Advances in cloud computing, along with the big data movement, have transformed the business IT landscape. Leveraging the cloud, companies are now afforded on demand capacity and mobile accessibility to their business-critical systems and information. At the same time, the amount of structured and unstructured data created by, and available to, organizational users is a constantly moving target, with IDC estimating that the digital universe will grow by a factor of 10 between 2013 and 2020. But while both of these IT megatrends can be the catalysts for innovation and growth, organizations are facing significant new challenges when trying to seize upon their opportunities.
Corporate datasets are growing more diverse, complex and massive in size. This information is comprised of vast collections of unstructured, file-based content such as Word documents, Excel spreadsheets, images, videos and PDFs (Big Content), as well as the structured data (Big Data) that is being collected by Smart Machines and wearables, or that resides in various business systems across the enterprise, such as CRMs and ERPs. For organizations, the challenge lies in the ability to create harmony between their unstructured content repositories and structured data systems in order to not only make critical business information easily accessible to employees, but to also to pinpoint and analyze with great precision that which is most relevant to business objectives.
While cloud computing offers more effective ways of utilizing applications and storage, it also has the potential to further complicate the information management environment by introducing new sets of silos. These information silos, where data is often maintained by designated lines of business and incapable of interacting with other business systems, lend additional layers of complexity to the growing information disconnect.
In the simplest of terms, organizations are now dealing with information overload and as a result, operational inefficiencies. With disparate silos of structured data and unstructured content, some residing On-Premises and others in the cloud, the mere process of finding the right information quickly is often riddled with more technical complexity now than ever before. How can businesses optimize the benefits provided by the cloud while also eliminating the Big Data and Big Content challenges their implementations present?
Software vendors have recognized the need for enabling smooth and seamless integration among various repositories. Most vendors offer open and well-documented Web Service APIs to improve interoperation between different systems and cloud services.
To bridge the information disconnect, organizations must understand how data in different repositories is related. This can be either a manual mapping process or one that leverages emerging technologies, such as Artificial Intelligence and Machine Learning. Either way, the objects should be bound together with metadata.
Often overlooked in today’s information ecosystem, metadata is defined as “data about the data.” It consists of the attributes, properties and tags that describe and classify contents of information. Metadata natively exists for all structured content throughout an organization, and most unstructured content also contains basic metadata properties, such as type, author, date created, etc.
Using metadata, organizations can add business-relevant intelligence by associating structured data and unstructured content with a project, customer, sales rep, workflow state or virtually by any other attribute they define. Doing so enables businesses to classify data intelligently around their unique business characteristics and processes, ultimately making unstructured content structured.
By serving as the bridge that connects unstructured content with structured data systems, metadata eliminates silos, freeing information from the confines of business systems, departments and devices, as well as across public and private datasets. How exactly does it bridge the divide? The process involves intelligently linking information in structured data systems to unstructured content repositories to establish relevance, with an enterprise information system (EIM), for example, serving as the conduit. Consider this: a proposal is important because it is related to a certain customer that is managed in the CRM system, or an invoice is of interest because it is related to a certain vendor or project in the ERP system. With an integrated layer of metadata within an EIM system, it’s possible to provide users with instant access to the most up-to-date information, regardless of where the file is stored or where it originated.
As another example, using a metadata-centric architecture, sales reps can search across all data repositories from within a CRM system for content that is relevant to a customer profile. The findings may include sales proposals, open support tickets and outstanding invoices – files stored in the cloud or on-premises, representing information that would otherwise remain hidden within separate information silos. The power of metadata connects these sources and offers a 360-degree view of all data assets related to an object profile, creating added business intelligence about that profile’s behaviors and patterns.
Indeed, metadata plays a powerful role in searching, analyzing and even in drawing relational conclusions from big data. To this end, it empowers organizations to profile their data – to make sense of it for their users in an era in which information is abundant, but is often left untapped. To derive true business value, all big data must be essentially reduced using such profiling techniques, eliminating that which is irrelevant and narrowing in on that which is compelling and actionable.
While metadata has been around for a long time, it is just beginning to earn recognition as a powerful tool for extracting important insights from big data and cloud deployments. By serving to bridge the gap across content repositories and cloud applications, it is enabling business users to quickly and easily locate the information needed to improve performance, make more informed decisions and provide greater value to customers.
By Mika Javanainen