Data is the most valuable asset any organization has, and should be treated as such by extracting value from each stage of the lifecycle.
By Brian Pawlowski, Chief Development Officer
For years, organizations across sectors have viewed their volumes of data as a static entity and treated it with the assumption that it’s something to be stored and not used again—but this is far from the truth. Data, when managed effectively across its lifecycle, can unlock new pathways to business success. However, data is only as strong as the ability to act on it. Enterprises must reframe their approach to data management in actionable verbs: generating, transforming, and preserving. Think of this method of data management as a data factory, where data is continually running through workflows where it is collected, stored, and optimized to arm enterprises with key business insights.
There are three main components to the data factory framework: raw materials, works in progress, and finished goods. When operating in unison like any good factory, it can enable organizations to maximize the value of their unstructured data assets, seamlessly turning telemetry insights into impactful action that drives market growth and sustained success.
The starting point in the data factory is where data is generated by any cloud-based, edge, or on-prem device. Currently, data–in particular unstructured data like video and imagery–is being generated and collected at an unprecedented rate amid societal shifts to digital transformation and hybrid work environments. In fact, 90% of the world’s data was created over the past two years, and 80% of all data collected is unstructured. That’s an unfathomable amount of data being generated every second of every day– from video surveillance footage to medical imagery like MRIs to IoT sensing and monitoring logs. Additionally, new workloads and applications, like the advent of VR/AR, are contributing to the exponential increase of unstructured data. One additional behavior of unstructured data that complicates this factory model is that it’s mobile across this entire factory– data doesn’t just move in one direction.
After data is generated and collected, it moves to the work-in-process phase, where specific capabilities really matter. Performance is the first. Second, the ability to easily access the data and collaborate from anywhere is critical. And third, this is the stage where we first start to use AI data enrichment to derive more value from the data. AI can accomplish enrichment in a variety of ways: putting metatags on data to make it easily searchable, applying analytics to understand patterns and gain visibility, or simply by helping with categorization.
Why is this enrichment process so crucial? According to IDC, with the meteoric rise of unstructured data, companies will soon succeed or fail based on their ability to harness the power of their data by enriching it for increased digital competitiveness. This means companies will be able to use AI to garner insights from their data for new product development and innovation breakthroughs and to streamline their operational workflows, reducing the manual effort required so employees can focus on higher-level business priorities.
Finally, after data has been enriched, it’s then ready to be archived where it can be easily accessible and repurposed because ultimately, data never dies. For example, the same footage can be used initially for internal training purposes, and then reused for future content production by a corporate marketing video team. Forensic analytics can be performed on the same data repeatedly with a different goal in mind each time, making the data factory a cyclical process.
When it comes to long-term data archiving, there’s a different set of capabilities required. The capabilities needed are the ability to store data at the very lowest media cost possible, the solution needs to be sustainable and green – that is, power and data center real estate cost efficient, and it must operate reliably at exabyte scale for years.
Data is one of the most important resources to any enterprise, and as new ways to generate value from data surface every day due to analytics, AI/ML and more, it will only become more important. With the data lifecycle more cyclical than ever, more organizations should adopt a factory-like operation where maximum value from data can be created, extracted, and repurposed continuously. One example of this repurposing is old medical imagery and research data are being recalled from archives and are being searched to create new drugs or studying the behavior of diseases. Key breakthroughs like this are an excellent demonstration of the importance of all data, no matter how old. Data is fluid, and organizations need to make sure they have a dynamic infrastructure at the right price point based upon where data is in the lifecycle.
About the Author:
Brian Pawlowski is the Chief Development Officer for Quantum. Brian began his career as a software engineer for several well-known technology companies, and has over 35 years of experience in building technologies and leading teams at global technology companies. Brian served on the Board of Trustees for the Anita Borg Institute for Women and Technology and was a Board member at the Linux Foundation. Brian studied computer science at Arizona State University and studied physics at the University of Texas at Austin and MIT.
Tune in to hear from Chris Brown, Vice President of Sales at CADDi, a leading manufacturing solutions provider. We delve into Chris’ role of expanding the reach of CADDi Drawer which uses advanced AI to centralize and analyze essential production data to help manufacturers improve efficiency and quality.