Think about this fact: According to a 2013 article in Science Daily, “90 percent of the world’s data was created in the last 2 years.” In 2017, the numbers only continue to rise. The amount of data generated via social media is staggering—4 million text messages are sent each minute within the U.S alone! With massive amounts of data being generated in today’s society, we are forced to think differently about how to process, store, and analyze all of this data.
Organizations are embarking on data modernization efforts to not only update their current capabilities, but to provide new processing and storage capabilities for the data that is generated today. In reality, the processing and storing of the data is actually the easy part—the tougher issues still remain and need to be addressed, such as data governance, data quality, metadata, and master data management. In addition, it is probably even more important to focus on the basics in today’s world. More data with data quality issues only creates more challenges. Even if we can store the data, but can’t analyze it, then we haven’t achieved anything.
As many companies in the commercial sector are modernizing their data infrastructure and implementing innovative solutions around data analytics, particularly in the commercial healthcare space, the Federal market can leverage the expertise and knowledge gained from the commercial sector and particularly from companies that have a presence across both. This white paper will not only present various ideas to consider when modernizing your data, but will also describe the basics that you still need to focus on when evolving your data infrastructure.
While organizations are drowning in data—going back to the fact that 90 percent of the world’s data has been generated in the last 2 years; according to Forbes, less than 0.5 percent of the data is actually analyzed. In order to analyze the data, you first have to be able to process and store it. We understand that many companies and agencies have already made significant investments in their data infrastructure. Enterprise Data Warehouses are common place today—many have been reaping the benefits of integrated data environments for years. So now what? Do we lose that sunk cost and move to new technology? No, you don’t have to lose your existing investments. You should maximize your investments by modernizing, where needed, and then deploying new capabilities that provide seamless integration. The legacy data infrastructure still plays a critical role in an enterprise data architecture; the task at hand is how to integrate the legacy and modern infrastructure to create best value for your organization. Consider these basic items when focused on modernization:
- Think differently about Data Integration: The world of data integration has evolved over the years. Look to enhance your Extract, Transform and Load (ETL) processes by upgrading to real-time data services, and optimize your integration techniques by looking at where it makes sense to perform the integration, either at the analytic layer or operational layer.
- Open up your Environments – Focus on Exploration: Provide the capabilities for your business users to research, analyze, and play with the data—allow new tools and capabilities to provide sandboxes where data can be explored.
- Chose Federation over Centralization: With the vast amounts of data, it is not efficient to have everything centralized. Look for opportunities to keep the data stored where it is and integrate using data virtualization techniques.
- Use Open Source Tools: Look for opportunities where you can augment your existing data technology stack with open source tools. The open source community provides a powerful set of tools to not only ingest and pre-process the data, but also to gain insight and explore the data.
It is important to integrate both your legacy infrastructure, depicted at the bottom of this diagram, along with your new data tools and technologies. Understanding the role that your legacy data infrastructure plays in the overall approach to your enterprise data architecture is critical for success.
Modernizing your existing architecture will take time and effort, but the Return on Investment (ROI) will be significant and you will be well poised for the future wave of data coming your way. It is also prudent to incorporate data modernization efforts into your organizations IT Strategy, where the existing capabilities can be reviewed against new technology to make adjustments on a consistent basis. The data management world is continually evolving and it will be critical to stay on top of the trends and capabilities.
While it is just as important to stay on top of the latest trends and capabilities, it is also critically important to not lose sight of the basics in data management. Data management is a discipline that encompasses a wide variety to capabilities, such as data integration, data quality, data governance, metadata management, master data management, data security, and data architecture. All of these disciplines are tightly integrated and many are dependent on one another. The challenge has always been and still remains, ensuring that there is demonstrated value in performing these disciplines and ensuring that they align with the business goals of the organization. In most cases, they are an “after-thought.”
Focus on the Basics
It is important, particularly with the vast amount and variety of data being generated today, that we not lose sight of very basic data management capabilities that will be even more important as the data world continues to evolve.
- Stay focused on Data Accountability: What is not changing in today’s world is the amount of regulation and scrutiny being placed on an organization’s data. Staying focused on the basic principles of data governance across your entire data ecosystem will become critically important to address targeted questions. You need to be able to understand not only what, when, and how, data is transformed across your legacy environment, but also, how it is ingested, processed, stored, and analyzed in your new data environments. Being able to bridge across your legacy data warehouse and your Hadoop ecosystem will remain important—establishing accountability at the data process/subject/element level will provide for full transparency—regardless of environment.
- Implement “Fit for Purpose” Data Quality: Not all data is treated equally, so ensure that your data quality program has various levels of data grading implemented. It is important to understand how the data is being consumed from the various data environments so that you can ensure the right level of data quality control is established.
- Implement End-to-End Metadata: Ensure that all facets of metadata, business, operational, and technical, are being captured across all legacy and new data environments—ensuring traceability and lineage.
All disciplines associated with managing data are vital to ensure that the data is of highest quality, consistent, understood, and most importantly is of value to your business.
As your organization adopts new data tools and technologies, or modernization efforts associated with your existing data environments, remember that it all comes back to a simple concept; you need to trust the data in order to find its value. In order to trust the data, you need to understand the basics: How is it defined? How is it used? What is the quality associated with the data? How does it travel across our data environments? Is it data that needs to be secured? Bottom line; stay focused on the basics as you modernize. In addition, look to other companies that have implemented innovative modernization solutions for commercial companies, where the lessons learned can be applied to modernization in the Federal market.
Initially, Unissant’s data services and solutions focused on building data integration and storage capabilities using traditional tools and technologies, such as structured database platforms (Oracle, SQL Server, Sybase, etc.), Extract, Transform & Load (ETL) tools, and basic reports using standard tools, such as Business Objects, Cognos, and MicroStrategy. As the technology capabilities expanded, we began delivering solutions processing higher volumes of data using major data appliances, such as Teradata and Oracle Exadata. Data Visualization capabilities began to deliver greater value to our clients using tools, such as Tableau, SAS Visual Analytics, and MicroStrategy Visual Insights, and were an important aspect of our solutions. As data technology capabilities continued to expand over the past 5 years, the majority of our clients have moved from primarily on-premise solutions to cloud-based solutions or a hybrid of both. Unissant has experience delivering data repositories using on-premise and/or cloud solutions leveraging cloud platforms, such as Amazon Web Services (AWS) and Microsoft Azure. In addition, our clients look to us to provide solutions where all types of data structured, semi-structured, and unstructured data can be ingested, stored, and analyzed to achieve the greatest business value.
As a result of our experience over the past 10 years, we have developed our own internal proprietary Enterprise Information Management (EIM) Framework that provides a series of accelerators, best practices, and templates to deploy any data integration, storage, management and analytics solution. Using rigorous requirements gathering, best in class technical expertise, and award-winning creative talent, Unissant has the best combination of skills to help its clients design and deploy successful business intelligence projects. Unissant’s clients range from small-midsize enterprises to Fortune 1000 companies and large Federal agencies, Unissant has developed strategic relationships with global partners to ensure that we provide world-class services.