Federal agencies are moving to modernize their infrastructure and adopt artificial intelligence (AI), machine learning (ML), and other emerging technologies. As they do so, ensuring data integrity is crucial for building reliable models, making informed decisions, and achieving mission success.
MeriTalk recently spoke with Gabrielle Rivera, vice president at Maximus, to discuss data integrity challenges and how to overcome them, as well as key considerations for enabling efficient and secure information sharing.
MeriTalk: What are the fundamental principles of data integrity, and what are key considerations for data integrity as Federal agencies modernize IT?
Rivera: The fundamental principles of data integrity are accuracy, consistency, and security, which must be applied throughout the entire data lifecycle. These principles ensure that the data is complete and free from errors or anomalies that could compromise the data quality.
This is particularly important when looking at data migration for modernization efforts. If the source data lacks integrity, it also compromises the accuracy of the data. Whatever you are migrating or converting, such as refactoring or reengineering code, will lack quality and contain errors. These problems only compound downstream.
MeriTalk: What are some of the data integrity challenges that you have seen impact modernization efforts at Federal agencies?
Rivera: Federal agencies, particularly Federal financial agencies, are dealing with spaghetti code – legacy code, local code, or source code – that is tied together with anomalies and ambiguities. This creates data integrity challenges when refactoring or reengineering to prepare for migration.
Before trying to migrate data or layer on new technologies, such as AI/ML tools or automation, the integrity and quality issues in the legacy code and data must be addressed with a comprehensive, enterprise-wide data management strategy that includes a migration-readiness strategy.
Sandboxes or testbeds are important to modernization. Thoroughly testing refactored code, data, applications, and systems in testbeds assures data integrity and security during updates, migrations, and reconfigurations. This is especially critical for time-sensitive modernization efforts.
MeriTalk: How do data integrity practices factor into agencies’ decisions about combining and storing data?
Rivera: Agencies need to assess the security and storage parameters for different options like containerization, data warehouses, data lakes, and hybrid lakehouse approaches. By evaluating how the data will be used, accessed, stored, and secured, agency IT teams can determine the most appropriate storage solution that balances cost, security, and accessibility. Considerations around data retention policies and the economics of storage are also important. For example, using cold data storage offers lower costs and is designed for data that is infrequently accessed or archived for long periods.
Agencies often strive to be cloud first, when really, they should be cloud smart. While cloud migration offers potential benefits, it’s important to consider the challenges involved and the readiness of the environment. Hybrid environments, such as an on-premises and cloud-based architecture – a lakehouse approach – can provide a more gradual transition, but they require careful planning and management to avoid increasing costs and complexity.
In the push for modernization, agency IT teams may look for shiny objects – the latest solutions – but they should be careful to consider the overlap of costs between the legacy on-premises environments and the cloud environments meant to replace them.
MeriTalk: Agencies need to be able to exchange data with other agencies and organizations. What are some key considerations for data governance and storage strategies to make the exchange of information more efficient and enable effective security and access control?
Rivera: The varying data management practices and access policies across Federal agencies create obstacles for data sharing and collaboration, which can limit their ability to serve constituents effectively. Inconsistencies in formats and protocols are also obstacles to data sharing.
Creating a collaborative, well-governed approach to data exchange that prioritizes data integrity, security, and accessibility across organizational boundaries is key for usability and availability. Considerations include the types of data being exchanged, the frequency of exchanges, data storage decisions, and identifying compatible and secure exchange methods.
Each interacting agency and additional parties, such as industry contractors, need to critically assess roles, processes, and responsibilities for managing the data. How they exchange information from a cybersecurity standpoint is critical. User-based and role-based access controls are important to maintain data quality and confidentiality, especially when multiple agencies are accessing or exchanging highly sensitive information, such as personal identifiable information in the case of Federal financial agencies.
MeriTalk: How can agencies evaluate industry partners and build effective partnerships to leverage data for mission success, customer satisfaction, and resource optimization?
Rivera: Contractor-owned, contractor operated environments (COCOs) can offer the advantages of experience with agency systems and the types of data procedures and privacy protocols for agency-to-agency exchanges. But COCOs can also add complexities and challenges to data exchange.
Effective data exchange necessitates transparency, communication, and collaboration among all stakeholders. Comprehensive dashboarding and reporting are crucial for data-driven decision-making and resource optimization, fostering stakeholder trust, customer satisfaction, and mission success. Federal agencies should evaluate industry partners based on their commitment to transparency, collaboration, and accessibility.
MeriTalk: What is Maximus’ approach to helping agency leaders unleash the full potential of their data?
Rivera: It is important to have a holistic data strategy and approach to identify upstream and downstream effects and interdependencies. We strive to eliminate the risks and maximize the benefits of agencies’ growing data stores with solution designs that leverage the full potential of the data while keeping it secure.
When agencies leverage the full potential of their data, they can make better decisions, enable accurate forecasting, enhance transparency, and provide insight into operations. We focus on maximizing the applicability of the data without layering on duplicative platforms or tool sets. We collaborate with agencies to deliver a solution that best fits each agency’s requirements and security parameters.
This article was orginally published by MeriTalk on September 19, 2024.