Data lineage is the comprehensive tracking of data usage within an organization. This includes how data originates, how it is transformed, how it is calculated, its movement between different systems, and ultimately how it is utilized in applications, reporting, analysis, and decision-making.
With the increasing complexities of business technology, data lineage analysis has become essential for most organizations. This article provides an overview of the fundamentals, importance, uses, and challenges of data lineage.
Data lineage facilitates improved data transparency, quality, and consistency by enabling organizations to track and understand the complete lifecycle of their data assets. It helps with decision-making when sourcing and using data. It also helps with transforming data, especially for larger organizations with mission-critical applications and intricate data landscapes.
There are several factors to consider with data lineage:
Data lineage plays a key role in keeping data valuable and effective in a business setting. Here are a few ways that data lineage can deliver benefits to an organization.
Data has incredible value in an information age. To realize the full value, data must be accurate and accessible. In other words, it becomes trustworthy only when it can be understood by anyone using it, and when the processing steps keep the data accurate. Data lineage provides transparency into the flow of data. It increases understanding and makes it easier for non-technical users to capture insights from existing datasets, especially for aggregated or calculated data.
Data management regulations are becoming more stringent each year. Regulatory standards are tightening, and effective data management is becoming increasingly important. Data lineage can help organizations comply with GDPR, CCPA, and other data privacy laws. The transparency of data lineage makes data access, audits, and overall accountability easier. Accurate data lineage is crucial for demonstrating compliance with regulatory requirements, thereby mitigating the risk of project delays, fines, and other penalties.
Data lineage enables stronger data governance by providing the data to monitor, manage and ensure compliance to issued standards and guidelines. Because data lineage offers traceability of origin, flow, transformation and destination, it allows businesses to improve data quality, reduce inconsistencies and errors, and strengthen data management practices.
Data lineage allows companies to trace the path of data from its current form back to its source. Data lineage offers a transparent record, facilitating the understanding and management of data variability and quality throughout its journey, and ensuring reliable data for decision-making. This is particularly relevant for companies modernizing existing systems.
With data lineage, trust in data accuracy and accessibility, improved data quality, and stronger ability to govern data all triangulate for better collaboration across teams. Data lineage avoids data siloing and facilitates interdepartmental activity. When data engineers and analysts utilize the same set of data, it fosters cross-functional teamwork and minimizes errors due to bad or in consistent data. Data lineage encourages a sense of unification as team members across an organization work from the same, trusted data.
There are multiple ways that data lineage can add business value to organizations.
Zengines has invested in data lineage capabilities to support end-to-end migration of data from existing source systems to new target business systems. Data lineage is often the first research step required to ensure an efficient and accurate data migration.
Data lineage exposes data quality issues by providing a clear view of the data journey, highlighting areas where inconsistencies or errors may have occurred. This makes it easier to engage in effective, detailed data analytics.
Consider, for instance, a financial services company with decades-old COBOL programs. Data lineage provides insights for organizations trying to replicate reporting or other outputs from these aging programs.
Data lineage makes it easier to identify and trace errors back to their source. Finding the root cause of an error quickly is extremely valuable in a world where time is at a premium.
An important aspect of data security and privacy compliance is keeping data safe guarded at all times. Data lineage provides an understanding of the data life cycle that can show information security groups the steps that must be reviewed and secured.
Comprehensive data lineage makes it easier to demonstrate compliance with data privacy regulations. For example, Banks and Payments Processors are subject to GLBA (Gramm-Leach-Bliley Act), PCI DSS(Payment Card Initiative - Data Security Standards), EU GDPR (European General Data Protection Regulation), and many other regulations that protect Personally Identifiable Information (PII). The knowledge of how any data element is used allows it to be protected, masked, or hidden when appropriate.
Data Mesh and Data Fabric are advanced data architectures that help to decentralize data and integrate it across diverse data sources. Understanding the data lineage allows data management teams to make trustworthy data available to Data Mesh / Data Fabric consumers. Data lineage makes it possible to determine the correct data to store and use for a given purpose (decision making, analytics, reporting, etc.). Data lineage is typically part of any new Data Mesh / Data Fabric initiative.
Data lineage is useful but can also face challenges. Here are a few potential issues.
Siloed data continues to be a major hurdle for tracing business data across departments and organizations. Consider when a security trade is being made. The security details are usually maintained in a reference data / Master Data Management application. The bid / ask information comes from many different market vendors and is updated continuously. The trading application computes the value of the trade, and any tax impact is computed in an investment accounting application. Is the same data being used across them all? Do they use different terminology? Do the applications all use the same pricing information? For accurate reporting and good decision making, it is vital that the same data is used in every step.
Mapping data lineage in increasingly complex environments is also a concern. Things like on-site and cloud storage, as well as remote, hybrid, and in-person work environments, make data complexity and fragmentation a growing issue that requires attention.
Historically, capturing and maintaining data lineage has been resource-intensive work performed by analysts with a deep understanding of the business. Given the quantity of data and code involved, a manual approach is prohibitively expensive for most companies. Most software solutions provide a partial view, only showing data stored in relational databases or excluding logic found in computer programs.
The best option is to find a balance between manual and automated solutions that enable cost-effective data lineage frameworks.
Data lineage is more than a backward-looking activity. Organizations also need to maintain up-to-date lineage information as systems are changed and replaced over time. In an era of constant change, data lineage teams are challenged to incorporate new forms of data usage or data transformation.
Data lineage is becoming a critical part of any company’s data management strategy. In an information age where data and analytics are king, data lineage enables companies to maintain clean, transparent, traceable datasets. This empowers data-driven decision-making and encourages cross-collaborative efforts.
Data lineage addresses a central part of business operations. It provides a powerful sense of digital clarity as organizations navigate increasingly complex tools, systems, and regulatory landscapes.
Forward-thinking technical and non-technical leaders alike should be encouraging their organizations to improve their data lineage strategies. Investments in data lineage result in a valuable new data assets that provide greater business agility and competitive advantage.
Data lineage isn’t just a nice-to-have—it’s essential for modern businesses navigating system changes, compliance pressures, and complex tech stacks. Whether you're migrating from legacy systems, improving analytics, or strengthening data governance, data lineage empowers teams to move faster, reduce risk, and make better decisions.
At Zengines, we’ve built our data lineage capabilities to do more than just document data flow. Our lineage engine integrates deeply with legacy codebases, like mainframe COBOL modules, and modern environments alike—giving you full visibility into how data is transformed, used, and governed across your systems. With AI-powered analysis, automation, and an intuitive interface, Zengines transforms lineage from a bottleneck into a business advantage.
Ready to see what intelligent data lineage can do for your organization?

Boston, MA - March 4, 2026 - Zengines, an AI technology company specializing in data migration and mainframe and AS400 data lineage, today announced it has been selected to demo live at FinovateSpring 2026, taking place May 5–7 in San Diego, California.
Finovate is one of the most prestigious fintech event series, drawing over 1,200 senior-level executives from banks, credit unions, and financial institutions - including nine of the top 10 U.S. banks. Demo slots are awarded through a competitive application and selection process, with only the most innovative and market-ready fintech companies earning a spot on stage.
Zengines will use its seven-minute live demo - Finovate's signature format - to showcase its Data Lineage product: an AI-powered research and visualization tool purpose-built for large financial institutions managing the complexity of “black box” systems.
What sets Zengines apart? Traditional lineage tools show you the map - at the surface level. Zengines gives you the map and the context behind it - built exclusively for the decades-old COBOL, RPG, and PL/1 systems no one fully understands anymore.
Conventional tools produce technically accurate data flow diagrams. They cannot tell you why a calculation exists, what business rule drives it, or what it means for your regulatory obligations. That context is buried in the code itself - and Zengines is built to surface it.
Two things define the Zengines platform:
Together, these enable three outcomes financial institutions are struggling to achieve today:
"Being selected to demo at Finovate is a meaningful validation of what we've built," said Caitlyn Truong, CEO and Co-Founder of Zengines. "The financial institutions in that room are dealing with exactly the challenges our lineage tool was designed to solve - regulatory mandates, modernization programs, and the 'black box' problem of legacy systems that no one can fully see into. We're excited to show them that contextual lineage is what actually moves the needle."
“Finovate demos are about showing, not telling, and Zengines’ contextual data lineage is something that I’m sure our audience is going to really appreciate seeing at FinovateSpring this May,” said Greg Palmer, VP and Host of Finovate. "The FI’s in our audience are wrestling with legacy infrastructure that's been accumulating complexity for decades. Zengines' ability to understand what's inside those systems before trying to modernize them or meet regulatory requirements is exactly the kind of solution that is likely to resonate with them.”
The Zengines Data Lineage tool is currently deployed at several Fortune 100 financial institutions across codebases spanning hundreds of thousands of source modules and tens of millions lines of code, where teams use it at enterprise scale to accelerate analysis that previously took months down to minutes.
FinovateSpring 2026 will feature RegTech, AI, data optimization, and risk management among its key themes - making it an ideal stage for Zengines to connect with the financial institutions and consulting partners navigating solutions to support these exact priorities.
Zengines is an AI technology company helping financial institutions trace, map, change, and move their data to manage legacy systems, modernize, and meet regulatory compliance requirements. Our Mainframe Data Lineage solution goes beyond traditional lineage tools by delivering contextual intelligence - not just where data flows, but the business logic, calculation rules, and institutional knowledge embedded in decades of legacy code. Our Data Migration platform accelerates data conversion programs using AI, reducing time and risk across core conversions, system implementations, and new client onboarding. Zengines serves financial services firms and their technology and service provider partners - where the cost of getting data wrong is highest.
Learn more at zengines.ai

For Chief Risk Officers and Chief Actuaries at European insurers, Solvency II compliance has always demanded rigorous governance over how capital requirements get calculated. But as the framework evolves — with Directive 2025/2 now in force and Member States transposing amendments by January 2027 — the bar for data transparency is rising. And for carriers still running actuarial calculations, policy administration, or claims processing on legacy mainframe or AS/400s, meeting that bar gets harder every year.
Solvency II isn't just about holding enough capital. It's about proving you understand why your models produce the numbers they do — where the inputs originate, how they flow through your systems, and what business logic transforms them along the way. For insurers whose critical calculations still run on legacy languages like COBOL or RPG, that proof is becoming increasingly difficult to produce.
At its core, Solvency II's data governance requirements are deceptively simple. Article 82 of the Directive requires that data used for calculating technical provisions must be accurate, complete, and appropriate.
The Delegated Regulation (Articles 19-21 and 262-264) adds specificity around governance, internal controls, and modeling standards. EIOPA's guidelines go further, recommending that insurers implement structured data quality frameworks with regular monitoring, documented traceability, and clear management rules.
In practice, this means insurers need to demonstrate:
For modern cloud-based platforms with well-documented APIs and metadata catalogs, these requirements are manageable. But for the legacy mainframe or AS/400 systems that still process the majority of core insurance transactions at many European carriers, this level of transparency requires genuine investigation.
Many large European insurers run core business logic on mainframe or AS/400 systems that have been evolving for 30, 40, even 50+ years. Policy administration, claims processing, actuarial calculations, reinsurance — the systems that generate the numbers feeding Solvency II models were often written in COBOL by engineers who retired decades ago.
The documentation hasn't kept pace. In many cases, it was never comprehensive to begin with. Business rules were encoded directly into procedural code, updated incrementally over the years, and rarely re-documented after changes. The result is millions of lines of code that effectively are the documentation — if you can read them.
This creates a compounding problem for Solvency II compliance:
When supervisors or internal audit ask how a specific reserve calculation works, or where a risk factor in your internal model originates, the answer too often requires someone to trace it through the code manually. That trace depends on a shrinking pool of specialists who understand legacy COBOL systems — specialists who are increasingly close to retirement across the European insurance industry.
Every year the knowledge gap widens. And every year, the regulatory expectations for data transparency increase.
The Solvency II framework isn't standing still. The amending Directive published in January 2025 introduces significant updates that amplify data governance demands:
National supervisors across Europe — from the ACPR in France to BaFin in Germany to the PRA in the UK — are tightening their expectations in parallel. The ACPR, for instance, has been specifically increasing its focus on the quality of data used by Solvency II functions, requiring actuarial, risk management, and internal audit teams to demonstrate traceability and solid evidence.
And the consequences of falling short are becoming tangible. Pillar 2 capital add-ons, supervisory intervention, and in severe cases, questions about the suitability of responsible executives — these aren't theoretical outcomes. They're tools that European supervisors have demonstrated willingness to use.
Every CRO at a European insurer knows the scenario: a supervisor asks a pointed question about how a specific technical provision was calculated, or requests that you trace a data element from source through to its appearance in a QRT submission. Your team scrambles. The mainframe or AS/400 specialists — already stretched thin — get pulled from other work. Days or weeks pass before the answer materializes.
These examinations are becoming more frequent and more granular. Supervisors aren't just asking for high-level descriptions of data flows. They want attribute-level traceability. They want to see the actual business logic that transforms raw policy data into the numbers in your regulatory reports.
For carriers whose critical processing runs through legacy mainframe or AS/400s, these requests expose a fundamental vulnerability: institutional knowledge that exists only in people's heads, supported by code that only a handful of specialists can interpret.
The question isn't whether your supervisor will ask. It's whether you'll be able to answer confidently when they do.
The good news: you don't have to replace your entire core system to solve the transparency problem. AI-powered tools can now parse legacy codebases and extract the data lineage that's been locked inside for decades.
This means:
The goal isn't to decommission your legacy systems overnight. It's to shine a light into the black box — so you can demonstrate the governance and control that Solvency II demands over systems that still run your most critical functions.
The European insurers who navigate Solvency II most smoothly aren't necessarily the ones with the newest technology. They're the ones who can clearly articulate how their risk management processes work — including the parts that run on infrastructure built before many of today's actuaries were born.
That clarity doesn't require a multi-year transformation program. It requires the ability to extract and document what your systems already do, in a format that satisfies both internal governance requirements and supervisory scrutiny.
For CROs, Chief Actuaries, and compliance leaders managing legacy technology estates, that capability is rapidly moving from nice-to-have to essential — especially as the 2027 transposition deadline for the amended Solvency II Directive approaches.
The carriers that invest in legacy system transparency now won't just be better prepared for their next supervisory review. They'll have a foundation for every modernization decision that follows — because you can't confidently change what you don't fully understand.
Zengines helps European insurers extract data lineage and calculation logic from legacy mainframe or AS/400 systems. Our AI-powered platform parses COBOL and RPG code and related infrastructure to deliver the transparency that Solvency II demands — without requiring a rip-and-replace modernization.

Every data migration has a moment of truth — when stakeholders ask, "Is everything actually correct in the new system?" Most teams don’t have the tools they need to answer that question.
Data migrations consume enormous time and budget. But for many organizations, the hardest part isn't moving the data — it's proving it arrived correctly. Post-migration reconciliation is the phase where confidence is either built or broken, where regulatory obligations are met or missed, and where the difference between a successful go-live and a costly rollback becomes clear.
For enterprises in financial services — and the consulting firms guiding them through modernization — reconciliation isn't optional. The goal of any modernization, vendor change, or M&A integration is value realization — and reconciliation is the bookend that proves the change worked, giving stakeholders and regulators the confidence to move forward.
Most migration programs follow a familiar arc: assess the source data, map it to the target schema, transform it to meet the new system's requirements, load it, and validate. On paper, it's linear. In practice, the validation step is where many programs stall.
Here's why. Reconciliation requires you to answer a deceptively simple question: Does the data in the new system accurately represent what existed in the old one — and does it behave the same way?
That question has layers. At the surface level, it's a record count exercise — did all 2.3 million accounts make it across? But beneath that, reconciliation means confirming that values transformed correctly, that business logic was preserved, that calculated fields produce the same results, and that no data was silently dropped or corrupted in transit.
For organizations subject to regulatory frameworks like BCBS 239, CDD, or CIP, reconciliation also means demonstrating an auditable trail. Regulators don't just want to know that data moved — they want evidence that you understood what moved, why it changed, and that you can trace any value back to its origin.
Three factors make post-migration reconciliation consistently harder than teams anticipate.
The most effective migration programs don't treat reconciliation as a phase that happens at the end. They build verifiability into every step — so that by the time data lands in the new system, the evidence trail already exists.
This requires two complementary capabilities: intelligent migration tooling that tracks every mapping and transformation decision, and deep lineage analysis that surfaces the logic embedded in legacy systems so you actually know what "correct" looks like.
The mapping and transformation phase of any migration is where most reconciliation problems originate. When a business analyst maps a source field to a target field, applies a transformation rule, and moves on, that decision needs to be recorded — not buried in a spreadsheet that gets versioned twelve times.
AI-powered migration tooling can accelerate this phase significantly. Rather than manually comparing schemas side by side, pattern recognition algorithms can predict field mappings based on metadata, data types, and sample values, then surface confidence scores so analysts can prioritize validation effort where it matters most. Transformation rules — whether written manually or generated through natural language prompts — are applied consistently and logged systematically.
The result is that when a stakeholder later asks, "Why does this field look different in the new system?" — the answer is traceable. You can point to the specific mapping decision, the transformation rule that was applied, and the sample data that validated the match. That traceability is foundational to reconciliation.
Reconciliation gets exponentially harder when the source system is a mainframe running COBOL code that was last documented in the 1990s. When the new system produces a different calculation result than the old one, someone has to determine whether that's a migration error or simply a difference in business logic between the two platforms.
This is where mainframe data lineage becomes critical. By parsing COBOL modules, job control language, SQL, and associated files, lineage analysis can surface the calculation logic, branching conditions, data paths, and field-level relationships that define how the legacy system actually works — not how anyone thinks it works.
Consider a practical example: after migrating to a modern cloud platform, a reconciliation check reveals that an interest accrual calculation in the new system produces a different result than the legacy mainframe. Without lineage, the investigation could take weeks. An analyst would need to manually trace the variable through potentially thousands of lines of COBOL code, across multiple modules, identifying every branch condition and upstream dependency.
With lineage analysis, that same analyst can search for the variable, see its complete data path, understand the calculation logic and conditional branches that affect it, and determine whether the discrepancy stems from a migration error or a legitimate difference in how the two systems compute the value. What took weeks now takes hours — and the finding is documented, not locked in someone's head.
The real power of combining intelligent migration with legacy lineage is that reconciliation becomes a structured, evidence-based process rather than an ad hoc investigation.
When you can trace a value from its origin in a COBOL module, through the transformation rules applied during migration, to its final state in the target system — you have end-to-end data provenance. For regulated financial institutions, that provenance is exactly what auditors and compliance teams need. For consulting firms delivering these programs, it's the difference between a defensible methodology and a best-effort exercise.
For Tier 1 consulting firms and systems integrators delivering modernization programs, post-migration reconciliation is often where project timelines stretch and client confidence erodes. The migration itself may go seem to go smoothly, but then weeks of reconciliation cycles — investigating discrepancies, tracing values back through legacy systems, re-running transformation logic — consume budget and test relationships.
Tooling that accelerates both sides of this equation changes the engagement model. Migration mapping and transformation that would have taken a team months can be completed by a smaller team in weeks. Lineage analysis that would have required dedicated mainframe SMEs for months of manual code review becomes an interactive research exercise. And the reconciliation evidence is built into the process, not assembled after the fact.
This translates directly to engagement economics: faster delivery, reduced SME dependency, lower risk of costly rework, and a more compelling value proposition when scoping modernization programs.
Whether you're leading a migration internally or advising a client through one, these principles will strengthen your reconciliation outcomes.
The goal of any modernization program isn't the migration itself — it's the value that comes after. Faster operations, better insights, reduced risk, regulatory confidence. Reconciliation is the bookend that earns trust in the change and clears the path to that value.
Zengines combines AI-powered data migration with mainframe data lineage to give enterprises and consulting firms full traceability from source to target — so you can prove the migration worked and move forward with confidence.
.png)