Articles

Can Your Bank Prove BCBS-239 Compliance? The Data Lineage Reality Check

November 4, 2025
Caitlyn Truong

The 2008 financial crisis exposed a shocking truth: major banks couldn't accurately report their own risk exposures in real-time.

When Lehman Brothers collapsed, regulators discovered that institutions didn't know their actual exposure to toxic assets -- not because they were hiding it, but because they genuinely couldn't aggregate their own data fast enough.

Fifteen years and billions in compliance spending later, only 2 out of 31 Global Systemically Important Banks fully comply with BCBS-239 -- the regulation designed to prevent this exact problem.

The bottleneck? Data lineage.

Who Must Comply and When

BCBS-239 applies from January 1, 2016 for Global Systemically Important Banks (G-SIBs) and is recommended by national supervisors for Domestic Systemically Important Banks (D-SIBs) three years after their designation. In practice, this means hundreds of banks worldwide are now expected to comply.

Unlike regulations with fixed annual filing deadlines, BCBS-239 is an ongoing compliance requirement. Supervisors can test a bank's compliance with occasional requests on selected risk issues with short deadlines, gauging a bank's capacity to aggregate risk data rapidly and produce risk reports.

Think of it as a fire drill that can happen at any moment -- and with increasingly serious consequences for failure.

The Sobering Statistics

More than a decade after publication and eight years past the compliance deadline, the results are dismal. Only 2 out of 31 assessed Global Systemically Important Banks fully comply with all principles, and no single principle has been fully implemented by all banks.

Even more troubling, the compliance level across all principles barely improved from an average of 3.14 in 2019 to 3.17 in 2022 on a scale of 1 ("non-compliant") to 4 ("fully compliant"). At this rate of improvement, full compliance is decades away.

What Happens If Your Bank Fails BCBS-239 Compliance?

The consequences are escalating. The ECB guide explicitly mentions:

  • Enforcement actions against the institution
  • Capital add-ons to compensate for data risk
  • Removal of responsible executives who fail to drive compliance
  • Operational restrictions on new business lines or acquisitions

The Basel Committee makes it clear that banks' progress towards BCBS 239 compliance in recent years has not been satisfactory and that increased measures on the part of the supervisory authorities are to be expected to accelerate implementation.

What Banks Are Doing (And Why It's Not Enough)

Most banks have responded to BCBS-239 with predictable tactics:

  • Governance restructuring: Creating Chief Data Officer roles and data governance committees
  • Policy documentation: Writing comprehensive data management policies and frameworks
  • Technology investments: Purchasing disparate tools like data catalogs, metadata management tools, and master data management platforms
  • Remediation programs: Launching multi-year, multi-million dollar compliance initiatives

These tactics, as positive steps forward, are necessary but not sufficient to meeting compliance. In other words, they're checking boxes without fundamentally solving the problem.

The issue? Banks are treating BCBS-239 like a project with an end date, when it's actually an operational capability that must be demonstrated continuously.

The Data Lineage Bottleneck

Among the 14 principles, one capability has emerged as the make-or-break factor for compliance: data lineage.

Data lineage has been identified as one of the key challenges that banks have faced in aligning to the BCBS-239 principles, as it is one of the more time consuming and resource intensive activities demanded by the regulation.

Why Data Lineage Is Different

Data lineage -- the ability to trace data from its original source through every transformation to its final destination -- sits at the intersection of virtually every BCBS-239 principle. The European Central Bank refers to data lineage as "a minimum requirement of data governance" in the latest BCBS 239 recommendations.

Here's why lineage is uniquely difficult:

It's invisible until you need it.
Unlike a data governance policy you can show an auditor or a data quality dashboard you can pull up, lineage is about proving flows, transformations, and dependencies that exist across dozens or hundreds of systems. You can't fake it in a PowerPoint.

It crosses organizational and system boundaries.
Complete lineage requires cooperation between IT, risk, finance, operations, and business units -- each with their own priorities, systems, and definitions. Further, data hand-off occurs in and between systems, databases and files, which adds to the complexity of connecting what happens at each hand-off.  Regulators are increasingly requiring detailed traceability of reported information, which can only be achieved through lineage across organizations and systems.

It must be current and complete.
The ECB requires "complete and up-to-date data lineages on data attribute level (starting from data capture and including extraction, transformation and loading) for the risk indicators, and their critical data elements." A lineage document from six months ago is worthless if your systems have changed.

It must work under pressure.
Supervisors increasingly require institutions to demonstrate the effectiveness of their data frameworks through on-site inspections and fire drills, with data lineage providing the audit trail necessary for these reviews. When a regulator asks "prove this number came from where you say it came from," you have hours -- not days -- to respond.

The Eight Principles That Demand Data Lineage Proof

While 11 of the 14 principles benefit from good data lineage, regulatory guidance makes it explicitly mandatory for eight:

  • Principle 2 (Data Architecture): Demonstrate integrated data architecture through documented lineage flows
  • Principle 3 (Accuracy & Integrity): Prove data accuracy by showing traceable lineage from source to report
  • Principle 4 (Completeness): Demonstrate comprehensive risk coverage through lineage mapping
  • Principle 6 (Adaptability): Respond to ad-hoc requests using lineage to quickly identify relevant data
  • Principle 7 (Report Accuracy): Validate report numbers through documented lineage and audit trails
  • Principles 12-14 (Supervisory Review): Provide lineage evidence during audits and fire drills

The Technology Gap: Why Traditional Tools Fall Short

Most banks have invested heavily in data catalogs, metadata management platforms, and governance frameworks. Yet they still can't produce lineage evidence under audit conditions. Why?

Traditional approaches have three fatal flaws:

1. Manual Documentation

Excel-based lineage documentation becomes outdated within weeks as systems change. By the time you finish documenting one data flow, three others have been modified. Manual approaches simply can't keep pace with modern banking environments.

2. Point Solutions that only support newer applications

Modern data lineage tools can map cloud warehouses and APIs, but they hit a wall when they encounter legacy mainframe systems. They can't parse COBOL code, decode JCL job schedulers, or trace data through decades-old custom applications -- exactly where banks' most critical risk calculations often live.

3. Incomplete Coverage

Lineage that stops at the data warehouse is fundamentally incomplete under BCBS-239's end-to-end data lineage requirements. Regulators want to see the complete path -- from original source system through every transformation, including hard-coded business logic in legacy applications, to the final risk report. Most tools miss 40-70% of the actual transformation logic.

How AI-Powered Data Lineage Changes the Game

This is where AI-powered solutions like Zengines fundamentally differ from traditional approaches.

Instead of manually documenting lineage, Zengines can automatically and comprehensively:

  • Parse legacy mainframe code (COBOL, RPG, Focus, etc) to extract data flows and transformation logic
  • Trace calculations backward from any report field to ultimate source systems
  • Document relationships between tables, fields, programs, files and job schedulers
  • Generate audit-ready evidence in minutes instead of months
  • Maintain relevancy and currency through lineage updates as code changes

Solving the "Black Box" Problem

For many banks, the biggest lineage gap isn't in modern systems -- it's in legacy mainframes where critical risk calculations were encoded 20-60 years ago by developers who have long since retired. These systems are literal "black boxes": they produce numbers, but no one can explain how.

Zengines' Mainframe Data Lineage capability specifically addresses this challenge by:

  • Parsing COBOL and RPG modules to expose calculation logic and data dependencies
  • Tracing variables across millions of lines of legacy code
  • Identifying hard-coded values, conditional logic, and branching statements
  • Visualizing data flows across interconnected mainframe programs and external files
  • Extracting "requirements" that were never formally documented but are embedded in code

This capability is essential for banks that need to prove how legacy calculations work -- whether for regulatory compliance, system modernization, or simply understanding their own risk models.

Assessment: Can Your Bank Prove Compliance Right Now?

The critical question isn't "Do we have data lineage?" It's "Can we prove compliance through data lineage right now, under audit conditions, with short notice?"

Most banks would answer: "Well, sort of..."

That's not good enough anymore.

We've translated ECB supervisory expectations into a practical, principle-by-principle checklist. This isn't about aspirational capabilities or future roadmaps -- it's about what you can demonstrate today, under audit conditions, with short notice.

The Bottom Line

The bottleneck to full BCBS-239 compliance is clear: data lineage.

Traditional approaches -- manual documentation, point solutions, incomplete coverage -- can't solve this problem fast enough. The compliance deadline was 2016. Enforcement is escalating. Fire drills are becoming more frequent and demanding.

Banks that solve the lineage challenge with AI-powered automation will demonstrate compliance in hours instead of months. Those that don't will continue struggling with the same gaps, facing increasing regulatory pressure, and risking enforcement actions.

The technology to solve this exists today. The question is: how long can your bank afford to wait?

Schedule a demo with our team today to get started.

You may also like

For Chief Risk Officers and Chief Actuaries at European insurers, Solvency II compliance has always demanded rigorous governance over how capital requirements get calculated. But as the framework evolves — with Directive 2025/2 now in force and Member States transposing amendments by January 2027 — the bar for data transparency is rising. And for carriers still running actuarial calculations, policy administration, or claims processing on legacy mainframe or AS/400s, meeting that bar gets harder every year.

Solvency II isn't just about holding enough capital. It's about proving you understand why your models produce the numbers they do — where the inputs originate, how they flow through your systems, and what business logic transforms them along the way. For insurers whose critical calculations still run on legacy languages like COBOL or RPG, that proof is becoming increasingly difficult to produce.

What Solvency II Actually Requires of Your Data

At its core, Solvency II's data governance requirements are deceptively simple. Article 82 of the Directive requires that data used for calculating technical provisions must be accurate, complete, and appropriate.

The Delegated Regulation (Articles 19-21 and 262-264) adds specificity around governance, internal controls, and modeling standards. EIOPA's guidelines go further, recommending that insurers implement structured data quality frameworks with regular monitoring, documented traceability, and clear management rules.

In practice, this means insurers need to demonstrate:

  • Data traceability: A clear, auditable path from source data through every transformation to the final regulatory output — whether that's a Solvency Capital Requirement calculation, a technical provision, or a Quantitative Reporting Template submission.
  • Calculation transparency: How does a policy record become a reserve estimate? What actuarial assumptions apply, and where do they come from?
  • Data quality governance: Structured frameworks with defined roles, KPIs, and continuous monitoring — not just point-in-time checks during reporting season.
  • Impact analysis capability: If an input changes, what downstream calculations and reports are affected?

For modern cloud-based platforms with well-documented APIs and metadata catalogs, these requirements are manageable. But for the legacy mainframe or AS/400 systems that still process the majority of core insurance transactions at many European carriers, this level of transparency requires genuine investigation.

The Legacy System Problem That Keeps Getting Worse

Many large European insurers run core business logic on mainframe or AS/400 systems that have been evolving for 30, 40, even 50+ years. Policy administration, claims processing, actuarial calculations, reinsurance — the systems that generate the numbers feeding Solvency II models were often written in COBOL by engineers who retired decades ago.

The documentation hasn't kept pace. In many cases, it was never comprehensive to begin with. Business rules were encoded directly into procedural code, updated incrementally over the years, and rarely re-documented after changes. The result is millions of lines of code that effectively are the documentation — if you can read them.

This creates a compounding problem for Solvency II compliance:

When supervisors or internal audit ask how a specific reserve calculation works, or where a risk factor in your internal model originates, the answer too often requires someone to trace it through the code manually. That trace depends on a shrinking pool of specialists who understand legacy COBOL systems — specialists who are increasingly close to retirement across the European insurance industry.

Every year the knowledge gap widens. And every year, the regulatory expectations for data transparency increase.

The Regulatory Pressure Is Intensifying

The Solvency II framework isn't standing still. The amending Directive published in January 2025 introduces significant updates that amplify data governance demands:

  • Enhanced ORSA requirements now mandate analysis of macroeconomic scenarios and systemic risk conditions — requiring even more data inputs with clear provenance.
  • Expanded reporting obligations split the Solvency and Financial Condition Report into separate sections for policyholders and market professionals, each requiring precise, auditable data.
  • New audit requirements mandate that the balance sheet disclosed in the SFCR be subject to external audit — increasing scrutiny on the data chain underlying reported figures.
  • Climate risk integration requires insurers to assess and report on climate-related financial risks, adding new data dimensions that must be traceable through existing systems.

National supervisors across Europe — from the ACPR in France to BaFin in Germany to the PRA in the UK — are tightening their expectations in parallel. The ACPR, for instance, has been specifically increasing its focus on the quality of data used by Solvency II functions, requiring actuarial, risk management, and internal audit teams to demonstrate traceability and solid evidence.

And the consequences of falling short are becoming tangible. Pillar 2 capital add-ons, supervisory intervention, and in severe cases, questions about the suitability of responsible executives — these aren't theoretical outcomes. They're tools that European supervisors have demonstrated willingness to use.

The Supervisory Fire Drill

Every CRO at a European insurer knows the scenario: a supervisor asks a pointed question about how a specific technical provision was calculated, or requests that you trace a data element from source through to its appearance in a QRT submission. Your team scrambles. The mainframe or AS/400 specialists — already stretched thin — get pulled from other work. Days or weeks pass before the answer materializes.

These examinations are becoming more frequent and more granular. Supervisors aren't just asking for high-level descriptions of data flows. They want attribute-level traceability. They want to see the actual business logic that transforms raw policy data into the numbers in your regulatory reports.

For carriers whose critical processing runs through legacy mainframe or AS/400s, these requests expose a fundamental vulnerability: institutional knowledge that exists only in people's heads, supported by code that only a handful of specialists can interpret.

The question isn't whether your supervisor will ask. It's whether you'll be able to answer confidently when they do.

Extracting Lineage from Legacy Systems

The good news: you don't have to replace your entire core system to solve the transparency problem. AI-powered tools can now parse legacy codebases and extract the data lineage that's been locked inside for decades.

This means:

  • Automated tracing of how data flows through COBOL and RPG modules, job schedulers, and database operations — across thousands of programs, without needing to know where to look.
  • Calculation logic extraction that reveals the actual mathematical expressions and business rules governing how risk data gets transformed — not just that Field A maps to Field B, but what happens during that transformation.
  • Visual mapping of branching conditions and downstream dependencies, so compliance teams can answer supervisor questions in hours instead of weeks.
  • Preserved institutional knowledge that doesn't walk out the door when your legacy specialists retire — because the logic is documented in a searchable, auditable format.

The goal isn't to decommission your legacy systems overnight. It's to shine a light into the black box — so you can demonstrate the governance and control that Solvency II demands over systems that still run your most critical functions.

From Compliance Burden to Strategic Advantage

The European insurers who navigate Solvency II most smoothly aren't necessarily the ones with the newest technology. They're the ones who can clearly articulate how their risk management processes work — including the parts that run on infrastructure built before many of today's actuaries were born.

That clarity doesn't require a multi-year transformation program. It requires the ability to extract and document what your systems already do, in a format that satisfies both internal governance requirements and supervisory scrutiny.

For CROs, Chief Actuaries, and compliance leaders managing legacy technology estates, that capability is rapidly moving from nice-to-have to essential — especially as the 2027 transposition deadline for the amended Solvency II Directive approaches.

The carriers that invest in legacy system transparency now won't just be better prepared for their next supervisory review. They'll have a foundation for every modernization decision that follows — because you can't confidently change what you don't fully understand.

Zengines helps European insurers extract data lineage and calculation logic from legacy mainframe or AS/400 systems. Our AI-powered platform parses COBOL and RPG code and related infrastructure to deliver the transparency that Solvency II demands — without requiring a rip-and-replace modernization.

Every data migration has a moment of truth — when stakeholders ask, "Is everything actually correct in the new system?" Most teams don’t have the tools they need to answer that question.

Data migrations consume enormous time and budget. But for many organizations, the hardest part isn't moving the data — it's proving it arrived correctly. Post-migration reconciliation is the phase where confidence is either built or broken, where regulatory obligations are met or missed, and where the difference between a successful go-live and a costly rollback becomes clear.

For enterprises in financial services — and the consulting firms guiding them through modernization — reconciliation isn't optional. The goal of any modernization, vendor change, or M&A integration is value realization — and reconciliation is the bookend that proves the change worked, giving stakeholders and regulators the confidence to move forward.

The Reconciliation Gap

Most migration programs follow a familiar arc: assess the source data, map it to the target schema, transform it to meet the new system's requirements, load it, and validate. On paper, it's linear. In practice, the validation step is where many programs stall.

Here's why. Reconciliation requires you to answer a deceptively simple question: Does the data in the new system accurately represent what existed in the old one — and does it behave the same way?

That question has layers. At the surface level, it's a record count exercise — did all 2.3 million accounts make it across? But beneath that, reconciliation means confirming that values transformed correctly, that business logic was preserved, that calculated fields produce the same results, and that no data was silently dropped or corrupted in transit.

For organizations subject to regulatory frameworks like BCBS 239, CDD, or CIP, reconciliation also means demonstrating an auditable trail. Regulators don't just want to know that data moved — they want evidence that you understood what moved, why it changed, and that you can trace any value back to its origin.

Why Reconciliation Is So Difficult

Three factors make post-migration reconciliation consistently harder than teams anticipate.

  • The source system is often a black box. When you're migrating off a legacy mainframe or a decades-old custom application, the business logic embedded in that system may not be documented anywhere. Interest calculations, fee structures, conditional processing rules — these live in COBOL modules, job schedulers, and tribal knowledge. You can't reconcile output values if you don't understand how they were originally computed.
  • Transformation introduces ambiguity. Data rarely moves one-to-one. Fields get split, concatenated, reformatted, and coerced into new data types. A state abbreviation becomes a full state name. A combined name field becomes separate first and last name columns. Each transformation is a potential point of divergence, and without a systematic way to trace what happened, discrepancies become investigative puzzles rather than straightforward fixes.
  • Scale makes manual verification impossible. A financial institution migrating off a mainframe might be dealing with tens of thousands of data elements spread across thousands of modules. Spot-checking a handful of records doesn't provide the coverage that stakeholders and regulators require. But exhaustive manual comparison across millions of records, hundreds of fields, and complex calculated values simply doesn't scale.

A Better Approach: Build Reconciliation Into the Migration, Not After It

The most effective migration programs don't treat reconciliation as a phase that happens at the end. They build verifiability into every step — so that by the time data lands in the new system, the evidence trail already exists.

This requires two complementary capabilities: intelligent migration tooling that tracks every mapping and transformation decision, and deep lineage analysis that surfaces the logic embedded in legacy systems so you actually know what "correct" looks like.

Getting the Data There — With Full Traceability

The mapping and transformation phase of any migration is where most reconciliation problems originate. When a business analyst maps a source field to a target field, applies a transformation rule, and moves on, that decision needs to be recorded — not buried in a spreadsheet that gets versioned twelve times.

AI-powered migration tooling can accelerate this phase significantly. Rather than manually comparing schemas side by side, pattern recognition algorithms can predict field mappings based on metadata, data types, and sample values, then surface confidence scores so analysts can prioritize validation effort where it matters most. Transformation rules — whether written manually or generated through natural language prompts — are applied consistently and logged systematically.

The result is that when a stakeholder later asks, "Why does this field look different in the new system?" — the answer is traceable. You can point to the specific mapping decision, the transformation rule that was applied, and the sample data that validated the match. That traceability is foundational to reconciliation.

Understanding What "Right" Actually Means — Legacy System Lineage

Reconciliation gets exponentially harder when the source system is a mainframe running COBOL code that was last documented in the 1990s. When the new system produces a different calculation result than the old one, someone has to determine whether that's a migration error or simply a difference in business logic between the two platforms.

This is where mainframe data lineage becomes critical. By parsing COBOL modules, job control language, SQL, and associated files, lineage analysis can surface the calculation logic, branching conditions, data paths, and field-level relationships that define how the legacy system actually works — not how anyone thinks it works.

Consider a practical example: after migrating to a modern cloud platform, a reconciliation check reveals that an interest accrual calculation in the new system produces a different result than the legacy mainframe. Without lineage, the investigation could take weeks. An analyst would need to manually trace the variable through potentially thousands of lines of COBOL code, across multiple modules, identifying every branch condition and upstream dependency.

With lineage analysis, that same analyst can search for the variable, see its complete data path, understand the calculation logic and conditional branches that affect it, and determine whether the discrepancy stems from a migration error or a legitimate difference in how the two systems compute the value. What took weeks now takes hours — and the finding is documented, not locked in someone's head.

Bringing Both Sides Together

The real power of combining intelligent migration with legacy lineage is that reconciliation becomes a structured, evidence-based process rather than an ad hoc investigation.

When you can trace a value from its origin in a COBOL module, through the transformation rules applied during migration, to its final state in the target system — you have end-to-end data provenance. For regulated financial institutions, that provenance is exactly what auditors and compliance teams need. For consulting firms delivering these programs, it's the difference between a defensible methodology and a best-effort exercise.

What This Means for Consulting Firms

For Tier 1 consulting firms and systems integrators delivering modernization programs, post-migration reconciliation is often where project timelines stretch and client confidence erodes. The migration itself may go seem to go smoothly, but then weeks of reconciliation cycles — investigating discrepancies, tracing values back through legacy systems, re-running transformation logic — consume budget and test relationships.

Tooling that accelerates both sides of this equation changes the engagement model. Migration mapping and transformation that would have taken a team months can be completed by a smaller team in weeks. Lineage analysis that would have required dedicated mainframe SMEs for months of manual code review becomes an interactive research exercise. And the reconciliation evidence is built into the process, not assembled after the fact.

This translates directly to engagement economics: faster delivery, reduced SME dependency, lower risk of costly rework, and a more compelling value proposition when scoping modernization programs.

Practical Steps for Stronger Reconciliation

Whether you're leading a migration internally or advising a client through one, these principles will strengthen your reconciliation outcomes.

  • Start with lineage, not mapping. Before you map a single field, understand the business logic in the source system. What calculations are performed? What conditional branches exist? What upstream dependencies feed the values you're migrating? This upfront investment pays for itself many times over during reconciliation.
  • Track every transformation decision. Every mapping, every transformation rule, every data coercion should be logged and traceable. When discrepancies surface during reconciliation — and they will — you need to be able to reconstruct exactly what happened to any given value.
  • Profile data before and after. Automated data profiling at both the source and target gives you aggregate-level validation — record counts, completeness rates, value distributions, data type consistency — before you ever get to record-level comparison. This is your first line of defense and often catches systemic issues early.
  • Don't treat reconciliation as pass/fail. Not every discrepancy is an error. Some reflect intentional business logic differences between old and new systems. The goal isn't zero discrepancies — it's understanding and documenting every discrepancy so stakeholders can make informed decisions about go-live readiness.
  • Build for repeatability. If your organization does migrations frequently — onboarding new clients, integrating acquisitions, switching vendors — your reconciliation approach should be systematized. What you learn from one migration should make the next one faster and more reliable.

The goal of any modernization program isn't the migration itself — it's the value that comes after. Faster operations, better insights, reduced risk, regulatory confidence. Reconciliation is the bookend that earns trust in the change and clears the path to that value.

Zengines combines AI-powered data migration with mainframe data lineage to give enterprises and consulting firms full traceability from source to target — so you can prove the migration worked and move forward with confidence.

Mainframes aren't going anywhere overnight. Despite the industry's push toward cloud migration and modernization, the reality is that many financial institutions still rely on mainframe systems to process millions of daily transactions, calculate interest accruals, manage account records, and run core business operations. And they will for years to come.

Modernization is the eventual reality for every organization still running on mainframe. But "eventual" is doing a lot of heavy lifting in that sentence. For many financial institutions, a full modernization effort is on the roadmap but years away — dependent on budget cycles, vendor timelines, regulatory considerations, and a hundred other competing priorities. In the meantime, these systems still need to be maintained — and that's where things get increasingly risky.

The hidden cost of "just making a change"

When a business requirement changes — say, a new regulation requires a different calculation methodology, or a product team needs to update how accrued interest is computed — someone has to go into the mainframe and update the code. Sounds straightforward enough. Except it's not.

Mainframe COBOL codebases are often decades old. They've been written, rewritten, and patched by generations of engineers, many of whom have long since left the organization. A single mainframe environment can contain tens of thousands of COBOL modules, each with hundreds or thousands of lines of code. Variables branch across modules. Tables are read and updated in ways that aren't always documented. Conditional logic sends data down different paths depending on record types, dates, or account classifications that may have made perfect sense in 1998 but aren't intuitive to anyone working today.

Before a mainframe engineer can write a single new line of code, they need to answer a deceptively simple question: What will this change affect?

And answering that question — tracing a variable backward through modules, understanding which tables get updated, identifying upstream and downstream dependencies — can take weeks or even months of manual investigation. One engineer we've worked with estimated that investigating the impact of a change takes substantially longer than actually making the change.

Why mainframe management feels like navigating a black box

The term "black box" gets used a lot in mainframe conversations, and for good reason. The challenge isn't that the code doesn't work — it usually works remarkably well. The challenge is that nobody fully understands how and why it works the way it does.

Consider what a typical investigation looks like without modern tooling. An engineer receives a request from the business: "We need to update how we calculate X." To comply, that engineer has to:

  • Determine a relevant starting point for researching “X”, which may be a business term or a system  term.  This starting point, for example, could be a system variable in a frequently accessed COBOL module
  • Open the relevant COBOL module (which might be thousands of lines long)
  • Find and trace the variable in question through the code
  • Identify every table and field it touches
  • Follow it across modules when it gets called or referenced elsewhere, keeping track of pathways where the variable may take on a new name
  • Map out conditional branching logic that might treat the variable differently based on account type, date ranges, or other factors
  • Determine which downstream processes depend on the output
  • Document all of this before they can even begin to assess whether the change is safe to make

Now multiply that by the reality that a single environment might have 50,000 to 500,000 to 5,000,000 modules. It's not hard to see why organizations describe their mainframe as a black box — and why changes feel so high-stakes.

The real risk: unintended consequences

The fear isn't hypothetical. When an engineer updates a module without fully understanding the dependencies, the consequences can ripple across systems. A calculation that looked isolated might feed into downstream reporting. A field that seemed unused might actually be read by another module under specific conditions. A change to one branch of conditional logic might alter outputs for an account type that wasn't part of the original requirement.

These kinds of unintended consequences don't always surface immediately. Sometimes they show up in reconciliation discrepancies weeks later. Sometimes a client calls and says, "My statement looks different this month." By that point, the investigation to find the root cause is just as painful as the original change — if not more so.

This is why many mainframe teams default to a conservative posture. They move slowly, pad timelines, and layer in extensive manual review. Not because they aren't skilled, but because the risk of getting it wrong is too high and the tools available to them haven't evolved with the complexity of the systems they manage.

A better approach: data lineage for mainframe management

This is where mainframe data lineage changes the equation. Rather than manually tracing code paths and building dependency maps from scratch every time a change is requested, data lineage technology can parse COBOL modules at scale and generate a comprehensive, searchable view of how data flows through the system.

With data lineage in place, that same engineer who used to spend months investigating a change can now:

  • Search for a specific variable, table, or field and immediately see every module that reads, writes, or updates it
  • Trace the data path forward and backward to understand exactly where a value originates and where it ends up
  • View calculation logic to understand the mathematical expressions and business rules embedded in the code
  • Identify conditional branching to see where and why data gets treated differently based on record types or other criteria
  • Understand cross-module dependencies to assess the full blast radius of a proposed change before making it

Instead of navigating thousands of lines of raw COBOL to answer a single question, the engineer gets a curated, structured view of exactly the information they need. The investigation that used to take months can happen in minutes.

Not just for modernization day — for every day between now and then

Much of the conversation around mainframe data lineage focuses on migration and modernization. And yes, lineage is critical for those efforts — but the value starts long before modernization kicks off.

Every time a business requirement changes, every time a regulation is updated, every time an engineer needs to write or modify code — they're navigating the same black box. Data lineage doesn't just prepare you for the future. It makes your mainframe safer and more manageable right now, during the months or years between today and the day you're ready to modernize.

For mainframe teams, it means less time investigating and more time executing. For risk and compliance leaders, it means greater confidence that changes won't introduce unintended consequences. For the business, it means faster turnaround on change requests without increasing operational risk.

And when modernization day does arrive, you'll be ready

Here's the other advantage of investing in data lineage now: when your organization is ready to modernize, you won't be starting from scratch.

Modernization isn't just about moving everything from the old system to the new one. It requires making deliberate decisions about what to bring forward and what to leave behind. Which business rules are still relevant? Which calculations need to be replicated exactly, and which should be redesigned? Which data paths reflect current requirements, and which are artifacts of decisions made decades ago?

Without lineage, those questions send teams back into the same manual investigation cycle — except now they're doing it across tens of thousands of modules under the pressure of a migration timeline. With lineage already in place, your team walks into modernization with a comprehensive understanding of how the current system works, what it does, and why.

And the value doesn't stop at cutover. Post-migration, lineage gives you a baseline for reconciliation. When the new system produces a different output than the old one — and it will — lineage helps you trace back to the original logic and understand why the results differ. Was it an intentional change? A missed business rule? A calculation that was carried over incorrectly? Instead of guessing, your team can pinpoint the source of the discrepancy and resolve it with confidence.

The mainframe isn't the problem. The lack of visibility is.

Organizations that rely on mainframes aren't behind — they're running proven, reliable infrastructure that processes critical transactions every day. The challenge has never been the mainframe itself. It's that the tools and processes for understanding what's inside it haven't kept pace with the complexity of the systems or the speed at which the business needs to evolve.

Data lineage closes that gap. Whether modernization is two years away or five, understanding what's inside the black box isn't something you can afford to wait on. Your teams need that visibility today to manage changes safely — and they'll need it even more when the time comes to move forward.

Zengines' Mainframe Data Lineage solution parses COBOL code at scale to give your team searchable, visual access to the data paths, calculation logic, dependencies, and business rules embedded in your mainframe.

Subscribe to our Insights