Articles

Three Keys to Successful Mainframe Refactoring: A Practical Guide

August 14, 2025
Esther Jesurum

With 96% of companies moving mainframe workloads to the cloud, yet 74% of modernization projects failing, organizations need a systematic approach to refactoring legacy systems. The difference between success and failure lies in addressing three critical challenges: dependency visibility, testing optimization, and knowledge democratization.

The Hidden Challenge

Mainframe systems built over decades contain intricate webs of dependencies that resist modernization, but the complexity runs deeper than most organizations realize. Unlike modern applications designed with clear interfaces, documentation standards and plentiful knowledge resources, legacy systems embed business logic within data relationships, file structures, and program interactions that create three critical failure points during mainframe refactoring:

Hidden Dependencies: Runtime data flows and dynamic relationships that static analysis cannot reveal, buried in millions of lines of code across interconnected systems.

Invisible Testing Gaps: Traditional validation approaches fail to catch the complex data transformations and business logic embedded in mainframe applications, leaving critical edge cases undiscovered until production.

Institutional Knowledge Scarcity: The deep understanding needed to navigate these invisible complexities exists only in the minds of departing veterans.

Any one of these challenges can derail a refactoring project. Combined, they create a perfect storm that explains why 74% of modernization efforts fail. Success requires ensuring this critical information is available throughout the refactoring effort, not left to chance or discovery during code transformation.

Key 1: Master Data Dependencies Before Code Conversion

The Problem: Runtime data flows and dynamic dependencies create invisible relationships that static analysis cannot reveal.

The Problem: Complex data flows and dynamic dependencies create invisible relationships that span program execution flows, database navigation patterns, and runtime behaviors.

Implementation Checklist

□ Trace Data Element Journeys Across All Systems

  • Identify program actions that reads, modifies, or depends on specific data structures
  • Map cross-application data sharing through job control language (JCL) and program execution sequences

□ Understand Database and Program Execution Patterns

  • Analyze JCL/CL job flows to understand program dependencies and execution order
  • Map hierarchical (IMS) and network (IDMS) database structures and navigation paths
  • Identify data-driven business logic that changes based on content and processing context

□ Access Hidden Business Rules

  • Identify validation logic embedded in program execution sequences
  • Discover error handling routines that function as business rules
  • Uncover edge cases handled through decades of modifications

□ Generate Impact Analysis

  • Visualize effects of modifying specific programs or data structures
  • Understand downstream impacts from changing data formats or program execution flows
  • Access comprehensive decomposition analysis for monolithic applications

What It Looks Like in Real Life

Manual Approach: Teams spend months interviewing SMEs, reading through millions of lines of undocumented code, and creating spreadsheets to track data flows and job dependencies. The scale and complexity make it impossible to find all relationships—critical dependencies exist in JCL execution sequences, database navigation patterns, and runtime behaviors that are buried in decades of modifications. Even after extensive documentation efforts, teams miss interconnected dependencies that cause production failures.

With Zengines: Complete data lineage mapping across all systems in days. Interactive visualization shows exactly how customer data flows from the 1985 COBOL program through job control sequences, database structures, and multiple processing steps, including execution patterns and database behaviors that documentation never captured.

Success Metrics

  • Complete visibility into data flows, program dependencies, and execution patterns
  • Real-time access to comprehensive refactoring complexity analysis
  • Zero surprises during code conversion phase

Key 2: Implement Data Lineage-Driven Testing

The Problem: Traditional testing approaches fail to validate the complex data transformations and business logic embedded in mainframe applications. While comprehensive testing includes performance, security, and integration aspects, the critical foundation is ensuring data accuracy and transformation correctness.

Implementation Checklist

□ Establish Validation Points at Every Data Transformation

  • Identify test checkpoints at each step where data changes hands between programs
  • Monitor intermediate calculations and business rule applications
  • Track data transformation throughout the process

□ Generate Comprehensive Data-Driven Test Scenarios

  • Create test cases covering all conditional logic branches based on data content
  • Build transaction sequences that replicate actual data flow patterns
  • Include edge cases and error conditions that exercise unusual data processing paths

□ Enable Data-Focused Shadow Testing

  • Process test data through refactored systems alongside legacy systems
  • Compare data transformation results at every lineage checkpoint
  • Monitor data accuracy and consistency during parallel data processing

□ Validate Data Integrity at Scale

  • Test with comprehensive datasets to identify data accuracy issues
  • Monitor for cumulative calculation errors in long-running data processes
  • Verify data transformations produce identical results to legacy systems

What It Looks Like in Real Life

Manual Approach: Testing teams manually create hundreds of test cases, then spend weeks comparing data outputs from old and new systems. The sheer volume of data transformation points makes comprehensive coverage impractical—when data discrepancies appear across thousands of calculation steps, teams have no way to trace where in the complex multi-program data flow the difference occurred. Manual comparison of data transformations across interconnected legacy systems becomes impossible at scale.

With Zengines: Enable test generation automation to create thousands of data scenarios based on actual processing patterns. Self-service validation at every data transformation checkpoint to pinpoint exactly where refactored logic produces different data results—down to the specific calculation or business rule application.

Success Metrics

  • Test coverage across all critical data transformation points
  • Validation of data accuracy and business logic correctness
  • Confidence in refactored data processing before cutover

Key 3: Democratize Institutional Knowledge

The Problem: Critical system knowledge exists only in the minds of retiring experts, creating bottlenecks that severely delay modernization projects.

Implementation Checklist

□ Access Comprehensive Data Relationship Mapping

  • Obtain complete visualization of how data flows between systems and programs
  • Understand business logic and transformation rules embedded in legacy code
  • Enable team members to explore system dependencies without expert consultation

□ Extract Business Context from Legacy Systems

  • Capture business rules and validation requirements from existing code
  • Link technical implementations to business processes and requirements
  • Create accessible knowledge bases with complete rule extraction

□ Enable Independent Impact Analysis

  • Provide capabilities to show downstream effects of proposed changes
  • Allow developers to trace data origins and dependencies during refactoring
  • Support business analysts in validating modernized logic

□ Eliminate SME Consultation Bottlenecks

  • Provide role-based access to comprehensive system analysis
  • Enable real-time exploration of data flows and business rules
  • Deliver complete context for development and testing teams

What It Looks Like in Real Life

Manual Approach: Junior developers submit tickets asking "What happens if I change this customer validation routine?" and wait 2 weeks for Frank to review the code and explain the downstream impacts. The interconnected nature of decades-old systems makes it impractical to document all relationships—Frank might remember 47 downstream systems, but miss the obscure batch job that runs monthly. The breadth of institutional knowledge across millions of lines of code is impossible to capture manually, creating constant bottlenecks as project velocity crawls.

With Zengines: Any team member clicks on the validation routine and instantly sees its complete impact map—every consuming program, all data flows, and business rules. Questions get answered in seconds instead of weeks, keeping modernization projects on track.

Success Metrics

  • 80% reduction in SME consultation requests
  • Independent access to system knowledge for all team members
  • Accelerated decision-making without knowledge transfer delays

Technology Enablers

Modern platforms like Zengines automate much of the dependency mapping, testing framework creation, and knowledge extraction.

Take Action

Successful mainframe refactoring demands more than code conversion expertise. Organizations that master data dependencies, implement lineage-driven testing, and democratize institutional knowledge create sustainable competitive advantages in their modernization efforts. The key is addressing these challenges systematically before beginning code transformation, not discovering them during production deployment.

Next Steps: Assess your current capabilities in each area and prioritize investments based on your specific modernization timeline and business requirements.

You may also like

BOSTON, MA – November 12, 2025 – Zengines is pleased to announce that the company's CEO and Co-Founder, Caitlyn Truong, has been recognized as a winner of the 2025 Info-Tech Awards by Info-Tech Research Group, a global leader in IT research and advisory.

Truong has been named a winner in the Women Leading IT award category.

The Info-Tech Awards celebrate outstanding achievements in IT, recognizing both individual leaders and organizations that have demonstrated exceptional leadership, innovation, and impact. The Women Leading IT Award celebrates exceptional women whose strength of leadership is driving innovation and transformation in their organization and the IT industry.

Since founding Zengines in 2020, Truong has led the development of AI-powered solutions that address two of the most pressing data management challenges facing enterprise organizations: data migration and data lineage. Under her leadership, Zengines has partnered with some of the largest enterprises to accelerate and de-risk their most critical business initiatives—from customer onboarding and system modernization to M&A integration and compliance requirements. The company's innovative approach helps organizations complete data conversions up to 80% faster while significantly reducing risk and cost, transforming processes that traditionally required large teams of specialists and months of manual work into streamlined operations achievable in minutes or days through AI-driven automation.

"I'm deeply honored by this recognition from Info-Tech and applaud their commitment to celebrating women in tech," says Caitlyn Truong, CEO of Zengines. "At Zengines, we're solving some of the most complex challenges the industry hasn't been able to crack: helping companies understand, modernize, and move their most valuable asset - their data. We're succeeding thanks to our incredible teammates - including women leaders who earned their place through grit and skill. When we amplify this power between women in tech - sharing knowledge, championing success, staying in the fight - we create leaders who know how to do hard things. That's the future worth building."

The 2025 Info-Tech Award winners were selected from a competitive pool of hundreds of candidates. The Women Leading IT Award winners were determined by their track record of innovation, leadership, and business impact, and their contribution to the advancement of women in technology through mentorship, advocacy, or initiatives that support diversity in IT.

"Women Leading IT within the 2025 Info-Tech Awards celebrates leaders whose vision and execution have driven measurable progress in innovation, inclusion, and organizational performance," says Tom Zehren, Chief Executive Officer at Info-Tech Research Group. "Congratulations to this year's honorees for strengthening their organizations through strategic leadership and opening doors for the future generation of IT leaders. Each Women Leading IT winner for 2025 exemplifies the strength of inclusive leadership that is shaping IT's next chapter."

To view the full list of winners and learn more about the Info-Tech Awards, please click here.

About Zengines

Zengines is a technology company that transforms how organizations handle data migrations and mainframe modernization. Zengines serves business analysts, developers, and transformation leaders who need to map, change, and move data across systems. With deep expertise in AI, data migration, and legacy systems, Zengines helps organizations reduce time, cost, and risk associated with their most challenging data initiatives. Learn more at zengines.ai.

About Info-Tech Research Group

Info-Tech Research Group is the world's leading research and advisory firm, proudly serving over 30,000 IT, HR, and marketing professionals. The company produces unbiased, highly relevant research and provides industry-leading advisory services to help leaders make strategic, timely, and well-informed decisions. For nearly 30 years, Info-Tech has partnered closely with teams to provide them with everything they need, from actionable tools to analyst guidance, ensuring they deliver measurable results for their organizations.

To learn more about Info-Tech Research Group or to access the latest research, visit infotech.com.

The 2008 financial crisis exposed a shocking truth: major banks couldn't accurately report their own risk exposures in real-time.

When Lehman Brothers collapsed, regulators discovered that institutions didn't know their actual exposure to toxic assets -- not because they were hiding it, but because they genuinely couldn't aggregate their own data fast enough.

Fifteen years and billions in compliance spending later, only 2 out of 31 Global Systemically Important Banks fully comply with BCBS-239 -- the regulation designed to prevent this exact problem.

The bottleneck? Data lineage.

Who Must Comply and When

BCBS-239 applies from January 1, 2016 for Global Systemically Important Banks (G-SIBs) and is recommended by national supervisors for Domestic Systemically Important Banks (D-SIBs) three years after their designation. In practice, this means hundreds of banks worldwide are now expected to comply.

Unlike regulations with fixed annual filing deadlines, BCBS-239 is an ongoing compliance requirement. Supervisors can test a bank's compliance with occasional requests on selected risk issues with short deadlines, gauging a bank's capacity to aggregate risk data rapidly and produce risk reports.

Think of it as a fire drill that can happen at any moment -- and with increasingly serious consequences for failure.

The Sobering Statistics

More than a decade after publication and eight years past the compliance deadline, the results are dismal. Only 2 out of 31 assessed Global Systemically Important Banks fully comply with all principles, and no single principle has been fully implemented by all banks.

Even more troubling, the compliance level across all principles barely improved from an average of 3.14 in 2019 to 3.17 in 2022 on a scale of 1 ("non-compliant") to 4 ("fully compliant"). At this rate of improvement, full compliance is decades away.

What Happens If Your Bank Fails BCBS-239 Compliance?

The consequences are escalating. The ECB guide explicitly mentions:

  • Enforcement actions against the institution
  • Capital add-ons to compensate for data risk
  • Removal of responsible executives who fail to drive compliance
  • Operational restrictions on new business lines or acquisitions

The Basel Committee makes it clear that banks' progress towards BCBS 239 compliance in recent years has not been satisfactory and that increased measures on the part of the supervisory authorities are to be expected to accelerate implementation.

What Banks Are Doing (And Why It's Not Enough)

Most banks have responded to BCBS-239 with predictable tactics:

  • Governance restructuring: Creating Chief Data Officer roles and data governance committees
  • Policy documentation: Writing comprehensive data management policies and frameworks
  • Technology investments: Purchasing disparate tools like data catalogs, metadata management tools, and master data management platforms
  • Remediation programs: Launching multi-year, multi-million dollar compliance initiatives

These tactics, as positive steps forward, are necessary but not sufficient to meeting compliance. In other words, they're checking boxes without fundamentally solving the problem.

The issue? Banks are treating BCBS-239 like a project with an end date, when it's actually an operational capability that must be demonstrated continuously.

The Data Lineage Bottleneck

Among the 14 principles, one capability has emerged as the make-or-break factor for compliance: data lineage.

Data lineage has been identified as one of the key challenges that banks have faced in aligning to the BCBS-239 principles, as it is one of the more time consuming and resource intensive activities demanded by the regulation.

Why Data Lineage Is Different

Data lineage -- the ability to trace data from its original source through every transformation to its final destination -- sits at the intersection of virtually every BCBS-239 principle. The European Central Bank refers to data lineage as "a minimum requirement of data governance" in the latest BCBS 239 recommendations.

Here's why lineage is uniquely difficult:

It's invisible until you need it.
Unlike a data governance policy you can show an auditor or a data quality dashboard you can pull up, lineage is about proving flows, transformations, and dependencies that exist across dozens or hundreds of systems. You can't fake it in a PowerPoint.

It crosses organizational and system boundaries.
Complete lineage requires cooperation between IT, risk, finance, operations, and business units -- each with their own priorities, systems, and definitions. Further, data hand-off occurs in and between systems, databases and files, which adds to the complexity of connecting what happens at each hand-off.  Regulators are increasingly requiring detailed traceability of reported information, which can only be achieved through lineage across organizations and systems.

It must be current and complete.
The ECB requires "complete and up-to-date data lineages on data attribute level (starting from data capture and including extraction, transformation and loading) for the risk indicators, and their critical data elements." A lineage document from six months ago is worthless if your systems have changed.

It must work under pressure.
Supervisors increasingly require institutions to demonstrate the effectiveness of their data frameworks through on-site inspections and fire drills, with data lineage providing the audit trail necessary for these reviews. When a regulator asks "prove this number came from where you say it came from," you have hours -- not days -- to respond.

The Eight Principles That Demand Data Lineage Proof

While 11 of the 14 principles benefit from good data lineage, regulatory guidance makes it explicitly mandatory for eight:

  • Principle 2 (Data Architecture): Demonstrate integrated data architecture through documented lineage flows
  • Principle 3 (Accuracy & Integrity): Prove data accuracy by showing traceable lineage from source to report
  • Principle 4 (Completeness): Demonstrate comprehensive risk coverage through lineage mapping
  • Principle 6 (Adaptability): Respond to ad-hoc requests using lineage to quickly identify relevant data
  • Principle 7 (Report Accuracy): Validate report numbers through documented lineage and audit trails
  • Principles 12-14 (Supervisory Review): Provide lineage evidence during audits and fire drills

The Technology Gap: Why Traditional Tools Fall Short

Most banks have invested heavily in data catalogs, metadata management platforms, and governance frameworks. Yet they still can't produce lineage evidence under audit conditions. Why?

Traditional approaches have three fatal flaws:

1. Manual Documentation

Excel-based lineage documentation becomes outdated within weeks as systems change. By the time you finish documenting one data flow, three others have been modified. Manual approaches simply can't keep pace with modern banking environments.

2. Point Solutions that only support newer applications

Modern data lineage tools can map cloud warehouses and APIs, but they hit a wall when they encounter legacy mainframe systems. They can't parse COBOL code, decode JCL job schedulers, or trace data through decades-old custom applications -- exactly where banks' most critical risk calculations often live.

3. Incomplete Coverage

Lineage that stops at the data warehouse is fundamentally incomplete under BCBS-239's end-to-end data lineage requirements. Regulators want to see the complete path -- from original source system through every transformation, including hard-coded business logic in legacy applications, to the final risk report. Most tools miss 40-70% of the actual transformation logic.

How AI-Powered Data Lineage Changes the Game

This is where AI-powered solutions like Zengines fundamentally differ from traditional approaches.

Instead of manually documenting lineage, Zengines can automatically and comprehensively:

  • Parse legacy mainframe code (COBOL, RPG, Focus, etc) to extract data flows and transformation logic
  • Trace calculations backward from any report field to ultimate source systems
  • Document relationships between tables, fields, programs, files and job schedulers
  • Generate audit-ready evidence in minutes instead of months
  • Maintain relevancy and currency through lineage updates as code changes

Solving the "Black Box" Problem

For many banks, the biggest lineage gap isn't in modern systems -- it's in legacy mainframes where critical risk calculations were encoded 20-60 years ago by developers who have long since retired. These systems are literal "black boxes": they produce numbers, but no one can explain how.

Zengines' Mainframe Data Lineage capability specifically addresses this challenge by:

  • Parsing COBOL and RPG modules to expose calculation logic and data dependencies
  • Tracing variables across millions of lines of legacy code
  • Identifying hard-coded values, conditional logic, and branching statements
  • Visualizing data flows across interconnected mainframe programs and external files
  • Extracting "requirements" that were never formally documented but are embedded in code

This capability is essential for banks that need to prove how legacy calculations work -- whether for regulatory compliance, system modernization, or simply understanding their own risk models.

Assessment: Can Your Bank Prove Compliance Right Now?

The critical question isn't "Do we have data lineage?" It's "Can we prove compliance through data lineage right now, under audit conditions, with short notice?"

Most banks would answer: "Well, sort of..."

That's not good enough anymore.

We've translated ECB supervisory expectations into a practical, principle-by-principle checklist. This isn't about aspirational capabilities or future roadmaps -- it's about what you can demonstrate today, under audit conditions, with short notice.

The Bottom Line

The bottleneck to full BCBS-239 compliance is clear: data lineage.

Traditional approaches -- manual documentation, point solutions, incomplete coverage -- can't solve this problem fast enough. The compliance deadline was 2016. Enforcement is escalating. Fire drills are becoming more frequent and demanding.

Banks that solve the lineage challenge with AI-powered automation will demonstrate compliance in hours instead of months. Those that don't will continue struggling with the same gaps, facing increasing regulatory pressure, and risking enforcement actions.

The technology to solve this exists today. The question is: how long can your bank afford to wait?

Schedule a demo with our team today to get started.

BOSTON, MA – October 29, 2025 – Zengines, the AI-powered data migration and data lineage platform, announces expanded support for RPG (Report Program Generator) language in its Data Lineage product. Organizations running IBM i (AS/400) systems can now rapidly analyze legacy RPG code alongside COBOL, dramatically accelerating modernization initiatives while reducing dependency on scarce programming expertise.

Breaking Through the RPG "Black Box"

Many enterprises still rely on mission-critical applications written in RPG decades ago, creating what Zengines calls the "black box" problem – legacy technology where business logic, data flows, and requirements are locked away in legacy code with little to no documentation. As companies undertake digital transformation and cloud migration initiatives, understanding these legacy systems has become a critical bottleneck.

The challenge with RPG is particularly acute. While COBOL's descriptive, English-like syntax makes it easier to "read," RPG's fixed-format column specifications and cryptic operation codes require developers to decode what goes in which column while tracing through numbered indicators to follow the logic. This complexity, combined with a shrinking pool of RPG expertise, makes understanding these systems even more critical—and difficult—than their COBOL counterparts.

"The majority of our enterprise customers are running legacy technology across multiple platforms – both mainframe COBOL environments and IBM i systems with RPG code," said Caitlyn Truong, CEO of Zengines. "By expanding our support to include RPG alongside COBOL, we can now address the full spectrum of legacy code challenges these organizations face. This means our customers can leverage a single AI-powered platform to comprehensively analyze, understand and modernize their legacy technology estate, rather than cobbling together multiple point solutions or relying on increasingly scarce programming expertise across different languages and systems."

Minutes, Not Months: AI-Powered Legacy Code Analysis

The enhanced Zengines Data Lineage platform automatically ingests RPG code, job schedulers, and related artifacts to deliver:

  • Interactive data lineage visualization – Graphical representation of data paths, sources, and hard-coded values
  • Comprehensive code intelligence – Relationships between modules, tables, fields, variables, and files
  • Business logic extraction – Calculation logic, branching conditions, and transformation rules
  • Actionable insights – Tables and fields inventory, profiling, and impact analysis

This capability is critical for organizations navigating system replacements, M&A integrations, compliance initiatives, and technology modernization programs where understanding legacy RPG logic is essential for de-risking implementations.

Real-World Impact: From Guesswork to Precision

Managing and modernizing legacy systems break down when teams lack complete understanding of existing logic. Migrations stall when teams cannot achieve functional coverage or resolve test failures. When validating new systems against legacy outputs, discrepancies inevitably emerge – but without understanding why the old system produces specific results, teams cannot effectively test, replicate, or improve functionality.

"Our customers use Zengines to reverse-engineer business requirements from legacy code," added Truong. "When a new system returns a different result for an interest calculation compared to that of  the 40-year-old RPG program, teams need to understand the original logic to make informed decisions about what to preserve and what to update. That's the power of shining a light into the black box."

Immediate Availability

RPG parsing capability is now available on Zengines Data Lineage platform. Organizations can analyze both COBOL and RPG codebases within a single integrated platform.

About Zengines

Zengines is a technology company that transforms how organizations handle data migrations and modernization inititatives. Zengines serves business analysts, developers, and transformation leaders who need to map, change, and move data across systems. With deep expertise in AI, data migration, and legacy systems, Zengines helps organizations reduce time, cost, and risk associated with their most challenging data initiatives.

Media Contact:

Todd Stone

President, Zengines

todd@zengines.ai

Subscribe to our Insights