Legacy Modernisation vs. Full Rewrite: How to Decide
The full rewrite is almost always the wrong answer. But so is indefinitely patching a system that can't scale. Here is a practical framework to decide.
In 2000, Joel Spolsky published “Things You Should Never Do” — the essay where he argued that a full rewrite was the single worst strategic mistake a software company could make. His case study was Netscape, which had decided to rewrite its browser from scratch. The result was years of stagnation, a product that shipped late and worse than what it replaced, and a company that never recovered its competitive position.
In 2026, with cloud-native tooling, better migration patterns, and infrastructure that can be provisioned in minutes, the picture is more nuanced than Spolsky’s original framing. Rewrites are more tractable than they were in 2000. The ecosystem of migration tooling is richer. The ability to run parallel systems with a clean traffic-splitting mechanism is genuinely easier.
But the core warning still holds: the full rewrite is almost always the wrong answer. Not because rewriting is never valid — it sometimes is. It’s because the conditions that make a rewrite safe are almost never present when organisations want to do one. The desire for a rewrite tends to peak at precisely the moment when it is most dangerous: when the system is under load, the team is under pressure, and the business cannot afford to stop accumulating technical debt while engineering rebuilds the plane mid-flight.
When the Full Rewrite Is Tempting
The conditions that produce rewrite conversations are consistent across industries and company stages. If you’ve been in one of these situations, you know exactly how compelling the argument feels in the moment.
The legacy system is slow, expensive to change, and the engineering team is deeply unhappy working in it. Every new feature requires touching five different places. Deployments are a manual ritual that everyone dreads. The on-call rotation is a source of sustained misery. From inside the team, a rewrite feels like relief.
A new platform or framework promises a step-change in developer velocity. “We could build this in half the time if we were using the modern stack.” This is frequently true in isolation, and frequently misleading as a justification for a rewrite — because it doesn’t account for the cost of getting to the point where you’re building in the new stack rather than migrating to it.
The team has accumulated enough context that the system’s limitations feel more visible than its value. “We’d build it completely differently if we started today.” This is almost certainly true. It is also irrelevant. The question is not how you would build it today; the question is how to get from here to a better place without stopping the business.
A new CTO arrives and wants to put their mark on the architecture. This is the most dangerous rewrite trigger. It is the one least likely to be grounded in technical necessity and most likely to be driven by the desire to demonstrate vision. New CTOs should spend at least six months understanding a system before recommending a rewrite. Many don’t.
Why Rewrites Fail
The mechanism of rewrite failure is consistent enough to describe as a pattern.
The time estimate is wrong by a factor of two to five. The legacy system contains encoded business logic that nobody documented — edge cases accumulated over years of production operation, regulatory requirements baked into processing flows, exception handling that reflects real events that happened in the past. None of this is visible in the system’s external behaviour. It only becomes visible when the new system doesn’t handle something the old system handled, and production incidents reveal the gap. The new system always takes longer than estimated because you spend the later months of the project discovering requirements that were implicit in the old one.
You’re maintaining two systems simultaneously. While the rewrite is in progress, the legacy system keeps running. It keeps accumulating bugs, security patches, and regulatory changes. Every week the old system needs attention is a week where engineers are split between maintaining the existing system and building the new one. The rewrite timeline stretches because the legacy system will not let you ignore it.
The new system launches without institutional knowledge. This is the Netscape problem, precisely described. The old system handled things badly — but it handled them. When the new system ships, it ships with clean architecture and clean code, and it also ships without the edge-case handling that the old system accumulated through years of production pain. These gaps appear as production incidents in the first weeks of operation. If the old system has been decommissioned, there is no fallback. If it hasn’t, you’re running two systems indefinitely.
The business keeps moving. During an eighteen-month rewrite, the commercial team is not waiting. New requirements are coming in. The product roadmap is not paused. In a rewrite, new requirements usually go into the old system (because the new system isn’t ready) or into the new system (causing scope creep that delays launch). Neither is good. The rewrite competes with the business for engineering capacity, and it is rarely the winner.
The Decision Framework
The question is rarely “rewrite or not.” It is “what is the right intervention for this system, at this stage, with this team, given this regulatory environment?” Four questions frame the decision honestly.
1. Can you isolate the system’s bounded contexts?
A bounded context, in Domain-Driven Design terms, is a subsystem that has a clear boundary — it owns its own data, it has clear interfaces with the rest of the system, and a team could work on it without constant coordination with other teams. Legacy systems that have been maintained without architectural discipline tend to have boundaries that are implicit, contested, and leaky. Everything talks to everything. The database is shared. The data model was designed for one use case and extended for twenty others.
If you can identify bounded contexts — even messy, imperfect ones — you can modernise incrementally. Carve off one context, rebuild it cleanly, validate it in production, and continue. This is the strangler fig pattern: the new system grows around the old one until the old one can be retired.
If you cannot identify bounded contexts because the system has no meaningful internal boundaries, you have a more fundamental problem. Spend six months mapping the system — understanding data flows, dependency graphs, implicit contracts — before you do anything architectural. The mapping phase is not glamorous, but it is the prerequisite for any safe modernisation approach.
2. Is the problem the technology or the design?
These are different problems with different solutions.
Technology problems — old runtime, no test coverage, slow deployment pipeline, outdated dependencies — can often be solved without changing the system’s architecture. Upgrading a runtime, adding a test suite, modernising a deployment pipeline, and migrating a database are all achievable without a rewrite. They are uncomfortable and they require discipline, but they are tractable.
Design problems — a monolith that cannot be decomposed, a data model that encodes the wrong domain concepts, a coupling pattern that makes every change a ripple effect — may require more radical surgery. But “more radical surgery” still doesn’t necessarily mean a full rewrite. It may mean a targeted architectural intervention: extracting one service, migrating one domain, refactoring one subsystem.
The mistake is misidentifying a technology problem as a design problem and using it to justify a rewrite when a more targeted upgrade would have served.
3. What is the regulatory and compliance risk of running two systems simultaneously?
In regulated industries — Malaysian fintech under BNM, insurance under the relevant prudential frameworks, healthcare under HIPAA or local equivalents — running parallel systems is not always feasible. Data sovereignty requirements, audit trail obligations, and transaction integrity standards may make it impossible to operate two versions of a payment system simultaneously.
This is a genuine constraint, and it cuts both ways. On one hand, it is a forcing function toward a clean cutover, which makes a targeted rewrite more defensible. On the other hand, a clean cutover requires that the new system is complete and validated before the old one is retired — which is a much higher bar than a gradual migration. The regulatory environment doesn’t make rewrites easier. It makes their failure modes more severe.
4. Do you have the team to build the new system and maintain the old one?
This question eliminates more proposed rewrites than any other. The honest answer is almost always no.
Building a new system requires your best engineers on the new system. Maintaining the old system requires experienced engineers who understand its quirks. If your team has fifteen engineers, you cannot split them into a new-system team and a legacy-maintenance team and have either group be effective. You will have a new system that accumulates architectural shortcuts because the senior engineers are firefighting the old system, and a legacy system that is accumulating changes that the new system doesn’t know about.
The organisations that successfully execute rewrites tend to be large enough that they can staff both efforts properly. Most of the organisations that want to do rewrites are not.
The Outcome Matrix
Working through these four questions honestly produces a clear decision:
Modernise incrementally when you can identify bounded contexts, the problem is primarily technological, the regulatory environment supports gradual migration, and the team is not large enough to staff a parallel effort. Use the strangler fig pattern: build new capability alongside the old, migrate traffic gradually, retire old components as they are replaced.
Targeted rewrite when the design problem is localised to one subsystem, the bounded context is clear, the regulatory environment permits a clean cutover for that subsystem, and you have a dedicated team that can execute it without destabilising the rest of the organisation. Rewrite that service, validate it, and continue with the next one.
Full rewrite only when all of the following are true: the system is genuinely beyond decomposition, the team is large enough to staff both efforts, the regulatory environment supports a clean cutover, and business leadership understands that new feature development will stop for the duration. This set of conditions is met rarely. When it is, a rewrite is defensible. When it isn’t, it’s a plan for a multi-year derailment.
What Modernisation Actually Looks Like
The strangler fig pattern sounds elegant in a conference talk. In practice, it is a sustained programme of unglamorous work: carving off one service at a time, building the seams (APIs, event streams, data synchronisation mechanisms) that allow old and new to coexist, and validating each step in production before taking the next.
It requires discipline to resist the temptation to expand scope. The bounded context you are modernising needs to be genuinely bounded — not “while we’re here, let’s also fix the adjacent module.” Scope creep in a strangler fig migration produces the same outcome as scope creep in a rewrite: an overlong programme that delivers value late.
We’ve worked through this pattern in practice with enterprise clients — most recently in a legacy ERP cloud migration where the existing system was deeply integrated with a set of downstream business processes that couldn’t be paused. The approach was a phased extraction of bounded contexts, migrating each to a cloud-native architecture while maintaining the integration seams with the legacy system. It took longer than a rewrite would have been estimated to take. It delivered value continuously, rather than at the end of an eighteen-month programme. And the system that resulted was maintainable because the team understood every architectural decision they had made along the way.
The Question Behind the Question
The full rewrite conversation is almost never really about technology. It is about frustration — with accumulated debt, with the pace of change, with a system that feels like it is working against you. That frustration is legitimate. The system probably is working against you. The question is whether a rewrite is the right response to that frustration, or whether the frustration is a signal that a different kind of intervention is needed.
The question is almost never “rewrite or not.” It is “how do we reduce the risk of each decision to an acceptable level while keeping the business running?” That requires a plan, not a vision. And plans require honest answers to questions that organisations under pressure frequently don’t want to ask.
Related Reading
- Legacy Modernisation in Regulated Industries: A Framework — The compliance-first engineering framework for decomposing regulated legacy systems.
- Case study: Legacy ERP Cloud Migration — A real programme showing how Nematix approached enterprise legacy modernisation.
- Case study: Building a Scalable Digital Product — How a fintech startup migrated from a monolith to service-oriented architecture at scale.
See how Nematix approached a legacy ERP cloud migration for an enterprise client — and what the modernisation programme actually looked like in practice.