Correctness vs Change: Which Matters More?

Source:- infoq.com

Correctness in software is limited to well-understood components.
In ongoing software development, our core work is changing code.
Build changeable software on top of existing, well-understood components.
Improve your delivery automations, and your team will get better at your core work.
When software interacts with humans, aim for “better” instead of “correct.”

When I started programming, I loved the puzzle-solving aspect of it: how can I make the computer find the answer? But there’s a harder problem: how do I know this answer is correct? When programs get bigger than one person can produce, this problem becomes very hard. We draw on static typing and automated testing, even proof assistants and property testing when we get serious. Yet all this has one assumption behind it: we know what “correct” is.

In my second programming job, in meetings with the businesspeople who created our requirements, I dug into corner cases and feature interactions. And dug. And dug. They were annoyed with me, but how could we get this program right if they couldn’t tell me what “right” was? Now I know that they were right to be annoyed. Conflicting requirements are the norm, and we move forward anyway.

Sometimes correctness is possible, and it matters. The rest of the time, optimize for change, so that we can incrementally discover and approach “right.”

When is correctness possible?

Are you writing code to be baked into hardware? Scientific data analysis? Financial processing? Internal activity tracking? A website for a 30-day ad campaign?

Software is not one industry.

Back in the day people thought all software should be written with a full specification up front, because correctness was paramount. Meir Lehman, in his 1980 paper“Programs, Lifecycles, and Laws of Software Evolution,” insisted that we strive for full specification, even as he provided clear evidence against its usefulness.

The paper contains a crucially important categorization of programs, and an insightful but mistaken analysis of the software lifecycle. (It also contains some “laws” of software evolution that people get hung up on, but they’re not relevant here.)

The important bit is the categorization. Many arguments over correctness “become reconcilable or irrelevant under an adequate program classification scheme.”

Distinguish among situations: S/P/E

Lehman divides programs according to how well we understand the solution. With the distinctions he makes, we can resolve many arguments over correctness vs change. The trick is to ask, what kind of program are we talking about? Lehman proposes three categories:

S-programs, which are specified and solvable;
P-programs, where we can define “better” but can’t solve for “best;” and
E-programs, which are a part of the system they are trying to improve.

S: a solution is specified

When we can fully define “correct,” and it’s computable in reasonable time, then we implement it with an S-program. “S-programs are programs whose function is formally defined by and derivable from a specification.” This works in the small, when we need a function and we know exactly what it should do. In a large program, how often do we know the full requirements, with all conflicts resolved?

Specifications are extremely hard to write. Hila Peleg told me that when her research group is trying to prove that a program correctly implements a specification, and they find a discrepancy, the bug is usually in the specification. I am grateful to the people on committees who pore over the details of specifications like CSS. They are doing the hard work of defining “correct.”
Lehman remarks that implementations of S-programs, once correct, never need to change. They’re done. You might tweak it for performance. But if the requirements change, then you write a different S-program. You may use the code for this one as a starting point for that new one, but the original is still correct by its own definition. The holy grail, for Lehman, is to break every software system down into S-programs. He is right that this is a useful intention, but misguided in his dreams of completing this breakdown before beginning development.

S-programs are satisfying to write. They are the puzzles we can solve.

P: a problem is specified

Other puzzles can’t be solved perfectly, but we can seek better and better solutions. These questions aren’t computable in our lifetime, such as a real-life traveling salesman or chess. We can’t get to “right,” but we can define “better.” Lehman calls these P-programs. “The problem statement and its solution approximate the real world situation.”

Stochastic programs belong here, like Monte Carlo and other statistical simulations that try to answer the question “How likely is it?” These programs are impossible to fully test, since they aren’t supposed to do the same thing every time. This makes proofs and mathematical reasoning extra valuable.

Lehman recognizes that P-programs do change over time, as we learn more about the problem. The traveling salesman, for instance, may start to care about gas mileage or where they end up on Friday afternoon. Using the algorithm teaches us ways to add to it. For this reason, Lehman groups these in with the next category, under “applications.”

E: the software is entwined in the problem space

That leaves everything else: all the software built with incomplete requirements, and every system whose requirements shift after it goes into production. Sound familiar? We want this imprecision to be a failing of human organizations, which should be producing nice, sound specifications. But as Lehman points out, this is more than endemic: it’s unavoidable. Software used by humans is not separate from the human systems that use it. Software is part of that system, entwined in it. Therefore, any new software that changes the system it’s in leads to a new situation –and new demands on the software. “The program has become a part of the world it models.”

Lehman uses the word embedded for this kind of program, but he does not mean embedded-in-hardware. That’s confusing, so I prefer “entwined.” He is referring to programs that participate in a sociotechnical system, where the software needs to learn and grow along with the humans using it. The program is entwined with the problem it is helping to solve. Lehman: “The need for continuing change is intrinsic to the nature of computer usage.”

This kind of software is never “done.”

If you wish to complete puzzles, you won’t be satisfied with this work. But if you wish to help people with real problems, E-programs are powerful.

There’s a reason that most custom software is of this type, which we call “applications.”

Compose applications out of S programs whenever you can

Lehman saw that E- and P- programs could be broken down, so that many of the parts were S-programs, well-defined and solvable. Lehman was correct in that we want to compose our applications out of S-programs as much as possible. But he got the order wrong; fuzzy, changing E-programs come first, and S-programs come later. Lehman also thought that each S-program would be implemented by the development team. Now we have a different strategy: find existing components and then adapt the application design to use them. Custom software development is more about integration than construction. Lehman predicted: “All large programs (software systems) will be constructed as structures of S-programs.” Lehman didn’t see this as a key aspect of an S-problem; it can be solved by someone else.

S programs are reusable; optimize for correctness

S-programs are valuable; well-specified and correct, they are reusable components. S-programs are also satisfying to implement, which led to something Lehman did not expect: oodles of open source libraries and frameworks, plus software as a service.

While coding, whenever I whittle down a problem to a well-defined and abstract
piece, it starts getting fun, and a flag raises in my head: someone has solved this already. It’s time to search for someone else’s already-verified implementation, and adapt to that.

When you are creating a reusable S-program, correctness matters. You know what correctness is; now optimize for it. This is a job for proofs, types, and property tests — whatever fits your program.

Problems move from E to S after we gain understanding

In 1980, even with the crucial insight that “Quality in general cannot at present be designed and built into a program ab initio. Rather, they are gradually achieved by evolutionary change and refinement,” Lehman still remonstrated that if we only design hard enough, we can break every application down into S-programs. He wasn’t completely wrong.

Lehman wants to break the program down before it is written, but that’s not how it works. Components of applications can be defined as S-programs after they’ve been written several times, in different circumstances.

In the late nineties, we created ORM frameworks inside our applications, until the problem became widespread and well-understood enough that frameworks like Hibernate could solve the problems experienced by many companies. After that, committees could write standards like JPA. People used to write CRUD applications by hand, until the patterns became clear enough for a framework like Rails to fit them widely. Good standards emerge from many implementations, and then those standards feed correct implementations.

E programs are most custom software; optimize for change

The point of this categorization is to let go of correctness when it is in conflict with change.

If you’re making software for humans to use, you’re probably not creating an S-program. “Custom” software is fit for purpose, part of the operations of a particular business system. These system-entwined applications are not reusable. Each one is designed for the situation, to help particular humans with an activity or to lead them toward a particular behavior. An application needs to be correct enough, and it needs to keep getting better. Optimize for change.

What does it mean to optimize for change?

Don’t write code; change code

In his paper, Lehman studied the software lifecycle, from Requirements to Design to Detailed Design to Programming to Test to Rollout to Maintenance. He observed that problems occurred at every handoff between phases. He also noticed a crucial fact: all the other phases are encapsulated (but smaller) inside Maintenance. Yet he missed the solution that now seems clear. He got stuck on “We need to make stronger separations between the phases,” instead of “Forget them all, except maintenance”. Start somewhere and iterate. Make changes small and easy, so that we can barely tell one phase from another. The handoff problems disappear.

Let go of the idea that we write applications. You can write an S-program, and call it done. When the application is entwined in the system it supports, all development is change.

When you set up the first walking skeleton (link) of your program, include continuous integration and delivery. You need builds, automated tests, a way to run security checks, and automated deployment. These are table stakes for the “change software” game.

Reduce fear of changing code

The biggest obstacle to movement is not the work going into it, but it is the fear. Many modern standards help reduce this. Refactoring, with automated tests, helps us build a clear model of the code so that we can change it with confidence. Microservices help us limit the impact of changing a particular piece. Lehman even anticipated microservices: “Each module could be implemented as a program running on its own microprocessor and the system implemented as a distributed system.” He knew the world wasn’t ready for them, though: “Many problems in connection with the design and construction of such systems still need to be solved.” Some of these problems are now solved. We have automation for the maintenance of software components in quantity; we have APIs for infrastructure, builds and delivery.

Change how you change code

The crucial question in software architecture is “how will we change it?” We design both the future state of the system, and a path from here to there.

To keep growing, we need to keep getting better at changing it, too. When setting up your delivery process and automations, consider flexibility. Delivery is always an E-program. The software we want to deliver is not the same, the infrastructure we deploy it to varies, and the quality checks we need in our delivery structure change with our organizations needs. To keep our code flexible, keep learning as developers, keep teaching our automations how to do better work.

When you can’t know correct, aim for better

Lehman said, “Absolute correctness of the program as a whole is not the real issue. It is the usability of the program and the relevance of its output in a changing world that must be the main concern.” This is not easy to test, but we can measure or detect our effects on the system our program is entwined in. Microsoft says that a feature is not “shipped” until it is emitting telemetry data that reveals whether people use it. This tells the developer if their changes had an impact on the company.

This is the beauty of development work. “Fundamentally, the peculiar nature of software systems will always leave software engineering in a class of its own.” In programming, solving puzzles is only the beginning. Effecting change in the world is the more compelling game.