Boeing Failures Illuminate Greater Software Challenge

“If x: do y; else: do z.” The beauty of software is in its ability to eliminate human error; a computer can be trusted to execute code accurately every single time, all at incredible speeds while occupying minimal space. But what happens when the code is so complex that the humans writing it cannot identify a bug? As confirmed in the recent tragedies involving Boeing 737-MAX aeroplanes, the computer will still perform exactly as directed.

Software has been adapted and improved over past decades and given greater trust by humanity as it’s progressed. Over time, software has been utilised to control our stock markets, mobile communications, robots in warehouses, rockets (both space and weaponry), aeroplanes and, newly, automated vehicles.

Until recently, the costs of software disasters were almost exclusively monetary. Minor flaws in apps or robots were not ideal but almost never fatal. Now, as we begin to trust the software in planes and cars every day with our lives, a new standard of 100% accuracy is required to maintain safety.

Whilst software has constantly improved in efficiency, it has more than commensurately increased in complexity. The original Onboard Maintenance Function programmed into Boeing 737 planes took two and a half years to develop, including more than 1700 requirements in 32,000 lines of code. Today, Boeing 787 software includes around 14 million lines of code.

The underlying complexity lies within the design of coded programs. A main body is run, calling upon thousands of smaller functions as each line is parsed over. These smaller functions call on smaller functions themselves, each acting as a ‘black box’ – if you input the right arguments, the function returns an output as designed. To execute each function, you need not understand how it works, but simply trust that it does. As such, hundreds of engineers are often responsible for the final product. These engineers are all writing code in their own variation of computer language, accounting and solving for real physical and biological problems through text editors. It is even common for programmers to not understand the physical problem they are solving at all – their goal is simply that their code works. Herein lies an inefficiency complicating the entire system.

Naturally, this means that fixing a bug can be extremely complex. Each individual function is tested rigorously to account for every input case, effectively eliminating human keystroke error. This bottoms-up approach ensures that accuracy can be tested at every level. However, when a bug is buried within millions of lines of code, many relying on one another to work as required, isolating and editing an error is sometimes impossible without compromising other functions. As such, updates are typically appendages that override previous code in order to improve functionality or eliminate bugs, thereby increasing the size and complexity of the overall software program.

James Somers unpacks this challenge in detail in his cautionary article in The Atlantic. He quotes Nancy Leveson, a professor of aeronautics and astronautics at the Massachusetts Institute of Technology who has been studying software safety for 35 years.

“Software is different. Just by editing the text in a file somewhere, the same hunk of silicon can become an autopilot or an inventory-control system. This flexibility is software’s miracle, and its curse. Because it can be changed cheaply, software is constantly changed; and because it’s unmoored from anything physical—a program that is a thousand times more complex than another takes up the same actual space—it tends to grow without bound. ‘The problem,’ Leveson wrote in a book, ‘is that we are attempting to build systems that are beyond our ability to intellectually manage.’”

As we have seen with Boeing’s malfunction, trusting software too complex for humans to intellectually manage can now be fatal. The real problem, however, is not in the computer and not in the accuracy of the code. It is that humans often fail to identify cases or requirements that are statistically improbable, or impossible under human estimation. Unprecedented physical environments can create scenarios that software engineers were not able to perceive from behind their computer screens. When the circumstances acting on the system are not what it was equipped to perform under, the program acts as exactly as it was told to in the circumstance it believes it is in (e.g. nose-diving due to a bad sensor reading in the 737-MAX case).

Whilst Boeing will no doubt correct the problems that caused the 737-MAX crashes, they will also inevitably add to the complexity of the plane’s software systems. Modern, model-based coding languages have been developed to make unperceived requirements easier to identify; however, it is costly and time-consuming to rewrite entire software systems in a new language.

Looking forward, our next challenge will come as automated vehicles are rolled out. Drivers every day encounter scenarios on the road they never could have expected to see. With over 100 million lines of code controlling today’s non-autonomous cars, these systems are already even more complicated than aeroplane software. As Somers identifies, “when you’re writing code that controls a car’s throttle, for instance, what’s important is the rules about when and how and by how much to open it. But these systems have become so complicated that hardly anyone can keep them straight in their head.” Before more lives are put in the hands of software every day, underlying system and language designs need to be simplified so that engineers can be certain of our safety.

Lachlan Mackay is a Research Analyst with Montaka Global Investments. To learn more about Montaka, please call +612 7202 0100.

1 thought on “Boeing Failures Illuminate Greater Software Challenge”

Leave a Comment

Your email address will not be published. Required fields are marked *