softwarenotperfect

Why Your Software is Never Perfect

Estimated Read Time: 6 minute(s)
Common Topics: code, software, perfect, ozone, test

We occasionally have students ask for help on software, “My software is perfect, but it doesn’t work!” Your software is never perfect. My software is never perfect.

I recently found that I made someone’s top ten list of software that I had written 37 years ago. It’s not a top ten list anyone would aspire to be listed on. Software that I wrote in 1979 is number two on this list of ten historical software bugs with extreme consequences. I learned some very important lessons from that experience.

Background

My first job out of college was to document what everyone thought was a complete and very well-tested set of software for processing low-level data solar backscatter ultraviolet/total ozone mapping system (SBUV/TOMS) on NASA’s Nimbus 7 satellite. The entire development had already moved on to other projects with different employers. They left behind a large set of code and a very tall stack of computer printouts that contained their test results.

I started from the state of “What is this ‘FORTRAN’ language?” but quickly proceeded to “How can this code possibly work?” and from there to “There’s no way this code can work!” I finally looked at that massive stack of test results on reaching that final stage of understanding. I was the first to do so except for the developers who had abandoned the ship. Nobody else had looked at those test results. They instead looked at the amazing thickness of the printouts.

Testing by thickness always has been and always will be a phenomenally bad idea. Some of those test printouts were slim; these were failed compilations. The rest were what was then called “ABEND dumps.” In those days, the equivalent of what is now called a segmentation fault resulted in the entire virtual memory for the process in question being printed out in hexadecimal. The result was a huge waste of paper. (The modern equivalent is a segfault and core dump.) Not one test indicated success.

This turned out to be a career-maker for me. I made a name for myself by fixing that mess. As a result, I was subsequently given the privilege of working directly for the principal investigator of that pair of instruments and his team of scientists. Instead of the low-level computer science stuff involved with my first task, my next task truly did involve scientific programming.

Why the Nimbus 7 satellite did not discover the ozone hole

Of the two ozone measuring instruments on the Nimbus 7 satellite, one (SBUV) had been flown previously, but the more precise instrument (TOMS) was brand new. The previously flown instrument sometimes yielded flaky results when the solar angle was low, and the team scientists were worried that the same would apply to this newer instrument. The scientific team did not want their good scientific names sullied by suspect data. As a result, they vehemently insisted that I filter out those suspect results by resetting data where the solar incidence angle was low and where the estimated ozone quantity lay outside a predetermined range to a value that meant “missing or invalid data”.

I argued that if I did what he asked there would be no way to discover anomalies. “We can change your code if we discover anomalies,” I suggested that I produce two products, an unfiltered one for NASA internal use only and a filtered version for release to the wider research community. They did not want any part of that, either, on the basis that the unfiltered version would somehow get outside of NASA. “Do what I told you to do, or we will tell your employer to assign someone else to us.” I capitulated and did what he told me to do.

Karma!

The Nimbus 7 satellite did not discover the ozone hole. Credit for that discovery instead goes to Joseph Farnam, who simply pointed a device invented in the 1920s (a Dobsonmeter) up into the sky. Mr. Farnam received a very nice obituary in the New York Times. The SBUV/TOMS team will more or less die anonymously. That’s karma.

As I should have been more insistent with the scientific team, I too was stricken with karma. In 1986, curious minds at NASA wanted very much to know why their very expensive satellite did not discover what a person using a 1920s-era device had discovered. The scientific team discovered that my name was all over the code that hid the ozone hole from NASA. (They conveniently forget why this was the case.) This made people high up in NASA want to talk to me, personally. Despite having switched employers three times and having moved 1400+ miles away from that initial job, I received numerous phone calls and even a few visits from people very high up in NASA that year. I told them why that code existed, and also how to fix it. Voila! After reprocessing the archived satellite data, the Antarctic springtime ozone hole appeared every year.

What I learned

  • Lesson number one:

    Take responsibility for your code.
    Version control software provides the ability to establish blame (or credit) for who wrote/modified every line of code in the codebase. Your name is the sole name attached to the code you write, not your boss’s name, nor that of your customer. You never know who’s going to come back to you seven years or more after the fact regarding the code that you wrote. It’s your name that will be on the code, so take full responsibility for it. While I took full responsibility for fixing that very bad code right out of college, I did not take full responsibility for the code I wrote immediately afterward. I should have.
  • Lesson number two:

    Your code is never perfect.
    As I noted at the outset, this site occasionally receives posts that start with “My code is perfect, but it doesn’t work right! Help me!” If your code doesn’t work right, it is not perfect, by definition. Typical code has a bug per one hundred lines. Well-crafted, well-inspected, and well-tested code typically has one bug per one thousand lines, or perhaps one bug per every ten thousand lines if done very carefully. Pushing beyond that one bug per a few thousand lines of code is very hard and very expensive. The Space Shuttle flight software reportedly had less than one bug per two hundred thousand lines of code. This incredibly low error rate was achieved at the cost of writing code at the glacial rate of two lines of code per person per week, after taking into account the hours people spent writing and reviewing requirements, writing and reviewing test code, writing and reviewing the test results, and attending meeting after boring meeting. Even with all that, the Space Shuttle flight software was not perfect. It was however as close to perfection as code can be. (Note: I did not participate in that process. It would have killed me.)
  • Lesson number three:

    Even if your code is perfect, it is not perfect.
    This is the difference between verification and validation. Verification asks whether the code does exactly what the requirements or user stories say that the code should do. There’s a hidden bug just waiting to manifest itself if the tests are incomplete (and the tests always are incomplete). While verification is hard, validation is even harder yet. Validation asks whether the requirements/user stories are correct. This is not something that typically can be automated. In the case of Nimbus 7, there was a faulty requirement to filter out suspect data. Because I initially balked at writing the code to implement this, there was an explicit test, written by me and reviewed by others, that ensured that the code filtered out those suspect values. Faulty requirements result not only in faulty code but also in faulty tests that prove that the code behaves faultily, just as required.

52 replies
  1. rootone says:

    All compilers have options to insert code which can help with detection of anomalies.
    This does result in longer run times, but modern CPUs don't really notice it.
    If you have got your source code polished down to being near 100% reliable, there might be a small advantage in turning of those debugging options.

  2. eachus says:
    Sherwood Botsford

    Hmm. Pascal was my first language. Years ago when I was using Turbo Pascal, you had the option for range checking. By default it was on. The compiler inserted checks on every array and pointer operation so that the program couldn't access data not originally assigned to that variable. Wonder how much that slows down the software. My estimate is only a few percent.Negative. Not the idea, the impact of global checking enabled. One of the first things I learned when we were developing an Ada compiler at Honeywell, was how to find every place the compiler generated code to raise errors. Some of them I was looking to fix the compiler because the compiler had missed an inference. Some? Oops! Blush my error. And a few really belonged there.

    Today you can program in Spark, which is technically a strict subset of Ada, plus a verifier and other tools. It is extra work, but worth if for software you need to trust. Sometimes you need that quality, or better. I remember one program where the "Red Team" I was on was dinged because ten years after the software was fielded, no decision had been made about whether to hire the contractor to maintain the code, or use an organic (AF) facility. I just shook my head. There were still no reported bugs, and I don't think there will ever be. Why? If you remember the movie War Games, the project was writing the code for Wopper. Technically the computer did not have firing authority. It just decoded launch control messages presented to the officers in the silos–well in-between a group of five silos. The software could also change targeting by reprogramming the missiles directly. We very much wanted the software to be perfect when fielded, and not changed for any reason without months (or longer) of revalidation.

    Let me also bring up two disasters that show the limit of what can be accomplished. In a financial maneuver, Arianespace let a contract to upgrade the flight control software for Ariane 4. I'm hazy on why this was needed, but Arianespace to save money on Ariane 5, decided they were going to reuse all of the flight control and engine management hardware and software from Ariane 4 on Ariane 5. The Ariane 4 FCS was subcontracted by the German firm doing the hardware part of the upgrade, to an English company. The software developers aware that the software would also be used on the Ariane 5, asked to see the high-level design documents for Ariane 5. Fearing that this would give the English an advantage in bidding for Ariane 5 work, the (French) management at Arianespace refused.

    Since the guidance hardware and software were already developed, the test plan was a "full-up" test where the engines and gyroscopes would be mounted, as in the Ariane 5 on a two degree of freedom platform which would allow for a flight of Ariane 5's first stage from launch to burnout. It ran way over budget and behind schedule. When it became the "long pole in the tent," the last box on a PERT or GANTT chart. Rather than wait another year on a project already behind schedule, they went ahead with the launch.

    If you have any experience with software for systems like this, you know that there are about a dozen "constants" that are only constant for a given version of that rocket, jet engine, airplane, etc. Things like gross takeoff weight, moments of inertia, not to exceed engine deflections, and so on. Since the software hadn't been touched, the Ariane 5 launched with Ariane 4 parameters. One difference was that Ariane 5 heads East a lot earlier in the mission. And Ariane 4 had a program to update the inertial guidance system which Ariane 4 needed to run for 40 seconds after launch. (On Ariane 5 it could be shut off at t=0. Ariane 4 could be aborted and rapidly "turned around" until t=6. That capability was used, but couldn't happen on Ariane 5. At immediately after t=0, it was airborne. On the first Ariane 5 launch the clock reached t=39, and the Ariane 5 was "impossibly" far away from the launch site, and the unnecessary software (which was only needed, remember, on the pad) aborted both flight guidance computers. The engines deflected too far, the stack broke up, and a lot of VIPs got showered by hardware. (Fortunately no one was killed.)

    What does this story tell you? That software you write, perfect for its intended function, can cause a half a billion Euros of damage, if it is used for an unintended purpose. We used to say that software sharing needs to be bottom up, not top down, because an M2 tank is not an M1 tank. Now we point to Ariane instead.

    The second case is even more of a horror, and not just because it killed several hundred people. The A320 was programmed by the French in a language that they developed to allow formal proofs of correctness. The prototype crashed at an airshow, fortunately most of the people on board survived. The pilot had been operating way outside the intended operating envelope. He made a low slow pass, then went to pull up to clear some trees. The plane wanted to land, but there wasn't enough runway, and when the plane and pilot agreed on aborting the landing and going around, it was too late. The engines couldn't spool up fast enough. Unfortunately the (French) investigators didn't look at why the engines hadn't been spooled up already.

    There were several more A320 crashes during landings and very near the runway. Lots of guesses, no clear answers. Then there was a flight into Strasbourg, the pilots had flown the route many times before, but this time they were approaching from the north due to winds. The plane flew through a pass, then dived into the ground. The French "probable cause" claims pilot error in setting the decent rate into the autopilot. The real problem seems to have been that the pilots set a waypoint in the autopilot at a beacon in the pass. The glide path for their intended runway, if extended to the beacon, was underground. The autopilot would try to put the aircraft into the middle of the glide path as soon as possible after the last waypoint. Sounds perfectly sensible in English, French, and the special programming language. But of course, that requirement was deadly.

    The other crashes were not as clear. Yes, a waypoint above a rise in the ground, and an altitude for clearing the waypoint that put the plane above the center of the glide path. Pilots argue that the A320 could be recovered after flipping its tail up. The software has been fixed, the French say the problem was a dual use control (degrees of descent or 100's of feet per minute) and that pilots could misread or ignore the setting. But the real lesson is that it doesn't matter how fancy your tools are, if they can conceal fatal flaws in the programmer's thinking. (And conceal them from those who review the software as well.)

  3. donpacino says:

    This is an amazing article that anyone aspiring to be an electrical or software engineer should read.

    I also recently got bitten by a nimbus style situation, although much much smaller scale.

  4. scottdave says:

    Great article. Thanks for sharing. Here it is nearly 2 years old when I come across it and read it. Everything is still relevant. I'm taking a data science class; much of what goes into that is what decisions are made about how to handle certain data — like flagging it or filtering it out.

  5. Sherwood Botsford says:

    Hmm.  Pascal was my first language.  Years ago when I was using Turbo Pascal, you had the option for range checking.  By default it was on.  The compiler inserted checks on every array and pointer operation so that the program couldn't access data not originally assigned to that variable.  Wonder how much that slows down the software.  My estimate is only a few percent.

  6. David Reeves says:

    Humans are unreliable. This is why people are working on formal specifications and automatic programming. I think that eventually we will program only at a high specification level and the computer will implement the code. It is possible now for a robot brain to write its own software to solve a problem, using first order predicate logic. This code is correct in a logical sense, since it is deduced from the original set of axioms in the robot's knowledge base. I have done some work on this myself. The result is an algorithm which can be implemented in an appropriate language.Pascal was mentioned. Wirth has the right idea. He builds his software the Swiss way, which is to say intolerant of error. He wrote a book years ago called Systematic Programming. I was happy to find a used copy on ebay. It's worth reading even today.

  7. anorlunda says:

    [QUOTE="Svein, post: 5565530, member: 538805"]Yes – but not all things are that simple.”The question is not if you can make everything simply, but rather if you can make anything simply.

  8. Svein says:

    [QUOTE="anorlunda, post: 5565499, member: 455902"]My argument in #39 is that we are delivering lower quality software today than in the past because we don't follow the KISS principle. Refute that if you will. To prevent muddling the point being argued, please stick to the MOV app as the benchmark,(because to apply KISS, we must start with something simple.)”Yes – but not all things are that simple. Anecdote: A customer came to me and announced: We want to put the horse betting status on the Internet! Here we have a simple statement that is nowhere near a specification. What is missing is (among several other details):

    • What should the status screen look like?
    • How often should it be updated?
    • How should we handle the fact that there are a varying number of horses in each race?
    • How should a stricken horse be handled – should we stop displaying it, should we display it in a contrasting font or should we do something else?

    Being stubborn, I insisted on having all the details in the requirement spec. I then presented the customer with my functional spec – and since the deadline was now just two weeks away, they accepted it promptly. The software was delivered on time and it worked.And – before you mention PHP – this was in 1995!

  9. anorlunda says:

    [QUOTE="Svein, post: 5565301, member: 538805"]That is where the paperwork comes in. Done correctly, it is of vital importance.

    1. The Requirement Specification is the responsibility of the customer. …
    2. The Functional Specification is the responsibility of the developer. …
    3. Ideally: Revised Req. Spec., revised Func. Spec., …
    4. Hopefully: A Req. spec. that is reasonably clear and realistic and a Func. Spec. that is realistic.

    I have used that model several times. “I too have used that model many times.  It promotes bloat rather than combats growth.  Neither can that process scale down to keep a simple function simple (KISS).  Please reconsider the MOV application from #39.  Use an order of magnitude budget estimate of $10000 (in 2016 $) and 48 hours elapsed time,  including all specs, negotiations, lawyers, implementation, testing and documentation.   Can your process fit in that budget?  Could the product meet the same reliability performance as #39?My argument in #39 is that we are delivering lower quality software today than in the past because we don't follow the KISS principle.  Refute that if you will.   To prevent muddling the point being argued, please stick to the MOV app as the benchmark,(because to apply KISS, we must start with something simple.).

  10. Svein says:

    [QUOTE="anorlunda, post: 5564128, member: 455902"]I'm sure that you're right.  But let me ask you two questions.

    1. How do you draw the line between necessary requirements and bloat?  (Focusing first on a simple app like the MOV controller helps clarify.)
    2. Who has the responsibility and authority to draw that line?

    “That is where the paperwork comes in. Done correctly, it is of vital importance.

    1. The Requirement Specification is the responsibility of the customer. The first revision is usually both "over the top" and imprecise, but it is a start.
    2. The Functional Specification is the responsibility of the developer. It should not be a carbon copy of the Req. Spec., but a description of what the developer think it is possible to deliver in a realistic time frame.
    3. Ideally: Revised Req. Spec., revised Func. Spec., …
    4. Hopefully: A Req. spec. that is reasonably clear and realistic and a Func. Spec. that is realistic.

    I have used that model several times. Coupled with being stubborn I have flatly refused to start developing something until the project has arrived at point 4. above. Usually, I have done some notes to myself along the way about how to develop what I thought the customer wanted. Usually, those notes have to be discarded since the final Req. Spec. bore little resemblance to the first informal inquiry.

  11. FactChecker says:

    [QUOTE="anorlunda, post: 5564128, member: 455902"]I'm sure that you're right.  But let me ask you two questions.1. How do you draw the line between necessary requirements and bloat?  (Focusing first on a simple app like the MOV controller helps clarify.)[/quote]That is the million dollar question that I can't answer.  Even trying to isolate a part like a MOV controller gets complicated when issues like redundancy, failure modes, communication standards, customer preferences, etc. come into play.  It is often hard for technical software people to communicate (or even anticipate) the risk / cost / schedule consequences of decisions.[quote]2. Who has the responsibility and authority to draw that line? “It is somewhat mysterious to me, but here is what I think.  In military contracts top level requirements are initially set very optimistically by the military so that they can see how much different contractors can propose.  As the contract winner is selected and development proceeds, they see that some requirements were unrealistic.  They try to find some "low hanging fruit" that they did not think of before and can be done in place of the reduced / modified initial requirements.  It's a negotiation.

  12. anorlunda says:

    [QUOTE="FactChecker, post: 5564107, member: 500115"]But I have one argument. I think that the old software would be very easy to apply new software processes to. The problem is that hundreds of new requirements are piled on. Many of the new requirements are good ideas, but there is a tendency to go overboard. They are trying to anticipate the needs of the next 30 years, which is very difficult if not impossible.”I'm sure that you're right.  But let me ask you two questions.

    1. How do you draw the line between necessary requirements and bloat?  (Focusing first on a simple app like the MOV controller helps clarify.)
    2. Who has the responsibility and authority to draw that line? 
  13. FactChecker says:

    anorlunda said: Now you can fairly call me old fashioned, but I find it hard to imagine how the world's best quality control procedures, and software standards could ever make the "modern" implementation as risk free or reliable as the "old" 200 byte version. Worse, the modern standards probably prohibit the "old" version because it can't be verifiabull, auditabull, updatabull, securabull, or lots of other bulls. I argue that we are abandoning the KISS principle.[QUOTE="Svein, post: 5563974, member: 538805"]Hear, hear!”Yes.  When hardware is the subject, people adhere to the KISS principle, but for software they abandon it.But I have one argument.  I think that the old software would be very easy to apply new software processes to.  The problem is that hundreds of new requirements are piled on.  Many of the new requirements are good ideas, but there is a tendency to go overboard.  They are trying to anticipate the needs of the next 30 years, which is very difficult if not impossible.

  14. Svein says:

    [QUOTE="anorlunda, post: 5563946, member: 455902"]Now you can fairly call me old fashioned, but I find it hard to imagine how the world's best quality control procedures, and software standards could ever make the "modern" implementation as risk free or reliable as the "old" 200 byte version.  Worse, the modern standards probably prohibit the "old" version  because it can't be verifiabull, auditabull, updatabull, securabull, or lots of other bulls.  I argue that we are abandoning the KISS principle.”Hear, hear!

  15. anorlunda says:

    The closest I ever came to military software was an association with the Saturn V Project, so I can't comment on things military.  But the point of post #34 was old versus new, so let's compare apples with apples.  Compare the same mundane application then and now.Consider a controller for a motor operated valve (MOV).  The valve can be asked to open, close, or to maintain an intermediate position.  The controller may monitor and protect the MOV from malfunctions.   In the old days, the logic for this controller would be expressed in perhaps 100-150 bytes of instructions, plus 50 bytes of data.  That is so little that not even an assembler would be needed. Just program it in machine language and type the 200 hex digits by hand into the ROM burner.  A 6502, or 8008, or 6809 CPU variant with on-chip ROM would do the work.  The software would have been the work product of a single person working less than one work-day, perhaps checked by a second person.  Instantiations would cost about $1 each.  (In the really old days, it would have been done with discrete logic.)In the modern approach, the logic would be programmed in a high level language.  That needs libraries, and those need an OS (probably a Linux variant), and that brings in more libraries.  With all those libraries come bewildering dependencies and risks, (for example https://www.physicsforums.com/threads/science-vulnerability-to-bugs.878975/#post-5521131)  All that software needs periodic patches, so we need to add an Internet connection (HORRORS!) and add a user interface. With that comes all the cybersecurity, and auditing overhead.  All in all, the "modern" implementation includes ##!0^4## to ##10^6## times more software than the "old" 200 byte version, to perform the same invariant MOV controller function.Now you can fairly call me old fashioned, but I find it hard to imagine how the world's best quality control procedures, and software standards could ever make the "modern" implementation as risk free or reliable as the "old" 200 byte version.  Worse, the modern standards probably prohibit the "old" version  because it can't be verifiabull, auditabull, updatabull, securabull, or lots of other bulls.  I argue that we are abandoning the KISS principle.Now, the reason that this is more than a pedantic point, is the IOT (Internet of Things).  We are about to become surrounded by billions of ubiquitous micro devices implemented the "modern" way rather than the "old" way.   It is highly germane to stop and consider if that is wise.

  16. FactChecker says:

    [QUOTE="D H, post: 5563246, member: 42688"]The Space Shuttle flight software was written at a rate of about two or three lines of production code per person per month. The people behind that code weren't twiddling their thumbs for 159 working hours and then writing two or three lines of code. They were instead attending meetings (NASA loves meetings), writing requirements and specifications, critiquing the production code, writing and critiquing test code, evaluating test results, and tracing requirements to sub-requirements to specifications to code to tests, backwards and forwards. This was largely done by hand (the automated tools didn't exist), and of course was done without modern software development techniques such as Agile. While still very expensive, development of critical software has come a long ways since then.”The things they were able to do with such tiny (capability-wise) computers back then is amazing to me.  And they are still doing great things communicating with vehicles that were launched decades ago.

  17. D H says:

    [QUOTE="anorlunda, post: 5562506, member: 455902"]Where it comes to embedded real time control software, I dispute that so much has really changed.  In a few cases, modern software may have difficulty doing as well as some of the vintage stuff.”The Space Shuttle flight software was written at a rate of about two or three lines of production code per person per month. The people behind that code weren't twiddling their thumbs for 159 working hours and then writing two or three lines of code. They were instead attending meetings (NASA loves meetings), writing requirements and specifications, critiquing the production code, writing and critiquing test code, evaluating test results, and tracing requirements to sub-requirements to specifications to code to tests, backwards and forwards. This was largely done by hand (the automated tools didn't exist), and of course was done without modern software development techniques such as Agile. While still very expensive, development of critical software has come a long ways since then.

  18. FactChecker says:

    [QUOTE="anorlunda, post: 5562506, member: 455902"]Where it comes to embedded real time control software, I dispute that so much has really changed.  In a few cases, modern software may have difficulty doing as well as some of the vintage stuff.”In the military aerospace application, the control law software of 30 years ago can not be compared with current software.  A control law diagram from 30 years ago would fit on a single sheet of paper.  Modern control law software is on hundreds of pages of diagrams.  The requirements have exploded 1000 fold along with the associated SW complexity.

  19. QuantumQuest says:

    [QUOTE="anorlunda, post: 5562506, member: 455902"]Where it comes to embedded real time control software, I dispute that so much has really changed. In a few cases, modern software may have difficulty doing as well as some of the vintage stuff.”I agree. I was talking about software at large and not particularly for embedded software.

  20. anorlunda says:

    [QUOTE="QuantumQuest, post: 5562415, member: 554291"]In the present days on the other hand, there are huge demands for software to solve way more difficult/demanding problems, operate on a lot of newer domains, …a”Where it comes to embedded real time control software, I dispute that so much has really changed.  In a few cases, modern software may have difficulty doing as well as some of the vintage stuff.

  21. QuantumQuest says:

    [QUOTE="anorlunda, post: 5562392, member: 455902"]Often neglected is the fact that many (perhaps most?) bugs are benign. They may never get triggered, or their negative effect not noticeable.An interesting example came up during the Y2K remediation. Some software in nuclear power plants had operated successfully for 35 years or more. Presumably, software standards were much better in 1999 than in 1965, so new software is expected to have many fewer bugs. On the other hand, the old software had amply demonstrated that any bugs remaining must be benign. New software may have few bugs (never say zero), but they may not be benign.So which is safer, the old or new software? That question should not be answered flippantly.That same question arose in many critical applications in many industries during the Y2K years. It is the age old debate between new and better versus proven.”In my opinion, each software – old vs, new, must be judged according to its domain and problem/s it solved/solves at its time.In the old days there were not so many programming languages and dialects of them, a few programming paradigms, testing was all but trivial and maintenance was difficult at best – talking about big software. Positive thing is that developers had the time and opportunity to focus better on what they were creating, so for that instant there were mostly benign bugs.In the present days on the other hand, there are huge demands for software to solve way more difficult/demanding problems, operate on a lot of newer domains, operate in an inter-domain fashion, and although there is a multitude of programming languages – this holds for multi-branch descendants too, a lot of programming and testing tools, libraries and frameworks, its complexity increases at a very steep fashion and even with the best designed tool chains for full cycle development, bugs are inevitable and this I think also justifies that are not so benign, as in the old days.I don't think that a direct comparison can be done between old and new software. There is a whole lot of independent and interdependent factors between old and new, as well as in each of them, that makes a direct comparison very difficult.

  22. anorlunda says:

    Often neglected is the fact that many (perhaps most?) bugs are benign.  They may never get triggered, or their negative effect not noticeable. An interesting example came up during the Y2K remediation.  Some software in nuclear power plants had operated  successfully for 35 years or more.  Presumably, software standards were much better in 1999 than in 1965, so new software is expected to have many fewer bugs.  On the other hand,  the old software had amply demonstrated that any bugs remaining must be benign.   New software may have few bugs (never say zero), but they may not be benign. So which is safer, the old or new software?  That question should not be answered flippantly. That same question arose in many critical applications in many industries during the Y2K years.  It is the age old debate between new and better versus proven.

  23. D H says:

    [QUOTE="StatGuy2000, post: 5553895, member: 339302"]This raises a question for me. Software plays a critical importance in systems where safety and/or reliability is of critical importance (e.g. nuclear power plants, electric grids, air-traffic control, medical equipment, etc.) Under those circumstances, the threshold for tolerance of bugs in software would be extremely low (if not close to non-existent). Under these circumstances, how could engineers or software developers ensure that the system would be as close to "perfect" as possible?”By working very carefully, by using thorough testing, by using processes that reduce the number of bugs that arise, by having a top-to-bottom attitude toward quickly stomping those few bugs that do arise, and by having others check everything.

  24. Svein says:

    If you are producing drivers for some high-speed peripheral (say Gigabit Ethernet hardware), your software is going to be run thousands of times every day. And – if the product is successful – in thousands of locations every day. The chance of even a small bug going unnoticed are close to zero.A hardware-related anecdote: I once wrote an Ethernet driver for a system-on-a-chip based on one of the ARM CPUs. The Ethernet driver worked perfectly, but somehow no IP packet made it to the IP handler. The cause turned out to be a combination of the Ethernet packet format and a peculiarity of the ARM CPU:

    • The 32 bit ARM CPU could only access words on a 32 bit boundary. If you violated this rule, the CPU did not protest, but read the closest 32 bit word, scrambling the contents along the way.
    • The Ethernet packet format starts with two 6-byte addresses plus a 2-byte packet type. Add this together, and you get a 14-byte offset from the start of the Ethernet packet to the payload.
    • The IP software assumes that the IP packet is aligned on a 32 bit boundary (or that the CPU can access the contents as if it were on a 32 bit boundary).

    Those three facts together resulted in the ARM CPU trying to make sense of a scrambled IP address. The solution: Copy payload to a buffer aligned on a 32 bit boundary (a bad solution, since copying a packet from one place in memory to another is slower than transmitting the packet across the Ethernet).

  25. Sherwood Botsford says:

    @StatGuy20If the figures are correct, then one way to do this is to minimize code size.  Fewer lines = fewer bugs.  For really critical code, do the code in different languages by different teams.  Do they get the same results?  Code in 3 languages, and run all at the same time against real time data.  two out of three separate systems have to agree to take action.Use discipline & bondage languages that enforce strong typing, bounds checking, strict calling conventions.  Sure, it slows down the code.  You want it in a microsecond, or you want it right in 3 microseconds?Use standard libraries whenever possible.  Some one else has found the bugs. Or most of them.A large number of problems aren't with the code, but with the operating system libraries.  The obvious response to that, is use a well audited operating system.  OpenBSD last I looked, had an enviable reputation for exploitable bugs.

  26. mfb says:

    There are a lot of guidelines that help making bugs less frequent and easier to find. As an example, some commands or concepts are known to be bug-prone, and you can avoid them. Apart from that: code reviews, extensive test cases (typically written earlier than the code itself) and so on.A common way to avoid hardware issues (e. g. from radiation damage) is the majority vote system: have three identical systems, do what at least two of the systems suggest to do. If one of them fails, its output is overridden by the other two systems.

  27. StatGuy2000 says:

    DH, a well written article – thank you for sharing. I quote the following from your post above: "Well-crafted, well-inspected, and well-tested code typically has one bug per one thousand lines, or perhaps one bug per every ten thousand lines if done very carefully.Reference https://www.physicsforums.com/insights/software-never-perfect/"This raises a question for me. Software plays a critical importance in systems where safety and/or reliability is of critical importance (e.g. nuclear power plants, electric grids, air-traffic control, medical equipment, etc.) Under those circumstances, the threshold for tolerance of bugs in software would be extremely low (if not close to non-existent). Under these circumstances, how could engineers or software developers ensure that the system would be as close to "perfect" as possible?

  28. anorlunda says:

    My personal favorites for obscure bugs came on two occasions when (the same) someone asked me for help. 

    1. He couldn't find the problem despite numerous re-compiliations adding PRINT statements for diagnostic purposes.   He tried to re-compile once again, and out of the corner of my eye I caught the card reader shuffling the order of his cards.
    2. He couldn't find another bug. I helped him to narrow it down to a single statement, then a particular expression, then to floating multiply of two specific values.  I wrote a new program to multiply those two values and compare the result with the known answer.  It failed 0.003% of the time.  The floating point unit was replaced and the bug vanished.  Even in today's world, I suspect that programmers expect the CPU to faithfully execute the code with zero error rate.
  29. Borg says:

    [QUOTE="mfb, post: 5493691, member: 405866"]You can certainly overdo it with making new classes and functions for everything, of course – a function that does nothing but calling a different function can be useful, but if you do that 5 layers deep something could be wrong.”Definitely agree.  There is seldom a reason for code to be nested so deep.  Usually when I see something like that, the person should have written a recursive function.

  30. mfb says:

    [QUOTE="Borg, post: 5493686, member: 185214"][QUOTE="Sherwood Botsford, post: 5493671, member: 590802"]I knew a programmer who kept his tabs set at 8 characters, but used a 132 character window. If he found that his code was wandering off the right edge, he'd abstract a chunk of it out. His claim was that you could only understand so many levels deep.”Putting a chunk of code in a separate method to keep the code clean isn't the worst thing in the world.  8 space tabs is pretty excessive though.[/quote]Doing it only after ~8-15 indentation steps (not sure how to interpret the statement) is way too late, however.You can certainly overdo it with making new classes and functions for everything, of course – a function that does nothing but calling a different function can be useful, but if you do that 5 layers deep something could be wrong.

  31. Borg says:

    [QUOTE="Sherwood Botsford, post: 5493671, member: 590802"]I knew a programmer who kept his tabs set at 8 characters, but used a 132 character window. If he found that his code was wandering off the right edge, he'd abstract a chunk of it out. His claim was that you could only understand so many levels deep.”Putting a chunk of code in a separate method to keep the code clean isn't the worst thing in the world.  8 space tabs is pretty excessive though.You probably don't mean abstraction the way that I have to deal with it.  In my current project, I mostly have to deal with code that is accessed through abstract interfaces even though there may only be a single implementation of the class.  The interfaces are then created through multiple configuration files that are themselves abstracted.  There is no organization as to where various code is located in the source tree such that html-based code can often be found at the database layers making it extremely difficult to modify and test without running the server.  Then, toss in a few undocumented expectations of OS environment variables for extra fun.  Figuring out the code when someone does this can be a nightmare.  :oldruck:

  32. Sherwood Botsford says:

    [QUOTE="elusiveshame, post: 5493667, member: 510934"]How is that littering? It's providing insight to the next programmer who either has to troubleshoot or fix code that was untested. Would you rather look for a needle in a haystack, or have some guidance to help you fix troubled code?”Littering in a somewhat different sense is to scatter randomly.  Trees litter the ground with leaves.  Of late it has come to mean trash.It is likely to get you in trouble with your boss.  If you leave comments that are perjoritive about the code, and it results in an accident or financial loss, then the company as a group knew of the bug, and therefore is more culpable.On a complex project change submissions are reviewed by someone, so it often takes a conspiracy or very enlightened policy to get this sort of comment embedded.

  33. Sherwood Botsford says:

    The most arcane bug I've tracked was on an MS-DOS system.  At that point drives larger than 33 MB were just coming into play.  You had to split them into two drives because 32 and change MB was all that Dos could address.  Twice a day a BAT file would run backing up the department accounting data to an external tape drive.I was called in because the backup would crash every time the BAT file ran.  Olliveti, the machine's maker, had been around and swapped mother boards, and this and that.It took me a day to reproduce the problem.  I basically made a clone of the orgininal machine including tape drive and controller.The problem didn't manifest itself if the machine had a single virtual drive.  So the apparent data fault was due to the logical partition.  I kept cutting the data on that drive in half.  Two days later:  If the first file on the second logical drive was under 509 bytes in length the driver for the tape drive would crash.  ***Some languages are more error prone than others.  Something to be said for "Discipline and Bondage" languages like Pascal with strong typing and range checkingI knew a programmer who kept his tabs set at 8 characters, but used a 132 character window.  If he found that his code was wandering off the right edge, he'd abstract a chunk of it out.  His claim was that you could only understand so many levels deep.Another time a grad student came to me.  "Can you help me get this code to run?"  I looked at it.  Fortran.  Written with identifiers like R6  Q27.  "Look:  First thing, give each variable a name that is at least 5 characters long, has a vowel,  and is meaningful to you.  Exception:  You can use single characters for loop counters if the ohter end of the loop is in the same page.  Second, for each subroutine write a 10 line description of what it is supposed to do.   He grumbled and went away.Several days later he came back, and I gave him another lesson from my first programming course.  "No goto's.  No abnormal loop exits."  That took him longer.While writing good code is hard, there are lots of ways to write bad code.   He did eventually get his code to run, and his thesis involving black hole physics got him his masters.

  34. elusiveshame says:

    [QUOTE="Sherwood Botsford, post: 5493661, member: 590802"]So you litter the code with comments like,/* Somewhere in this block is a bug that bites when processing partial data segments *//* Untested code: proceed at own risk */”How is that littering? It's providing insight to the next programmer who either has to troubleshoot or fix code that was untested. Would you rather look for a needle in a haystack, or have some guidance to help you fix troubled code?

  35. Sherwood Botsford says:

    [QUOTE="eltodesukane, post: 5487045, member: 394501"]"It’s your name that will be on the code, so take full responsibility for it."Good advice, but usually this can not be done.How many times does a programmer says the code is not ready, but the employer says we release it now anyway?Same problem in design, architecture or else..If an architect is hired to replace a wonderful bay windows with a concrete wall, then the architect will do that.If this concrete wall is viewed as an ugly abomination, the decision maker has to be blamed, not the architect or designer hired to do it.”So you litter the code with comments like,/* Somewhere in this block is a bug that bites when processing partial data segments *//* Untested code: proceed at own risk */

  36. elusiveshame says:

    Nice article. It's interesting to see how major firms design and code their software. You're right, though, in that there is not perfect software.

  37. anorlunda says:

    I was once involved with a service organization that helped companies deal with the Y2K bug. When it came to the conclusion, the IT workers were fired without even a thank you handshake, the business managers who created the service company went for a week long celebration in Bermuda, and the public and the media said, “See nothing happened. The Y2K bug was a myth in the first place.”

    I also think of the initial launch of Lotus 123 in 1983.. The 1.0 release was limited and buggy. The news reported that the Lotus startup spent $5 million on advertising and only $1 million on the software itself. Their retort was simple. They said “If this product is successful, we will have truckloads of money to abandon 1.0 and write a proper 2.0. If not successful, we will quickly abandon it anyhow So every penny spent on debugging and quality is wasted.” As it turned out, customers like the buggy initial release enough, that the 2.0 version was indeed financed.

    My point is that society is very hypocritical about bugs and flaws. We get so easily indignant when hearing of bugs in so-called “mission critical” places. But the reality is that those programmers who slave to check for bugs (or to not create bugs in the first place) are among the least valued members of the profession. We will never have a Turing award for one of those people.

    Debugging is a very thankless task.

  38. newjerseyrunner says:

    I think code rot is the biggest producer of bugs and complicated code. Lots of times developers will come up with beautifully simple designs for complicated problems, but then requests start trickling for changes that were never expected to be made. This causes refactoring of small parts of the code, which ends up making it more rigid after a while.

  39. .Scott says:

    I also like your article. Although I would quibble about the difficulty in getting code down to the 1 bug per 10,000 lines.
    With thorough code reviews, thorough black-box modular testing, thorough review of code-coverage results, and good programming talent, you can get to that level and still have good productivity. Also, having been involved in several such efforts, those quality procedures themselves are not the bear. Making all of that auditable is the bear. After all, someone is paying for that quality and they want evidence that they’re getting it.

  40. D H says:

    “Wow. that was a fun read.@D-H. That’s one Insights article that is really insightful.”
    Thanks, and thanks to everyone else who liked my article.

    “Life isn’t fair. Most developers are forced to follow orders and meet the requirements handed down, as you were.”
    That was my first job out of college. That’s to be expected for a freshout. My career has evolved since then. I’ve learned that “debugging the blank sheet of paper” is my personal briar patch. (Debugging the blank sheet of paper: “There’s supposed to be a design or an architecture for X on this blank sheet of paper. Fix it!”)

    A couple of random thoughts I did not put in my Insights article:

    The article I sited in my Insights article missed a key point. The end of that article suggested all of ten of those “historical software bugs with extreme consequences” would not have occurred with improved testing. That was not that case in the case of my “historical bug.” We had tests, and the software passed those tests. I thoroughly understand the concept of filtering out bad data from remote sensors. While filtering is essential, its usually reserved for egregiously bad values such as what appears to be a 6+ sigma outlier. Digital noise is not Gaussian, thank you cosmic rays.

    Even the lowest person on the totem pole working in software that must kill if written done correctly (counterexample: software that erroneously starts WWIII), software that most not kill if written correctly (counterexample: The [URL=’https://en.wikipedia.org/wiki/Therac-25′]Therac-25 software[/URL]), or software that must not lost 400 million dollars if written correctly (counterexample: The [URL=’https://en.wikipedia.org/wiki/Knight_Capital_Group’]Knight Capital fiasco[/URL]) bears the burden of taking ownership of ones code. These are fields where you do not ship just because the boss says “ship it, now!”

  41. Borg says:

    “Who’s setting this allotted time?”I’ve been on projects where management set the time and where the developers were asked to provide an estimate. I’ve never felt that I couldn’t question an estimate even if I was the provided it in the first place.

  42. rootone says:

    There was a time when I made most of my living from freelance programming.
    On one occasion I was tasked with putting right a buggy application when the original coder had departed and was seemly uncontactable,
    furthermore there was little in the way of any documentation other than a description of what the system was supposed to achieve.
    I told the client that it looked like there would need to be at minimum a couple weeks just going through the code, testing things and making notes.
    Client was not happy to be told that and said they would get somebody else to do the job.
    In the end I think what happened is the whole thing got rewritten from scratch by somebody.

  43. rcgldr says:

    “In my experience, there is rarely a case of not being able to do the job within the allotted time.”Who’s setting this allotted time? The two main issues I’ve seen are overly optimistic schedules set by management, or unexpected technology issues, usually related to hardware.

    In DH’s example, the issue wasn’t software bugs, as the software was doing what it was asked to do, which was to mask certain types of anomalies.

    I was spoiled by my first job, back in 1973. It was a multi-processor / multi-tasking online database system for pharmacies (mostly prescriptions in the database, with support for insurance billing features). The system never experienced data corruption. There were instances of hardware failures that resulted in temporary down time, but the backup / recovery procedures developed using a test system, worked the first time they were used in an actual failure occurrence. This was a near mission critical environment. At the other extreme, I’ve worked in environments where a project was thrown together just to release something, and most of the software effort was spent tracking down and fixing bugs. In a few rare cases, there was a dual effort, the quick and dirty approach just to release something (like a beta release), in parallel with a proper development cycle to produce code that would replace the quick and dirty release.

  44. anorlunda says:

    Wow. that was a fun read.@D-H. That’s one Insights article that is really insightful.

    Life isn’t fair. Most developers are forced to follow orders and meet the requirements handed down, as you were. More fortunate developers, are ahead of the curve. They create the future, then show users what they really wanted, contrary to what they asked for. Perhaps the most famous example of that was Steve Jobs and his team with the iPhone. But an even better example was Dan Bricklin and Bob Frankston with Visicalc.

  45. Borg says:

    “”It’s your name that will be on the code, so take full responsibility for it.”
    Good advice, but usually this can not be done.
    How many times does a programmer says the code is not ready, but the employer says we release it now anyway?”
    In my experience, there is rarely a case of not being able to do the job within the alloted time. If a developer gets sidetracked by other priorities, then the deadline is extended or someone else picks up the slack. Quite often, when I hear this excuse, it’s from someone who isn’t doing their job correctly – either they’re goofing off and not doing their job or they’re stuck and are too afraid (or proud) to ask for help. I have no pity for the goof-offs and the prideful can be their own worst enemy. For everyone else, a simple five minute discussion of how to tackle a problem can make all the difference. I have no problem asking a junior developer how something works if I think that he has a better insight into it.

  46. Hornbein says:

    I’ve been a pro programmer but my main interest is music. The standards are completely different. In music your stuff has to be pretty close to perfect if you want to make a living. In software, a total incompetent with credentials can make a living. The demand for programmers is so high that you can get away with anything. In music, demand is so low that you can be very talented and lose money.

    If the situation were reversed, software WOULD be perfect. You’d starve if it weren’t.

    Steve Morse won a ton of Best Guitarist polls. Keyboardist T Lavitz said of guitarist Steve Morse that in five years of rehearsing and performing very difficult music he never heard Steve make a mistake. Not once. Nevertheless he couldn’t make it as a solo act. Classical music is even more demanding, to a degree that’s almost inconceivable. If programming were like that, you’d have to start at age three then do it five hours a day for the rest of your life in order to have a chance to make it. And not a very good chance at that.

    Guitarist Mick Goodrick advised people to stay out of music if they could do anything else. Indeed, those who make it a career generally can’t do anything else. I think there is no space left over in the brain for anything else. It’s too demanding.

    When I grew up I found that many of those famous jazz musicians like Barney Kessell really made their money playing for TV commercials and stuff like that. It’s kept secret because it’s depressing. Entertainers can’t be depressing.

    Yes, I know, lots of pop stars have little musical ability. That’s different. Pop stardom has almost nothing to do with music.

  47. eltodesukane says:

    "It’s your name that will be on the code, so take full responsibility for it."Good advice, but usually this can not be done.How many times does a programmer says the code is not ready, but the employer says we release it now anyway?Same problem in design, architecture or else..If an architect is hired to replace a wonderful bay windows with a concrete wall, then the architect will do that.If this concrete wall is viewed as an ugly abomination, the decision maker has to be blamed, not the architect or designer hired to do it.

  48. QuantumQuest says:

    Really nice article. I share same thoughts with Borg too. What I tried to do for myself to improve my skills from the outset, was to be pedantic enough on testing and especially proper commenting and documentation in the web world, where I began my professional coding endeavor. Back in that era there was a lot of improvisation, catchy things to include and proprietary things to follow in order to survive. Fortunately this “web 1” era has long gone. After a specific point , I jumped onto the bandwagon of enterprise software – mainly Java, but I kept on taking some medium web projects sometimes just for the fun of it. I found many times poorly documented code and ad-hoc solutions, courtesy of top developers. This does not help anyone. When such developers leave a company, their leftovers is just mess. Surviving these things, although part of the process, is something I personally try to avoid recreating. There is always room to write good code that can be followed by others. I also vote for simplicity provided that fits sufficiently to the project at hand. I definitely agree to the three final points that are made in this insight. Especially to the fact that there hasn’t been and can never exist perfect code. There are bags that will survive almost every testing process and manifest themselves at the wrong time. If we talk about real time software this can be fatal. Continuous testing and adaptations including modifications must me done in a regular basis.

  49. Pepper Mint says:

    I share the same thoughts with Borg. I realize the best part in coding or software development in general is to simplify things as much as possible but they must be guaranteed to have their basic provided functions preserved at the same time. This also helps to reduce costs for business development (e.g maintenance) and resource management (e.g more experts are no longer needed ), etc.

  50. Borg says:

    Nice article DH. One of the biggest lessons that I had to learn as a junior developer was that complexity did not mean that previous developers knew what they were doing. My first reaction would be that they must be really good programmers if they could write code that was difficult to follow. I would be afraid to make changes for fear of breaking something in all that complexity. These days, simplicity is my goal and I have no problem taking a chain saw to overly complex code as I’m doing on my current project. The link in my signature is my mantra for what not to do when writing software.

  51. jedishrfu says:

    Wonderful article! It brings back memories of working with GE and crashing the system while running in master mode with some software that was never designed to run in that environment. Guess who got the credit/blame for the crash. Even after it was explain to the technical staff one customer service rep pipes up at the end so it was xxxx who crashed the system right? The crash error actually had a bright side in it illustrated how another service was crashing things and we found the bug there too. But still the one who crashed it lives on…

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply