The Humble Pi is a book written by Matt Parker about maths mistakes. In it, he goes over different stories and tells what had to go wrong to get to very dire consequences. Usually, many layers of protection have to fail before there are real world consequences. This pushed me to think about my own mistakes and how they did or did not have actual consequences.
I make mistakes all the time, especially in my work. I write buggy code, I misplace something or I make spelling mistakes. Some of those end up costing time, others will cost money, to me or someone else.
Let's take a simple example: the time when I lost all of the music I had gathered over the years. It was when I was in university, one day I had to reinstall my phone from scratch, and after doing so, I realized I did not have any copy left of my music library. How could this happen? I had started with three copies of these files: one on my computer, one on an external hard drive and one on my phone. I didn't often add new things to this library so I did not interact with those medium often. A few months after I had it all backed up on the hard drive for the first time, I ended up needing to re-use this hard drive to transfer a big amount of data from one computer to another one. My music was taking too much space, I deleted it, which was fine because I still had two other copies. I made a mental note to copy them on an other hard drive but never did. I probably forgot, or got too lazy to do it.
Then, a few weeks after that, I decided it was time to reinstall the computer which had the music on from 0. So I wiped the hard drive, I thought about the music files, but remembered that I still had a copy on my phone so I thought I would be fine. Again, I made a mental note to copy the files from the phone and moved on.
Finally, when I had to reinstall the phone, I had lost all of my backups and had completely forgotten that I needed them. I was also probably a bit lazy. I had gone, in a few months, from 3 copies to 0. Fortunately, the only thing I lost were a bunch of audio files which a bit of time could restore. And later I started using Spotify which completely solved the problem.
What I find interesting in this story, is that one of the main reasons I got rid of my safety nets, was that it would take time and effort to make a backup. That is true of most safety nets I use in my programming work as well: it takes effort to write tests, it takes effort to make your continuous integration tool test your code and reject buggy updates. It also goes byond that: some of the deployment procedures we use for some of our most critical software often seem cumbersome. But they have saved me from outages so many times that I would never think to get rid of those procedures.
All of this made me realize something that I probably already knew: what we need to strive for, in software engineering at least, is not to reduce the amount of effort we need to produce new software. It is to reduce the amount of combined effort AND pain. You can take a hit in the effort department if you can reduce the probability of failures significantly. You also need to factor in the amount of pain a failure generate: for some applications, even one failure might be catastrophic, so it makes sense to put a lot of effort into making sure it works properly.
Before concluding, I would just like to add a link to Matt Parker's personal website http://standupmaths.com and to https://mathsgear.co.uk, the maths merchandise website he runs with a couple friends.
Happy New Year!