Today, I stumbled on this short paper about how systems fail and I had to share it with others. I think it is that good :)
The paper is 5 pages. No nonsense/fluff. Just information about failures in systems, our mistakes while processing these failures, what do we often attribute the failures to, etc. It is all laid out as pithy one liners with short descriptions. The only possible downside is that there are no illustrative examples.
While the paper is rooted in patient safety, I think it applies to all systems that deal with safety. This includes software. Even ignoring safety, the paper offers good info about failures in systems that can be applied to software.
If you are into system reliability and safety (or software development), then give it your 25 minutes (if you haven’t already).