An import bug escaped into production this week. The root cause analysis took us to the usual place; “If we had more test time, we would have caught it.”
I’ve been down this road so many times, I’m beginning to see things differently. No, even with more test time we probably would not have caught it. Said bug would have only been caught via a rigorous end-to-end test that would have arguably been several times more expensive than this showstopper production bug will be to fix.
Our reasonable end-to-end tests include so many fakes (to simulate production) that their net just isn’t big enough.
However, I suspect a mental end-to-end walkthrough, without fakes, may have caught the bug. And possibly, attention to the “follow-through” may have been sufficient. The “follow-through” is a term I first heard Microsoft’s famous tester, Michael Hunter, use. The “follow-through” is what might happen next, per the end state of some test you just performed.
Let’s unpack that: Pick any test, let’s say you test a feature to allow a user to add a product to an online store. You test the hell out of it until you reach a stopping point. What’s the follow-on test? The follow-on test is to see what can happen to that product once it has been added to the online store. You can buy it, you can delete it, you can let it get stale, you can discount it, etc… I’m thinking nearly every test has several follow-on tests.