Just before the holidays, we went live with another relatively huge chunk of users who require slightly different features than our previous users. The bug DB is quickly filling up with bugs discovered in production. These bugs are logged by the business/support arm of our team because testers can’t keep up. Many of the bugs don’t have repro steps and appear to be related to multiple users, performance, deadlocking, or misunderstood features. Other bugs are straight forward; oversights uncovered after users take the app through new paths for the first time.
My team is struggling to patch critical production issues to keep the users working through their deadlines. I want to investigate every new bug to determine the repro steps and prepare for verifying their fixes. Instead, I’m jumping from one patch to the next, attempting to certify the patches for production. Keeping up is difficult. New emails arrive every few seconds.
This is an awkward phase, but the team is reacting well, maintaining a good reputation for quick fixes. Nevertheless, I’m stressed.
Am I doing something wrong?
Do I suck for letting these bugs get to prod in the first place?
Should I be working late every night to clean up the bug DB?
Am I the bottleneck, too slow at getting patches to users?
Should I certify patches that are only partially fixed?
Should I be writing new tests to verify these bugs and prevent them from returning?
Should I be out in the trenches, watching the user behavior?
Can you relate?