At Automattic we’re always dogfooding; we’re always testing as we develop and therefore we’re always finding bugs.
When I come across one of these bugs I try to capture as much as I can about what I was doing at the time when it happened, and also capture some evidence (a screenshot, visual recording, error messages, browser console) of it happening. But even then on occasion I can’t work out how to reproduce a bug.
This doesn’t mean the bug isn’t reproducible; it’s just that because there’s so many factors in the highly-distributed, multi-level systems that we develop and test, this means I don’t know how to reproduce these exact circumstances, therefore I don’t know how to reproduce this bug.
The exact set of circumstances that may cause a bug are never entirely reproducible; just like you can never live the exact same moment of your life ever again.
So, when this bug occurred, perhaps the server was under a particular level of load at that exact moment in time. Or one of the load-balancers was restarting, or maybe even my home broadband speed was affected by my children streaming The Octonauts in the next room.
So it can seem almost like a miracle when you do know how to reproduce a bug.
But what do you do with all of those bugs that you haven’t been able to reproduce?
Like many things, I am of two minds about what to do with these bugs.
Bug reports without instructions on how to reproduce them are much more difficult to action since there’s no hypothesis listed to work against.
I recently came across a bug which bothered me as I couldn’t reproduce it. So I didn’t raise the bug since I couldn’t find what I thought was enough information to make the bug report actionable, and testable when it’s ‘fixed’.
A couple of weeks later I heard reports of this bug happening to customers. I quickly raised it without reproducible steps but with a hunch of what I thought might be causing it to happen.
This happened to be enough information for a developer to look through the code and find something that wasn’t quite right in particular circumstances: there was a race condition.
I was able to use Chrome’s inbuilt throttling tools to reproduce a similar issue and therefore I could verify the behaviour both before and after the fix created.
This exercise has taught me to raise every bug that I find (unless it already exists as a bug report, of course), with as much detail as I can at the time, even if that means I don’t know how to specifically reproduce it.
There is a good chance that there may be enough detail specified to allow someone familiar with the codebase to do some targeted investigation looking for potential ‘code smells’ that could be causing your issue, and fix it with details of what was wrong.
What’s your experience in raising bugs you don’t know how to reproduce?