Should you raise bugs you don’t know how to reproduce?

At Automattic we’re always dogfooding; we’re always testing as we develop and therefore we’re always finding bugs.

When I come across one of these bugs I try to capture as much as I can about what I was doing at the time when it happened, and also capture some evidence (a screenshot, visual recording, error messages, browser console) of it happening. But even then on occasion I can’t work out how to reproduce a bug. 

This doesn’t mean the bug isn’t reproducible; it’s just that because there’s so many factors in the highly-distributed, multi-level systems that we develop and test, this means I don’t know how to reproduce these exact circumstances, therefore I don’t know how to reproduce this bug.

The exact set of circumstances that may cause a bug are never entirely reproducible; just like you can never live the exact same moment of your life ever again.

So, when this bug occurred, perhaps the server was under a particular level of load at that exact moment in time. Or one of the load-balancers was restarting, or maybe even my home broadband speed was affected by my children streaming The Octonauts in the next room.

So it can seem almost like a miracle when you do know how to reproduce a bug.

But what do you do with all of those bugs that you haven’t been able to reproduce?

Continue reading “Should you raise bugs you don’t know how to reproduce?”

Fixing bugs in production: is it that expensive any more?

You’ve most likely seen a variant of this chart before:

bug fix costs

I hadn’t seen it for a while, until yesterday, but it’s an old favourite of test managers/test consultants to justify a lot of testing before releasing to production.

But I question whether it’s that accurate anymore.

Sure, in the good old days of having a production release once or twice a year it cost a large order of magnitude more to fix a bug in production, but does it really cost that much more in the present age of continuous delivery/continuous deployment where we release into production every fortnight/week/day?

If the timeline on the chart above is a year then of course bugs will cost more to fix, because presumably, if the project took a year to start with, you don’t have a very rapid software development process. And there’s more likely to be requirements ‘bugs’ in production because an awful lot happened in the year that the requirement was being developed. Hence along came agile with its smaller iterations and frequent releases.

Mission critical systems aside, most web or other software applications we build today can be easily repaired.

Big waterfall projects, like building a plane, are bound to fail. The Boeing 787 Dreamliner was an epic fail. Not only was it five delays and many years late, it had two major lithium ion battery faults in its first 52,000 hours of flying which caused months of grounding and has no doubt affected future sales, causing millions of dollars in damages. But it seems to have been well tested:

“To evaluate the effect of cell venting resulting from an internal short circuit, Boeing performed testing that involved puncturing a cell with a nail to induce an internal short circuit. This test resulted in cell venting with smoke but no fire. In addition, to assess the likelihood of occurrence of cell venting, Boeing acquired information from other companies about their experience using similar lithium-ion battery cells. On the basis of this information, Boeing assessed that the likelihood of occurrence of cell venting would be about one in 10 million flight hours.”

NTSB Interim Report DCA13IA037 pp.32-33

After months of grounding, retesting, and completely redesigning the battery system, the cause of the original battery failures are still unknown. If they can’t work out what the problem is after it has occured twice in production, it’s not likely it could have been found or resolved in initial testing.

But most of us don’t work on such mission critical systems anyway.

And production fixes can be very easy.

Take this very different example; I provide support for a production script that uploads a bunch of files to a server. There was a recent issue where a file-name had an apostrophe in it which meant this file was skipped when it should have been uploaded.

Upon finding out about the problem I immediately looked at my unit tests. Did I have a unit test with a file name with an apostrophe? No I didn’t. I wrote a quick unit test – it failed: as expected. I made a quick change to the regular expression constant that matches file names to include an apostrophe, I reran the unit test which passed. Yippee. I quickly reran all the other unit and integration tests and all passed, meaning I could confidently package and release the script. All of this was done in a few minutes.

I could have possibly prevented this happening by doing more thorough testing to begin with, but I am pretty sure that would have taken more effort than it did for me to fix the production bug, by writing a test for it and repackaging it. So for me it wasn’t an increase in cost whatsoever to find that bug ‘late’.

Unless you’re working on mission critical software, shipping some bugs into production is almost always better than shipping no software at all. If you work on very small, frequent deployments into production, the cost of fixing bugs once they have gone live will only be marginally greater than trying to find every bug before you ship. The longer your spend making sure your requirements are 100% correct and everything is 100% tested, ironically, your software is more likely to be out of date, and hence incorrect, once you finally go live.

Should you raise trivial bugs?

This post is part of the Pride & Paradev series.


Should you raise trivial bugs?


You should raise trivial bugs

Some of the world’s best companies have become that way through attention to detail. There are lots of famous stories about Steve Jobs when he was in charge of Apple about his pedantic nature. For example, how he would debate for half an hour about the shade of grey for the bathroom signs in Apple Stores. Another example is how he rang Google’s Vic Gundotra at home early on Sunday morning to let him know the shade of yellow of Google’s logo was wrong on the iPhone but Steve had put an Apple engineer immediately on it.

Raising trivial bugs is paying close attention of detail. Whilst customers might not notice something so little that is wrong; from little things big things grow.

You should raise trivial bugs even if you choose not to fix them, that way you can keep a list of known bugs you will release into production should you choose to fix them one day.

You shouldn’t raise trivial bugs

The problem with raising trivial bugs is that it slows your velocity and takes focus away from other things: namely building new functionality to release to customers.

If you focused your effort on fixing 100% of every issue ever identified with your application you will never actually ship the thing but continually try to perfect it. Fixing a trivial bug may then further highlight over trivial bugs and so the cycle begins.

By the time you release your application, a trivial bug may not even matter that much as it may be on a feature that won’t even be used: but you can only find out this by releasing your product into production: trivial bugs and all.

Should software testers fix the bugs they find?

This post is part of the Pride & Paradev series.


Should software testers fix the bugs they find?


Software testers should fix the bugs they find

I admit it, I often fix bugs I find. I can’t help it.

After being on a project for a couple of months you start to notice the same trivial bugs being found again and again. If you know how to fix it, why not fix it?

An example is a trailing comma in a JavaScript array, it’ll go berserk in IE7, and it’s easy enough to fix (remove the trailing comma) so I’ll just fix it if I find it. Another is an button in IE7 that is meant to submit a form but won’t. To support IE7 you need an input type=submit to work. Again, I’ll change it so that it works.

The benefits are that user stories will move to ‘done’ faster as it doesn’t require programmer involvement, and it’s less disruptive to the programmer who will be working on another story and will need to context-switch.

Software testers should not fix the bugs they find

You’ll often be tempted to do a quick bug fix when you know why something is broken, but you should avoid it. If you quickly fix it, the programmer who created the bug doesn’t get the feedback that they made a mistake, and will repeat the same mistake over again.

Over time, if the programmers know that you’ll fix bugs, they’ll naturally start providing you with buggier code as they know that you’ll just fix it as needed.

Programmers crave feedback, both positive and negative, that’s why it’s good having a tester on an agile team. But fixing bugs yourself means there’s less feedback being given, and less communication happening.

There’s a small chance that you may introduce some regression bugs when fixing a bug by yourself, but this isn’t the reason alone to stop doing it, there’s better ones I have already mentioned.

Another minor reason is that it may look like you’re not finding bugs, but this again shouldn’t be reason alone because testers shouldn’t be measured on how many bugs they find.

Bug tracking & cane toads

This post is part of the Pride & Paradev series.


Should agile software development teams use tools to track bugs?


Don’t use a tool to track bugs

When working in a collaborative, co-located agile team working in a iterative manner, it’s often more efficient to fix bugs as they’re found than spend time raising them and tracking them using a bug tracking tool. The time spent to raise and manage a bug is often higher than actually fixing the bug, so in this case it’s better avoided.

Most testers aren’t comfortable with this approach, initially at least, because it may look like they’re not raising bugs. But a tester should never be measured on how many bugs they have raised. Doing so encourages testers to game the system by raising insignificant bugs and splitting bugs which is a waste of everyone’s time.

Once a tester realizes their job isn’t to record bugs but instead deliver bug free stories: they will be a lot more comfortable not raising and tracking bugs. The only true measurement of the quality of testing performed is bugs missed, which aren’t recorded anyway.

One approach I have taken is to simply record bugs on sticky notes or index cards stuck to the team’s story wall. This is a lightweight approach as the only time taken is to write the sticky note and once resolved, it can be scrunched into a ball: a symbolic act of squashing the bug.


Use a tool to track bugs

“What’s measured improves” ~ Peter Drucker

If you’ve got remote team members, you can’t really avoid using a tool to track bugs. It ensures you’re communicating and tracking the progress effectively across geographic borders.

Without some form of bug tracking tool in place on your project, it’s difficult to keep a historical track of bugs and how they are resolved. Without this, it may lead to some of the nastiest bugs reappearing: I call these cane toads.

Beware of Cane Toads

If you weren’t aware, cane toads are a highly invasive species of toad in Australia that were introduced to control native cane beetles, but have ended up threatening natural wildlife. They have two notable characteristics:

  1. They secrete toxic poison affecting their surroundings; and
  2. They have an uncanny ability to survive even the harshest of conditions (here in Queensland there are competitions to kill cane toads but they’re amazingly hard to kill: just when you think you’re done with one it’ll bounce back to life).

Therefore, a cane toad on a software project is an issue that:

  1. Causes other issues by secreting toxic poison; and/or
  2. Seems to come back to life even though you’re sure you’ve already killed and buried it before.

Without tracking these cane toads, and how you killed fixed them, you’ll see patterns of these reemerging. You can easily look up when it last happened, what you did then, and why it shouldn’t be happening again.

This is also why it’s really important to have automated regression tests.

You must keep bug tracking tools as lightweight as possible. These should allow rapid recording and resolution of bugs. Avoid the temptation to introduce ‘defect workflow’ of any description: empower people to do the right thing in regards to tracking and resolving bugs, don’t make it harder than it should be.

An even better approach is to incorporate bug tracking into your existing user story tracking tool. Trello allows checklists on user story cards: these are a great way to track bugs in a lightweight, flexible manner.

Trello Bug Tracking

Software Quality Metrics

Software quality metrics are a very interesting topic, and from my experience, there doesn’t seem to be a widely used or accepted list of metrics that can be used to measure software quality. After many years of thought on the topic, and many years of trialing different metrics, I believe the number one metric that accurately measures software quality is defects in production. Quality software won’t include defects in production, so I believe that’s the metric we should use to measure whether testing is done successfully.

Various organizations I have worked in have used this metric in different ways. One organization called each production defect a ‘quality spill’. Another used a mean time to failure metric which is often used to measure the reliability of a production system, or machine. This could be used for example with your car and how long it takes to break down.

The issue I have with some other software quality metrics is that they motivate people the wrong way. For example, having a metric about bug count encourages testers to report bugs. But it can also encourage them to report bugs that aren’t bugs, or split one major bug into multiple bug tickets, so the metrics look good. Also, is a high bug count (in test) a bad thing? Doesn’t it mean you got all the bugs? Or does a low bug count mean the developers are doing a good job? Or perhaps you didn’t catch all the bugs? That’s why production defects is a true measure of software quality. No one wants bugs in production, they cause all sorts of headaches. In the last few days there have been numerous, embarrassing, public computer glitches, some related to the beginning of the year 2010. Have we become complacent after Y2K?

  • 3 Jan 2010: “Businesses stung by BOQ computer bug” (link)
  • 3 Jan 2010: “Bank of Queensland’s (BOQ) Eftpos terminals go down costing retailers thousands” (link)
  • 3 Jan 2010: “Chaos as check-in problems affect Qantas” (link)
  • 3 Jan 2010: “Flights delayed after check-in system malfunction” (link)
  • 10 Dec 2009: “Computer glitch brings Brisbane trains to a standstill” (link)
  • 16 Dec 2009: “Check-in failure sparks Brisbane Airport delays” (link)
  • 16 Nov 2009: “Computer glitch delays Qantas flights” (link)

What’s interesting is the Amadeus system Qantas uses failed in November and failed again today. The lesson here is if you do discover bugs in production, make sure you fix them.