Does your organization conduct extensive post-release testing in production environments?
If you do, then it shows you probably have an unhealthy testing process, and you’ve fallen into the “let’s just test it in production” trap.
If testing in non-production environments was reflective of production behaviour, there would be no need to do production testing at all. But often testing isn’t reflective of real production behaviour, so we test in production to mitigate the risk of things going wrong.
It’s also the case that often issues are found in a QA environment don’t appear in a local development environment.
But it makes much more sense to test in an environment as close to where the code was written as possible: it’s much cheaper, easier and more efficient to find and fix bugs early.
For example, say you were testing a feature and how it behaves across numerous times of day across numerous time zones. As you progress through different test environments this becomes increasingly difficult to test:
In a local development environment: you could fake the time and timezone to see how your application behaves.
In a CI or QA environment: you could change a single server time and restart your application to see how your application behaves under various time scenarios: not as easy as ‘faking’ the time locally but still fairly easy to do.
In a pre-production environment: you’ll probably have clustered web servers so you’ll be looking at changing something like 6 or 8 server times to test this feature. Plus it will effect anyone else utilizing this system.
In a production environment: you’ll need to wait until the actual time to test the feature as you won’t be able to change the server times in production.
Clearly it’s cheaper, easier and more efficient to test changing times in an environment closer to where the code was written.
You should aim to conduct as much testing as you can in earlier test environments and taper this off so by the time you can a change into production you’ll be confident that it’s been tested comprehensively. This probably requires some change to your testing process though.
How to Remedy A ‘Test in Production’ Culture
As soon as you find an issue in a later environment, ask why wasn’t this found in an earlier environment? Ultimately ask: why can’t we reproduce this in a local environment?
Some Hypothetical Examples
Example Two: our tests failed in pre-production that didn’t fail in QA because pre-production has a regular back up of the production database whereas QA often gets very out of date. We schedule a task to periodically restore the QA database from a production snapshot to ensure the data is reflective.
Example Three: our tests failed in production that didn’t fail in pre-production as email wasn’t being sent in production and we couldn’t test it in pre-production/QA as we didn’t want to accidentally send real emails. We configure our QA environments to send emails, but only to a white-list of specified email addresses we use for testing to stop accidental emails. We can be confident that changes to emails are tested in QA.
It’s easy to fall into a trap of just testing things in production even though it’s much more difficult and risky: things often go wrong with real data, the consequences are more severe and it’s generally more difficult to comprehensively test in production as you can’t change or fake things as easily.
Instead of just accepting “we’ll test it in production”, try instead to ask, “how can we test this much earlier whilst being confident our changes are reflective of actual behaviour?”
You’ll be much less stressed, your testing will be much more efficient and effective, and you’ll have a healthier testing process.