How are you handling the db in automation suites?
I’m running into issues where the test DB is, by necessity, a rather weighty 900mb, so a simple drop and restore from known backup is hugely time consuming.
“If you automate a mess, you get an automated mess.” -Rod Michael
In my current role at Automattic I primarily work on end-to-end automated tests for WordPress.com. These tests run against live data (Production) no matter where our UI client (Calypso) is running (for example on localhost), so we just make sure our config points to the data that we need (test sites) and create other test data within the e2e scenarios.
In previous organisations I have used a scaled down backup of production that had specific test data ‘seeded’ into it. Our DBAs had a bunch of scripts that would take a backup and cleanse/remove a whole heap of data (for example, archived products, orders) so that resulted in a small manageable backup that we could quickly restore into an environment. I found this to be a good approach as it gave us realistic data but it wasn’t time consuming restoring this when necessary, eg. before a CI test run.
I also shared some other data creation techniques in a previous answer.
Butch Mayhew asks…
What are the different ways (in order best to worst) to handle data creation for UI Automated checks?
The answer to this one very much depends on your environment and what tools/access you have to manage data, but here’s some ways I have handled this on various projects:
Within the scenario itself: for full e2e tests, having no or limited dependencies on any data – so following for example a CRUD e2e flow during your test, is often the best approach as you don’t have the manage the data. The downside of these types of tests is that each step in your e2e flow is dependent on the preceding ones passing, and if you want a clean data store, you may need hooks to cleanup in the case of test failure.
Using before/after hooks: for more detailed UI tests, you can use before and after hooks to create/delete test data via direct database calls, or calling a service. I would avoid using the UI where possible to do this to ensure efficiency and effectiveness of this data creation.
Seeding an Environment: I have worked on projects where we had a collection of SQL scripts that were used to ‘seed’ test data into an environment, which meant repeatable tests in CI. This meant our CI process was to restore the database completely then seed the data needed for the tests. Depending on your database size this can slow down your CI process, but the benefit is that you have a consistent untainted test environment for each test run.
Manually Created: I least recommend this approach, or at least recommend you have as little data as possible that needs to be manually created in this way. At WordPress.com, our e2e tests do have a configuration item which is a couple of test sites needed to run against, but since this is a once-off task and these sites can be shared, it would be overkill to either seed this all the time or script it into our test hooks, since running against an existing site is the most efficient way for CI.