→ How Canaries Help Us Merge Good Pull Requests

I recently published an article on the WordPress.com Developer’s Blog about how we run automated canary tests on pull requests to give us confidence to release frequent changes without breaking things. Feel free to check it out.

AMA: Difference between explicit and fluent wait

Anonymous asks…

What is the difference between Explicit wait and Fluent wait?

My response…

I hadn’t heard of fluent waiting before, only explicit and implicit waiting.

From my post about Waiting in C# WebDriver:

Implicit Waiting

Implicit, or implied waiting involves setting a configuration timeout on the driver object where it will automatically wait up to this amount of time before throwing a NoSuchElementException.

The benefit of implicit waiting is that you don’t need to write code in multiple places to wait as it does it automatically for you.

The downsides to implicit waiting include unnecessary waiting when doing negative existence assertions and having tests that are slow to fail when a true failure occurs (opposite of ‘fail fast’).

Explicit Waiting

Explicit waiting involves putting explicit waiting code in your tests in areas where you know that it will take some time for an element to appear/disappear or change.

The most basic form of explicit waiting is putting a sleep statement in your WebDriver code. This should be avoided at all costs as it will always sleep and easily blow out test execution times.

WebDriver provides a WebDriverWait class which allows you to wait for an element in your code.

As for fluent waits, according to this page it’s a type of explicit wait with more limited conditions on it. I don’t believe WebDriverJs supports fluent waits.

AMA: Moving automated tests from Java to JavaScript

Anonymous asks…

I am currently using a BDD framework with Cucumber, Selenium and Java for automating a web application. I used page factory to store the objects and using them in java methods I wanted to replace the java piece of code with javaScript like mocha or webdriverio. could you share your thoughts on this? can I still use page factory to maintain objects and use them in js files

My response…

What’s the reasoning for moving to JavaScript from Java? Despite having common names, there’s very little otherwise in common (Car is to Carpet as Java is to JavaScript.)

I wouldn’t move for moving sake since I see no benefit in writing BDD style web tests in JavaScript, if anything, e2e automated tests are much harder to write in JavaScript/Node because everything is asynchronous and so you have to deal with promises etc. which is much harder to do than just using Java (or Ruby).

Aside: I still dream of writing e2e tests in Ruby: it’s just so pleasant. But our new user interface is written extensively in JavaScript (React) so it makes sense from a sustainability point of view to use JS over Ruby.

 

Why you should use CSS selectors for your WebDriver tests

I didn’t used to be a fan of CSS selectors for automated web tests, but I changed my mind.

The reason I didn’t use to be a fan of CSS selectors is that historically they weren’t really encouraged by Watir, since the Watir API was designed to find elements by type and attribute, so the Watir API would look something like:

browser.div(:class => 'highlighted')

where the same CSS selector would look like:

div.highlighted

Since WebDriver doesn’t use the same element type/attribute API and just uses findElement with a By selector, CSS selectors make the most sense since they’re powerful and self-contained.

The the best thing about using CSS selectors, in my opinion, is the Chrome Dev Tools allows you to search the DOM using a CSS selector (and XPath selectors, but please don’t use XPath), using Command/Control & F:

chrome css selectors
Using CSS selectors to find elements in Chrome Dev Tools

So you can ‘test’ your CSS in a live browser window before deciding to use it in your WebDriver test.

The downside of using CSS selectors are they’re a bit less self explanatory than explicitly using by.className or by.id.

But CSS selectors are pretty powerful: especially pseudo selectors like nth-of-type and I’ve found the only thing you can’t really do in CSS is select by text value, which you probably shouldn’t be doing anyway as text values are more likely to change (since they’re copy often changed by your business) and can be localised in which case your tests won’t run across different cultures.

The most powerful usage of CSS selectors is where you add your own data attributes to elements in your application and use these to select elements: straightforward, efficient and less brittle than other approaches. For example:

a[data-e2e-value="free"]

How do you identify elements in your WebDriver automated tests?

AMA: Trunk Guardian Service?

Sue asks…

I read a LinkedIn blog post from 2015 by Keqiu Hu from LinkedIn about flaky UI tests. He explains how they fixed their flaky UI tests for the LinkedIn app. Among other things they implemented what they called the “Trunk Guardian service” which runs automated UI tests on the last known good build twice and if the test passes on the first run but fails on the second it is marked as ‘flaky’ and disabled and the owner is notified to fix it or get rid of it. I wondered what your thoughts were on such a “Trunk Guardian service” – if the culture / process was in place to solve the other issues that create flaky tests, could such a thing be worth the effort to implement? Article: Test Stability – How We Make UI Tests Stable

My response…

Continue reading “AMA: Trunk Guardian Service?”

AMA: IE11 Button Clicking in Selenium

Anthony asks…

I have coded to click buttons on IE11/Win7 but the latest version of Selenium IE doesn’t click the buttons correctly most of times. Most of times, it clicks one button below. I thought it might be loading time so added some waiting but still click one button below or two buttons below sometimes. I googled this and found several posting saying Selenium IE doesn’t click buttons well. Now I have moved it to FF but I am still wondering why IE is not accurate. I know a lot of Selenium test developers in the field but they are having the same issue or they know a workaround. What do you think of this issue on IE11? Are you aware of this issue? FYI, the buttons are not regular HTML tag. The menu system with clickable tag is created by javascript. Thank you!

My Response…

We actually don’t run any tests in Internet Explorer any more since these weren’t finding any browser specific bugs (we do exploratory testing in Internet Explorer instead).

But, I have heard of problems generally with the IEDriver tool. If you’re working on a JavaScript generated app I think the best thing for you to do would instead of using a native click in Selenium is instead execute a JavaScript click event. The exact syntax will depend on which language you’re using Selenium in, but it should look something like this:

this.driver.executeScript( 'return arguments[0].click();', webElement );

I hope this solution helps!

AMA: CodeceptJS support for Safari and IE?

Sahana Asks…

We area VOD startup and we have web app, mobile apps and TV apps. I am writing acceptance tests for web app now and chose codeceptjs framework since we have our website’s front end code in Javascript. We have dockerised the processes and docker images for codeceptjs webdriver IO is availble only for chrome and firefox browsers. How can I handle Safari , Internet Explorer browsers ? Looks like CodeceptJs does not support IE and Safari browsers. Do you have any suggestion?

My Response…

I’ve never personally found the return on investment of getting automated tests running across Internet Explorer and  Safari to be worthwhile as in my experience this took more effort than the bugs it found. So I personally stick to running our full e2e test suite in our most used browser (Chrome) and supplementing this with exploratory testing on all other browsers.

In saying that the reason you won’t be able to use Docker containers for these purposes is that they’re Linux and Internet Explorer requires Microsoft Windows and Safari requires Apple macOS to be able to run. To be able to use these for your existing automated tests you can sign up to a on-demand browser service like SauceLabs and use the remote WebDriver protocol to execute your tests.

AMA: Test Data Infrastructure

Anonymous asks…

Do you have set up (inexpensive) infrastructure to store data collected in your automated tests? We are currently using using selenium Java webdriver to automate our tests and IntelliJ as our IDE. We create data from scratch for each and every test case :(

My response…

I’m a little confused by the question and whether it’s about test data: data is that is needed by the automated tests, or test results data: insights into the results of our automated tests. So I’ll answer both 😀

Infrastructure to manage test data

Our tests run on specific test accounts and sites on production databases. Since our tests are end-to-end in fashion, we try to make our tests have as few dependencies as possible on existing data. Often an end-to-end scenario will involve creating, viewing, editing and deleting something. If we don’t do all of this by our UI we can use hooks that either use services or database jobs to clean up the data. I explained this in more detail previously.

Infrastructure to manage test results data

We use CircleCI for automated end-to-end tests. We have a number of projects that run different types of end-to-end tests from the same code repository for different purposes (canary tests, visual-diff tests, full regression tests for example).

We generate x-unit test results (from Mocha/Magellan) which CircleCI uses to provide insights into our test results such as this:

You can also drill down into slowest tests and most failed tests etc.

Since all our tests are open source you can view these build insights yourself!

We’re pretty happy with the insights we get from CircleCI at the moment so we don’t see a need to currently develop anything ourself.

WebDriverJs Select Lists in Chrome

Chromedriver/Chrome is pretty great at executing WebDriverJs scripts without taking away your focus (so you can execute them in the background whilst doing other things), the one exception I found was selecting items in a select list. I found it would do this:

Continue reading “WebDriverJs Select Lists in Chrome”

Feature Toggles for Automated e2e Tests

Feature toggles aren’t just for production code. Feature toggles are also a powerful technique to change the behaviour of your automated end-to-end tests without changing code.

Continue reading “Feature Toggles for Automated e2e Tests”

Prioritising Test Reliability over Perfection

If you saw my talk at GTAC last year, ‘your tests aren’t flaky‘, then you’re probably aware of my view on flaky tests actually being indicative of broader application/systems problems that we should address over making our tests less flaky.

But what if you’re in a situation where you work with a system where you can’t feasibly improve the reliability? Say you’ve got a domains page that should show you a list of available domains but since it’s using an external third-party service it sometimes just shows nothing?

Continue reading “Prioritising Test Reliability over Perfection”

AMA: product APIs for test automation

Michael Karlovich asks..

What’s your design approach for incorporating internal product APIs into test automation? I don’t mean in order to explicitly test them, but more for leveraging them to stage data and set application states.

My response…

As explained previously, in my current role at Automattic I primarily work on end-to-end automated tests for WordPress.com. These tests run against live data (Production) no matter where our UI client (Calypso) is running (for example on localhost), so we don’t use APIs for staging data or setting application state.

In previous roles we utilised a REST API to create dynamic data for an internally used web application which we found useful/necessary for repeatable UI tests.

We also utilised test controllers to set web application state for a public website. These test controllers were very handy as they allowed you to visit something like http://myteststore.com/testsetup/checkout which would set up an order for you with products in your session, and instantly display the checkout page, which would typically take 8 steps from the start of the process to this page.

This saved us lots of time and made our specific tests more deterministic as we could avoid the 8 or so ‘setup’ steps and use a single URL to access our page.

This approach had a couple of downsides in that this couldn’t ever be deployed to production, and it didn’t test realistic user flow which includes those ‘setup’ steps. There were two things we had to do to avoid the risk of using this approach; firstly ensure that these test controllers were never deployed to production though config, and secondly we had to ensure we had some end-to-end coverage so we were at least testing some real user flows.

AMA: R.Y.O. Page Objects 2.0

Michael Karlovich asks…

Do you have any updated thoughts on rolling your own page objects with Watir? The original post is almost 4 years old but is still the basis (loosely) of every page object framework I’ve built since then.

My response…

Wow, I can’t believe that post is almost four years old. I have also have used this for the basis of every page object framework I have built since then.

I recently had a look at our JavaScript (ES2015) code of page objects and despite ES2015 not having meta-programming support like ruby, our classes are remarkably similar to what I was proposing ~4 years ago.

I believe this is because some patterns are classic and therefore almost timeless, they can be applied over and over again to different contexts. There’s a huge amount of negativity towards best practices of late, but I could seriously say that page objects are a best practise for test automation of ui systems, which isn’t saying they will be exactly the same in every context, but there’s a common best-practice pattern there which you most likely should be using.

Page objects, as a pattern, typically:

  • Inherit from a base page object/container which stores common actions like:
    • instantiating the object looking for a known element that defines that page’s existence
    • optionally allow a ‘visit’ to the page during instantiation using some defined URL/path
    • provides actions and properties common to all pages, for example: waiting for the page, checking the page is displayed, getting the title/url of the page, and checking cookies and local storage items for that page;
  • Define actions as methods which are ways of interacting with that page (such as logging in);
  • Do not expose internals about the page outside the page – for example they typically don’t expose elements or element selectors which should only be used within actions/methods for that page which are exposed; and
  • Can also be modelled as components for user interfaces that are built using components to give greater reusability of the same components across different pages.

The biggest benefit I have found from using page objects as a pattern is having more deterministic end-to-end tests since instantiating a page I know I am on that page, so my tests will fail more reliably with a better understanding of what went wrong.

Are there any other pattern attributes you would consider vital for page objects?

AMA: test automation tooling for MS web stack?

Sean asks…

 Now that you’ve gained experience testing with JavaScript, do you have a preference for tooling? Would you lean more towards JS than Watin for a MS web stack?

BONUS: Any tips on testing modules that rely on dynamically created SQL? Common sense suggests testing to the nearest clearly defined “business value” and eventually separating concerns/refactoring. Any weakly held strong options?

My response…

I still think you should write your tests in the same language as your app, so for a MS web stack I would lean towards SpecFlow/WebDriver (see SpecDriver for an example). I am not sure whether Watin is actively maintained or whether it supports browsers other than IE, but I know the C# WebDriver bindings are increasingly solid.

Using Mocha in JavaScript for e2e tests continues to be painful, we’re patching lots of different aspects of it, which makes me think we would be probably better off using a different tool that does what we want.

Bonus answer: I think your idea makes sense as there’s elements of context and unpredictability, so starting with one approach and letting it evolve over time through refactoring is often the best outcome.

AMA: bdd frameworks, api test tools & unit testing responsibility

Swapnil Waghmare asks…

Are BDD frameworks like Specflow, Cucumber better for E2E tests?

My response…

There’s two ways I can interpret this question (my additions are in bold).

Firstly:

Are BDD frameworks that use ‘Given/When/Then’ feature files like Specflow, Cucumber  better than frameworks that use ‘describe/it’ blocks like RSpec and Mocha for e2e tests?

As I previously explained, Automattic’s unit tests are written in Mocha, so that was a logical choice for writing e2e tests as there is a lot of familiarity of it within Automattic, which will hopefully mean more developers are interested in the e2e tests we are writing using Mocha/WebDriverJs.

There are some challenges with writing end-to-end tests with Mocha (mainly that Mocha tests are all independent so will continue to run if a previous step in the scenarios fails) so I haven’t completely ruled out investigating a move to Cucumber at some point for the e2e tests.

Secondly:

Are BDD frameworks like Specflow, Cucumber better for E2E tests than using them for automated integration, component or unit tests?

I think you could write integration tests or even unit tests in Given/When/Then format as most unit tests follow the same arrange/act/assert pattern anyhow which is exactly what Given/When/Then is.

Keep in mind there is overhead in maintaining all the step definitions and feature files for Cucumber/Specflow that give you non technical readability so if you don’t require that readability it is probably overkill. But a personal preference nonetheless.

Swapnil Waghmare also asks…

Which tools have you used for API testing, which ones would you recommend?

My response…

At the moment my main focus is on automated unit test and automated e2e test coverage in JavaScript.

I try to keep things simple, so in the past when I’ve written REST integration tests I’ve just called them using in built libraries, and have used Postman quite a lot for manual testing and debugging.

Swapnil Waghmare finally asks…

What is the reason behind writing E2E tests using JavaScript? Isn’t any Object oriented language a better choice for writing E2E tests? Should QA’s write Unit tests?

My response…

 

Firstly, JavaScript is actually object oriented, it’s just not class-based object oriented like Java, C# or Ruby. The newer versions of JavaScript, called ECMAScript, or ESScript are more class-based object oriented.

I’ve actually already answered why I chose JavaScript in a previous answer, so I’ll summarise that here and you can read the rest in that answer.

WordPress.com built an entirely new UI for managing sites using 100% JavaScript with React for the main UI components. I am responsible for e2e automated tests across this UI, and whilst I originally contemplated, and trialled even, using Ruby, this didn’t make long term sense for WordPress.com where the original WordPress developers are mostly PHP and the newer UI developers are all JavaScript.

As for whether QA’s should write unit tests? No, I don’t think so, as I believe unit tests should drive software development, and writing code by writing unit tests is much easier than trying to add unit tests after by someone else, as the original code will not likely be very testable as it wasn’t written with testability in mind. One benefit of code written with unit tests at the time is that it will mostly be better code as the tests are consuming your API that you are developing.

AMA: JS vs Ruby

Butch Mayhew asks…

I have noticed you blogging more about JS frameworks. How do these compare to Watir/Ruby? Would you recommend one over the other?

My response…

I had a discussion recently with Chuck van der Linden about this same topic as he has a lot of experience with Watir and is now looking at JavaScript testing frameworks like I have done.

Some Background

WordPress.com built an entirely new UI for managing sites using 100% JavaScript with React for the main UI components. I am responsible for e2e automated tests across this UI, and whilst I originally contemplated, and trialled even, using Ruby, this didn’t make long term sense for WordPress.com where the original WordPress developers are mostly PHP and the newer UI developers are all JavaScript.

Whilst I see merit in both views: I still think having your automated acceptance tests in the same language as your application leads to better maintainability and adoptability.

I still think writing automated acceptance tests in Ruby is much cleaner and nicer than JavaScript Node tests, particularly as Ruby allows meta-programming which means page objects can be implemented really neatly.

The JavaScript/NodeJS landscape is still very immature where people are using various tools/frameworks/compilers and certain patterns or de facto standards haven’t really emerged yet. The whole ES6/ES2015/ES2016 thing is very confusing to newcomers like me, especially on NodeJS where some ES6+ features are supported, but others require something like Babel to compile your code.

But generally with the direction ES is going, writing page objects as classes is much nicer than using functions for everything as in ES5.

Whilst there’s nothing I have found that is better (or even as good) in JavaScript/Mocha/WebDriverJS than Ruby/RSpec/Watir-WebDriver, I still think it’s a better long term decision for WordPress.com to use the JavaScript NodeJS stack for our e2e tests.

AMA: intention vs implementation cucumber scenarios

Stan asks…

Internally in our company, we have various discussions about writing cucumber especially intentions vs implementations.

Surprisingly the business loves the idea of writing cucumber with implementations. For them this is much tighter control. How do you feel about this? Have you had similar experience?

My response…

I’ve never been successful in getting business people interested in writing/editing or even reading business test specifications. Do your business people like the idea of writing imperative cucumber scenarios, or actually writing imperative cucumber scenarios?

I personally prefer having discussions with the business about key user scenarios with examples, and then using the information I gather from these discussions to write scenarios that convey the intention of these scenarios and examples.

Tying cucumber scenarios directly to the implementation I have found leads to scenarios that are a lot more brittle to change, and I personally find using patterns like page objects are great for reducing any implementation maintainability overhead as fields/forms/elements implementation detail are stored in one place, whereas with imperative scenarios these are scattered across many feature files/scenarios. For example, I prefer

Given I am a customer from a small company
When I enter my personal and company details

over

When I enter "Sarah" into the "first name" field
And I enter "Jones" into the "last name" field
And I enter "ACME Corp" into the "company" field
And I click the "order" button

The other benefit of declarative or intention-based scenarios is these can be written before an implementation even exists, so it’s more likely these will emerge before/during software development, instead of being written as a “regression testing catch up” activity after the software already exists.

I’m interested in whether you’ve had success with this, or whether it’s early stages and you’re trying to see if what the business desires will actually work?

 

Checking an image is actually visible in WebDriverJs

I recently discovered a gap in one of my e2e automated tests where I was checking the existence of an uploaded image in the DOM, but not that the image was actually displayed.

driver.isElementPresent( By.css( `img[alt='upload.jpg']` ) ).then( function( present ) {
  assert.equal( present, true, 'Image not displayed' );
} );

If the DOM has a reference to the image, but it isn’t actually rendered this test will pass. This isn’t ideal.

I remembered my post about how to check that an image is actually rendered using WebDriver in C# and so I used the same JavaScript script which WebDriverJs sends to the driver:

driver.findElement( By.css( `img[alt='upload.jpg']` ) ).then( function( element ) {
  driver.executeScript( 'return (typeof arguments[0].naturalWidth!=\"undefined\" && arguments[0].naturalWidth>0)', element ).then( function( present ) {
    assert.equal( present, true, 'Image not displayed' );
  } );
} );

This works a treat. I’ve moved it into a helper function so I can use this anywhere without repeating it also.

Testing email in e2e tests with Mailosaur

In writing automated end-to-end tests for WordPress.com I have encountered some flows where I need to test that an email is received and act on something in that email (an invitation link for example).

My first attempt at this was to set up a GMail account for testing purposes and use plus addressing to have an unlimited number of unique email accounts for testing purposes. This works well for manual testing, but to have an automated script to retrieve the GMail email is a little tricky, as it means either automating their UI (very brittle and slow and even possibly against their TOS) or creating an ad-hoc IMAP connection which GMail discourages by having to lower the security of your account to allow this.

I had a look at a couple of services that provide email testing services namely Mailinator and Mailosaur. They both provide a service that offers an API you can call to retrieve emails sent to special inboxes you set up for testing purposes.

We ended up choosing Mailosaur as it automatically breaks every email down into a nicely formatted JSON object ready for us to easily inspect and extract content (and it was cheaper – win).

Mailosaur JSON structure
Mailosaur’s JSON view (some data obfuscated)

This means we can use the Mailosaur NPM package to easily retrieve, verify and visit links in emails using its API.

I love the simplicity of this service as it makes the email part of our automated end-to-end tests very easy and reliable.

Running Automated Tests with A/B Testing

Like a lot of modern, data driven sites, WordPress.com uses A/B testing extensively to introduce new features. These tests may be as simple as a label change or as complex as changing the entire sign up flow, for example by offering a free trial.

Since I have been working on a set of automated end-to-end tests for WordPress.com, I have found A/B testing to be problematic for automated testing on this very fast moving codebase, namely:

  1. Automated tests need to be deterministic: having a randomised experiment as an A/B test means the first test run may get an entirely different sign up flow than a second test run which is very hard to automate; and
  2. Automated tests need to know which experiments are running otherwise they may encounter unexpected behaviour randomly.

What we need is two methods to deal with A/B tests when running automated tests:

  1. We need to be able to see which A/B tests are active and compare this to a known list of expected A/B tests – so that we don’t suddenly encounter some unexpected/random behaviour for some of our test runs
  2. We need to be able to set the desired behaviour to the control group so that are our tests are deterministic.

Different sites conduct A/B testing using different tools and approaches, WordPress.com uses HTML5 local storage to set which A/B tests are active and which group the user belongs to.

Luckily it’s easy to read and update local storage using WebDriver and JavaScript. This means our approach is to:

  1. Each time a page object is initialised, there is a call on the base page model that checks the A/B tests that are active using something like return window.localStorage.ABTests; and then compares this to the known list of A/B tests which are checked in as a config item. This fails the test if there’s a new A/B test introduced that isn’t in the list of known tests. This is better than not knowing about the A/B test and failing based upon some non-deterministic behaviour.
  2. When a new A/B test is introduced and we wish to ensure our automated tests always use the control group, we can set this using a similar method window.localStorage.setItem('ABTests','{"flow":"default"}'); and refresh the page.

Ideally it would be good to know and plan every A/B test for our automated e2e tests, but since this isn’t possible, checking against known A/B tests and ensuring control groups are set means our automated tests are at least more consistent and deterministic, and fail a lot faster and more consistently when a new A/B test has been introduced.

How do you deal with non-determinism with A/B tests?