Using async/await with WebDriverJs

We’ve been using WebDriverJs for a number of years and the control flow promise manager that it offers to make writing WebDriverJs commands in a synchronous blocking way a bit easier, particularly when using promises.

The problem with the promise manager is that it is hard to understand its magic as sometimes it just works, and other times it was very confusing and not very predictable. It was also harder to develop and support by the Selenium project so it’s being deprecated later this year.

Fortunately recent versions of Node.js support asynchronous functions and use of the await command which makes writing WebDriverJs tests so much easier and understandable.

I’ve recently updated my WebDriverJs demo project to use async/await so I’ll use that project as examples to explain what is involved.

WebDriverJs would allow you to write consecutive statements like this without worrying about waiting for each statement to finish – note the use of test.it instead of the usual mocha it function:

test.it( 'can wait for an element to appear', function() {
	const page = new WebDriverJsDemoPage( driver, true );
	page.waitForChildElementToAppear();
	page.childElementPresent().then( ( present ) => {
		assert( present, 'The child element is not present' );
	} );
} );

When you were waiting on the return value from a promise you could use a .then function to wait for the value as shown above.

This is quite a simple example and this could get complicated pretty quickly.

Since the promise manager is being removed, we need to update our tests so they continue to execute in the correct order. We can make the test function asynchronous by adding the async prefix, remove the test. prefix on the it block, and add await statements every time we expect a statement to finish before continuing:

it( 'can wait for an element to appear', async function() {
	const page = new WebDriverJsDemoPage( driver, true );
	await page.waitForChildElementToAppear();
	assert( await page.childElementPresent(), 'The child element is not present' );
} );

I personally find this much easier to read and understand, less ‘magic’, but the one bit that stands out is visiting the page and creating the new page object. The code in the constructor for this page, and other pages, is asynchronous as well, however we can’t have an async constructor!

export default class BasePage {
	constructor( driver, expectedElementSelector, visit = false, url = null ) {
		this.explicitWaitMS = config.get( 'explicitWaitMS' );
		this.driver = driver;
		this.expectedElementSelector = expectedElementSelector;
		this.url = url;

		if ( visit ) this.driver.get( this.url );

		this.driver.wait( until.elementLocated( this.expectedElementSelector ), this.explicitWaitMS );
	}
}

How we can get around this is to define a static async function that acts as a constructor and returns our new page object for us.

So, our BasePage now looks like:

export default class BasePage {
	constructor( driver, expectedElementSelector, url = null ) {
		this.explicitWaitMS = config.get( 'explicitWaitMS' );
		this.driver = driver;
		this.expectedElementSelector = expectedElementSelector;
		this.url = url;
	}

	static async Expect( driver ) {
		const page = new this( driver );
		await page.driver.wait( until.elementLocated( page.expectedElementSelector ), page.explicitWaitMS );
		return page;
	}

	static async Visit( driver, url ) {
		const page = new this( driver, url );
		if ( ! page.url ) {
			throw new Error( `URL is required to visit the ${ page.name }` );
		}
		await page.driver.get( page.url );
		await page.driver.wait( until.elementLocated( page.expectedElementSelector ), page.explicitWaitMS );
		return page;
	}
}

In our Expect and Visit functions we call new this( driver ) which creates an instance of the child class which suits our purposes. So, this means our spec now looks like:

it( 'can wait for an element to appear', async function() {
	const page = await WebDriverJsDemoPage.Visit( driver );
	await page.waitForChildElementToAppear();
	assert( await page.childElementPresent(), 'The child element is not present' );
} );

which means we can await visiting and creating our page objects and we don’t have any asynchronous code in our constructors for our pages. Nice.

Once we’re ready to not use the promise manager we can set SELENIUM_PROMISE_MANAGER to 0 and it won’t use it any more.

Summary

The promise manager is being removed in WebDriverJs but using await in async functions is a much nicer solution anyway, so now is the time to make the move, what are you awaiting for? 😊

Check out the full demo code at https://github.com/alisterscott/webdriver-js-demo

AMA: Difference between explicit and fluent wait

Anonymous asks…

What is the difference between Explicit wait and Fluent wait?

My response…

I hadn’t heard of fluent waiting before, only explicit and implicit waiting.

From my post about Waiting in C# WebDriver:

Implicit Waiting

Implicit, or implied waiting involves setting a configuration timeout on the driver object where it will automatically wait up to this amount of time before throwing a NoSuchElementException.

The benefit of implicit waiting is that you don’t need to write code in multiple places to wait as it does it automatically for you.

The downsides to implicit waiting include unnecessary waiting when doing negative existence assertions and having tests that are slow to fail when a true failure occurs (opposite of ‘fail fast’).

Explicit Waiting

Explicit waiting involves putting explicit waiting code in your tests in areas where you know that it will take some time for an element to appear/disappear or change.

The most basic form of explicit waiting is putting a sleep statement in your WebDriver code. This should be avoided at all costs as it will always sleep and easily blow out test execution times.

WebDriver provides a WebDriverWait class which allows you to wait for an element in your code.

As for fluent waits, according to this page it’s a type of explicit wait with more limited conditions on it. I don’t believe WebDriverJs supports fluent waits.

Why you should use CSS selectors for your WebDriver tests

I didn’t used to be a fan of CSS selectors for automated web tests, but I changed my mind.

The reason I didn’t use to be a fan of CSS selectors is that historically they weren’t really encouraged by Watir, since the Watir API was designed to find elements by type and attribute, so the Watir API would look something like:

browser.div(:class => 'highlighted')

where the same CSS selector would look like:

div.highlighted

Since WebDriver doesn’t use the same element type/attribute API and just uses findElement with a By selector, CSS selectors make the most sense since they’re powerful and self-contained.

The the best thing about using CSS selectors, in my opinion, is the Chrome Dev Tools allows you to search the DOM using a CSS selector (and XPath selectors, but please don’t use XPath), using Command/Control & F:

chrome css selectors
Using CSS selectors to find elements in Chrome Dev Tools

So you can ‘test’ your CSS in a live browser window before deciding to use it in your WebDriver test.

The downside of using CSS selectors are they’re a bit less self explanatory than explicitly using by.className or by.id.

But CSS selectors are pretty powerful: especially pseudo selectors like nth-of-type and I’ve found the only thing you can’t really do in CSS is select by text value, which you probably shouldn’t be doing anyway as text values are more likely to change (since they’re copy often changed by your business) and can be localised in which case your tests won’t run across different cultures.

The most powerful usage of CSS selectors is where you add your own data attributes to elements in your application and use these to select elements: straightforward, efficient and less brittle than other approaches. For example:

a[data-e2e-value="free"]

How do you identify elements in your WebDriver automated tests?

AMA: IE11 Button Clicking in Selenium

Anthony asks…

I have coded to click buttons on IE11/Win7 but the latest version of Selenium IE doesn’t click the buttons correctly most of times. Most of times, it clicks one button below. I thought it might be loading time so added some waiting but still click one button below or two buttons below sometimes. I googled this and found several posting saying Selenium IE doesn’t click buttons well. Now I have moved it to FF but I am still wondering why IE is not accurate. I know a lot of Selenium test developers in the field but they are having the same issue or they know a workaround. What do you think of this issue on IE11? Are you aware of this issue? FYI, the buttons are not regular HTML tag. The menu system with clickable tag is created by javascript. Thank you!

My Response…

We actually don’t run any tests in Internet Explorer any more since these weren’t finding any browser specific bugs (we do exploratory testing in Internet Explorer instead).

But, I have heard of problems generally with the IEDriver tool. If you’re working on a JavaScript generated app I think the best thing for you to do would instead of using a native click in Selenium is instead execute a JavaScript click event. The exact syntax will depend on which language you’re using Selenium in, but it should look something like this:

this.driver.executeScript( 'return arguments[0].click();', webElement );

I hope this solution helps!

Save password prompts in Chrome 57 with WebDriver

When running Selenium WebDriver scripts against the latest version of Chrome (57) it shows a save password prompt that hasn’t appeared previously whilst using Chromedriver, as far as I know.

chrome 57 save password prompt Continue reading “Save password prompts in Chrome 57 with WebDriver”

Upgrading WebdriverJs to Selenium 3

Yes, I know that Selenium 3 has been out for a while, but I’ve finally got around at looking at updating our end-to-end tests to use it. Newer versions of Firefox require Geckodriver which require Selenium 3.3+ so it’s a forced upgrade of sorts.

Continue reading “Upgrading WebdriverJs to Selenium 3”

Checking web element styles using WebDriverJs

I try to avoid incorporating any or layout/style based checks or locators into my automated end to end tests since these typically change more often leading to a higher test maintenance burden.

But I did have a circumstance recently where I wanted to check that a change I dynamically made to a page was reflected in the resultant web element’s style.

Continue reading “Checking web element styles using WebDriverJs”

Checking an image is actually visible in WebDriverJs

I recently discovered a gap in one of my e2e automated tests where I was checking the existence of an uploaded image in the DOM, but not that the image was actually displayed.

driver.isElementPresent( By.css( `img[alt='upload.jpg']` ) ).then( function( present ) {
  assert.equal( present, true, 'Image not displayed' );
} );

If the DOM has a reference to the image, but it isn’t actually rendered this test will pass. This isn’t ideal.

I remembered my post about how to check that an image is actually rendered using WebDriver in C# and so I used the same JavaScript script which WebDriverJs sends to the driver:

driver.findElement( By.css( `img[alt='upload.jpg']` ) ).then( function( element ) {
  driver.executeScript( 'return (typeof arguments[0].naturalWidth!=\"undefined\" && arguments[0].naturalWidth>0)', element ).then( function( present ) {
    assert.equal( present, true, 'Image not displayed' );
  } );
} );

This works a treat. I’ve moved it into a helper function so I can use this anywhere without repeating it also.

Testing end-to-end with Mocha

As part of my excellent Excellence Wrangler role at Automattic, one of my key tasks has been establishing some end-to-end tests for WordPress.com using Mocha with WebDriverJs. Our testing pyramid doesn’t look much like a pyramid:

wordpress.com test pyramid.png

We’ve got lots of React unit tests at the bottom: these are to speed development.

We’re intentionally missing a middle: the REST API we consume has its own unit tests, we don’t need integration tests for it. We don’t have detailed full stack acceptance tests of our UI: these are too slow and brittle.

We have a handful of e2e flow tests at the top, these are to protect the user experience, we run these on every deployment and frequently in production. These can be brittle on such a fast moving code base, but we limit their number (depth) so they still give us good confidence everything is working well together but limiting our overhead.

So what do these end-to-end tests look like?

I hadn’t used Mocha before and I was used to writing end-to-end tests in Gherkin format in tools like Cucumber and Specflow so I initially began writing end-to-end tests that looked like this:

test.describe('WordPress.com Sign Up', function() {
  test.beforeEach(function() {
    driver.manage().deleteAllCookies();
  });

  test.it('Can Create A Free Blog', function() {
    var signupFlow = new SignUpFlow( driver, 'desktop' );
    signupFlow.createFreeBlog( 'en' );
  });

  test.it('Can Create A New Site With a Paid Domain Upgrade', function() {
    var signupFlow = new SignUpFlow( driver, 'desktop' );
    signupFlow.CreateSiteWithDomainPaidByCreditCard( 'en' );
  });
});

I was pushing the code down into flow classes which I have used before, but the issue with this was the output I was getting from Mocha wasn’t very rich:

WordPress.com Sign Up
      ✓ Can Create A Free Blog
      ✓ Can Create A New Site With a Paid Domain Upgrade

I then realized by looking at an end-to-end test written by another developer that you can nest describe and it statements to give you much more expressive end-to-end tests.

test.describe( `Sign Up (${screenSize})`, function() {

  test.describe( 'Free Site:', function() {
    test.before( 'Delete Cookies and Local Storage', function() {
      driverManager.clearCookiesAndDeleteLocalStorage( driver );
    } );

    test.describe( 'Sign up for a free site', function() {

      test.describe( 'Step One: Themes', function() {
        test.before( 'Can see the choose a theme page as the starting page', function() {
          this.chooseAThemePage = new ChooseAThemePage( driver, { visit: true } );
          return this.chooseAThemePage.displayed().then( ( displayed ) => {
            assert.equal( displayed, true, 'The choose a theme start page is not displayed' );
          } );
        } );

        test.it( 'Can select the first theme', function() {
          return this.chooseAThemePage.selectFirstTheme();
        } );
      } );

      test.describe( 'Step Two: Domains', function() {
        test.before( 'Can then see the domains page ', function() {
          this.findADomainComponent = new FindADomainComponent( driver );
          return this.findADomainComponent.displayed().then( ( displayed ) => {
            assert.equal( displayed, true, 'The choose a domain page is not displayed' );
          } );
        } );

        test.it( 'Can search for a blog name', function() {
          return this.findADomainComponent.searchForBlogNameAndWaitForResults( blogName );
        } );

        test.it( 'Can see a free WordPress.com blog address in results ', function() {
          return this.findADomainComponent.freeBlogAddress().then( ( actualAddress ) => {
            assert.equal( actualAddress, expectedBlogAddress, 'The expected free address is not shown' )
          } );
        } );

        test.it( 'Can select the free address', function() {
          return this.findADomainComponent.selectFreeAddress();
        } );
      } );

This gives us rich feedback:

Sign Up (desktop)
    Free Site:
      Sign up for a free site
        Step One: Themes
          ✓ Can see the choose a theme page as the starting page
          ✓ Can select the first theme
        Step Two: Domains
          ✓ Can then see the domains page
          ✓ Can search for a blog name
          ✓ Can see a free WordPress.com blog address in results
          ✓ Can select the free address

The mistake I had made which I didn’t realize was not creating enough nesting, instead of having Step One, Step Two etc. next to one another, they should be nested within each other. This is because if they’re next to one another, Mocha will run the Step Two, Step Three etc. even if Step One has failed, which is not what we want in an end-to-end scenario where each step is dependent on the previous one.

So, it now looks something like this:

test.describe( 'Free Site:', function() {
    test.before( 'Delete Cookies and Local Storage', function() {
      driverManager.clearCookiesAndDeleteLocalStorage( driver );
    } );

    test.describe( 'Sign up for a free site', function() {
      test.describe( 'Step One: Themes', function() {
        test.before( 'Can see the choose a theme page as the starting page', function() {
          this.chooseAThemePage = new ChooseAThemePage( driver, { visit: true } );
          return this.chooseAThemePage.displayed().then( ( displayed ) => {
            assert.equal( displayed, true, 'The choose a theme start page is not displayed' );
          } );
        } );

        test.it( 'Can select the first theme', function() {
          return this.chooseAThemePage.selectFirstTheme();
        } );

        test.describe( 'Step Two: Domains', function() {
          test.before( 'Can then see the domains page ', function() {
            this.findADomainComponent = new FindADomainComponent( driver );
            return this.findADomainComponent.displayed().then( ( displayed ) => {
              assert.equal( displayed, true, 'The choose a domain page is not displayed' );
            } );
          } );

          test.it( 'Can search for a blog name', function() {
            return this.findADomainComponent.searchForBlogNameAndWaitForResults( blogName );
          } );

          test.it( 'Can see a free WordPress.com blog address in results ', function() {
            return this.findADomainComponent.freeBlogAddress().then( ( actualAddress ) => {
              assert.equal( actualAddress, expectedBlogAddress, 'The expected free address is not shown' )
            } );
          } );

          test.it( 'Can select the free address', function() {
            return this.findADomainComponent.selectFreeAddress();
          } );

          test.describe( 'Step Three: Plans', function() {

which means the output is slightly different but still very useful:

Sign Up (mobile)
  Free Site:
    Sign up for a free site
      Step One: Themes
        ✓ Can select the first theme
        Step Two: Domains
          ✓ Can search for a blog name
          ✓ Can see a free WordPress.com blog address in results
          ✓ Can select the free address
          Step Three: Plans
            ✓ Can select the free plan

These tests are much better written this way. The only issue I am left facing with Mocha is when a before hook fails (such as logging in) the generic afterEach hook we have to take screenshots is not triggered (this is only triggered when an it block is run.

Waiting for AJAX calls in WebDriver C#

I was trying to work out how to wait for AJAX calls to complete in C# WebDriver before continuing a test.

Whilst I believe that your UI should visually indicate that AJAX activity is occurring (such as a spinner) and in this case you should be able to wait until such an indicator changes, if you don’t have a visual indicator and you use JQuery for your AJAX calls, you can use a JavaScript call to jQuery.active to determine if there are any active AJAX requests, and wait until this value is zero.

I wrapped this into a WebDriver extension method on Driver, so you can call it like this:

Driver.FindElement(By.Id("name")).Set("Alister");
Driver.WaitForAjax();
Driver.FindElement(By.Id("next")).Click();

The actual extension method looks like this:

public static void WaitForAjax(this IWebDriver driver, int timeoutSecs = 10, bool throwException=false)
{
  for (var i = 0; i < timeoutSecs; i++)
  {
    var ajaxIsComplete = (bool)(driver as IJavaScriptExecutor).ExecuteScript("return jQuery.active == 0");
    if (ajaxIsComplete) return;
    Thread.Sleep(1000);
  }
  if (throwException)
  {
    throw new Exception("WebDriver timed out waiting for AJAX call to complete");
  }
}

I hope you find this helpful if you’re ever in the same situation.

Five automated acceptance test anti-patterns

Whilst being involved with lots of people writing automated acceptance tests using tools like SpecFlow and WebDriver I’ve seen some ‘anti-patterns’ emerge that can make these tests non-deterministic (flaky), very fragile to change and less efficient to run.

Here’s five ‘anti-patterns’ I’ve seen and what you can do instead.

Anti-pattern One: Not using page-objects

Page objects are just a design pattern to ensure automated UI tests use reusable, modular code. Not using them, eg, writing WebDriver code directly in step definitions, means any changes to your UI will require updates in lots of different places instead of the one ‘page’ class.

Bad

[When(@"I buy some '(.*)' tea")]
public void WhenIBuySomeTea(string typeOfTea)
{
Driver.FindElement(By.Id("tea-"+typeOfTea)).Click();
Driver.FindElement(By.Id("buy")).Click();
}

Better

[When(@"I buy some '(.*)' tea")]
public void WhenIBuySomeTea(string typeOfTea)
{
     MenuPage.BuyTea(typeOfTea);
}

Complicated set up scenarios within the tests themselves

Whilst there’s a place for automated end-to-end scenarios (I call these user journies), I prefer most acceptance tests to jump straight to the point.

Bad

Scenario: Accept Visa and Mastercard for Australia
 Given I am on the home page for Australia
 And I choose the tea menu
 And I select some 'green tea'
 And I add the tea to my basket
 And I choose to checkout
 Then I should see 'visa' is accepted
 And I should see 'mastercard' is accepted

Better

This usually requires adding some special functionality to your app, but the ability for testing to ‘jump’ to certain pages with data automatically set up makes automated tests much easier to read and maintain.

Scenario: Accept Visa and Mastercard for Australia
 Given I am the checkout page for Australia
 Then I should see 'visa' is accepted
 And I should see 'mastercard' is accepted

Using complicated x-path or CSS selectors

Using element identification selectors that have long chains from the DOM in them leads to fragile tests, as any change to that chain in the DOM will break your tests.

Bad

private static readonly By TeaTypeSelector =
            By.CssSelector(
                "#input-tea-type > div > div.TeaSearchRow > div.TeaSearchCell.no > div:nth-child(2) > label");

Better

Identify by ‘id’ (unique) or ‘class’. If there’s multiple elements in a group, create a parent container and iterate through them.

private static readonly By TeaTypeSelector = By.Id("teaType");

Directly executing JavaScript

Since WebDriver can directly execute any arbitrary JavaScript, it can be tempting to bypass DOM manipulation and just run the JavaScript.

Bad

public void RemoveTea(string teaType)
{
  (driver as IJavaScriptExecutor).ExecuteScript(string.Format("viewModel.tea.types.removeTeaType(\"{0}\");", teaType));
  }

Better

It is much better to let the WebDriver control the browser elements which should fire the correct JavaScript events and call the JavaScript, as that way you avoid having your ‘test’ JavaScript in sync to your ‘real’ JavaScript.

public void RemoveTea(string teaType)
{
  driver.FindElement(By.Id("remove-"+teaType)).Click();
}

Embedding implementation detail in your features/scenarios

Acceptance test scenarios are meant to convey intention over implementation. If you start seeing things like URLs in your test scenarios you’re focusing on implementation.

Bad


 Scenario: Social media links displayed on checkout page
   Given I am the checkout page for Australia
   Then I should see a link to 'http://twitter.com/beautifultea'
   And I should see a link to 'https://facebook.com/beautifultea'
 

Better

Hide implementation detail in the steps (or pages, or config) and make your scenarios about the test intention.


 Scenario: Social media links displayed on checkout page
   Given I am the checkout page for Australia
   Then I should see a link to the Beautiful Tea Twitter account
   And I should see a link to the Beautiful Tea Facebook page
 

I hope you’ve enjoyed these anti-patterns. Leave a comment below if you have any of your own.

100,000 e2e selenium tests? Sounds like a nightmare!

This story begins with a promo email I received from Sauce Labs…

“Ever wondered how an Enterprise company like Salesforce runs their QA tests? Learn about Salesforce’s inventory of 100,000 Selenium tests, how they run them at scale, and how to architect your test harness for success”

saucelabs email

100,000 end-to-end selenium tests and success in the same sentence? WTF? Sounds like a nightmare to me!

I dug further and got burnt by the molten lava: the slides confirmed my nightmare was indeed real:

Salesforce Selenium Slide

“We test end to end on almost every action.”

Ouch! (and yes, that is an uncredited image from my blog used in the completely wrong context)

But it gets worse. Salesforce have 7500 unique end-to-end WebDriver tests which are run on 10 browsers (IE6, IE7, IE8, IE9, IE10, IE11, Chrome, Firefox, Safari & PhantomJS) on 50,000 client VMs that cost multiple millions of dollars, totaling 1 million browser tests executed per day (which equals 20 selenium tests per day, per machine, or over 1 hour to execute each test).

Salesforce UI Testing Portfolio

My head explodes! (and yes, another uncredited image from this blog used out of context and with my title removed).

But surely that’s only one place right? Not everyone does this?

A few weeks later I watched David Heinemeier Hansson say this:

“We recently had a really bad bug in Basecamp where we actually lost some data for real customers and it was incredibly well tested at the unit level, and all the tests passed, and we still lost data. How the f*#% did this happen? It happened because we were so focused on driving our design from the unit test level we didn’t have any system tests for this particular thing.
…And after that, we sort of thought, wait a minute, all these unit tests are just focusing on these core objects in the system, these individual unit pieces, it doesn’t say anything about whether the whole system works.”

~ David Heinemeier Hansson – Ruby on Rails creator

and read that he had written this:

“…layered on top is currently a set of controller tests, but I’d much rather replace those with even higher level system tests through Capybara or similar. I think that’s the direction we’re heading. Less emphasis on unit tests, because we’re no longer doing test-first as a design practice, and more emphasis on, yes, slow, system tests (Which btw do not need to be so slow any more, thanks to advances in parallelization and cloud runner infrastructure).”

~ David Heinemeier Hansson – Ruby on Rails creator

I started to get very worried. David is the creator of Ruby on Rails and very well respected within the ruby community (despite being known to be very provocative and anti-intellectual: the ‘Fox News’ of the ruby world).

But here is dhh telling us to replace lower level tests with higher level ‘system’ (end to end) tests that use something like Capybara to drive a browser because unit tests didn’t find a bug and because it’s now possible to parallelize these ‘slow’ tests? Seriously?

Speed has always seen as the Achille’s heel of end to end tests because everyone knows that fast feedback is good. But parallelization solves this right? We just need 50,000 VMs like Salesforce?

No.

Firstly, parallelization of end to end tests actually introduces its own problems, such as what to do with tests that you can’t run in parallel (for example, ones that change global state of a system such as a system message that appears to all users), and it definitely makes test data management trickier. You’ll be surprised the first time you run an existing suite of sequential e2e tests in parallel, as a lot will fail for unknown reasons.

Secondly, the test feedback to someone who’s made a change still isn’t fast enough to enable confidence in making a change (by the time your app has been deployed and the parallel end-to-end tests have run; the person who made the change has most likely moved onto something else).

But the real problem with end to end tests isn’t actually speed. The real problem with end to end tests is that when end to end tests fail, most of the time you have no idea what went wrong so you spend a lot of time trying to find out why. Was it the server? Was it the deployment? Was it the data? Was it the actual test? Maybe a browser update that broke Selenium? Was the test flaky (non-deterministic or non-hermetic)?

Rachel Laycock and Chirag Doshi from ThoughtWorks explain this really well in their recent post on broken UI tests:

“…unlike unit tests, the functional tests don’t tell you what is broken or where to locate the failure in the code base. They just tell you something is broken. That something could be the test, the browser, or a race condition. There is no way to tell because functional tests, by definition of being end-to-end, test everything.”

So what’s the answer? You have David’s FUD about unit testing not catching a major bug in BaseCamp. On the other hand you need to face the issue of having a large suite of end to end tests will most likely result in you spending all your time investigating test failures instead of delivering new features quickly.

If I had to choose just one, I would definitely choose a comprehensive suite of automated unit tests over a comprehensive suite of end-to-end/system tests any day of the week.

Why? Because it’s much easier to supplement comprehensive unit testing with human exploratory end-to-end system testing (and you should anyway!) than trying to manually verify units function from the higher system level, and it’s much easier to know why a unit test is broken as explained above. And it’s also much easier to add automated end-to-end tests later than trying to retrofit unit tests later (because your code probably won’t be testable and making it testable after-the-fact can introduce bugs).

To answer our question, let’s imagine for a minute that you were responsible for designing and building a new plane. You obviously need to test that your new plane works. You build a plane by creating parts (units), putting these together into components, and then putting all the components together to build the (hopefully) working plane (system).

If you only focused on unit tests, like David mentioned in his Basecamp example, you could be pretty confident that each piece of the plane would be have been tested well and works correctly, but wouldn’t be confident it would fly!

If you only focussed on end to end tests, you’d need to fly the plane to check the individual units and components actually work (which is expensive and slow), and even then, if/when it crashed, you’d need to examine the black-box to hopefully understand which unit or component didn’t work, as we currently do when end-to-end tests fail.

But, obviously we don’t need to choose just one. And that’s exactly what Airbus does when it’s designing and building the new Airbus A350:

As with any new plane, the early design phases were riddled with uncertainty. Would the materials be light enough and strong enough? Would the components perform as Airbus desired? Would parts fit together? Would it fly the way simulations predicted? To produce a working aircraft, Airbus had to systematically eliminate those risks using a process it calls a “testing pyramid.” The fat end of the pyramid represents the beginning, when everything is unknown. By testing materials, then components, then systems, then the aircraft as a whole, ever-greater levels of complexity can be tamed. “The idea is to answer the big questions early and the little questions later,” says Stefan Schaffrath, Airbus’s vice president for media relations.

The answer, which has been the answer all along, is to have a balanced set of automated tests across all levels, with a disciplined approach to having a larger number of smaller specific automated unit/component tests and a smaller number of larger general end-to-end automated tests to ensure all the units and components work together. (My diagram below with attribution)

Automated Testing Pyramid

Having just one level of tests, as shown by the stories above, doesn’t work (but if it did I would rather automated unit tests). Just like having a diet of just chocolate doesn’t work, nor does a diet that deprives you of anything sweet or enjoyable (but if I had to choose I would rather a diet of healthy food only than a diet of just chocolate).

Now if we could just convince Salesforce to be more like Airbus and not fly a complete plane (or 50,000 planes) to test everything every-time they make a change and stop David from continuing on his anti-unit pro-system testing anti-intellectual rampage which will result in more damage to our industry than it’s worth.

Do you REALLY need to run your WebDriver tests in IE?

I recently read that Microsoft are now on board to officially support Selenium WebDriver from Internet Explorer (IE) 11+

Whilst I welcome the news, I try to avoid running WebDriver tests in Internet Explorer completely for the following reasons:

  • Internet Explorer is a very non-testable browser. Whilst everyone agrees testability of your app is paramount, testability of its run-time container, the browser, is equally important. Settings such as security zones, proxies and auto-complete in IE must be manually configured on each machine instead of being programmatically specified by profiles in Firefox and Chrome; and
  • Because IE has historically been so hard to test, WebDriver’s support for IE is much less mature and much less stable and efficient than Firefox and Chrome

The only way automated UI tests can succeed (and the chances of success aren’t high to begin with), is if they are fast and consistent. WebDriver against IE is neither (I see it more of a problem with IE than WebDriver). So if you want to use WebDriver, don’t test against IE, test against Firefox or Chrome.

But, In my role as a consultant, I continually hear managers say that we must run our WebDriver automated tests in Internet Explorer. There’s usually one or two reasons given:

  1. Our web app is for internal staff only and our only supported browser is IE (which is usually IE8); and/or
  2. Our web app (or the one we pay for) has been specifically coded to work only in IE and therefore it’s not possible to test in another browser.

You need to explain that your WebDriver automated tests aren’t the only tests you’ll run against your app. In a corporate environment (such as those who only support IE8), chances are you’ll have a period of business acceptance testing or user acceptance testing. This will be conducted by users in the browser they use, so this straight away mitigates the risk of only running your automated tests against a non-IE browser.

From my experience testing many applications against older versions of IE, the one thing that doesn’t work well (and causes web apps to break) is not the HTML but JavaScript support. If your app contains a decent amount of JavaScript you could write some JavaScript tests in a tool like js-test-driver and run these automatically against older versions of IE automatically. That way you can be assured your JavaScript is working without having to deal with IE/WebDriver issues (and slow running tests).

As for applications specifically coded to work in IE. Web standards exist for a reason and in my opinion it’s crazy to develop a web app that is tied to the implementation of a browser by a single vendor. Microsoft made IE11 purposely report itself to a web server as not being IE so Microsoft can avoid this exact situation happening in the future.

Chances are if your app is hard-coded to only work in IE then it won’t work in IE11 anyway. If it works in IE11, then it’ll work in Chrome and Firefox as they all follow web standards, and you can run your WebDriver tests reliably now.

I believe you’re better off not having any automated UI tests if you there’s a mandate in place that you must run them against IE. If you can’t automatically test your app in Firefox or Chrome, I believe you’re better off spending your time manually testing your app in IE than trying to maintain a test suite that will never be efficient or reliable.

Using appium in Ruby for iOS automated functional testing

As previously explained, I recently started on an iOS project and have spent a bit of time comparing iOS automation tools and chose Appium as the superior tool.

The things I really like about Appium is that it is language/framework agnostic as it uses the WebDriver standard WIRE protocol, it doesn’t require any modifications to your app, supports testing web views (also known as hybrid apps) and it supports Android since we are concurrently developing an Android application (it also supports OSX and Firefox OS but we aren’t developing for those, yet). There isn’t another iOS automated testing tool, that I know of, that ticks that many boxes for me.

Getting Started

The first thing to do is download the appium.app package from the appium website. I had an issue with the latest version (0.11.2) launching the server which can be resolved by opening the preferences and checking “Override existing sessions”.

You run the server from inside the appium.app which takes your commands and relays them to the iOS simulator. There’s also a very neat ‘inspector’ tool which shows you all the information you need to know about your app and how to identify elements.

Note: there’s currently a problem with XCode 5.0.1 (the latest version as I write) which means Instruments/UIAutomation won’t work at all. You’ll need to downgrade (uninstall/reinstall) to XCode 5.0 to get appium to work at all.

Two Ruby Approaches

This confused me a little to start, but there’s actually two vastly different ways to use appium in ruby.

1) Use the standard selenium-webdriver gem

If you’re used to using WebDriver, like me, this will be the most straightforward approach (this is the approach I have taken). Appium extends the API to add different gestures by calling execute_script from the driver, so all other commands stay the same (for example, find_element).

2) Use the appium_lib library

There is a Ruby gem appium_lib that has a different API to the selenium-webdriver gem to control appium. I don’t see any massive benefits to this approach besides having an API that is more specific to app testing.

Using Selenium-WebDriver to start appium in ruby

Launching an appium app is as simple as defining some capabilities with a path to your .app file you have generated using XCode (this gets put into a deep folder so you can write the location to a file and read it from that file).

capabilities = {
'browserName' => 'iOS',
'platform' => 'Mac',
'version' => '6.1',
'app' => appPath
}
driver = Selenium::WebDriver.for :remote,
desired_capabilities: capabilities,
url: "http://127.0.0.1:4723/wd/hub"

Locating elements

Once you’ve launched your app, you’ll be able to use the appium inspector to see element attributes you can use in appium. Name is a common attribute, and if you find that it’s not being shown, you can add a property AccessibilityIdentifier in your Objective C view code which will flow throw to appium. This makes for much more robust tests than relying on labels or xpath expressions.

driver.find_element(:name, "ourMap").displayed?

Enabling location services for appium testing

This got me stuck for a while as there’s quite a bit of conflicting information about appium on how to handle the location services dialog. Whilst you should be able to interact with it as a normal dialog in the latest version of appium, I would rather not see it at all, so I wrote a method to copy a plist file with location services enabled in it to the simulator at the beginning of the test run. It’s quite simple (you can manually copy the clients.plist after manually enabling location services):

def copy_location_services_authentication_to_sim
source = "#{File.expand_path(File.dirname(__FILE__))}/clients.plist"
destination = "#{File.expand_path('~')}/Library/Application Support/iPhone Simulator/7.0/Library/Caches/locationd"
FileUtils.cp_r(source, destination, :remove_destination => true)
end

Waiting during appium tests

This is exactly the same as selenium-webdriver. There’s an implicit wait, or you can explicitly wait like such:

driver.manage.timeouts.implicit_wait = 10
wait = Selenium::WebDriver::Wait.new :timeout => 30
wait.until {driver.find_element(:name, 'monkeys').displayed? }

Mobile gestures

The obvious difference between a desktop web browser and a mobile app is gestures. Appium adds gestures to WebDriver using execute_script. I recommend using the percentage method (0.5 etc) instead of pixel method as it is more resilient to UI change.

For example:

driver.execute_script 'mobile: tap', :x => 0.5, :y => 0.5

or

b = driver.find_element :name, 'Sign In'
driver.execute_script 'mobile: tap', :element => b.ref

Testing Embedded Web Views

The native and web views seamlessly combine so you can use the same find_element method to find either. The appium.app inspector displays the appropriate attributes.

Note: I can’t seem to be able to execute a gesture (eg. swipe) over a Web View. I don’t know whether this is a bug or a limitation of Appium.

Summary

I have found that using the familiar selenium-webdriver gem with appium has been very powerful and efficient. Being able to open an interactive prompt (pry or irb) and explore your app using the selenium-webdriver library and the appium.app inspector is very powerful as you can script on the fly. Whilst appium still seems relatively immature, it seems a very promising approach to iOS automation.

Now to get watir-webdriver to work with appium.

Watir-WebDriver with GhostDriver on OSX: headless browser testing

GhostDriver has been released which means it is now easy to run reliable headless WebDriver tests on Mac OSX.

Steps to get working on OSX

  1. First make sure you have homebrew installed
  2. Run
    brew update

    then

    brew install phantomjs

    which should install PhantomJS 1.8.1 or newer

  3. Run irb and start using GhostDriver!
    require 'watir-webdriver'
    b = Watir::Browser.new :phantomjs
    b.goto "www.google.com"
    b.url #"http://www.google.com.au/"
    b.title #"Google"

I’ve tested it on a large test suite (123 scenarios) and it behaves the same as other browsers with full JavaScript support. It took 8m13s in total: surprisingly it is slightly slower than ChromeDriver (7m30s) in my testing, but a little faster than the Firefox Driver (9m33s).

Well done to all involved in this project. It’s great to see a reliable, realistic headless browser with full JavaScript support for WebDriver finally released.

And yes, in case you’re wondering, it does screenshots!

Using WebDriver to automatically check for JavaScript errors on every page

Update: please see my newer post on this that doesn’t require changes to your application.


One of the benefits of using a page-object model is that you can perform certain actions on every page in your application that you visit, such as checking for accessibility.

One such check is automatically checking for JavaScript errors on page load. There’s a couple of approaches out there, one involves copying and pasting a small snippet of JavaScript into each page. Since our application we are working on uses a standard template for every page, we simply add some JavaScript to the common page header that is the first thing to load on the page, and catches any JavaScript errors that occur:

define(["amdUtils/string/interpolate"], function(interpolate) {
    window.jsErrors = [];

    window.onerror = function (errorMessage, url, lineNumber) {
        var message = interpolate("Error: [{{0}}], url: [{{1}}], line: [{{2}}]", [errorMessage, url, lineNumber]);
        window.jsErrors.push(message);
        return false;
    };
});

It’s then a matter of checking the jsErrors each time we visit a page, this is example C# code we use in the base page class for every page.

var js = driver as IJavaScriptExecutor;
ICollection javascriptErrors = null;
for (var i = 0; i < 20; i++)
{
  javascriptErrors = js.ExecuteScript("return window.jsErrors") as ICollection;
  if (javascriptErrors != null) break;
  System.Threading.Thread.Sleep(1000);
}
Assert.IsNotNull(javascriptErrors, "Can't seem to load JavaScript on the page to find JavaScript errors. Check that JavaScript is enabled.");
var javaScriptErrorsAsString = javascriptErrors.Cast<string>().Aggregate("", (current, error) => current + (error + ", "));
Assert.AreEqual("", javaScriptErrorsAsString, "Found JavaScript errors on page load: " + javaScriptErrorsAsString);

This code waits until it can read the jsErrors on the page. If it can’t, it means that JavaScript didn’t load and this is an error. Once it gets the jsErrors, it checks that there are none.

This code has been very useful for us. It has caught a number of JavaScript errors, and is especially great for finding cross browser JavaScript issues, as we run our acceptance tests in 5 different browsers.

Checking an image is actually visible using WebDriver

I didn’t realize it’s actually a little tricky to check that an image is loaded when using WebDriver. WebDriver will only complain if the image tag you’re looking for isn’t in the DOM, not if the image link is broken and not actually visible.

For example, in watir-webdriver (ruby), this doesn’t really work as I would expect as the image isn’t actually visible on the ‘brokenimage’ page.

require 'watir-webdriver'
b = Watir::Browser.new :firefox
b.goto 'https://dl.dropbox.com/u/18859962/brokenimage.html'
puts b.image(id: 'watermelon').visible? #true but is not visible

The way to check that is is actually visible is to check a JavaScript property ‘naturalWidth’ is greater than 0.

b = Watir::Browser.new :firefox
b.goto 'https://dl.dropbox.com/u/18859962/brokenimage.html'
puts b.execute_script("return (typeof arguments[0].naturalWidth!=\"undefined\" && arguments[0].naturalWidth>0)", b.image(id: 'watermelon'))

Unfortunately this doesn’t work in IE, so you should use the ‘complete’ JavaScript method in IE (which doesn’t work in other browsers):

b = Watir::Browser.new :firefox
b.goto 'https://dl.dropbox.com/u/18859962/brokenimage.html'
puts b.execute_script("return arguments[0].complete", b.image(id: 'watermelon'))

In C#, you can wrap this up into a WebDriver extension method so you can this directly from Driver passing in the image element.

public static bool IsImageVisible(this IWebDriver driver, IWebElement image)
  {
    var script = TestConfig.DriverType == "ie"
                ? "return arguments[0].complete"
                : "return (typeof arguments[0].naturalWidth!=\"undefined\"" +
                  " && arguments[0].naturalWidth > 0)";
    return (bool) ((IJavaScriptExecutor) driver).ExecuteScript(script, image);
  }

// Usage
Driver.IsImageVisible(Driver.FindElement(By.Id("watermelon)));

If it’s important that images load correctly in your application, you should probably start putting some of these in your WebDriver page objects. It’s simple to write a verify images method on a page that iterates through each image in the DOM and checks that it’s visible using the techniques above. Have fun.

Update: 30 November
I wrote about a slightly more elegant C# approach to do this directly from the element.

Are your IE WebDriver tests running slow? Maybe it’s the screenshots

My current job involves running a suite of automated acceptance and accessibility tests automatically across four browsers (IE8, IE9, Firefox & Chrome) on every check in. These are run automatically using a ThoughtWorks Go pipeline which is run on a freshly deployed integrated QA environment immediately after all unit, integration and JavaScript automated tests pass.

Whilst I set up five agents to run these tests in parallel across the different browsers, the build was as slow as its slowest member (much like a buffalo heard) which happened to be IE8 (followed closely by IE9).

Test Agents

Initially the execution times looked something like this:

  • Chrome ~50 secs
  • Firefox ~1m 10 secs
  • IE9 ~4 m 30 secs
  • IE8 ~5 mins

 

I was wondering why on earth it was taking so long, when Simon Stewart pointed out how screenshots work in the IE Driver. I set up the tests to take a screenshot at the end of each scenario, which meant each browser was taking about 18 screenshots per test run.

I didn’t know but the IE Driver maximizes then restores the IE window every time it takes a screenshot, and it also parses the entire DOM to take a screenshot. This is why it was taking so long to execute the tests.

I removed the screenshots from the IE runs and was able to reduce both IE8 and IE9 to just over 2 minutes execution time. Not the best, after all it’s over twice as slow as Chrome, but better than 5 minutes previously!

In the future, I’ll avoid taking any screenshots using IE Driver wherever possible.