Are your IE WebDriver tests running slow? Maybe it’s the screenshots

My current job involves running a suite of automated acceptance and accessibility tests automatically across four browsers (IE8, IE9, Firefox & Chrome) on every check in. These are run automatically using a ThoughtWorks Go pipeline which is run on a freshly deployed integrated QA environment immediately after all unit, integration and JavaScript automated tests pass.

Whilst I set up five agents to run these tests in parallel across the different browsers, the build was as slow as its slowest member (much like a buffalo heard) which happened to be IE8 (followed closely by IE9).

Test Agents

Initially the execution times looked something like this:

  • Chrome ~50 secs
  • Firefox ~1m 10 secs
  • IE9 ~4 m 30 secs
  • IE8 ~5 mins

 

I was wondering why on earth it was taking so long, when Simon Stewart pointed out how screenshots work in the IE Driver. I set up the tests to take a screenshot at the end of each scenario, which meant each browser was taking about 18 screenshots per test run.

I didn’t know but the IE Driver maximizes then restores the IE window every time it takes a screenshot, and it also parses the entire DOM to take a screenshot. This is why it was taking so long to execute the tests.

I removed the screenshots from the IE runs and was able to reduce both IE8 and IE9 to just over 2 minutes execution time. Not the best, after all it’s over twice as slow as Chrome, but better than 5 minutes previously!

In the future, I’ll avoid taking any screenshots using IE Driver wherever possible.

Automated WCAG 2.0 accessibility testing in the build pipeline

The web application I am currently working on must meet accessibility standards: WCAG 2.0 AA to be precise. WCAG 2.0 is a collection of guidelines that ensure your site is accessible for people with disabilities.

An example of poor accessibility design is missing an alt tag on an image, or not specifying a language of a document, eg:

<HTML lang="fr">

Building Accessibility In

Whilst later we’ll do doing accessibility assessments with people who are blind or have low vision, we need to make sure we build accessibility in. To do this, we need automated accessibility tests as part of our continuous integration build pipeline. My challenge this week was to do just this.

Automated Accessibility Tests

First I needed to find a tool to validate against the spec. We’re developing the web application locally so we’ll need to run it locally. There’s a tool called TotalValidator which offers a command line tool, the only downside is the licensing cost, as to use the command line version you need to buy at least 6 copies of the tool at £25 per copy, approximately US$240 in total. There’s no trial of command line tool unless you buy at least one copy of the pro tool at £25. I didn’t want to spend money on something that might not work so I kept looking for alternatives.

There are two sites I found that validate accessibility by URL or supplied HTML code: AChecker and WAVE.

AChecker: this tool which works really well. It even supplies an REST API, but I couldn’t find a way to call the API to validate HTML code (instead of by URL) which is what I would like to do. The software behind AChecker is open source (PHP) so you can actually install your own version behind your firewall if you wish.

WAVE: a new tool recently released by WebAim: Web Accessibility in Mind. Again this is an online checker that allows you to validate by URL or HTML code supplied, but unfortunately there’s no API (yet) and the results aren’t as easy to programatically read.

My Solution

The final solution I came up with is a specific tagged web accessibility feature in our acceptance tests project. This has scenarios that use WebDriver to navigate through our application capturing the HTML source code from each page visited. Finally, it visits the AChecker tool online and validates each piece of HTML source code it collected and fails the build if any accessibility problems are found.

AChecker Results

Build Light

We have a specific build light that runs all the automated acceptance tests and accessibility tests. If any of these fail, the light goes red.

Build Status Lights

It’s much better if it looks like this:

Build Lights All Green

Summary

It was fairly easy to use an existing free online accessibility checker to validate all HTML code in your local application, and make this status highly visible to the development team. By building accessibility in, we’re reducing the expected number of issues when we conduct more formal accessibility testing.

Bonus Points: faster accessibility feedback

Ideally, as a page is being developed, the developer/tester should be able to check accessibility (rather than waiting for the build to fail). The easiest way I have found behind a firewall is to use the WAVE Firefox extension, which displays errors as you use your site. Fantastic!

(Apple and Microsoft each have one known accessibility problem, Google has nine!)

Apple.com accessibility

Disgraceful degradation

Old browsers are a headache for websites: to develop for, to test, you name it, they’re nothing but bad news.

Thankfully modern browsers like Google Chrome, Apple Safari & Mozilla Firefox are not only more open standards compliant, but are generally automatically updated so there’s going to be a lot less legacy versions in the wild.

There’s two techniques I am familiar with to cater for older browsers: graceful degradation and progressive enhancement. Essentially these achieve the same outcome, sites that still work on older browsers, but are different approaches to the same problem.

Graceful degradation is building a site that is optimized for modern browsers, but then adding functionality to gracefully degrade (but still function) when accessed via an older browser.

Progressive enhancement is building a simple site that functions then adding enhancements that work on modern browser technology when it’s available.

Whilst these achieve similar outcomes, I believe we’ve got to a tipping point where a lot of people use non-Microsoft non-legacy browsers so I believe graceful degradation is a better bet in these times.

At work today, we noticed a problem where a Google font wasn’t loading on a dev machine, and our site rendered in Comic Sans (the font we all love to hate), which immediately gave me a great idea: disgraceful degradation: displaying the content of our site in Comic Sans if you’re an IE8 or below user. That should make them upgrade.

Where is the ‘story’ in user stories?

There’s an old Jerry Weinberg quote:

“no matter what the problem is, it’s always a people problem”

which pretty much describes every project I’ve worked on over the years. Lately, I’ve particularly noticed that most of the tough problems evident on projects are more people or business problems than technology problems, which makes me think it’s worthwhile for me to continue my exploration of the business/user end of my list of software development roles.

BA = Business Analyst
UX = User Experience
ET = Exploratory Tester
TT = Technical Tester
Dev = Software Developer

In this vein, I’ve recently been trying to understand how to better articulate user stories, in that one day I’d love to work as a business analyst.

Most nights I read stories to my (almost) three year-old son as a nice way to end the day. Lately I have been making up my own impromptu stories to keep things interesting. I have really enjoyed making up stories on the spot; I think I’d be a good BA.

But thinking about user stories along with bedtime stories immediately raises a question: where is the ‘story’ in user stories?

Most user stories at work sound something like this: “As a user, I want to log onto the system, so that I can access my information”. What a shitty story! Sorry, but seriously, if I told this story to my two year old son, he’d die of boredom!

I’ve spent a fair amount of time reading about user stories but I still can’t find out why they’re actually called stories, because I don’t think they are actual stories:

sto·ry/ˈstôrē/
Noun:
An account of imaginary or real people and events told for entertainment: “an adventure story”.

The closest thing I have found to actual user stories is the concept of ‘soap opera‘ testing scenarios which outline implausible yet possible scenarios:

“A man (Chris Patterson) and his wife (Chris Patterson) want to take their kids (from previous marriages, Chris Patterson, a boy, and Chris Patterson, a girl) on a flight from San Francisco to Glasgow to San Jose (Costa Rica) to San Jose (California) back to San Francisco. He searches for flights by schedule. He’s a Super-Elite-Premium frequent flyer, but he doesn’t want the upgrade that the system automatically grants him so that he can sit with his wife and kids in economy class. He requires a kosher meal, his wife is halal, the boy is a vegetarian, and the girl is allergic to wheat. He has four pieces of luggage per person (including two pairs of skis, three sets of golf clubs, two 120 lb. dogs, and three overweight suitcases), where his frequent flyer plan allows him (but only him) to take up to four checked items, but the others can only take two each. He gets to the last step on the payment page before he realizes that he has confused San Jose (California) for San Jose (Costa Rica), so the order of the itinerary is wrong. The airline cancels the flight after it has accepted his bags, and reroutes him on a partner. The partner cancels the flight (after it has accepted the bags) to San Jose (California) so it reroutes him to another competitor, who cancels the flight (after accepting the bags) to San Jose (Costa Rica) and reroutes him to another partner, who goes bankrupt after it has accepted the bags for the San Francisco flight.”

~ Michael Bolton

Now that’s a real user story!

So, I think we have two choices on the user stories front. We can either make our user stories actually like real juicy stories, or at least start calling them something else!

The color of acceptance is gray

James Shore recently wrote some brillant words about acceptance testing:

I think “acceptance” is actually a nuanced problem that is fuzzy, social, and negotiable. Using tests to mediate this problem is a bad idea, in my opinion. I’d rather see “acceptance” be done through face-to-face conversations before, after, and during development of code, centering around whiteboard sketches (earlier) and manual demonstrations (later) rather than automated tests.

To rephrase: “acceptance” should be a conversation, and it’s one that we should allow to grow and change as the customer sees the software and refines her understanding of what she wants. Testing is too limited, and too rigid. Asking customers to read and write acceptance tests is a poor use of their time, skill, and inclinations.

This is pretty much where my head is at right now around automating acceptance tests. Automated tests are black and white, acceptance is gray.

“The color of truth is gray.”
~ André Gide

I prefer to have a handful of end to end automated functional tests that cover the typical journey of a user than a large set of acceptance tests constantly in a state of flux as the system is being developed and acceptance is being defined and changed.

We need to take feedback from the customer that we are building the right thing and ensure our automated tests model this, not make them responsible for specifying the actual tests.

Mobile apps still need automated tests

Jonathan Rasmusson recently wrote what I consider to be quite a contentious blog post about iOS application development titled “It’s not about the unit tests”.

“…imagine my surprise when I entered a community responsible for some of the worlds most loved mobile applications, only to discover they don’t unit test. Even more disturbing, they seem to be getting away with it!”

Whilst I agree with the general theme of the blog post which is change your mind, challenge assumptions:

“All I can say is to keep growing sometimes we need to challenge our most cherished assumptions. It doesn’t always feel good, but that’s how we grow, gain experience, and turn knowledge into wisdom.”

“The second you think you’ve got it all figured out you’ve stopped living.”

I don’t agree with the content.

Jonathan’s basic premise is that you can get away with little or no unit testing for your iOS application for a number of reasons including developing for a smaller screen size, no legacy, one language, visual development and developing on a mature platform. But the real reason that iOS get away with it is by caring.

“These people simply cared more about their craft, and what they were doing, than their contemporaries. They ‘out cared’ the competition. And that is what I see in the iOS community.”

But in writing this post, I believe he missed two critical factors when deciding whether to have automated tests for your iOS app.

iOS users are unforgiving

If you accidentally release an app with a bug, see how quickly you’ll start getting one star reviews and nasty comments in the App Store. See how quickly new users will uninstall your app and never use it again.

The App Store approval process is not capable of supporting quick bug fixes

Releasing a new version of your app that fixes a critical bug may take you 2 minutes (you don’t even need to fix a broken test or write a new test for it!) but it then takes Apple 5-10 business days to release it to your users. This doesn’t stop the one star reviews and comments destroying your reputation in the meantime.

Case in Point: Lasoo iPhone app

I love the Lasoo iPhone app, because it allows me to read store catalogs on my phone (I live in an apartment block and we don’t get them delivered). Recently I upgraded the app and then tried to use it but it wouldn’t even start. I tried the usual close/reopen, delete/reinstall but still nothing. I then checked the app store:

Lasoo iPhone app reviews
Lasoo iPhone app reviews

Oh boy, hundreds of one star reviews within a couple of days: the app is stuffed! I then checked twitter to make sure they knew it was broken, and to my surprise they’d fixed it immediately but were waiting for Apple to approve the fix.

I can’t speculate on whether Lasoo care or not about their app, but imagine for a second if they had just one automated test, one automated test that launched the app to make sure it worked, and it was run every time a change, no matter how small, was made. That one automated test would have saved them from hundreds of one star reviews and having to apologize to customers on twitter whilst they waited for Apple to approve the fix.

Which raises another point:

“[Apple] curate and block apps that don’t meet certain quality or standards.”

The Lasoo app was so broken it wouldn’t even start, so how did it get through Apple’s approval process for certain quality or standards?

Just caring isn’t enough to protect you from introducing bugs

We all make mistakes, even if we care. That’s why we have automated tests, to catch those mistakes.

Not having automated tests is a bit like having unprotected sex. You can probably get away with it forever if you’re careful, but the consequences of getting it wrong are pretty dramatic. And just because you can get away with it doesn’t mean that other people will be able to.

Why hot-desking is a bad idea

Hot-desking (aka hotelling) in open plan offices seems to be growing in popularity, and why not, since it seems to make sense from an financial and collaborative viewpoint. But I am of the belief that it’s actually a bad idea. Here’s why:

It’s unhygienic: Most hot-desk arrangements I have seen involve a thin client PC (eg. Windows Thin PC) on each hot-desk which is used to provide a way for a current user to access their computer session wherever they log in. Since the average keyboard has sixty times more germs than a toilet seat, I actually feel disgusted every time I sit down at a hot-desk and am expecting to use the filthy keyboard all day (much like if someone asked me to set up my computer on a toilet seat).

It’s confusing: Not knowing where someone is each day is particularly confusing, especially for new starters who are getting to know people. Sure, IM solves this situation to some degree, but I’ve spent time roaming the office floors looking for people who I don’t know where they are sitting today.

It doesn’t actually work: Even though organizations have hot-desking so they can cut down on the number of desks and get people to sit together, I have found that people still get established in certain desks as they know they’ll be working there for some time, and they can’t be bothered packing their things up and setting up each day in a new desk. The only time these people seem to get displaced is when they go on leave and someone else has to sit at their ‘hot desk’ amongst the stuff they have left behind. The lack of dedicated space has actually been shown to make employees feel isolated and teamless, amongst other things:

A study released by the University of Sheffield in the UK shows it diminishes the connection between colleagues, and the scattered locations make it difficult for people to communicate with each other.

It continues the obsession with open plan: I seriously don’t like open plan offices and don’t understand why software development workplaces continue to foster them. They encourage constant interruption and distraction which inhibits productivity. Paul Graham explained it best, back in 2004:

After software, the most important tool to a hacker is probably his office. Big companies think the function of office space is to express rank. But hackers use their offices for more than that: they use their office as a place to think in. And if you’re a technology company, their thoughts are your product. So making hackers work in a noisy, distracting environment is like having a paint factory where the air is full of soot.

So what is the answer?

Progressive software companies like Campaign Monitor provide dedicated offices to each team member and a large kitchen table to ensure everyone eats lunch together every day.

I like the idea of providing a dedicated office to each team member to work quietly without disruption, and separate (sound proofed) open areas for team collaboration/socializing. The open areas must be easily booked or can used for an impromptu discussion, and must be clean and connected, to maximise productivity.

As Kelly Executive Recruitment GM Ray Fleming says:

Productivity and motivation are maximised when employees have their own workspace. It helps them to feel part of the organisation and solidifies their position in the team, and businesses need to keep this in mind. Businesses also need to be aware that shifting to hot-desking just to save money may drive some employees to look elsewhere for employment.