AMA: JS vs Ruby

Butch Mayhew asks…

I have noticed you blogging more about JS frameworks. How do these compare to Watir/Ruby? Would you recommend one over the other?

My response…

I had a discussion recently with Chuck van der Linden about this same topic as he has a lot of experience with Watir and is now looking at JavaScript testing frameworks like I have done.

Some Background built an entirely new UI for managing sites using 100% JavaScript with React for the main UI components. I am responsible for e2e automated tests across this UI, and whilst I originally contemplated, and trialled even, using Ruby, this didn’t make long term sense for where the original WordPress developers are mostly PHP and the newer UI developers are all JavaScript.

Whilst I see merit in both views: I still think having your automated acceptance tests in the same language as your application leads to better maintainability and adoptability.

I still think writing automated acceptance tests in Ruby is much cleaner and nicer than JavaScript Node tests, particularly as Ruby allows meta-programming which means page objects can be implemented really neatly.

The JavaScript/NodeJS landscape is still very immature where people are using various tools/frameworks/compilers and certain patterns or de facto standards haven’t really emerged yet. The whole ES6/ES2015/ES2016 thing is very confusing to newcomers like me, especially on NodeJS where some ES6+ features are supported, but others require something like Babel to compile your code.

But generally with the direction ES is going, writing page objects as classes is much nicer than using functions for everything as in ES5.

Whilst there’s nothing I have found that is better (or even as good) in JavaScript/Mocha/WebDriverJS than Ruby/RSpec/Watir-WebDriver, I still think it’s a better long term decision for to use the JavaScript NodeJS stack for our e2e tests.

Watir-WebDriver with GhostDriver on OSX: headless browser testing

GhostDriver has been released which means it is now easy to run reliable headless WebDriver tests on Mac OSX.

Steps to get working on OSX

  1. First make sure you have homebrew installed
  2. Run
    brew update


    brew install phantomjs

    which should install PhantomJS 1.8.1 or newer

  3. Run irb and start using GhostDriver!
    require 'watir-webdriver'
    b = :phantomjs
    b.goto ""
    b.url #""
    b.title #"Google"

I’ve tested it on a large test suite (123 scenarios) and it behaves the same as other browsers with full JavaScript support. It took 8m13s in total: surprisingly it is slightly slower than ChromeDriver (7m30s) in my testing, but a little faster than the Firefox Driver (9m33s).

Well done to all involved in this project. It’s great to see a reliable, realistic headless browser with full JavaScript support for WebDriver finally released.

And yes, in case you’re wondering, it does screenshots!

Getting the WebDriver driver from a WebDriver element

In watir-webdriver it is really easy to access both the webdriver driver and the watir browser objects from an element:

require 'watir-webdriver'
b =
b.goto ''
button = b.button(name: 'btnK')
button.driver #webdriver
button.browser #watir browser

I never knew how to do this in C# until someone named Robert left a comment yesterday on an old blog post with instructions on how to do so.

You can get the webdriver by using.
var driver = ((IWrapsDriver)webElement).WrappedDriver;

Neat. This means the C# check image present extension method I wrote about previously can be implemented directly from the element itself:

public static bool IsImageVisible(this IWebElement image)
  var driver = ((IWrapsDriver)image).WrappedDriver;
  var script = TestConfig.DriverType == "ie"
             ? "return arguments[0].complete"
             : "return (typeof arguments[0].naturalWidth!=\"undefined\"" +
             " && arguments[0].naturalWidth>0)";
  return (bool)((IJavaScriptExecutor)driver).ExecuteScript(script, image);

This means it can be simply called like:


instead of the cumbersome:


Thanks Robert!

Checking an image is actually visible using WebDriver

I didn’t realize it’s actually a little tricky to check that an image is loaded when using WebDriver. WebDriver will only complain if the image tag you’re looking for isn’t in the DOM, not if the image link is broken and not actually visible.

For example, in watir-webdriver (ruby), this doesn’t really work as I would expect as the image isn’t actually visible on the ‘brokenimage’ page.

require 'watir-webdriver'
b = :firefox
b.goto ''
puts b.image(id: 'watermelon').visible? #true but is not visible

The way to check that is is actually visible is to check a JavaScript property ‘naturalWidth’ is greater than 0.

b = :firefox
b.goto ''
puts b.execute_script("return (typeof arguments[0].naturalWidth!=\"undefined\" && arguments[0].naturalWidth>0)", b.image(id: 'watermelon'))

Unfortunately this doesn’t work in IE, so you should use the ‘complete’ JavaScript method in IE (which doesn’t work in other browsers):

b = :firefox
b.goto ''
puts b.execute_script("return arguments[0].complete", b.image(id: 'watermelon'))

In C#, you can wrap this up into a WebDriver extension method so you can this directly from Driver passing in the image element.

public static bool IsImageVisible(this IWebDriver driver, IWebElement image)
    var script = TestConfig.DriverType == "ie"
                ? "return arguments[0].complete"
                : "return (typeof arguments[0].naturalWidth!=\"undefined\"" +
                  " && arguments[0].naturalWidth > 0)";
    return (bool) ((IJavaScriptExecutor) driver).ExecuteScript(script, image);

// Usage

If it’s important that images load correctly in your application, you should probably start putting some of these in your WebDriver page objects. It’s simple to write a verify images method on a page that iterates through each image in the DOM and checks that it’s visible using the techniques above. Have fun.

Update: 30 November
I wrote about a slightly more elegant C# approach to do this directly from the element.

The webdriver-user-agent gem now supports random user agents

My webdriver-user-agent gem now supports random user agents. This idea belonged to Christoph Pilka who released the webdriver-user-agent-randomizer gem and suggested that we merge this feature back into the orginal gem.

Well, I have done it and now you can access this functionality like so:

require 'selenium-webdriver'
require 'webdriver-user-agent'
driver = UserAgent.driver(:agent => :random)
driver.execute_script('return navigator.userAgent')
# random agent like "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.2) Gecko/20010726 Netscape6/6.1"

See README for full details.

Roll your own page objects

There seems to be a lot of focus being put into page object ruby gems at the moment. Cheezy has done a fantastic job of the aptly named page-object that supports Watir-Webdriver and Selenium-Webdriver, and then there’s the more recent site_prism (also fantastic) by Nat Ritmeyer that works with Capybara. Before these two came along, I even wrote my own; the now retired watir-page-helper gem.

The premise of these gems is they make it super easy to create page objects for your ruby automated testing projects. But today I want to discuss another crazy idea with you: do you even need a gem at all to do page objects?


I recently refactored some automated tests that Chris McMahon wrote as a potential framework for Wikimedia Foundation (creators of Wikipedia). Chris’s code used Cheezy’s excellent page-object gem so I happily went about my refactoring his code using that gem. Suddenly… I found instead of helping me it started to hinder me. I kept having to refer to page-object user guide I got from Cheezy in Austin to work out how things work. Namely:

  1. How to define elements: as they are different from watir-webdriver (eg. list_item vs li, cell vs td etc.)
  2. How to identify elements: as they are limited to certain supported attributes by element type, unlike watir-webdriver which supports every attribute for all elements
  3. What methods each element provides and what each does: as different elements create different methods with different behaviours, so calling a link element clicks it, whilst a span element returns its text.

The main problem I personally found was that Page-object has essentially created its own DSL to describe elements in page objects, and this DSL is subtly and not so subtly different from the Watir-Webdriver API, so the API I know and love doesn’t work in a lot of places.

An example. There’s a common menu bar on all the Wikimedia sites that displays the logged in user as a link (to the user’s page).

The link is only recognizable by its title attribute, and whilst this is supported by watir-webdriver (it supports any attribute), it is not supported by page-object. The source html looks like:

<a class="new" accesskey="." title="Your user page [ctrl-.]" href="/wiki/User:Alister.scott">Alister.scott</a>

What I would have liked to do was:

  link :logged_in_as, :title => 'Your user page [ctrl-.]'

But, instead, I had to do this (which isn’t very DRY):

  def logged_in_as => 'Your user page [ctrl-.]').text

I believe essentially what has happened is the page-object, in its neutrality between selenium-webdriver and watir-webdriver, has created its own DSL that is somewhat of a halfway point between the two. This is probably fine for most people starting out, it’s just an API to learn, but for someone like me who has extensive experience with the watir-webdriver API (and loves the power of it), I find it limiting. This is particularly evident when I write a majority of my code using the watir-webdriver API under IRB.

So, I had to take a re-think. Why not roll my own page objects for Wikimedia Foundation?

Roll your own page objects

I recently had a discussion with a colleague/good friend about page objects which went along the lines of “I don’t understand those page object gems because you end up writing a custom page object pattern for each project anyway, as every project/application you work on is different in its own way”. It was one of those aha moments for me.

What I needed was to roll my own Wikimedia page objects.

Taking it back to basics, essentially there are three functions I see a page object pattern provides:

  1. Ability to easily instantiate pages in a consistent manner;
  2. Ability to concisely describe elements on a page, keep it DRY by avoiding repetition of element identifiers (using the underlying driver’s API); and
  3. Ability to provide higher level methods that use the elements to perform user oriented functions.

You can probably notice the helper methods – the magic – that gems like page-object and site_prism provide are missing from my list. This is on purpose, and is because, after lots of thought, I actually don’t find these useful, as they encourage specifications/steps to be too lower level. I would rather a high level method on the page (eg. login) than exposing my username and password fields and a login button.

A Wikimedia Page Model

Taking those things into consideration: this is the page model I came up with for Wikimedia.

Generic Base Page Class

The generic base page class is what everything else extends. It contains the instantiation code common to all pages, and the class methods needed to define elements and methods (more on these later).

Wikimedia Base Page Class

This page class contains elements and methods that are common to all Wikimedia pages. The ‘logged in user’ example above is a good example of something that is the same on every Wikimedia page, whether you’re on Wikipedia, or Wikimedia Commons etc.

Commons & Wikipedia Base Page Classes

These two classes are placeholders for elements and methods are common to a particular site. At the moment with my limited examples, these don’t contain content.

Commons & Wikipedia Page Classes

These are the actual pages that are representations of pages in Wikimedia. These are in separate modules so they are in different namespaces (you can have a Wikipedia::LogonPage and a Commons::LogonPage).

Some example pages:

class Wikipedia::LogoutPage < Wikipedia::BasePage
  page_url "#{Wikipedia::BASE_URL}/w/index.php?title=Special:UserLogout"
  expected_title "Log out - #{Wikipedia::TITLE}"

  element(:main_content_div) { |b| b.div(id: 'mw-content-text' ) }
  value(:main_content) { |p| p.main_content_div.text }

Here we can see we define a page_url and expected_title, which are used to instantiate the page.

Next we define an element passing in a block of watir-webdriver code for it, and a value by referencing the element we defined before it. Since these element and value methods execute blocks against self, and the class delegates missing methods to our browser, we can refer to either the browser (shown as b) or the page class (shown as p) in our blocks.

class Commons::LoginPage < Commons::BasePage
  page_url "#{Commons::BASE_URL}/w/index.php?title=Special:UserLogin"
  expected_title "Log in / create account - #{Commons::TITLE}"

  value(:logged_in?) { |p| p.logged_in.exists? }

  def login_with username, password
    username_field.set username
    password_field.set password

In this example, we again define the page_url and expected_title, but we have stored the login_elements with the WikimediaBasePage (as they are the same across all the sites) so we include them by specifying login_elements. We have also defined a login_with method that performs actions on our elements.

There are three available methods to define page elements, values and actions, and these all follow the same format of specifying the method name, and passing in a block of watir-webdriver code.

Calling the page objects from Cucumber step definitions

I chose to use Cucumber for the Wikimedia Foundation framework over Chris’s choice of RSpec as I find it easier to specify end-to-end tests in this way. I find the Cucumber step definitions encourage reuse of steps typically used to set up a test (that are often duplicated in RSpec).

I try to stick to calling the exposed methods, values or actions instead of the elements themselves from my Cucumber steps to ensure I am writing them at a high level. An example step using the page above looks like:

Given /^I am logged into Commons$/ do
  visit Commons::LoginPage do |page|
    page.login_with Commons::USERNAME, Commons::PASSWORD
    page.should be_logged_in

The visit and on methods are defined in a Pages module that is mixed into the Cucumber World so these available on all step definitions. As named, the visit method instantiates and visits the page, whereas the on just instantiates it.

module Pages
  def visit page_class, &block
    on page_class, true, &block

  def on page_class, visit=false, &block
    page = @browser, visit page if block


That’s all there really is the rolling your own page objects. I found this excercise useful as it gives me maximum flexibility and allows me to clearly define pages how I want to define them. I appreciate all the great work that Cheezy and Nat have done on their page object gems, if anything these contain great inspirations on how to roll your own custom page objects most suited to your environment and applications.

You can check out my full code here on Github.