AMA: R.Y.O. Page Objects 2.0

Michael Karlovich asks…

Do you have any updated thoughts on rolling your own page objects with Watir? The original post is almost 4 years old but is still the basis (loosely) of every page object framework I’ve built since then.

My response…

Wow, I can’t believe that post is almost four years old. I have also have used this for the basis of every page object framework I have built since then.

I recently had a look at our JavaScript (ES2015) code of page objects and despite ES2015 not having meta-programming support like ruby, our classes are remarkably similar to what I was proposing ~4 years ago.

I believe this is because some patterns are classic and therefore almost timeless, they can be applied over and over again to different contexts. There’s a huge amount of negativity towards best practices of late, but I could seriously say that page objects are a best practise for test automation of ui systems, which isn’t saying they will be exactly the same in every context, but there’s a common best-practice pattern there which you most likely should be using.

Page objects, as a pattern, typically:

  • Inherit from a base page object/container which stores common actions like:
    • instantiating the object looking for a known element that defines that page’s existence
    • optionally allow a ‘visit’ to the page during instantiation using some defined URL/path
    • provides actions and properties common to all pages, for example: waiting for the page, checking the page is displayed, getting the title/url of the page, and checking cookies and local storage items for that page;
  • Define actions as methods which are ways of interacting with that page (such as logging in);
  • Do not expose internals about the page outside the page – for example they typically don’t expose elements or element selectors which should only be used within actions/methods for that page which are exposed; and
  • Can also be modelled as components for user interfaces that are built using components to give greater reusability of the same components across different pages.

The biggest benefit I have found from using page objects as a pattern is having more deterministic end-to-end tests since instantiating a page I know I am on that page, so my tests will fail more reliably with a better understanding of what went wrong.

Are there any other pattern attributes you would consider vital for page objects?

AMA: JS vs Ruby

Butch Mayhew asks…

I have noticed you blogging more about JS frameworks. How do these compare to Watir/Ruby? Would you recommend one over the other?

My response…

I had a discussion recently with Chuck van der Linden about this same topic as he has a lot of experience with Watir and is now looking at JavaScript testing frameworks like I have done.

Some Background

WordPress.com built an entirely new UI for managing sites using 100% JavaScript with React for the main UI components. I am responsible for e2e automated tests across this UI, and whilst I originally contemplated, and trialled even, using Ruby, this didn’t make long term sense for WordPress.com where the original WordPress developers are mostly PHP and the newer UI developers are all JavaScript.

Whilst I see merit in both views: I still think having your automated acceptance tests in the same language as your application leads to better maintainability and adoptability.

I still think writing automated acceptance tests in Ruby is much cleaner and nicer than JavaScript Node tests, particularly as Ruby allows meta-programming which means page objects can be implemented really neatly.

The JavaScript/NodeJS landscape is still very immature where people are using various tools/frameworks/compilers and certain patterns or de facto standards haven’t really emerged yet. The whole ES6/ES2015/ES2016 thing is very confusing to newcomers like me, especially on NodeJS where some ES6+ features are supported, but others require something like Babel to compile your code.

But generally with the direction ES is going, writing page objects as classes is much nicer than using functions for everything as in ES5.

Whilst there’s nothing I have found that is better (or even as good) in JavaScript/Mocha/WebDriverJS than Ruby/RSpec/Watir-WebDriver, I still think it’s a better long term decision for WordPress.com to use the JavaScript NodeJS stack for our e2e tests.

Using appium in Ruby for iOS automated functional testing

As previously explained, I recently started on an iOS project and have spent a bit of time comparing iOS automation tools and chose Appium as the superior tool.

The things I really like about Appium is that it is language/framework agnostic as it uses the WebDriver standard WIRE protocol, it doesn’t require any modifications to your app, supports testing web views (also known as hybrid apps) and it supports Android since we are concurrently developing an Android application (it also supports OSX and Firefox OS but we aren’t developing for those, yet). There isn’t another iOS automated testing tool, that I know of, that ticks that many boxes for me.

Getting Started

The first thing to do is download the appium.app package from the appium website. I had an issue with the latest version (0.11.2) launching the server which can be resolved by opening the preferences and checking “Override existing sessions”.

You run the server from inside the appium.app which takes your commands and relays them to the iOS simulator. There’s also a very neat ‘inspector’ tool which shows you all the information you need to know about your app and how to identify elements.

Note: there’s currently a problem with XCode 5.0.1 (the latest version as I write) which means Instruments/UIAutomation won’t work at all. You’ll need to downgrade (uninstall/reinstall) to XCode 5.0 to get appium to work at all.

Two Ruby Approaches

This confused me a little to start, but there’s actually two vastly different ways to use appium in ruby.

1) Use the standard selenium-webdriver gem

If you’re used to using WebDriver, like me, this will be the most straightforward approach (this is the approach I have taken). Appium extends the API to add different gestures by calling execute_script from the driver, so all other commands stay the same (for example, find_element).

2) Use the appium_lib library

There is a Ruby gem appium_lib that has a different API to the selenium-webdriver gem to control appium. I don’t see any massive benefits to this approach besides having an API that is more specific to app testing.

Using Selenium-WebDriver to start appium in ruby

Launching an appium app is as simple as defining some capabilities with a path to your .app file you have generated using XCode (this gets put into a deep folder so you can write the location to a file and read it from that file).

[sourcecode language=”ruby”]
capabilities = {
‘browserName’ => ‘iOS’,
‘platform’ => ‘Mac’,
‘version’ => ‘6.1’,
‘app’ => appPath
}
driver = Selenium::WebDriver.for :remote,
desired_capabilities: capabilities,
url: "http://127.0.0.1:4723/wd/hub"
[/sourcecode]

Locating elements

Once you’ve launched your app, you’ll be able to use the appium inspector to see element attributes you can use in appium. Name is a common attribute, and if you find that it’s not being shown, you can add a property AccessibilityIdentifier in your Objective C view code which will flow throw to appium. This makes for much more robust tests than relying on labels or xpath expressions.

[sourcecode language=”ruby” light=”true”]
driver.find_element(:name, "ourMap").displayed?
[/sourcecode]

Enabling location services for appium testing

This got me stuck for a while as there’s quite a bit of conflicting information about appium on how to handle the location services dialog. Whilst you should be able to interact with it as a normal dialog in the latest version of appium, I would rather not see it at all, so I wrote a method to copy a plist file with location services enabled in it to the simulator at the beginning of the test run. It’s quite simple (you can manually copy the clients.plist after manually enabling location services):

[sourcecode language=”ruby” light=”true”]
def copy_location_services_authentication_to_sim
source = "#{File.expand_path(File.dirname(__FILE__))}/clients.plist"
destination = "#{File.expand_path(‘~’)}/Library/Application Support/iPhone Simulator/7.0/Library/Caches/locationd"
FileUtils.cp_r(source, destination, :remove_destination => true)
end
[/sourcecode]

Waiting during appium tests

This is exactly the same as selenium-webdriver. There’s an implicit wait, or you can explicitly wait like such:

[sourcecode language=”ruby” light=”true”]
driver.manage.timeouts.implicit_wait = 10
wait = Selenium::WebDriver::Wait.new :timeout => 30
wait.until {driver.find_element(:name, ‘monkeys’).displayed? }
[/sourcecode]

Mobile gestures

The obvious difference between a desktop web browser and a mobile app is gestures. Appium adds gestures to WebDriver using execute_script. I recommend using the percentage method (0.5 etc) instead of pixel method as it is more resilient to UI change.

For example:

[sourcecode language=”ruby” light=”true”]
driver.execute_script ‘mobile: tap’, :x => 0.5, :y => 0.5
[/sourcecode]

or

[sourcecode language=”ruby” light=”true”]
b = driver.find_element :name, ‘Sign In’
driver.execute_script ‘mobile: tap’, :element => b.ref
[/sourcecode]

Testing Embedded Web Views

The native and web views seamlessly combine so you can use the same find_element method to find either. The appium.app inspector displays the appropriate attributes.

Note: I can’t seem to be able to execute a gesture (eg. swipe) over a Web View. I don’t know whether this is a bug or a limitation of Appium.

Summary

I have found that using the familiar selenium-webdriver gem with appium has been very powerful and efficient. Being able to open an interactive prompt (pry or irb) and explore your app using the selenium-webdriver library and the appium.app inspector is very powerful as you can script on the fly. Whilst appium still seems relatively immature, it seems a very promising approach to iOS automation.

Now to get watir-webdriver to work with appium.

Packaging a ruby script as an Windows exe using OCRA

I recently wrote a watir-webdriver ruby script that I needed to be able distribute to others to run on Windows machines that don’t have ruby installed.

I came across the OCRA gem that allows you to easily generate a windows executable from a ruby script. This packages the ruby interpreter and all dependencies into an executable file.

It was quite straightforward to get it working, you simply install the gem on windows and run the ocra command with the name of your ruby script.

The only (minor) issues I had were:

  • if you wish to access external files from your executable (such as a config file) just add ‘$:.unshift File.dirname($0)‘ at the start of your ruby file
  • if you are using ruby logging, then for some reason you can’t call logger.close as it crashes OCRA, but you can just not close the logger which is fine
  • for some reason on Windows I needed to explicitly require ‘securerandom’ to use SecureRandom.uuid whereas it just worked on Mac OSX

Once I resolved these it quickly generated an executable which was runnable without any version of ruby installed. Neat.

The webdriver-user-agent gem now supports random user agents

My webdriver-user-agent gem now supports random user agents. This idea belonged to Christoph Pilka who released the webdriver-user-agent-randomizer gem and suggested that we merge this feature back into the orginal gem.

Well, I have done it and now you can access this functionality like so:

[sourcecode language=”ruby” light=”true”]
require ‘selenium-webdriver’
require ‘webdriver-user-agent’
driver = UserAgent.driver(:agent => :random)
driver.execute_script(‘return navigator.userAgent’)
# random agent like "Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:0.9.2) Gecko/20010726 Netscape6/6.1"
[/sourcecode]

See README for full details.