By March 6, 2011

Getting Up to BAT: Picking a Browser Automation Tool

Now that you’ve gotten an “Automation Lead” for your BAT (Black-box Automated Testing,) it's time to make a very important decision.

It is time to pick the browser automation tool you are going to use for driving your automation efforts.

Before we can build an automation framework, we need a basic set of tools to build it on.  We want our automation framework to be tailored to our application under test specifically.  Our automation framework will act like a Domain Specific Language (DSL) for testing our application, but we need to build that automation framework on top of something.

What is a browser automation tool?

It is essentially what we will use to drive the web browser.

BC-WV--BC-PRI--Older Drivers--Eighty-seven-year-old Dorothy Wulfers, who learned to drive a Model T Ford at age 15, prepares to pull out of her parking space Friday, June 4, 2004 in Morgantown, WV. Wulfers said only her and God will decide when she stops driving. (AP PHOTO/DALE SPARKS)

We need to pick a tool or framework that will easily allow us to interact with the web browser so we don’t have to write that code ourselves.

We will need to be able to click buttons, enter text, select drop downs, and check values on web pages.

Some of the things we will want to consider when choosing a browser automation tool include:

  • Language we can use with the driver
  • Browsers it supports
  • How we interact with the browser (do we use XPath, JQuery, or some other mechanism)
  • If we can access and manipulate everything we need to be able to in the browser
  • Speed of execution
  • Ability to execute in parallel
  • Support and future development

Let’s take a quick look at each of these considerations.

What language can we use?

This is an important consideration because you will want to be able to utilize your other programming resources to help with building an automation framework and to eventually create automation tests.

If you pick a browser driver that supports a language your regular programmers don’t like, or don’t know, you may not end up with their support, and you are definitely going to want their support.

What browsers does it support?

This is a tricky topic to discuss.  It might seem that a browser automation tool would need to support all of the browsers your application supports, but that is not really true.

You really have to think about what your goal is with your BATs.  I could certainly devote a whole blog post to this topic, but I will try and summarize my main point here.

BATs should be designed to test functionality of the application, not to test compatibility.  (At least most of your BATs should be, although you might have some that are specifically for compatibility.)  With that goal in mind, a browser driver supporting each and every browser your application supports is not really necessary.

You do need to consider that the browser automation tool should support at least one of your supported browsers, preferably the main one.

How do we interact with the browser?

There are several different APIs which are used to communicate with the web browser that are employed by various different browser drivers.

Some tools have APIs that are very low level and allow you to very easily manipulate objects in the web browser while others are at a much higher level and rely more on your understanding of the brower’s DOM.

For example, one API might expose elements on a web page by their actual element name and let you interact with them directly.



In the first example, the API is aware of element types like buttons.  In the second example, the API is more generic and will send a “click” to an element with the ID value you specify.

You’ll have to decide how important the ease of use is to your project at this level.  There are some tradeoffs to consider here.

The lower level the API is, the easier it will be to use, but the more dependent you will be on that API and the more language specific that API will be.

The higher level the API is, the harder it will be to use, but it will free you from depending so much on the API, because you will be interacting more directly with the browser’s DOM.

I prefer lower level APIs because I find that it is much easier to write the framework code on top of these.

You will also want to consider here the skill levels of the programmers who will be creating the framework.  If your API is lower level, native language skills are more important.  If it is higher level, DOM and HTML, and perhaps Javascript skills, are more important.

Can we access what we need?

If your application makes use of Javascript pop-up dialogs and your browser automation tool doesn’t give you a way to interact with them, you don’t want to find this out when you have already invested significantly into your automation framework.

Different automation frameworks support different features of browser; not all of them will allow you to interact with every single browser feature that exists.  As new features are introduced, you could also get stuck with a tool that is no longer being maintained and doesn’t add support for the new features.

You will want to consider things like how much your application uses AJAX or JQuery.  You will want to select a browser automation tool which would make it easy to interact with AJAX calls if your application relies on them heavily.

Speed of execution

Not all browser automation tools are the same in execution speed either.

This may not be important to you, depending on how you address the issue of concurrency, but you should at least consider that if you have a large number of automated tests, a small difference in execution speed can result in a large difference in total time to run the tests.

The faster the automated tests can be run, the more valuable they are.  I won’t get into the details here but there are many reason why this is true.

Also influencing the speed in which you can run your tests will be if you automation tool choice requires you to put pauses into the test to wait for the browser instead of responding to events that occur in the browser.

Ability to execute in parallel

This consideration will depend greatly on the volume of and speed in which your tests can execute.

At some time (sooner than you think) you will likely get to the point where you can no longer run all of your automated tests in 24 hours.  Perhaps even before this point, you will want to consider running your BATs in parallel to reduce the total time to execute the tests.

Some automation tools have built-in support and others will require you to build your own way to do this.  It is good to at least have an idea of how you are going to achieve parallel execution when considering what browser automation tool to use.

Support and future development

We are in a pretty rapid pace of browser development.  Many things in the web browser world are changing much more rapidly than ever before.  Tools to automate the browser must also change or you could end up in a very bad place.

If your application is going to take advantage of the latest browsers and browser features, you should make sure the automation tool you choose is in active development.

There is nothing worse than investing into an open-source project and then having that project die, resulting in you having to rip it out to replace it with another library.

You can always design your automation framework to be abstracted away as much as possible from the underlying browser driver, but that will be extra work, so consider this point carefully.

Name names sir!

Nope, I’m not going to do it.

I’m not going to tell you to use Watin or Watij or Selenium or WebAii or any other browser driver.  I don’t want to put the focus on the tool, since the real focus will be building the automation framework on top of the tool you use to drive the browser.

I would suggest that you try writing a few simple tests in each of the major browser drivers so that you can get a good feel of what the API is like and how it will work with your application.

I will say that I have used most of the major choices out there and there really is no clear winner in my mind.  It really is going to depend on what your environment is like and what kind of application you are testing.

I would also recommend picking a browser driver and sticking to it.  I’ve tried in the past to abstract away the browser driver from the automation framework and while it is possible, it can become quite messy and add quite a bit of overhead to your project.

My only other hints would be to not put too much emphasis on supporting multiple browsers or on using recording tools.  Neither of these things will benefit you much in the long run because you will find that you will not want to try and run all of your BATs on each browser. Recording tools will not be nearly as effective as writing your own custom framework (which I will talk about in my next post.)

About the author

John Sonmez

John Sonmez is the founder of Simple Programmer and a life coach for software developers. He is the best selling author of the book "Soft Skills: The Software Developer's Life Manual."