Using Selenium::Remote::Driver

Ricky Morse, MGH Biostatistics Center

The Statement

My problem

StudyTRAX is a web-based electronic data capture system. Unfortunately, it has some limitations. For instance, there is no way to download all of the data for a study with one click. Also, patient completion status is calculated on the client, so if you make changes that affect the completion status, you have to visit every patient and every encounter to get these set properly. This is annoying.

My solution

“Aha!” I exclaimed. I’ll use WWW::Mechanize (or something similar) to automate these tasks.

Of course, it wasn’t quite that simple…

The Options

LWP and manual management

I could use the LWP family of modules to create my own user agent, download pages, use some of the HTML parsing modules to ensure that I download and follow all links, use some kind of Javascript engine to run all the dynamic parts…

(At this point, I’m writing a web browser; I have other things I need to do.)


WWW::Mechanize is one of the oldest Perl modules for automating the web. I had hoped it would do everything I needed, but it didn’t.

In particular, it doesn’t handle Javascript.

I have this web page that has JavaScript on it, and my Mech program doesn’t work.

That’s because WWW::Mechanize doesn’t operate on the JavaScript. It only understands the HTML parts of the page.



This is the WWW::Mechanize replacement that handles Javascript. It uses JE, a Perl-based Javascript engine, to handle the Javascript.

It is also slow, memory intensive, and doesn’t handle every case; in particular, mine. (Something about .NET doesn’t work right.)


Since I couldn’t get the Perl-based web-browser engines to work, I looked for embedded browser engines. This was the only one I could find; I couldn’t get it to install on a Mac, let alone Windows.


This is probably what I should use. However, it isn’t a separate module; it’s a part of the Test::WWW::Selenium distribution. Also, I didn’t find it until it was too late. And I was in the middle of a philosophical argument with the world because all of the online help for Selenium assumes that you are testing a web application, and I’m trying to automate a web application.


Which brings us to Selenium::Remote::Driver, which is what this talk is about.


(a digression)

What is Selenium

Selenium automates browsers. That’s it! What you do with that power is entirely up to you. Primarily, it is for automating web applications for testing purposes, but is certainly not limited to just that. Boring web-based administration tasks can (and should!) also be automated as well.


Getting Selenium

Selenium is a Java-based "server". Go to the website and download the "Selenium Server" .jar file.

Or, on a Mac using homebrew, install the selenium-server-standalone formula.

Supported browsers

Selenium supports, out of the box, the HtmlUnit browser (a Java headless browser) and Firefox.

You can download additional software to drive Chrome and IE.


PhantomJS supports a "Selenium" mode, so you could install it instead of Selenium.

Running Selenium

Before we can start using Selenium, the server needs to be running.

If you downloaded the .jar file, just run it using java -jar ….

If you installed using homebrew, run the selenium-server command.


There is a module on CPAN, Selenium::Server, which purports to download and start Selenium running. This may be easier than dealing with downloading and running Selenium yourself.


Basic usage

#!/usr/bin/env perl
use Modern::Perl '2012';
use Selenium::Remote::Driver;

my $d = Selenium::Remote::Driver->new();
say $d->get_title();

(future slides will assume the same starting code)

Now that we have a page loaded, what next?

We can get the title and source code easily ($d->get_title() and $d->get_page_source()).

We can move the mouse around, and click things ($d->move_mouse_to_location() and $d->click()).

But it’s more useful to work with DOM elements.

Finding things

In order to find elements of the DOM using Selenium, you call the find_element() or find_elements() methods of the driver object. These take a locator, and optionally a locator scheme, which specify which element(s) to find.

There are a variety of locator schemes — xpath, css, id, etc.

xpath is the default locator scheme.


my @links_xpath = $d->find_elements('//a', 'xpath');
my @links_css = $d->find_element('a', 'css');
my @links_tagname = $d->find_element('a', 'tag_name');

Using found things

Once you’ve found an element, you can do things with it.

$d->find_element('Perl', 'link_text')->click();

my $search_elt = $d->find_element('div.WikiCustomNav input.WikiSearchInput', 'css');

Sadly, the WebElement methods don’t chain.

Other notes

XPath has been, for me, the most useful locator scheme. Both Chrome and Safari support the $x() function in their web inspector for testing XPath. There are also Firefox and Chrome extensions for getting XPaths.

To work with <select>s, you send a click() to the option you wish to select. (See later example.)

Other notes

Downloading files is a big annoyance. (See later example.)

Browsers can take time to settle down; the pause() method allows you to wait for a specified period of time.


More capabilities