StudyTRAX is a web-based electronic data capture system. Unfortunately, it has some limitations. For instance, there is no way to download all of the data for a study with one click. Also, patient completion status is calculated on the client, so if you make changes that affect the completion status, you have to visit every patient and every encounter to get these set properly. This is annoying.
“Aha!” I exclaimed. I’ll use
WWW::Mechanize (or something similar) to automate these tasks.
Of course, it wasn’t quite that simple…
(At this point, I’m writing a web browser; I have other things I need to do.)
WWW::Mechanize is one of the oldest Perl modules for automating the web. I had hoped it would do everything I needed, but it didn’t.
This is the
It is also slow, memory intensive, and doesn’t handle every case; in particular, mine. (Something about .NET doesn’t work right.)
Since I couldn’t get the Perl-based web-browser engines to work, I looked for embedded browser engines. This was the only one I could find; I couldn’t get it to install on a Mac, let alone Windows.
This is probably what I should use. However, it isn’t a separate module; it’s a part of the
Test::WWW::Selenium distribution. Also, I didn’t find it until it was too late. And I was in the middle of a philosophical argument with the world because all of the online help for Selenium assumes that you are testing a web application, and I’m trying to automate a web application.
Which brings us to
Selenium::Remote::Driver, which is what this talk is about.
Selenium is a Java-based "server". Go to the website and download the "Selenium Server" .jar file.
Or, on a Mac using homebrew, install the selenium-server-standalone formula.
Selenium supports, out of the box, the HtmlUnit browser (a Java headless browser) and Firefox.
You can download additional software to drive Chrome and IE.
PhantomJS supports a "Selenium" mode, so you could install it instead of Selenium.
Before we can start using Selenium, the server needs to be running.
If you downloaded the .jar file, just run it using
If you installed using homebrew, run the
There is a module on CPAN,
Selenium::Server, which purports to download and start Selenium running. This may be easier than dealing with downloading and running Selenium yourself.
#!/usr/bin/env perl use Modern::Perl '2012'; use Selenium::Remote::Driver; my $d = Selenium::Remote::Driver->new(); $d->get('http://http://boston-pm.wikispaces.com'); say $d->get_title(); $d->quit();
(future slides will assume the same starting code)
Now that we have a page loaded, what next?
We can get the title and source code easily (
We can move the mouse around, and click things (
But it’s more useful to work with DOM elements.
In order to find elements of the DOM using Selenium, you call the
find_elements() methods of the driver object. These take a locator, and optionally a locator scheme, which specify which element(s) to find.
There are a variety of locator schemes — xpath, css, id, etc.
xpath is the default locator scheme.
my @links_xpath = $d->find_elements('//a', 'xpath'); my @links_css = $d->find_element('a', 'css'); my @links_tagname = $d->find_element('a', 'tag_name');
Once you’ve found an element, you can do things with it.
$d->find_element('Perl', 'link_text')->click(); my $search_elt = $d->find_element('div.WikiCustomNav input.WikiSearchInput', 'css'); $search_elt->clear(); $search_elt->send_keys('Uri'); $search_elt->submit();
WebElement methods don’t chain.
XPath has been, for me, the most useful locator scheme. Both Chrome and Safari support the $x() function in their web inspector for testing XPath. There are also Firefox and Chrome extensions for getting XPaths.
To work with <select>s, you send a
click() to the option you wish to select. (See later example.)
Downloading files is a big annoyance. (See later example.)
Browsers can take time to settle down; the
pause() method allows you to wait for a specified period of time.