Downloading files with WebDriver

By: on June 11, 2012

We have been using WebDriver (/Selenium) for doing functional testing of web applications. I have personally been involved in using WebDriver on .NET to automate testing of several .NET web applications. But in my spare time, I’ve discovered another use for WebDriver, which is automating interactions with websites.

At its most basic, this is just glorified scraping. But I’ve been discovering that in the years since I last tried this sort of thing, getting programs to interact with human-orientated websites is much easier. Partly, this is due to WebDriver’s high-level library for matching on elements in a variety of ways. But it’s also because the web is genuinely a bit more semantic than it used to be. These days, if you are interacting with a well-written site, then semantic css classes abound that make it very easy to pull out the information you need.

The two uses I’ve been playing around with are interacting with a phpBB driven forum – it can read posts and respond to simple commands – and exploring a university website to download lecture slides automatically for courses I’m interested in. This second use brought me a problem that there don’t seem to be a wealth of good answers to on the web, so I thought I would post about it here.

The problem is that of downloading files with WebDriver. The standard answer you get on StackOverflow and similar is that WebDriver can’t download files (there is no standard for interacting with the browser’s save dialogs). You find suggestions that you set up your browser to have a default action for certain file types that you are interested in, so that the save dialog can be by-passed. This isn’t very satisfactory – especially as the lecture slide files I was interested in didn’t have useful names, so I wanted to be able to use  names I had picked up from the other elements on the page.

The other solution is: don’t use WebDriver. Just fire up some tool like wget, or use a library in your language of choice. The problem with this is that the university files are password protected – you need to be logged in. It feels like I will be duplicating a lot of work if I log in under WebDriver, navigate the site, and then have to repeat a bunch of it to download the file.

However, it’s actually very simple. You can pull the cookie state that WebDriver is using out easily, and then just pass that along with your request for the file. This means that you are using the same session as WebDriver, and that everything Just Works.

Here’s a code snippet for how I did this in .NET:

using System.Net;
void downloadFile(IWebDriver driver, string url, string localPath)
    var client = new WebClient();
    client.Headers[HttpRequestHeader.Cookie] = cookieString(driver);
    client.DownloadFile(url, localPath);
string cookieString(IWebDriver driver)
    var cookies = driver.Manage().Cookies.AllCookies;
    return string.Join("; ", cookies.Select(c => string.Format("{0}={1}", c.Name, c.Value)));
And you’re done!


  1. Rexy says:

    Hi, can you convert this in to Java code please.
    i have no idea about .net.

  2. Thanks for the post! I put together an example of a similar approach in Ruby. You can see it here:

    I’ve also covered a couple of other approaches to downloads in Ruby (in case anyone’s interested):

  3. manivas says:

    I am not able to get what your intend to do in

    //cookies.Select(c => string.Format(“{0}={1}”, c.Name, c.Value)

    and also I dont see there is not method called ‘Select’ under ‘AllCookies’.

    Pls help.

  4. niesha says:

    Hello! Hi, Although I do tend to agree with most of what you have said, there are occasions where I need to check the contents of the file and using a hash won t work because the file contains dynamic data, e.g. references, Ids, dates etc. How do you test for this sort of scenario? By the way the best paper writing service that I saw:

  5. Gourav Bhatt says:

    Thanks buddy it helped me a lot time saving and extra code saving . Thanks

  6. Pranav Doshi says:

    Thanks Martin, This has helped me lot in saving time for downloading whole fine which I already implemented.

  7. José Luis Dunstan Aravena says:

    Thank you very much the code has served perfectly. I was hesitant to use WebClient since I did it for almost a year and now, out of the blue, I could not access the site using the cookie I had rescued using autoIt.

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>