Using a Headless Browser with ProxyMesh

Headless Browser Advantages

ProxyMesh is great for sending requests through a headless browser. And browserless proxying offers great advantages in time-savings and efficiency. For example, “real,” or visible, browsers need time to open and render code and images; but with headless browsers, that’s not necessary. In fact, they can start functioning without waiting for a page to load completely.

You can easily configure your device to gain these advantages. This article offers some tips for configuring ProxyMesh and your device to work with browserless.io/

Selenium Using Python

The code examples below have been adapted from Selenium Chrome Proxy Authentication.

IP Authentication

ProxyMesh allows both basic authentication and IP authentication. In many cases, the latter is preferred, for example to configure the Python webdriver for Selenium to use Chrome. But this may not work in a rotating proxy service, which provides a different IP address for each session. If you have authenticated your IP address to the proxy, then to use the proxy with Python and the Selenium library with ChromeDriver, you can enter the following code:

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server=%s' % hostname + ":" + port)
driver = webdriver.Chrome(chrome_options=chrome_options)

However, this will not work if the proxy requires you to log in with a username and password. That’s because Browserless uses a different IP address for each session. In that case, you can configure your device for basic authentication. Please follow the steps in the section below.

HTTP Proxy Authentication with ChromeDriver in Selenium

The following code configures Selenium with ChromeDriver to use an HTTP proxy that requires username:password authentication.

import os
import zipfile

from selenium import webdriver

PROXY_HOST = 'XX.proxymesh.com'  # rotating proxy
PROXY_PORT = 31280
PROXY_USER = 'proxy-user'
PROXY_PASS = 'proxy-password'

manifest_json = """
{
    "version": "1.0.0",
    "manifest_version": 2,
    "name": "Chrome Proxy",
    "permissions": [
        "proxy",
        "tabs",
        "unlimitedStorage",
        "storage",
        "<all_urls>",
        "webRequest",
        "webRequestBlocking"
    ],
    "background": {
	"scripts": ["background.js"]
     },
    "minimum_chrome_version":"22.0.0"
}
"""
background_js = """
    var config = {
        mode: "fixed_servers",
        rules: {
	    singleProxy: {
	        scheme: "http",
	        host: "%s",
	        port: parseInt(%s)
	    },
	    bypassList: ["localhost"]
	}
};

chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});

function callbackFn(details) {
    return {
        authCredentials: {
	    username: "%s",
	    password: "%s"
        }
    };
}

chrome.webRequest.onAuthRequired.addListener(callbackFn, {urls: ["<all_urls>"]}, ['blocking']);
""" % (PROXY_HOST, PROXY_PORT, PROXY_USER, PROXY_PASS)

def get_chromedriver(use_proxy=False, user_agent=None):
    path = os.path.dirname(os.path.abspath(__file__))
    chrome_options = webdriver.ChromeOptions()
    if use_proxy:
        pluginfile = 'proxy_auth_plugin.zip'

        with zipfile.ZipFile(pluginfile, 'w') as zp:
            zp.writestr("manifest.json", manifest_json)
            zp.writestr("background.js", background_js)
        chrome_options.add_extension(pluginfile)
    if user_agent:
        chrome_options.add_argument('--user-agent=%s' % user_agent)
    driver = webdriver.Chrome(
        os.path.join(path, 'chromedriver'),
        chrome_options=chrome_options)
    return driver

def main():
    driver = get_chromedriver(use_proxy=True)
    driver.get('https://httpbin.org/ip')

if __name__ == '__main__':
    main()

The get_chromedriver function returns a configured Selenium webdriver that you can use in your application.

Still need help? Contact Us Contact Us