Using a Headless Browser with ProxyMesh
Headless Browser Advantages
ProxyMesh is great for sending requests through a headless browser. And browserless proxying offers great advantages in time-savings and efficiency. For example, “real,” or visible, browsers need time to open and render code and images; but with headless browsers, that’s not necessary. In fact, they can start functioning without waiting for a page to load completely.
You can easily configure your device to gain these advantages. This article offers some tips for configuring ProxyMesh and your device to work with browserless.io/
Selenium Using Python
The code examples below have been adapted from Selenium Chrome Proxy Authentication.
IP Authentication
ProxyMesh allows both basic authentication and IP authentication. In many cases, the latter is preferred, for example to configure the Python webdriver for Selenium to use Chrome. But this may not work in a rotating proxy service, which provides a different IP address for each session. If you have authenticated your IP address to the proxy, then to use the proxy with Python and the Selenium library with ChromeDriver, you can enter the following code:
chrome_options = webdriver.ChromeOptions() chrome_options.add_argument('--proxy-server=%s' % hostname + ":" + port) driver = webdriver.Chrome(chrome_options=chrome_options)
However, this will not work if the proxy requires you to log in with a username and password. That’s because Browserless uses a different IP address for each session. In that case, you can configure your device for basic authentication. Please follow the steps in the section below.
HTTP Proxy Authentication with ChromeDriver in Selenium
The following code configures Selenium with ChromeDriver to use an HTTP proxy that requires username:password authentication.
import os import zipfile from selenium import webdriver PROXY_HOST = 'XX.proxymesh.com' # rotating proxy PROXY_PORT = 31280 PROXY_USER = 'proxy-user' PROXY_PASS = 'proxy-password' manifest_json = """ { "version": "1.0.0", "manifest_version": 2, "name": "Chrome Proxy", "permissions": [ "proxy", "tabs", "unlimitedStorage", "storage", "<all_urls>", "webRequest", "webRequestBlocking" ], "background": { "scripts": ["background.js"] }, "minimum_chrome_version":"22.0.0" } """ background_js = """ var config = { mode: "fixed_servers", rules: { singleProxy: { scheme: "http", host: "%s", port: parseInt(%s) }, bypassList: ["localhost"] } }; chrome.proxy.settings.set({value: config, scope: "regular"}, function() {}); function callbackFn(details) { return { authCredentials: { username: "%s", password: "%s" } }; } chrome.webRequest.onAuthRequired.addListener(callbackFn, {urls: ["<all_urls>"]}, ['blocking']); """ % (PROXY_HOST, PROXY_PORT, PROXY_USER, PROXY_PASS) def get_chromedriver(use_proxy=False, user_agent=None): path = os.path.dirname(os.path.abspath(__file__)) chrome_options = webdriver.ChromeOptions() if use_proxy: pluginfile = 'proxy_auth_plugin.zip' with zipfile.ZipFile(pluginfile, 'w') as zp: zp.writestr("manifest.json", manifest_json) zp.writestr("background.js", background_js) chrome_options.add_extension(pluginfile) if user_agent: chrome_options.add_argument('--user-agent=%s' % user_agent) driver = webdriver.Chrome( os.path.join(path, 'chromedriver'), chrome_options=chrome_options) return driver def main(): driver = get_chromedriver(use_proxy=True) driver.get('https://httpbin.org/ip') if __name__ == '__main__': main()
The get_chromedriver function returns a configured Selenium webdriver that you can use in your application.