MENU

Scraper Template with Selenium #2: Docker Bridge Network

This article was first published on Python | datawookie , and kindly contributed to python-bloggers. (You can report issue about the content on this page here)
Want to share your content on python-bloggers? click here.

In the previous post we set up a scraper template which used Selenium on Docker via the host network. Now we’re going to do essentially the same thing but using a bridge network.

Default Network

We’ll start by using Docker’s default bridge network.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
docker network ls
docker network ls
docker network ls
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
NETWORK ID NAME DRIVER SCOPE
ea5ebd23a086 bridge bridge local
bb80a2809880 host host local
00b74ecbf970 none null local
NETWORK ID NAME DRIVER SCOPE ea5ebd23a086 bridge bridge local bb80a2809880 host host local 00b74ecbf970 none null local
NETWORK ID          NAME                DRIVER              SCOPE
ea5ebd23a086        bridge              bridge              local
bb80a2809880        host                host                local
00b74ecbf970        none                null                local

These three networks will always be available:

bridge
bridge,
host
host and
none
none. We’re only interested in the first one.

Let’s create a Selenium container.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
docker run -d --rm --name selenium selenium/standalone-chrome:3.141
docker run -d --rm --name selenium selenium/standalone-chrome:3.141
docker run -d --rm --name selenium selenium/standalone-chrome:3.141
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
cede2a2e6fc279fcb2014f290cc5e324d86f2033d04cca1b2da59c03e121aec5
cede2a2e6fc279fcb2014f290cc5e324d86f2033d04cca1b2da59c03e121aec5
cede2a2e6fc279fcb2014f290cc5e324d86f2033d04cca1b2da59c03e121aec5

Now if we inspect the

bridge
bridge network we’ll see that the
selenium
selenium container is connected.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
docker network inspect bridge
docker network inspect bridge
docker network inspect bridge
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
[
{
"Name": "bridge",
"IPAM": {
"Config": [
{
"Subnet": "172.17.0.0/16",
"Gateway": "172.17.0.1"
}
]
},
"Containers": {
"cede2a2e6fc279fcb2014f290cc5e324d86f2033d04cca1b2da59c03e121aec5": {
"Name": "selenium",
"MacAddress": "02:42:ac:11:00:02",
"IPv4Address": "172.17.0.2/16",
"IPv6Address": ""
}
}
}
]
[ { "Name": "bridge", "IPAM": { "Config": [ { "Subnet": "172.17.0.0/16", "Gateway": "172.17.0.1" } ] }, "Containers": { "cede2a2e6fc279fcb2014f290cc5e324d86f2033d04cca1b2da59c03e121aec5": { "Name": "selenium", "MacAddress": "02:42:ac:11:00:02", "IPv4Address": "172.17.0.2/16", "IPv6Address": "" } } } ]
[
    {
        "Name": "bridge",
        "IPAM": {
            "Config": [
                {
                    "Subnet": "172.17.0.0/16",
                    "Gateway": "172.17.0.1"
                }
            ]
        },
        "Containers": {
            "cede2a2e6fc279fcb2014f290cc5e324d86f2033d04cca1b2da59c03e121aec5": {
                "Name": "selenium",
                "MacAddress": "02:42:ac:11:00:02",
                "IPv4Address": "172.17.0.2/16",
                "IPv6Address": ""
            }
        }
    }
]

The above output has been abridged for clarity.

We can see that the gateway between the host and the bridge network has an IP of 172.17.0.1 and that the

selenium
selenium container is at 172.17.0.2.

Launching a shell inside the

selenium
selenium container we can see what the network looks like from its perspective.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
root@cede2a2e6fc2:/# ip -br -c a
root@cede2a2e6fc2:/# ip -br -c a
root@cede2a2e6fc2:/# ip -br -c a
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
lo UNKNOWN 127.0.0.1/8
eth0@if68 UP 172.17.0.2/16
lo UNKNOWN 127.0.0.1/8 eth0@if68 UP 172.17.0.2/16
lo               UNKNOWN        127.0.0.1/8 
eth0@if68        UP             172.17.0.2/16

To get this to work you’ll need to install the

iproute2
iproute2 package on the container.

Okay, now let’s try connecting to the

selenium
selenium container via the default bridge network. In order to do this we need to use it’s IP address.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from selenium import webdriver
SELENIUM_URL = "172.17.0.2:4444"
browser = webdriver.Remote(f"http://{SELENIUM_URL}/wd/hub", {'browserName': 'chrome'})
browser.get("https://www.google.com")
print(f"Retrieved URL: {browser.current_url}.")
browser.close()
from selenium import webdriver SELENIUM_URL = "172.17.0.2:4444" browser = webdriver.Remote(f"http://{SELENIUM_URL}/wd/hub", {'browserName': 'chrome'}) browser.get("https://www.google.com") print(f"Retrieved URL: {browser.current_url}.") browser.close()
from selenium import webdriver

SELENIUM_URL = "172.17.0.2:4444"

browser = webdriver.Remote(f"http://{SELENIUM_URL}/wd/hub", {'browserName': 'chrome'})

browser.get("https://www.google.com")

print(f"Retrieved URL: {browser.current_url}.")

browser.close()

We have to explicitly specify the IP for the

selenium
selenium container. Obviously this is not ideal. We cannot be assured that the
selenium
selenium container will always be at the same IP address, so this will become hard to maintain.

Stop the existing

selenium
selenium container.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
docker stop selenium
docker stop selenium
docker stop selenium

User-Defined Network

We’re able to build a more robust setup if we create a user-defined network.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
docker network create --driver bridge google
docker network create --driver bridge google
docker network create --driver bridge google
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
57a868c4124e4339a35b13dd6125f36835530e561e48a85e26de02e31d44460b
```network.
List the Docker networks again.
```bash
docker network ls
57a868c4124e4339a35b13dd6125f36835530e561e48a85e26de02e31d44460b ```network. List the Docker networks again. ```bash docker network ls
57a868c4124e4339a35b13dd6125f36835530e561e48a85e26de02e31d44460b
```network.

List the Docker networks again.


```bash
docker network ls
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
NETWORK ID NAME DRIVER SCOPE
ea5ebd23a086 bridge bridge local
57a868c4124e google bridge local
bb80a2809880 host host local
00b74ecbf970 none null local
NETWORK ID NAME DRIVER SCOPE ea5ebd23a086 bridge bridge local 57a868c4124e google bridge local bb80a2809880 host host local 00b74ecbf970 none null local
NETWORK ID          NAME                DRIVER              SCOPE
ea5ebd23a086        bridge              bridge              local
57a868c4124e        google              bridge              local
bb80a2809880        host                host                local
00b74ecbf970        none                null                local

The

google
google network has been added to the list.

Now launch the Selenium container again, but this time using the

--network
--network argument to connect it to the
google
google network.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
docker run -d --rm --name selenium --network google selenium/standalone-chrome:3.141
docker run -d --rm --name selenium --network google selenium/standalone-chrome:3.141
docker run -d --rm --name selenium --network google selenium/standalone-chrome:3.141
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
f8a0a0dd21f4f27773c5ce260df21cb6d509815b56638ccfd6be5f05dbb8172b
f8a0a0dd21f4f27773c5ce260df21cb6d509815b56638ccfd6be5f05dbb8172b
f8a0a0dd21f4f27773c5ce260df21cb6d509815b56638ccfd6be5f05dbb8172b

If we inspect the

google
google network then we’ll see the details of the
selenium
selenium container.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
docker network inspect google
docker network inspect google
docker network inspect google
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
[
{
"Name": "google",
"IPAM": {
"Driver": "default",
"Options": {},
"Config": [
{
"Subnet": "172.21.0.0/16",
"Gateway": "172.21.0.1"
}
]
},
"Containers": {
"f8a0a0dd21f4f27773c5ce260df21cb6d509815b56638ccfd6be5f05dbb8172b": {
"Name": "selenium",
"MacAddress": "02:42:ac:15:00:02",
"IPv4Address": "172.21.0.2/16",
"IPv6Address": ""
}
}
}
]
[ { "Name": "google", "IPAM": { "Driver": "default", "Options": {}, "Config": [ { "Subnet": "172.21.0.0/16", "Gateway": "172.21.0.1" } ] }, "Containers": { "f8a0a0dd21f4f27773c5ce260df21cb6d509815b56638ccfd6be5f05dbb8172b": { "Name": "selenium", "MacAddress": "02:42:ac:15:00:02", "IPv4Address": "172.21.0.2/16", "IPv6Address": "" } } } ]
[
    {
        "Name": "google",
        "IPAM": {
            "Driver": "default",
            "Options": {},
            "Config": [
                {
                    "Subnet": "172.21.0.0/16",
                    "Gateway": "172.21.0.1"
                }
            ]
        },
        "Containers": {
            "f8a0a0dd21f4f27773c5ce260df21cb6d509815b56638ccfd6be5f05dbb8172b": {
                "Name": "selenium",
                "MacAddress": "02:42:ac:15:00:02",
                "IPv4Address": "172.21.0.2/16",
                "IPv6Address": ""
            }
        }
    }
]

The above output has been abridged for clarity.

On a user-defined network containers can be located either via IP address or by name (where the name is internally resolved to an IP address via the automatic service discovery capability). This means that, rather than address the

selenium
selenium container by its IP address we can simply refer to it by name. This is a much more robust setup since, provided we consistently use the same name for this container.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
from selenium import webdriver
SELENIUM_URL = "selenium:4444"
browser = webdriver.Remote(f"http://{SELENIUM_URL}/wd/hub", {'browserName': 'chrome'})
browser.get("https://www.google.com")
print(f"Retrieved URL: {browser.current_url}.")
browser.close()
from selenium import webdriver SELENIUM_URL = "selenium:4444" browser = webdriver.Remote(f"http://{SELENIUM_URL}/wd/hub", {'browserName': 'chrome'}) browser.get("https://www.google.com") print(f"Retrieved URL: {browser.current_url}.") browser.close()
from selenium import webdriver

SELENIUM_URL = "selenium:4444"

browser = webdriver.Remote(f"http://{SELENIUM_URL}/wd/hub", {'browserName': 'chrome'})

browser.get("https://www.google.com")

print(f"Retrieved URL: {browser.current_url}.")

browser.close()

Scraper Template in Docker with User-Defined Bridge Network

Let’s wrap this up by putting our little scraper into a Docker image.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
FROM python:3.8.5-slim AS base
RUN pip3 install selenium==3.141.0
COPY google-selenium-bridge-user-defined.py /
CMD python3 google-selenium-bridge-user-defined.py
FROM python:3.8.5-slim AS base RUN pip3 install selenium==3.141.0 COPY google-selenium-bridge-user-defined.py / CMD python3 google-selenium-bridge-user-defined.py
FROM python:3.8.5-slim AS base

RUN pip3 install selenium==3.141.0

COPY google-selenium-bridge-user-defined.py /

CMD python3 google-selenium-bridge-user-defined.py

Now build the image.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
docker build -t google-selenium-bridge-user-defined .
docker build -t google-selenium-bridge-user-defined .
docker build -t google-selenium-bridge-user-defined .

And run it.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
docker run --net=google google-selenium-bridge-user-defined
docker run --net=google google-selenium-bridge-user-defined
docker run --net=google google-selenium-bridge-user-defined
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
Retrieved URL: https://www.google.com/.
Retrieved URL: https://www.google.com/.
Retrieved URL: https://www.google.com/.

We specified

--net=google
--net=google to ensure that this container is launched onto the
google
google network.

Our setup now has everything covered in the previous post but also keeps all of the networking within Docker, so everything is isolated from the host.

Cleaning Up

Always good practice to mop up: stop the

selenium
selenium container and remove the
google
google network.

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
docker stop selenium
docker network rm google
docker stop selenium docker network rm google
docker stop selenium
docker network rm google
To leave a comment for the author, please follow the link and comment on their blog: Python | datawookie .

Want to share your content on python-bloggers? click here.