One use for "Infrastructure-as-Code" solutions is to set up complex content-distribution networks in a consistent and repeatable manner. In this codelab, we will examine how to deploy a custom set of networks, subnetworks, virtual machines, and firewall rules that implement a scalable web site. The network schema for the lab is shown below. As the figure shows, we will create a network of our own called "networking101
", then establish several subnetworks across 3 different regions. While our server infrastructure will be deployed in subnetworks located in us-east5
and europe-west1
(10.20.0.0/16
and 10.30.0.0/16
in the figure), we will also deploy client machines in a subnetwork located in the us-west1
region to stress test the servers.
Begin by launching Cloud Shell and copying the files from the lab's bucket hosted on Google Cloud Storage:
gsutil cp -r gs://cs430/networking101 . cd networking101
The lab uses the Deployment Manager service that Google Cloud provides for setting up resources on the platform via specifications. Within the directory, multiple Deployment Manager files are given. The first is the main YAML file shown below. As the file shows, it includes a number of Jinja templates for specifying the different parts of the deployment including its virtual machines, networks, subnetworks, and firewalls. You can think of these templates as libraries of pre-defined configurations that you can create your instances from.
imports:
- path: vm-template.jinja
- path: network-template.jinja
- path: subnetwork-template.jinja
- path: compute-engine-template.jinja
resources:
- name: compute-engine-setup
type: compute-engine-template.jinja
The YAML file specifies the main Jinja template to instantiate as a resource. We will go through its parts in the next steps.
Jinja files declare infrastructure to deploy. A snippet showing the networking parts of the main Jinja file is shown below. The snippet defines a variable in Jinja (NETWORK_NAME
) that will be used throughout the deployment and sets it to networking101
. It then passes it as the name for the custom network that we deploy using the network-template.jinja
file.
{% set NETWORK_NAME = "networking101" %}
resources:
- name: {{ NETWORK_NAME }}
type: network-template.jinja
The template is shown below and defines a custom network that does not have subnetworks automatically created. As we will show later, we will define our own subnetworks for the regions we wish to deploy into. Because the template was passed "networking101
" as its name, that will be the name of the network that is created.
resources:
- name: {{ env["name"] }}
type: compute.v1.network
properties:
autoCreateSubnetworks: false
Our deployment will specify the various subnetworks as well. Going back to the main Jinja template, we can find all of these subnetworks declared. For each subnetwork, we first assign them a name (us-west-s1
, us-west-s2
, etc.) and specify the template to use for instantiation (subnetwork-template.jinja
). Then, in the properties section, we specify for each subnetwork:
networking101
)The specification for the two subnetworks is shown below:
resources:
- name: us-west-s1
type: subnetwork-template.jinja
properties:
network: {{ NETWORK_NAME }}
range: 10.10.0.0/16
region: us-west1
- name: us-west-s2
type: subnetwork-template.jinja
properties:
network: {{ NETWORK_NAME }}
range: 10.11.0.0/16
region: us-west1
...
The subnetwork Jinja template instantiates the actual infrastructure using the parameters it has been passed (network
, range
, region
).
resources:
- name: {{ env["name"] }}
type: compute.v1.subnetwork
properties:
ipCidrRange: {{ properties["range"] }}
network: $(ref.{{ properties["network"] }}.selfLink)
region: {{ properties["region"] }}
Finally, the deployment specifies a virtual machine to be created in each subnetwork. For the machines in us-west1
, a snippet from the main Jinja template is shown below:
resources:
- name: w1-vm
type: vm-template.jinja
properties:
machineType: e2-micro
zone: us-west1-b
network: {{ NETWORK_NAME }}
subnetwork: us-west-s1
- name: w2-vm
type: vm-template.jinja
properties:
machineType: e2-micro
zone: us-west1-b
network: {{ NETWORK_NAME }}
subnetwork: us-west-s2
ip: 10.11.0.100
The snippet shows that a virtual machine named w1-vm
is to be created using an e2-micro
type in us-west1
and is to be placed on the networking101
network in subnetwork us-west-s1
(as specified in the previous step). The network address will be automatically assigned since it has not been specified. The declaration of w2-vm
is similar, but the IP address is explicitly assigned to be 10.11.0.100
.
The template takes the parameters passed to it (name
, machineType
, zone
, network
, subnetwork
, ip
) and then:
The final part of the template specifies in the machine's "metadata
", a startup-script
that should be run when the VM is first brought up. As the script shows, each VM will automatically install a set of packages we'll be using for subsequent steps (e.g. traceroute
, siege
)
resources:
- name: {{ env["name"] }}
type: compute.v1.instance
properties:
zone: {{ properties["zone"] }}
machineType: https://www.googleapis.com/compute/v1/projects/{{ env["project"] }}/zones/{{ properties["zone"] }}/machineTypes/{{ properties["machineType"] }}
disks:
- deviceName: boot
initializeParams:
sourceImage: https://www.googleapis.com/compute/v1/projects/debian-cloud/global/images/family/debian-10
networkInterfaces:
- network: $(ref.{{ properties["network"] }}.selfLink)
subnetwork: $(ref.{{ properties["subnetwork"] }}.selfLink)
{% if properties["ip"] %}
networkIP: {{ properties["ip"] }}
{% endif %}
metadata:
items:
- key: startup-script
value: |
#!/bin/bash
apt-get -y update
apt-get -y install traceroute mtr tcpdump iperf whois host dnsutils siege
Within Cloud Shell, launch the deployment and name it networking101
gcloud deployment-manager deployments create networking101 --config networking-lab.yaml
ssh
button for one of the VMs and attempt to connect. Did it succeed? Secure-by-default is an important concept in the cloud given there are so many ways to configure systems and so many systems being deployed. In the previous step, although we configured the networks and machines, the default access policy inside a network we create is deny
. Thus, all traffic is disallowed by the firewall unless we add rules to explicitly allow it.
To ameliorate this, we will append our deployment specification with firewall rules that allow the traffic we need. The update will be added to the original YAML file. Open up the file (networking-lab.yaml
).
Add the following line to the end of the imports
section to include a new template file that we will create for adding firewall rules to the deployment
- path: firewall-template.jinja
Then, add the following lines at the bottom of the file under the resources
section to apply it to our network..
- name: networking-firewall
type: firewall-template.jinja
properties:
network: networking101
The additional specification creates a firewall configuration based on a Jinja template that is then attached to our custom network (networking101
). The template itself is shown below. Create the file in the same directory. As the template shows, a firewall rule named networking101-allow-internal
is created which allows all TCP, UDP, and ICMP traffic amongst the internal interfaces on the network (e.g. 10.0.0.0/8
). The second rule is named networking101-allow-ssh
and allows ssh connections from any network source (0.0.0.0/0
). Similarly, the last rule, networking101-allow-icmp
allows ICMP traffic from any network source.
resources:
- name: {{ env["name"] }}-allow-internal
type: compute.v1.firewall
properties:
network: $(ref.{{ properties["network"] }}.selfLink)
sourceRanges: ["10.0.0.0/8"]
allowed:
- IPProtocol: TCP
ports: ["0-65535"]
- IPProtocol: UDP
ports: ["0-65535"]
- IPProtocol: ICMP
- name: {{ env["name"] }}-allow-ssh
type: compute.v1.firewall
properties:
network: $(ref.{{ properties["network"] }}.selfLink)
sourceRanges: ["0.0.0.0/0"]
allowed:
- IPProtocol: TCP
ports: ["22"]
- name: {{ env["name"] }}-allow-icmp
type: compute.v1.firewall
properties:
network: $(ref.{{ properties["network"] }}.selfLink)
sourceRanges: ["0.0.0.0/0"]
allowed:
- IPProtocol: ICMP
We will now update the deployment in-place. Deployment manager will automatically determine what needs to be updated and only re-deploy infrastructure that the new specification requires. Back in Cloud Shell, run the following in the networking101
directory that contains the updated YAML file.
gcloud deployment-manager deployments update networking101 --config networking-lab.yaml
Visit the networking101
VPC network in the web UI
Then, go back to the Compute Engine console and ssh
into each VM. We will now use the sessions on each VM to perform the ping command in order to measure the latency between the different geographic regions. The table below shows the regions and their geographic locations.
Region | Location |
us-west1 | The Dalles, Oregon, USA |
us-east5 | Columbus, Ohio, USA |
europe-west1 | Saint Ghislain, Belgium |
asia-east1 | Changhua County, Taiwan |
Given the locations of the regions and the physical distance between them, we can then calculate the "ideal" latency between them using the speed of light and compare it against the latency via the network. Using the ssh
sessions, perform a pairwise ping
between all 4 regions that you have deployed VMs on. Note that you can either use the name of the VM as an argument or its internal IP address. Examples are shown below for ping commands that perform 3 round-trip measurements to w2-vm
(10.11.0.100
):
ping -c 3 w2-vm
ping -c 3 10.11.0.100
Location pair | Ideal latency | Measured latency |
us-west1 us-east5 | ~45 ms | |
us-west1 europe-west1 | ~93 ms | |
us-west1 asia-east1 | ~114 ms | |
us-east5 europe-west1 | ~76 ms | |
us-east5 asia-east1 | ~141 ms | |
europe-west1 asia-east1 | ~110 ms |
The current deployment has a fixed number of VM instances. When traffic is light, this can result in idle resources and significant unnecessary cost. When traffic is heavy, this can lead to congestion and a loss of users due to poor performance. In a modern deployment, infrastructure is scaled based on demand. While this scaling is automatically done in serverless deployments, with Compute Engine VMs, we can do this by creating Managed Instance Groups (us-east5-mig
, europe-west1-mig
below).
Instance Groups are collections of Compute Engine VMs that are derived from a single template. The number of replicas in the group is varied based on the load being experienced.
In order to route traffic so that it is evenly spread across instances in a group, we must also instantiate a Load Balancer. A load balancer acts as a reverse-proxy, taking in requests and forwarding them to a backend server for processing. To demonstrate this, we will create the two instance groups then set up a scaling policy for each to determine how they will scale out and scale in. Then, we'll instantiate a load balancer to route requests amongst the servers in the two instance groups.
In our previous steps, we added firewall rules to allow ssh
and ICMP traffic from all sources. In the next steps, we wish to build a scalable web site. In order for it to be accessible externally, we must also allow HTTP traffic from all sources. Although this could be instantiated via an update to the Deployment Manager specifications as before, we will instead demonstrate how this can be done via the gcloud
CLI.
The CLI command below does this by creating a firewall rule named networking-firewall-allow-http
, specifying that traffic to the HTTP port (80) from all sources (0.0.0.0/0
) is allowed and attaching it to the networking101
network. In addition, we associate the tag http-server
to it so that it can be attached to each of the VMs we create to serve our site.
gcloud compute firewall-rules create networking-firewall-allow-http \ --allow tcp:80 --network networking101 --source-ranges 0.0.0.0/0 \ --target-tags http-server
In order to scale Compute Engine VMs out with replicas, we must first specify an Instance Template. In this case, we will use a per-region template to differentiate machines created in one region versus another. Similar to how VMs were specified in the Deployment Manager Jinja template, the command below creates a template for the us-east5
region called us-east5-template
. VMs created from this template are placed on the us-east5
subnetwork and have the http-server
tag attached to them to allow incoming HTTP traffic.
gcloud compute instance-templates create "us-east5-template" \ --image-project debian-cloud \ --image debian-10-buster-v20221102 \ --machine-type e2-micro \ --subnet "us-east5" \ --metadata "startup-script-url=gs://cs430/networking101/website/startup.sh" \ --region "us-east5" \ --tags "http-server"
The command also specifies a startup script that should be executed whenever a VM instance is created (startup-script-url
set in the metadata of the instance). The startup file is shown below. It installs a simple Apache/PHP server, downloads an index.php
file, and obtains the zone information of the instance from the VM instance's Metadata service (See CS 495 for more on this service). It then substitutes the region information into the index.php
file using the Unix stream editor command (sed
).
#! /bin/bash
apt-get update
apt-get install -y apache2 php
cd /var/www/html
rm index.html -f
rm index.php -f
wget https://storage.googleapis.com/cs430/networking101/website/index.php
META_REGION_STRING=$(curl "http://metadata.google.internal/computeMetadata/v1/instance/zone" -H "Metadata-Flavor: Google")
REGION=`echo "$META_REGION_STRING" | awk -F/ '{print $4}'`
sed -i "s|region-here|$REGION|" index.php
The PHP file is below. The 'region-here
' text is replaced by the sed
command and results in the PHP script showing us the region that it has been brought up in along with the IP address of the client and the hostname of the instance (to allow us to differentiate between various replicas in our deployment).
<?php
$ip = $_SERVER['REMOTE_ADDR'];
// display it back
echo "<h1>Networking 101 Lab</h1>";
echo "<h2>Client IP</h2>";
echo "Your IP address : " . $ip;
echo "<h2>Hostname</h2>";
echo "Server Hostname: " . php_uname("n");
echo "<h2>Server Location</h2>";
echo "Region and Zone: " . "region-here";
?>
Create the second instance template for the lab in and europe-west1
. As the command shows, it only differs from the first in its region
and subnet
.
gcloud compute instance-templates create "europe-west1-template" \ --image-project debian-cloud \ --image debian-10-buster-v20221102 \ --machine-type e2-micro \ --subnet "europe-west1" \ --metadata "startup-script-url=gs://cs430/networking101/website/startup.sh" \ --region "europe-west1" \ --tags "http-server"
Then, see that both templates have been created by listing them.
gcloud compute instance-templates list
They should also show up in the web console of Compute Engine under "Instance Templates".
Up until this point, we have mostly relied upon Deployment Manager and the gcloud
CLI to configure resources. The configuration can also be done via the web console. For subsequent steps, deploy using the gcloud
commands, but view how it can be done via the web console so you can see how they map to each other.
Before we create our managed instance groups from templates, we must first specify a health check. Health checks enable GCP to automatically detect non-functioning nodes in a managed instance group when they crash so that they can be restarted. This is useful to ensure high availability for our service. As we are running a web site as our application, we can define a simple HTTP check called instance-health-check
on port 80 that triggers every 10 seconds and will declare failure upon 3 failed checks.
A single gcloud
command can create this check:
gcloud compute health-checks create http instance-health-check \ --check-interval=10s \ --port=80 \ --timeout=5s \ --unhealthy-threshold=3
Visit "Compute Engine"=>"Health checks" and create a new health check with the settings specified:
Given our templates, we will first create a simple managed instance group in our European region with a fixed number of instances. The settings are below.
europe-west1
europe-west1-template
instance-health-check
)We can instantiate the group using the gcloud
commands given below to create the group and specify its health check.
gcloud compute instance-groups managed create europe-west1-mig \ --size 3 \ --region europe-west1 \ --template europe-west1-template gcloud compute instance-groups managed update europe-west1-mig \ --health-check instance-health-check \ --initial-delay 120 \ --region europe-west1
Because we want this instance group to eventually serve web requests, we "expose" port 80
and name it http
. A load balancer will then route requests to this named port.
gcloud compute instance-groups set-named-ports europe-west1-mig \ --named-ports=http:80 --region europe-west1
The web console of Compute Engine can also be used to configure this group as shown below.:
Leave the rest of the settings at their defaults, but for Autohealing, specify the health check created in the previous step.
In Cloud Shell, we will now create two managed instance groups from the instance templates created previously. Name the first one us-east5-mig
and set it to use autoscaling from 1 to 5 instances and the settings below:
us-east5
us-east5-template
We will create the deployment via several individual gcloud
commands. The first two commands create the group and specify its health check.
gcloud compute instance-groups managed create us-east5-mig \ --size 1 \ --region us-east5 \ --template us-east5-template gcloud compute instance-groups managed update us-east5-mig \ --health-check instance-health-check \ --initial-delay 120 \ --region us-east5
The next command sets up autoscaling for the instance group to range from 1 to 5 replicas based on load-balancing utilization.
gcloud compute instance-groups managed set-autoscaling us-east5-mig \ --mode on \ --region us-east5 \ --min-num-replicas=1 --max-num-replicas=5 \ --cool-down-period=45 \ --scale-based-on-load-balancing \ --target-load-balancing-utilization=0.8
Finally, as before, we want this instance group to eventually serve web requests so we "expose" port 80
and name it http
for the load balancer we will deploy.
gcloud compute instance-groups set-named-ports us-east5-mig \ --named-ports=http:80 --region us-east5
An example of how to create the group in the web console with the appropriate settings is shown below. Visit the web console for Compute Engine and click on "Instance Groups" and "Create instance group". Within the UI, configure the settings from above.
Ensure that the groups and their associated instances have been created properly via:
gcloud compute instance-groups list
Wait a minute, even after the deployment finishes, for the instances to fully initialize. Then, go to the web console of Compute Engine and directly visit the web server that has been brought up within us-east5-mig
via its IP address (http://
).
You should see the output of the PHP script.
Repeat with an instance from europe-west1-mig
. Answer the following questions for your lab notebook
While we have set up a number of individual web servers for our site, we do not have a way of distributing the requests from clients to them automatically based on load. To do so, we must instantiate a load balancer. Load balancers will accept requests from clients on a single, anycast IP address and forward the request to the most appropriate web server on the backend based on proximity and load. In this step, we specify a backend service consisting of our two instance groups and create a load balancer with a static IP address that will forward requests to them.
The configuration requires a number of gcloud
commands. The first set of commands creates a backend service on the HTTP port (webserver-backend-migs
) and then adds the two instance groups to them (europe-west1-mig
, us-east5-mig
)
gcloud compute backend-services create webserver-backend-migs \ --protocol=http --port-name=http --timeout=30s \ --health-checks=instance-health-check \ --global gcloud compute backend-services add-backend webserver-backend-migs \ --instance-group=europe-west1-mig --instance-group-region=europe-west1 \ --balancing-mode=utilization --max-utilization=0.8 \ --global gcloud compute backend-services add-backend webserver-backend-migs \ --instance-group=us-east5-mig --instance-group-region=us-east5 \ --balancing-mode=rate --max-rate-per-instance=50 \ --global
Then, we create the load balancer (also known as a URL map), point it to the backend, and create an HTTP proxy that will forward HTTP requests from the load balancer to the backend.
gcloud compute url-maps create webserver-frontend-lb \ --default-service webserver-backend-migs gcloud compute target-http-proxies create webserver-proxy \ --url-map webserver-frontend-lb
Next, we allocate an IPv4 address to use and associate a forwarding rule to take incoming HTTP requests to that address and send them to the HTTP proxy we've created.
gcloud compute addresses create webserver-frontend-ip --ip-version=ipv4 --global gcloud compute forwarding-rules create webserver-frontend-fwrule \ --ip-protocol=tcp --ports=80 --address=webserver-frontend-ip \ --target-http-proxy webserver-proxy \ --global
The configuration via the web console is fairly involved. To begin with, visit "Network services" and create a load balancer configured for HTTP load balancing and specify that it be Internet-facing (From Internet to my VMs)
Name the load balancer webserver-frontend-lb
and sequence through its steps for configuration
Start with the Backend configuration and create a backend service that the load balancer will forward requests to:
Name the backend webserver-backend-migs
. Then, within the "New backend" UI, specify the europe-west1-mig
group and port 80. Click "Done" to add it.
Once added, click on "Add backend" and repeat the process for the other instance group.
In the Health Check section, select instance-health-check
.
Then click "Create"
In "Frontend configuration", configure the frontend IP address for the load balancer and name it webserver-frontend-ip
, then click "Done"
Finally, click "Create" to create the load balancer.
We have now set up our scalable web site below. Review the diagram to see each component has been instantiated in the codelab.
Visit "Network Services"=>"Load Balancing", then click on the load balancer you instantiated. Visit the IP address that it handles requests on.
We will now show how our load balancer can direct traffic to our instance groups and how we can scale instance groups based on demand. From your initial Deployment Manager deployment, bring up two ssh
sessions: one on w1-vm
and one on eu1-vm
.
On w1-vm
, launch a siege
on your load balancer IP address. Note that the command is configured for 250 simultaneous requests. If this load is insufficient to impact autoscaling, you may need to increase the number of concurrent requests.
# On w1-vm siege -c 250 http://<LoadBalancerIP>
Visit the web console. Go to "Network Services"=>"Load Balancing" and click on your load balancer (webserver-frontend-lb
). Then, click on the "Monitoring" tab. In the Backend dropdown, select the webserver-backend-migs
as shown below and expand out its details. The UI shows traffic sources by region and the backends that they are routed to by the load balancer. Since w1-vm
is in the US, traffic is sent to us-east5-mig
and the instance group scales up from 1 instance to 5. As the instances are brought up, the load balancer directs requests over to the europe-west1-mig
, creating significant intercontinental traffic.
Keep this window open for 5-10 minutes as the system adapts to the load and the UI updates.
Note that eventually, traffic is handled mostly by the 5 VMs in us-east5-mig
.
Stop the siege
running on w1-vm
. Then, go to eu1-vm
and launch an identical siege
on your load balancer IP address:
# On eu1-vm siege -c 250 http://<LoadBalancerIP>
Go back to the load balancer monitoring UI updates to show traffic now coming from the European region. For web sites, it is ideal to have clients in Europe be served by servers in Europe. As the UI will eventually show, requests from eu1-vm
are sent to europe-west1-mig
and the total traffic will shift away from the servers in us-east5-mig
. Using the anycast functionality of the load balancer, this can be done with a single IP address.
When finished, exit out of both w1-vm
and eu1-vm
.
We have deployed a significant amount of resources to implement our scalable web site. As a result, it is important that we clean up immediately to avoid running up charges unnecessarily.
The following sets of commands delete our load balancing setup:
gcloud compute forwarding-rules delete webserver-frontend-fwrule --global gcloud compute target-http-proxies delete webserver-proxy gcloud compute addresses delete webserver-frontend-ip --global gcloud compute url-maps delete webserver-frontend-lb gcloud compute backend-services delete webserver-backend-migs --global
Then, delete the managed instance group in us-east5
.
gcloud compute instance-groups managed delete us-east5-mig --region us-east5
Do the same for the managed instance group in europe-west1
.
gcloud compute instance-groups managed delete europe-west1-mig --region europe-west1
Then, delete the health checks, instance templates, and the firewall rule allowing HTTP.
gcloud compute health-checks delete instance-health-check gcloud compute instance-templates delete us-east5-template gcloud compute instance-templates delete europe-west1-template gcloud compute firewall-rules delete networking-firewall-allow-http
Finally, we delete the initial deployment.
gcloud deployment-manager deployments delete networking101
Visit the web console of Compute Engine and ensure no instances remain running.