One use for "Infrastructure-as-Code" solutions is to set up complex content-distribution networks in a consistent and repeatable manner. In this codelab, we will examine how to deploy a custom set of networks, subnetworks, virtual machines, and firewall rules that implement a scalable web site. The network schema for the lab is shown below. As the figure shows, we will create a network of our own called "networking101", then establish several subnetworks across 3 different regions. While our server infrastructure will be deployed in subnetworks located in us-east5 and europe-west1 (10.20.0.0/16 and 10.30.0.0/16 in the figure), we will also deploy client machines in a subnetwork located in the us-west1 region to stress test the servers.

Begin by launching Cloud Shell and cd into the course repository:

cd cs430-src/10_CDN/

The lab uses Terraform in order to set up its resources via declarative configuration files. Within this directory, several Terraform files are provided.The first is the main configuration file shown below. As the file shows, it includes multiple .tf modules for specifying the different parts of the deployment—including virtual machines, networks, subnetworks, and firewalls.You can think of these Terraform files as libraries of predefined infrastructure configurations that you can use to create your instances.

main.tf
variables.tf
output.tf
scripts/startup.sh   # used by instances (referenced from main.tf)

The Terraform files together specify the main topology the lab will instantiate as a single deployment. We'll go through the parts in the next steps.

In order for terraform to deploy resources onto your project, it requires owner permissions for the project. As a result, any credentials we issue in this lab are particularly sensitive if exposed and should be removed afterwards.

Next, create a service account for the lab via the CLI.

gcloud iam service-accounts create CDN-lab

Then, attach a policy binding that allows it full access to Compute Engine permissions for the project.

gcloud projects add-iam-policy-binding ${GOOGLE_CLOUD_PROJECT} \
  --member serviceAccount:CDN-lab@${GOOGLE_CLOUD_PROJECT}.iam.gserviceaccount.com \
  --role roles/compute.admin

Finally, issue a service account key for the service account that terraform will then use for its access to project resources and store it in CDN-lab.json.

gcloud iam service-accounts keys create CDN-lab.json \
  --iam-account CDN-lab@${GOOGLE_CLOUD_PROJECT}.iam.gserviceaccount.com

terraform uses its own structured data format similar to JSON and YAML for declaratively specifying platform resources to create. The first block to add specifies the cloud provider we are deploying on (Google). Within this block, we must specify the credential file we'll be using (CDN-lab.json), the Project ID the resources will be deployed in. Set the project to that of your own.

main.tf

// Configure the Google Cloud provider
provider "google" {
 credentials = file("CDN-lab.json")
 project     = "<FMI>"
}

Terraform configuration files declare infrastructure to deploy. In this section, we look at how the different resources are defined.

Resource naming

The snippet below defines a custom network named networking101. As we will see later, separate Terraform resources define our own subnets for the regions we deploy into. Because the network resource uses var.network_name, that variable's value determines the actual name of the network created in Google Cloud.

variables.tf

variable "network_name" {
  type    = string
  default = "networking101"
}

Subnetworks

Our deployment will also specify the various subnetworks. In the main Terraform configuration, we can find all of these subnetworks declared. For each subnetwork, we first assign it a name (us-west-s1, us-west-s2, etc.) and define its region and CIDR range. Each subnetwork will be attached to the custom VPC network we created earlier (networking101).

In Terraform, this is done using the google_compute_subnetwork resource block. Each subnetwork block specifies:

main.tf

resource "google_compute_subnetwork" "us_west_s1" {
  name          = "us-west-s1"
  region        = "us-west1"
  ip_cidr_range = "10.10.0.0/16"
  network       = google_compute_network.networking101.self_link
  private_ip_google_access = true
}

resource "google_compute_subnetwork" "us_west_s2" {
  name          = "us-west-s2"
  region        = "us-west1"
  ip_cidr_range = "10.11.0.0/16"
  network       = google_compute_network.networking101.self_link
  private_ip_google_access = true
}

...

These subnetworks will later host the virtual machines that make up the distributed CDN topology.

Virtual machines

Finally, the deployment specifies a virtual machine to be created in each subnetwork. A snippet of the specification is shown below. The VM blocks take parameters that map directly to what the original Jinja template accepted:

main.tf

# Debian image family used by all VMs
data "google_compute_image" "debian" {
  project = "debian-cloud"
  family  = "debian-12"
}

# Startup script used by all VMs
locals {
  startup = file("${path.module}/scripts/startup.sh")
}

# w1-vm in us-west1-b on subnetwork us-west-s1
resource "google_compute_instance" "w1_vm" {
  name         = "w1-vm"
  machine_type = "f1-micro"
  zone         = "us-west1-b"

  boot_disk {
    initialize_params { image = data.google_compute_image.debian.self_link }
  }

  network_interface {
    subnetwork = google_compute_subnetwork.us_west_s1.self_link
    access_config {}
  }

  metadata = { "startup-script" = local.startup }
}

The snippet shows that a virtual machine named w1 is created as an f1-micro in us-west1-b and attached to the custom network via the us-west1 subnetwork (from the previous step). Because no internal network_ip is specified, Google Cloud assigns one automatically.


Virtual machine startup script

When spinning up a virtual machine, we must initialize it so that it can serve our application. The script is set in an instance's metadata and can be found at scripts/startup.sh. The script installs the servers and tools you'll use in later steps (e.g., apache2, traceroute, siege).

startup.sh

#!/usr/bin/env bash
set -euxo pipefail

# Base tools (from DM template)
apt-get -y update
DEBIAN_FRONTEND=noninteractive apt-get -y install traceroute mtr tcpdump iperf whois host dnsutils siege

# Website provisioning (from your startup.sh)
apt-get install -y apache2 php wget
cd /var/www/html
rm -f index.html index.php
wget https://storage.googleapis.com/cs430/networking101/website/index.php
META_REGION_STRING=$(curl -s "http://metadata.google.internal/computeMetadata/v1/instance/zone" -H "Metadata-Flavor: Google")
REGION=$(echo "$META_REGION_STRING" | awk -F/ '{print $4}')
sed -i "s|region-here|$REGION|" index.php
systemctl enable apache2
systemctl restart apache2

Within Cloud Shell, perform an authorization for applications, then launch the deployment using terraform.

terraform init
terraform apply

Then put in your project id and take a look at what will be deploy and assign. Finally, enter yes when it's prompted.

After the deployment is done, answer the following questions:

Secure-by-default is an important concept in the cloud given there are so many ways to configure systems and so many systems being deployed. In the previous step, although we configured the networks and machines, the default access policy inside a network we create is deny. Thus, all traffic is disallowed by the firewall unless we add rules to explicitly allow it.

To ameliorate this, we will append our deployment specification with firewall rules that allow the traffic we need. Create the file below in the directory.

(create file) firewall.tf

# firewall.tf
# Requires: resource "google_compute_network" "networking101" { ... }

resource "google_compute_firewall" "networking101_allow_internal" {
  name    = "networking101-allow-internal"
  network = google_compute_network.networking101.name

  source_ranges = ["10.0.0.0/8"]

  allow {
    protocol = "tcp"
    ports    = ["0-65535"]
  }

  allow {
    protocol = "udp"
    ports    = ["0-65535"]
  }

  allow {
    protocol = "icmp"
  }
}

resource "google_compute_firewall" "networking101_allow_ssh" {
  name    = "networking101-allow-ssh"
  network = google_compute_network.networking101.name

  source_ranges = ["0.0.0.0/0"]

  allow {
    protocol = "tcp"
    ports    = ["22"]
  }
}

resource "google_compute_firewall" "networking101_allow_icmp" {
  name    = "networking101-allow-icmp"
  network = google_compute_network.networking101.name

  source_ranges = ["0.0.0.0/0"]

  allow {
    protocol = "icmp"
  }
}

As the file shows, a firewall rule named networking101-allow-internal is created which allows all TCP, UDP, and ICMP traffic amongst the internal interfaces on the network (e.g. 10.0.0.0/8). The second rule is named networking101-allow-ssh and allows ssh connections from any network source (0.0.0.0/0). Similarly, the last rule, networking101-allow-icmp allows ICMP traffic from any network source.

We will now update the deployment in-place. Terraform will automatically determine what needs to be updated and only re-deploy infrastructure that the new specification requires. Back in Cloud Shell, run the following in the 10_CDN directory that contains the updated Terraform file.

terraform apply

Once it's done, visit the networking101 VPC network in the web UI

Then, go back to the Compute Engine console and ssh into each VM. We will now use the sessions on each VM to perform the ping command in order to measure the latency between the different geographic regions. The table below shows the regions and their geographic locations.

Region

Location

us-west1

The Dalles, Oregon, USA

us-east5

Columbus, Ohio, USA

europe-west1

Saint Ghislain, Belgium

asia-east1

Changhua County, Taiwan

Given the locations of the regions and the physical distance between them, we can then calculate the "ideal" latency between them using the speed of light and compare it against the latency via the network. Using the ssh sessions, perform a pairwise ping between all 4 regions that you have deployed VMs on. Note that you can either use the name of the VM as an argument or its internal IP address. Examples are shown below for ping commands that perform 3 round-trip measurements to w2-vm (10.11.0.100):

ping -c 3 w2-vm
ping -c 3 10.11.0.100

Location pair

Ideal latency

Measured latency

us-west1 us-east5

~45 ms

us-west1 europe-west1

~93 ms

us-west1 asia-east1

~114 ms

us-east5 europe-west1

~76 ms

us-east5 asia-east1

~141 ms

europe-west1 asia-east1

~110 ms

The current deployment has a fixed number of VM instances. When traffic is light, this can result in idle resources and significant unnecessary cost. When traffic is heavy, this can lead to congestion and a loss of users due to poor performance. In a modern deployment, infrastructure is scaled based on demand. While this scaling is automatically done in serverless deployments, with Compute Engine VMs, we can do this by creating Managed Instance Groups (us-east5-mig, europe-west1-mig below).

Instance Groups are collections of Compute Engine VMs that are derived from a single template. The number of replicas in the group is varied based on the load being experienced.

In order to route traffic so that it is evenly spread across instances in a group, we must also instantiate a Load Balancer. A load balancer acts as a reverse-proxy, taking in requests and forwarding them to a backend server for processing. To demonstrate this, we will create the two instance groups then set up a scaling policy for each to determine how they will scale out and scale in. Then, we'll instantiate a load balancer to route requests amongst the servers in the two instance groups.

In our previous steps, we added firewall rules to allow ssh and ICMP traffic from all sources. In the next steps, we wish to build a scalable web site. In order for it to be accessible externally, we must also allow HTTP traffic from all sources. Although this could be instantiated via an update to the Terraform specifications as before, we will instead demonstrate how this can be done via the gcloud CLI.

The CLI command below does this by creating a firewall rule named networking-firewall-allow-http, specifying that traffic to the HTTP port (80) from all sources (0.0.0.0/0) is allowed and attaching it to the networking101 network. In addition, we associate the tag http-server to it so that it can be attached to each of the VMs we create to serve our site.

gcloud compute firewall-rules create networking-firewall-allow-http \
  --allow tcp:80 --network networking101 --source-ranges 0.0.0.0/0 \
  --target-tags http-server

In order to scale Compute Engine VMs out with replicas, we must first specify an Instance Template. In this case, we will use a per-region template to differentiate machines created in one region versus another. Similar to how VMs were specified in the Terraform template, the command below creates a template for the us-east5 region called us-east5-template. VMs created from this template are placed on the us-east5 subnetwork and have the http-server tag attached to them to allow incoming HTTP traffic.

gcloud compute instance-templates create "us-east5-template" \
  --image-project debian-cloud \
  --image-family=debian-12 \
  --machine-type e2-micro \
  --subnet "us-east5" \
  --metadata "startup-script-url=gs://cs430/networking101/website/startup.sh" \
  --region "us-east5" \
  --tags "http-server"

The command also specifies a startup script that should be executed whenever a VM instance is created (startup-script-url set in the metadata of the instance). The startup file is shown below. It installs a simple Apache/PHP server, downloads an index.php file, and obtains the zone information of the instance from the VM instance's Metadata service (See CS 495 for more on this service). It then substitutes the region information into the index.php file using the Unix stream editor command (sed).

gs://.../networking101/website/startup.sh

#! /bin/bash
apt-get update
apt-get install -y apache2 php
cd /var/www/html
rm index.html -f
rm index.php -f
wget https://storage.googleapis.com/cs430/networking101/website/index.php
META_REGION_STRING=$(curl "http://metadata.google.internal/computeMetadata/v1/instance/zone" -H "Metadata-Flavor: Google")
REGION=`echo "$META_REGION_STRING" | awk -F/ '{print $4}'`
sed -i "s|region-here|$REGION|" index.php

The PHP file is below. The 'region-here' text is replaced by the sed command and results in the PHP script showing us the region that it has been brought up in along with the IP address of the client and the hostname of the instance (to allow us to differentiate between various replicas in our deployment).

gs://.../networking101/website/index.php

<?php
  $ip = $_SERVER['REMOTE_ADDR'];
  // display it back
  echo "<h1>Networking 101 Lab</h1>";
  echo "<h2>Client IP</h2>";
  echo "Your IP address : " . $ip;
  echo "<h2>Hostname</h2>";
  echo "Server Hostname: " . php_uname("n");
  echo "<h2>Server Location</h2>";
  echo "Region and Zone: " . "region-here";
?>

Create the second instance template for the lab in and europe-west1. As the command shows, it only differs from the first in its region and subnet.

gcloud compute instance-templates create "europe-west1-template" \
  --image-project debian-cloud \
  --image-family=debian-12 \
  --machine-type e2-micro \
  --subnet "europe-west1" \
  --metadata "startup-script-url=gs://cs430/networking101/website/startup.sh" \
  --region "europe-west1" \
  --tags "http-server"

Then, see that both templates have been created by listing them.

gcloud compute instance-templates list

They should also show up in the web console of Compute Engine under "Instance Templates".

Up until this point, we have mostly relied upon Terraform and the gcloud CLI to configure resources. The configuration can also be done via the web console. For subsequent steps, deploy using the gcloud commands, but view how it can be done via the web console so you can see how they map to each other.

Before we create our managed instance groups from templates, we must first specify a health check. Health checks enable GCP to automatically detect non-functioning nodes in a managed instance group when they crash so that they can be restarted. This is useful to ensure high availability for our service. As we are running a web site as our application, we can define a simple HTTP check called instance-health-check on port 80 that triggers every 10 seconds and will declare failure upon 3 failed checks.

A single gcloud command can create this check:

gcloud compute health-checks create http instance-health-check \
  --check-interval=10s \
  --port=80 \
  --timeout=5s \
  --unhealthy-threshold=3

Given our templates, we will first create a simple managed instance group in our European region with a fixed number of instances. The settings are below.

Name: europe-west1-mig

We can instantiate the group using the gcloud commands given below to create the group and specify its health check.

gcloud compute instance-groups managed create europe-west1-mig \
  --size 2 \
  --region europe-west1 \
  --template europe-west1-template

gcloud compute instance-groups managed update europe-west1-mig \
  --health-check instance-health-check \
  --initial-delay 120 \
  --region europe-west1

Because we want this instance group to eventually serve web requests, we "expose" port 80 and name it http. A load balancer will then route requests to this named port.

gcloud compute instance-groups set-named-ports europe-west1-mig \
  --named-ports=http:80 --region europe-west1

In Cloud Shell, we will now create two managed instance groups from the instance templates created previously. Name the first one us-east5-mig and set it to use autoscaling from 1 to 5 instances and the settings below:

Name: us-east5-mig

We will create the deployment via several individual gcloud commands. The first two commands create the group and specify its health check.

gcloud compute instance-groups managed create us-east5-mig \
  --size 1 \
  --region us-east5 \
  --template us-east5-template

gcloud compute instance-groups managed update us-east5-mig \
  --health-check instance-health-check \
  --initial-delay 120 \
  --region us-east5

The next command sets up autoscaling for the instance group to range from 1 to 5 replicas based on load-balancing utilization.

gcloud compute instance-groups managed set-autoscaling us-east5-mig \
  --mode on \
  --region us-east5 \
  --min-num-replicas=1 --max-num-replicas=5 \
  --cool-down-period=45 \
  --scale-based-on-load-balancing \
  --target-load-balancing-utilization=0.8

Finally, as before, we want this instance group to eventually serve web requests so we "expose" port 80 and name it http for the load balancer we will deploy.

gcloud compute instance-groups set-named-ports us-east5-mig \
  --named-ports=http:80 --region us-east5

Ensure that the groups and their associated instances have been created properly via:

gcloud compute instance-groups list

Wait a minute, even after the deployment finishes, for the instances to fully initialize. Then, go to the web console of Compute Engine and directly visit the web server that has been brought up within us-east5-mig via its IP address (http://).

You should see the output of the PHP script.

Repeat with an instance from europe-west1-mig. Answer the following questions for your lab notebook

While we have set up a number of individual web servers for our site, we do not have a way of distributing the requests from clients to them automatically based on load. To do so, we must instantiate a load balancer. Load balancers will accept requests from clients on a single, anycast IP address and forward the request to the most appropriate web server on the backend based on proximity and load. In this step, we specify a backend service consisting of our two instance groups and create a load balancer with a static IP address that will forward requests to them.

The configuration requires a number of gcloud commands. The first set of commands creates a backend service on the HTTP port (webserver-backend-migs) and then adds the two instance groups to them (europe-west1-mig, us-east5-mig)

gcloud compute backend-services create webserver-backend-migs \
  --protocol=http --port-name=http --timeout=30s \
  --health-checks=instance-health-check \
  --global

gcloud compute backend-services add-backend webserver-backend-migs \
  --instance-group=europe-west1-mig --instance-group-region=europe-west1 \
  --balancing-mode=utilization --max-utilization=0.8 \
  --global

gcloud compute backend-services add-backend webserver-backend-migs \
  --instance-group=us-east5-mig --instance-group-region=us-east5 \
  --balancing-mode=rate --max-rate-per-instance=50 \
  --global

Then, we create the load balancer (also known as a URL map), point it to the backend, and create an HTTP proxy that will forward HTTP requests from the load balancer to the backend.

gcloud compute url-maps create webserver-frontend-lb \
  --default-service webserver-backend-migs

gcloud compute target-http-proxies create webserver-proxy \
  --url-map webserver-frontend-lb

Next, we allocate an IPv4 address to use and associate a forwarding rule to take incoming HTTP requests to that address and send them to the HTTP proxy we've created.

gcloud compute addresses create webserver-frontend-ip --ip-version=ipv4 --global

gcloud compute forwarding-rules create webserver-frontend-fwrule \
  --ip-protocol=tcp --ports=80 --address=webserver-frontend-ip \
  --target-http-proxy webserver-proxy \
  --global

We have now set up our scalable web site below. Review the diagram to see each component has been instantiated in the codelab.

Visit "Network Services"=>"Load Balancing", then click on the load balancer you instantiated. Visit the IP address that it handles requests on.

If you get an error, you may need to wait several minutes for the load balancer to finish deploying.

Reload the page multiple times.

We will now show how our load balancer can direct traffic to our instance groups and how we can scale instance groups based on demand. From your initial Deployment Manager deployment, bring up two ssh sessions: one on w1-vm and one on eu1-vm.

On w1-vm, launch a siege on your load balancer IP address. Note that the command is configured for 250 simultaneous requests. If this load is insufficient to impact autoscaling, you may need to increase the number of concurrent requests.

# On w1-vm
siege -c 250 http://<LoadBalancerIP>

Visit the web console. Go to "Network Services"=>"Load Balancing" and click on your load balancer (webserver-frontend-lb). Then, click on the "Monitoring" tab. In the Backend dropdown, select the webserver-backend-migs as shown below and expand out its details. The UI shows traffic sources by region and the backends that they are routed to by the load balancer. Since w1-vm is in the US, traffic is sent to us-east5-mig and the instance group scales up from 1 instance to 5. As the instances are brought up, the load balancer directs requests over to the europe-west1-mig, creating significant intercontinental traffic.

Keep this window open for 5-10 minutes as the system adapts to the load and the UI updates.

Note that eventually, traffic is handled mostly by the 5 VMs in us-east5-mig.

Stop the siege running on w1-vm. Then, go to eu1-vm and launch an identical siege on your load balancer IP address:

# On eu1-vm
siege -c 250 http://<LoadBalancerIP>

Go back to the load balancer monitoring UI updates to show traffic now coming from the European region. For web sites, it is ideal to have clients in Europe be served by servers in Europe. As the UI will eventually show, requests from eu1-vm are sent to europe-west1-mig and the total traffic will shift away from the servers in us-east5-mig. Using the anycast functionality of the load balancer, this can be done with a single IP address.

When finished, exit out of both w1-vm and eu1-vm.

We have deployed a significant amount of resources to implement our scalable web site. As a result, it is important that we clean up immediately to avoid running up charges unnecessarily.

The following sets of commands delete our load balancing setup:

gcloud compute forwarding-rules delete webserver-frontend-fwrule --global

gcloud compute target-http-proxies delete webserver-proxy

gcloud compute addresses delete webserver-frontend-ip --global

gcloud compute url-maps delete webserver-frontend-lb

gcloud compute backend-services delete webserver-backend-migs --global

Then, delete the managed instance group in us-east5 and europe-west1.

gcloud compute instance-groups managed delete us-east5-mig --region us-east5

gcloud compute instance-groups managed delete europe-west1-mig --region europe-west1

Then, delete the health checks, instance templates, and the firewall rule allowing HTTP.

gcloud compute health-checks delete instance-health-check

gcloud compute instance-templates delete us-east5-template

gcloud compute instance-templates delete europe-west1-template

gcloud compute firewall-rules delete networking-firewall-allow-http

Finally, we delete the initial deployment.

terraform destroy

Examine the service account key credential in CDN-lab.json and find its KEY_ID (e.g. private_key_id). Then, delete the key by running the command below, substituting the KEY_ID with that found in the file.

gcloud iam service-accounts keys delete <KEY_ID> \
  --iam-account CDN-lab@${GOOGLE_CLOUD_PROJECT}.iam.gserviceaccount.com

Then, remove the policy binding added to the service account.

gcloud projects remove-iam-policy-binding ${GOOGLE_CLOUD_PROJECT} \
  --member serviceAccount:CDN-lab@${GOOGLE_CLOUD_PROJECT}.iam.gserviceaccount.com \
  --role roles/compute.admin

Finally, delete the service account.

gcloud iam service-accounts delete CDN-lab@${GOOGLE_CLOUD_PROJECT}.iam.gserviceaccount.com

You also want to delete the json key file we created

rm CDN-lab.json

Visit the web console of Compute Engine and ensure no instances remain running.