Data is often delivered from Web APIs via the Javascript Object Notation format or JSON. One tool that is helpful for quickly parsing JSON is jq. The syntax for the command has a learning curve that may be difficult for some to learn. In this exercise, we'll examine the ability for an LLM to properly analyze a JSON object and generate jq queries using natural language.
To begin with, install jq on your VM.
sudo apt install jq
Begin by visiting the Certificate Transparency site at https://crt.sh and lookup the domain kapow.cs.pdx.edu. Doing so will access the URL https://crt.sh/?q=kapow.cs.pdx.edu and the results will be returned as an HTML page. One can retrieve the same content, but in a JSON format. Within your VM, use the curl command below to download a JSON version of the results to a file.
curl "https://crt.sh/?q=kapow.cs.pdx.edu&output=json" > kapow.json
The JSON returned has a schema that you would need to parse and understand before being able to write a jq command to pull out specific information from it. Upload the JSON output or copy it to an LLM and prompt the LLM to generate a jq command that prints out the certificate ID and common name of every entry.
jq command that the LLM outputsRun the jq command on the JSON.
Now, ask the LLM to directly output the certificate ID and common name directly.
Next, with the help of an LLM to produce all unique domain names that crt.sh reports for the domain cs.pdx.edu
One of the benefits of chatbots is that they can provide a more natural interface for search engines. Search engines such as Google come with a rich set of features that advanced users can leverage to zero find specific content. A summary can be found here. Examples include:
"We the people" to search for documents with this phrase)filetype:pdf for PDF files, ext:txt for files with a txt filename extension)site:pdx.edu for documents on pdx.edu domains, @twitter for content on social media platform Twitter)inurl:security), pages with titles containing specific text (intitle:security), or pages with specific text (intext:disallow)-filetype:pdf removes all results that are PDF files)(psychology | computer science) & design for sites that match psychology design or computer science design)Use an LLM to see if it can perform the same function as Google dork generators by having the LLM generate the dorks below.
"VNC Desktop" inurl:5800
Search index restriction files (robots.txt/robot.txt) indicating sensitive directories that should be disallowed.
(inurl:"robot.txt" | inurl:"robots.txt" ) intext:disallow filetype:txt
inurl:phpmyadmin site:*.pdx.edu
SQL database backup files that have been left on a public web server
intitle:"index of" "database.sql.zip"
intitle:"index of" passwords
" -FrontPage-" ext:pwd inurl:(service | authors | administrators | users)
"Not for Public Release" ext:pdf
One of the benefits of using an LLM is its ability to use its broad knowledge base to explain code and commands that a particular user may not understand. Consider the following set of commands for configuring rules using iptables, a network firewall tool for Linux.
iptables -A INPUT -i eth0 -p tcp -m multiport --dports 22,80,443 -m state --state NEW,ESTABLISHED -j ACCEPT
iptables -A OUTPUT -o eth0 -p tcp -m multiport --sports 22,80,443 -m state --state ESTABLISHED -j ACCEPT
Prompt an LLM for a concise summary of the two rules that the command creates.
Then, prompt an LLM for a prompt that could reproduce the commands exactly. Using that prompt as a basis, create a prompt that reproduces the above commands verbatim.
Google Cloud's Compute Engine service allows one to set up virtual machines configured with a variety of operating systems and network configurations. As we have done previously for the WFP1 VM at the beginning of this lab, this can be done via the command-line interface provided by the Google Cloud SDK and its gcloud command.
gcloud compute firewall-rules create pdx-80 \
--allow=tcp:80 --source-ranges="131.252.0.0/16" \
--target-tags=pdx-80 --direction=INGRESS
gcloud compute instances create wfp1-vm \
--machine-type e2-micro --zone us-west1-b \
--image-project pdx-cs \
--image wfp1-nofilter
gcloud compute instances add-tags wfp1-vm --tags=pdx-80 --zone us-west1-b
Consider the following nginx configuration for a web server in https://mashimaro.cs.pdx.edu .
server {
server_name mashimaro.cs.pdx.edu;
root /var/www/html/mashimaro;
index index.html;
location / {
try_files $uri $uri/ =404;
}
listen 443 ssl;
ssl_certificate /etc/letsencrypt/live/mashimaro/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/mashimaro/privkey.pem;
include /etc/letsencrypt/options-ssl-nginx.conf;
ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem;
}
server {
if ($host = mashimaro.cs.pdx.edu) {
return 301 https://$host$request_uri;
}
server_name mashimaro.cs.pdx.edu;
listen 80;
return 404;
}
Prompt an LLM for a concise line-by-line summary of the configuration above.
Then, prompt an LLM for a prompt that could reproduce the configuration exactly. Using that prompt as a basis, create a prompt that reproduces the above configuration verbatim.
Terraform and other infrastructure-as-code solutions provide a way of declaratively defining infrastructure that can then be deployed in a reliable, reproducible manner. Consider the Terraform specification file below that deploys a single virtual machine on Google Cloud Platform.
provider "google" {
credentials = file("tf-lab.json")
project = "YOUR_PROJECT_ID"
region = "us-west1"
}
resource "google_compute_address" "static" {
name = "ipv4-address"
}
resource "google_compute_instance" "default" {
name = "tf-lab-vm"
machine_type = "e2-medium"
zone = "us-west1-b"
boot_disk {
initialize_params {
image = "ubuntu-os-cloud/ubuntu-2204-jammy-v20240501"
}
}
network_interface {
network = "default"
access_config {
nat_ip = google_compute_address.static.address
}
}
}
output "ip" {
value = google_compute_instance.default.network_interface.0.access_config.0.nat_ip
}
Prompt an LLM for a concise line-by-line summary of the configuration above.
Then, prompt an LLM for a prompt that could reproduce the configuration exactly. Using that prompt as a basis, create a prompt that reproduces the above configuration verbatim.
Docker containers, which can be seen as virtual operating systems, are often used to deploy services in cloud environments. Containers are instantiated from images that are specified and built from a Dockerfile configuration. For beginners, parsing a configuration can be difficult. An LLM can potentially aid in understanding such files. Below is a Dockerfile for a multi-stage container build.
FROM python:3.5.9-alpine3.11 as builder
COPY . /app
WORKDIR /app
RUN pip install --no-cache-dir -r requirements.txt && pip uninstall -y pip && rm -rf /usr/local/lib/python3.5/site-packages/*.dist-info README
FROM python:3.5.9-alpine3.11
COPY --from=builder /app /app
COPY --from=builder /usr/local/lib/python3.5/site-packages/ /usr/local/lib/python3.5/site-packages/
WORKDIR /app
ENTRYPOINT ["python3","app.py"]
Prompt an LLM for a concise summary of the configuration above.
Then, prompt an LLM for a prompt that could reproduce the configuration exactly. Using that prompt as a basis, create a prompt that reproduces the above configuration verbatim.
Kubernetes is a system for declaratively specifying infrastructure, deploying it, and maintaining its operation, often using containers and container images. Below is a simple configuration for a web application.
apiVersion: v1
kind: ReplicationController
metadata:
name: guestbook-replicas
spec:
replicas: 3
template:
metadata:
labels:
app: guestbook
tier: frontend
spec:
containers:
- name: guestbook-app
image: gcr.io/YOUR_PROJECT_ID/gcp_gb
env:
- name: PROCESSES
value: guestbook
- name: PORT
value: "8000"
ports:
- containerPort: 8000
---
apiVersion: v1
kind: Service
metadata:
name: guestbook-lb
labels:
app: guestbook
tier: frontend
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8000
selector:
app: guestbook
tier: frontend
Prompt an LLM for a concise summary of the configuration above.
Then, prompt an LLM for a prompt that could reproduce the configuration exactly. Using that prompt as a basis, create a prompt that reproduces the above configuration verbatim.