06.1: Code Generation

Code generation is one of the more useful tasks a model can do. It's difficult to trust the code it produces without having an idea of what a correct version of the code looks like. In this exercise, a simple Python class that implements a username-password authentication function using a SQLite3 database is shown. Within the class:

A connection is created to a SQLite3 database stored in the file 'users.db' within the class constructor.
If the database does not exist or doesn't contain a users table, a call to the initilizeUsers() method of the class is performed which creates the users table with text fields: username and password. It then calls the addUser() method to add the admin username with the password of 'password123'
An addUser() method is implemented that takes a username and a password and inserts them into the database if the username does not exist in the database.
A checkUser() method is implemented that takes a username and password, retrieves the password for the username from the database, then checks it against the given password. The method returns True if they match, False otherwise.

import sqlite3

DB_FILE = 'users.db'    # file for our Database

class Users():
    def __init__(self):
        self.connection = sqlite3.connect(DB_FILE)
        cursor = self.connection.cursor()
        try:
            cursor.execute("select count(rowid) from users")
        except sqlite3.OperationalError:
            self.initializeUsers()

    def initializeUsers(self):
        cursor = self.connection.cursor()
        cursor.execute("create table users (username text, password text)")
        self.addUser('admin','password123')

    def addUser(self, username, password):
        cursor = self.connection.cursor()
        params = {'username':username}
        cursor.execute("SELECT username FROM users WHERE username=(:username)", params)
        res = cursor.fetchall()
        if len(res) == 0:
            params = {'username':username, 'password':password}
            cursor.execute("insert into users (username, password) VALUES (:username, :password)", params)
            self.connection.commit()
            return True
        else:
            return False

    def checkUser(self, username, password):
        params = {'username':username}
        cursor = self.connection.cursor()
        cursor.execute("select password from users WHERE username=(:username)", params)
        res = cursor.fetchall()
        if len(res) != 0:
            password_from_db = res.pop()[0]
            if password == password_from_db:
                return True
        return False

The goal of the exercise is to generate a prompt that allows an LLM to produce the code above.

Prompt an LLM to generate a prompt that can produce the code above
Then, in a new chat session, take the generated prompt and ask the LLM to produce the code
Analyze the differences between the code snippets and modify the prompt as needed to produce code as close to the original as possible

For the prompt that produces the original code most closely

Take a screenshot of the prompt and resulting code that includes your OdinId

Unit tests that are built into a program allow one to catch code changes that may break the functionality of the application. For example, consider the code below that implements a square root.

import math

def square_root(n):
    if isinstance(n, int) and n >= 0:
        return math.sqrt(n)
    else:
        raise ValueError("Input must be a positive integer.")

To add unit tests to this code, one could utilize the unittest package in Python and add assertions that should hold on a variety of test cases. An example is shown below

class TestSquareRoot(unittest.TestCase):
    def test_zero(self):
        self.assertEqual(square_root(0), 0.0)

    def test_non_integer(self):
        with self.assertRaises(ValueError):
            square_root(4.5)
        with self.assertRaises(ValueError):
            square_root("string")
        with self.assertRaises(ValueError):
            square_root([4])

    def test_negative_integer(self):
        with self.assertRaises(ValueError):
            square_root(-1)

if __name__ == "__main__":
    unittest.main()

For our password authentication example, we wish to test the expected behavior of the code across a variety of tests to ensure correctness. For example, the code should:

Ensure the default admin user is created with the password 'password123'
Ensures an account that already exists can not be created again
Ensures that one can properly add a new username and password and that they are properly returned from the database when subsequently queried.
Ensures that a username and password pair that is given, is properly checked when given combinations of correct and incorrect values.

While one could generate these tests manually, we will utilize an LLM to do so instead. Perform the following:

Prompt an LLM to instrument the password program with unit tests that can be run to ensure the code's correctness that include the above tests
Ensure the unit tests provide sufficient coverage for the cases described

Then, run the generated password program with the unit tests implemented.

Take a screenshot of the code for the unit tests and the results produced when running them that includes your OdinId

Python versions beyond 3.5 support type annotations in order to give the developer the ability to reason about data types within their programs. Adding type annotations to code written prior to this version is something that can be potentially automated by an LLM. Consider the code below that fetches a URL using the requests package, parses the page using BeautifulSoup, and then returns the page's <title> tag if it exists.

import requests
from bs4 import BeautifulSoup

def getUrlTitle(url):
    resp = requests.get(url)
    title_tag = BeautifulSoup(resp.text, 'html.parser').find('title')
    if title_tag and title_tag.text:
        return title_tag.text.strip()
    else:
        return None

A fully annotated version is shown below with each parameter and return value assigned a type, along with any variable that has been utilized. In addition, the Optional type is used when the return type can be either the given type (e.g. str) or None.

import requests
from bs4 import BeautifulSoup
from typing import Optional 

def getUrlTitle(url: str) -> Optional[str]:  
    resp: requests.Response = requests.get(url)    
    resp.raise_for_status()
    soup: BeautifulSoup = BeautifulSoup(resp.text, 'html.parser')  

    title_tag: Optional[BeautifulSoup.Tag] = soup.find('title')  
    if title_tag and title_tag.text:
        return title_tag.text.strip()
    else:
        return None

With the code generated previously for the password authentication program with the unit tests included:

Prompt an LLM to generate a fully type-annotated version of the program

Examine the code for correctness and ensure it executes correctly.

Take a screenshot of the successful execution of the generated code as well as the annotated code produced for one of the functions that includes your OdinId

The prior password program utilizes cleartext passwords in its implementation instead of a password hash of it. Unfortunately, if the system were compromised, cleartext passwords for every user would be exposed, allowing an adversary to perform credential stuffing. Given the code from the previous step:

Prompt an LLM to convert the program to into one that uses PBKDF2 with SHA-256 using 100,000 iterations to store hashes into the database rather than cleartext passwords
After generating the version, have the LLM produce a unit test that validates the implementation using a username of admin and password of your choice.

Test the implementation.

Take a screenshot of the successful execution of the modified password storage and testing code as well as the results of successful program execution that includes your OdinId

Standards for coding such as PEP8 and static analysis checks such as those done by pylint are often useful in the development process. Automatic these checks via an LLM can reduce the labor required to enforce development rules such techniques specify. Consider a snippet of the web server code that is included in this week's lab:

simple_http_server.py

from flask import Flask, request, jsonify
import os
from werkzeug.utils import secure_filename

app = Flask(__name__)

# Ensure there's a folder to save the uploaded files
UPLOAD_FOLDER = 'uploads'
if not os.path.exists(UPLOAD_FOLDER):
    os.makedirs(UPLOAD_FOLDER)
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER

@app.route('/upload', methods=['POST'])
def upload_file():
    if 'file' not in request.files:
        return jsonify({'error': 'No file part in the request'}), 400
...

It does not adhere to the PEP8 standard. Rather than manually read the rules for PEP8, instead, go to the repository and copy the program and include it in a prompt to an LLM asking it to make it compliant to PEP8.

Take a screenshot of the modifications made to the code that bring it into compliance that includes your OdinId

LLMs have been successfully used to translate text from one language to another. Since programming languages are just another type of language, one potential use for LLMs is to automatically translate a program to another programming language.

Javascript

In this exercise, we'll translate our original password code written in Python into Javascript. We'll begin by asking an LLM to create a Javascript equivalent for the password program. As part of the prompt, give the LLM some additional instructions to guide its translation such as:

Utilize the sqlite3 module
Add test cases to validate correctness and ensure they run serially
Log all calls to the console and include the calling parameters

Using the above as a guide,

Prompt an LLM to convert the password program from Python to Javascript
Check that the program is correct

Attempt to run the code by bringing up the course VM and installing the latest Node.js version.

sudo apt update -y
sudo apt install nodejs npm -y
sudo npm install -g n
sudo n stable
hash -r

Create a directory to run the application from, and install the Javascript packages that are required.

mkdir js
cd js
npm install sqlite3

Copy the code the LLM produced into the file users.js. Then, run the code.

node users.js

Take a screenshot of the successful execution of the program that includes your OdinId

Typescript

We'll attempt to repeat the exercise using Typescript instead.

Ask an LLM to convert the password program from Python to Typescript

Install the Typescript package

npm install ts-node

Copy the code the LLM produced into the file users.ts. Then, run the npx command to transpile the code to Javascript and execute it.

npx ts-node users.js

Take a screenshot of the successful execution of the program that includes your OdinId

Regular expressions are commonly used for string matching in security applications. One of the problems in generating effective expressions, however, is being able to correctly specify what is needed. LLMs that are not specifically prompted requirements will often generate incomplete or insecure code. In this exercise, you will prompt an LLM to generate a regular expression to match the "localhost" or loopback address. This particular address is often used to set up services that are only available for internal access.

Use an LLM to generate a regular expression to match the localhost address.

Give me a regular expression that matches the localhost address

At a minimum, the result should match 127.0.0.1, the most common form of the address.

Take a screenshot of the generated expression that includes your OdinId

As it turns out, the entire block of 127.0.0.0/8 should match the loopback. For example, 127.0.0.53 is commonly used by the systemd-resolved service as a local DNS stub resolver. Prompt the LLM to generate a regular expression to match all localhost addresses.

Give me a regular expression that matches all localhost addresses

Take a screenshot of the generated expression that includes your OdinId

One problem with LLMs is that they may not understand obscure encodings. On your Kali VM, perform the following:

ping 0177.0.0.1

Prompt the LLM to generate a regular expression to match all encodings of all localhost addresses.

Give me a regular expression that matches all encodings of all localhost addresses

Take a screenshot of the generated expression that includes your OdinId and list the encodings and the IP address versions that are covered by it.

Examine this blog post to see if the regular expression addresses each representation. Then, prompt an LLM with the "right" way to validate the localhost address in Python that includes unit tests for all of the different encodings and versions above.

Take a screenshot of the prompt utilized and the Python code that is produced.