Code generation is one of the more useful tasks a model can do. It's difficult to trust the code it produces without having an idea of what a correct version of the code looks like. In this exercise, a simple Python class that implements a username-password authentication function using a SQLite3 database is shown. Within the class:

import sqlite3

DB_FILE = 'users.db'    # file for our Database

class Users():
    def __init__(self):
        self.connection = sqlite3.connect(DB_FILE)
        cursor = self.connection.cursor()
        try:
            cursor.execute("select count(rowid) from users")
        except sqlite3.OperationalError:
            self.initializeUsers()

    def initializeUsers(self):
        cursor = self.connection.cursor()
        cursor.execute("create table users (username text, password text)")
        self.addUser('admin','password123')

    def addUser(self, username, password):
        cursor = self.connection.cursor()
        params = {'username':username}
        cursor.execute("SELECT username FROM users WHERE username=(:username)", params)
        res = cursor.fetchall()
        if len(res) == 0:
            params = {'username':username, 'password':password}
            cursor.execute("insert into users (username, password) VALUES (:username, :password)", params)
            self.connection.commit()
            return True
        else:
            return False

    def checkUser(self, username, password):
        params = {'username':username}
        cursor = self.connection.cursor()
        cursor.execute("select password from users WHERE username=(:username)", params)
        res = cursor.fetchall()
        if len(res) != 0:
            password_from_db = res.pop()[0]
            if password == password_from_db:
                return True
        return False

The goal of the exercise is to generate a prompt that allows an LLM to produce the code above.

For the prompt that produces the original code most closely

Unit tests that are built into a program allow one to catch code changes that may break the functionality of the application. For example, consider the code below that implements a square root.

import math

def square_root(n):
    if isinstance(n, int) and n >= 0:
        return math.sqrt(n)
    else:
        raise ValueError("Input must be a positive integer.")

To add unit tests to this code, one could utilize the unittest package in Python and add assertions that should hold on a variety of test cases. An example is shown below

class TestSquareRoot(unittest.TestCase):
    def test_zero(self):
        self.assertEqual(square_root(0), 0.0)

    def test_non_integer(self):
        with self.assertRaises(ValueError):
            square_root(4.5)
        with self.assertRaises(ValueError):
            square_root("string")
        with self.assertRaises(ValueError):
            square_root([4])

    def test_negative_integer(self):
        with self.assertRaises(ValueError):
            square_root(-1)

if __name__ == "__main__":
    unittest.main()

For our password authentication example, we wish to test the expected behavior of the code across a variety of tests to ensure correctness. For example, the code should:

While one could generate these tests manually, we will utilize an LLM to do so instead. Perform the following:

Then, run the generated password program with the unit tests implemented.

Python versions beyond 3.5 support type annotations in order to give the developer the ability to reason about data types within their programs. Adding type annotations to code written prior to this version is something that can be potentially automated by an LLM. Consider the code below that fetches a URL using the requests package, parses the page using BeautifulSoup, and then returns the page's <title> tag if it exists.

import requests
from bs4 import BeautifulSoup

def getUrlTitle(url):
    resp = requests.get(url)
    title_tag = BeautifulSoup(resp.text, 'html.parser').find('title')
    if title_tag and title_tag.text:
        return title_tag.text.strip()
    else:
        return None

A fully annotated version is shown below with each parameter and return value assigned a type, along with any variable that has been utilized. In addition, the Optional type is used when the return type can be either the given type (e.g. str) or None.

import requests
from bs4 import BeautifulSoup
from typing import Optional 

def getUrlTitle(url: str) -> Optional[str]:  
    resp: requests.Response = requests.get(url)    
    resp.raise_for_status()
    soup: BeautifulSoup = BeautifulSoup(resp.text, 'html.parser')  

    title_tag: Optional[BeautifulSoup.Tag] = soup.find('title')  
    if title_tag and title_tag.text:
        return title_tag.text.strip()
    else:
        return None

With the code generated previously for the password authentication program with the unit tests included:

Examine the code for correctness and ensure it executes correctly.

The prior password program utilizes cleartext passwords in its implementation instead of a password hash of it. Unfortunately, if the system were compromised, cleartext passwords for every user would be exposed, allowing an adversary to perform credential stuffing. Given the code from the previous step:

Test the implementation.

Standards for coding such as PEP8 and static analysis checks such as those done by pylint are often useful in the development process. Automatic these checks via an LLM can reduce the labor required to enforce development rules such techniques specify. Consider a snippet of the web server code that is included in this week's lab:

simple_http_server.py

from flask import Flask, request, jsonify
import os
from werkzeug.utils import secure_filename

app = Flask(__name__)

# Ensure there's a folder to save the uploaded files
UPLOAD_FOLDER = 'uploads'
if not os.path.exists(UPLOAD_FOLDER):
    os.makedirs(UPLOAD_FOLDER)
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER

@app.route('/upload', methods=['POST'])
def upload_file():
    if 'file' not in request.files:
        return jsonify({'error': 'No file part in the request'}), 400
...

It does not adhere to the PEP8 standard. Rather than manually read the rules for PEP8, instead, go to the repository and copy the program and include it in a prompt to an LLM asking it to make it compliant to PEP8.

LLMs have been successfully used to translate text from one language to another. Since programming languages are just another type of language, one potential use for LLMs is to automatically translate a program to another programming language.

Javascript

In this exercise, we'll translate our original password code written in Python into Javascript. We'll begin by asking an LLM to create a Javascript equivalent for the password program. As part of the prompt, give the LLM some additional instructions to guide its translation such as:

Using the above as a guide,

Attempt to run the code by bringing up the course VM and installing the latest Node.js version.

sudo apt update -y
sudo apt install nodejs npm -y
sudo npm install -g n
sudo n stable
hash -r

Create a directory to run the application from, and install the Javascript packages that are required.

mkdir js
cd js
npm install sqlite3

Copy the code the LLM produced into the file users.js. Then, run the code.

node users.js

Typescript

We'll attempt to repeat the exercise using Typescript instead.

Install the Typescript package

npm install ts-node

Copy the code the LLM produced into the file users.ts. Then, run the npx command to transpile the code to Javascript and execute it.

npx ts-node users.js

Regular expressions are commonly used for string matching in security applications. One of the problems in generating effective expressions, however, is being able to correctly specify what is needed. LLMs that are not specifically prompted requirements will often generate incomplete or insecure code. In this exercise, you will prompt an LLM to generate a regular expression to match the "localhost" or loopback address. This particular address is often used to set up services that are only available for internal access.

Use an LLM to generate a regular expression to match the localhost address.

Give me a regular expression that matches the localhost address

At a minimum, the result should match 127.0.0.1, the most common form of the address.

As it turns out, the entire block of 127.0.0.0/8 should match the loopback. For example, 127.0.0.53 is commonly used by the systemd-resolved service as a local DNS stub resolver. Prompt the LLM to generate a regular expression to match all localhost addresses.

Give me a regular expression that matches all localhost addresses

One problem with LLMs is that they may not understand obscure encodings. On your Kali VM, perform the following:

ping 0177.0.0.1

Prompt the LLM to generate a regular expression to match all encodings of all localhost addresses.

Give me a regular expression that matches all encodings of all localhost addresses

Examine this blog post to see if the regular expression addresses each representation. Then, prompt an LLM with the "right" way to validate the localhost address in Python that includes unit tests for all of the different encodings and versions above.