06.2: Vulnerability Tools

In this lab, you will utilize tools in conjunction with LLMs to automate the detection of vulnerable services and source code examples, then determine whether the results are accurate. To begin with, change into the code directory for the exercises and install the packages.

cd cs410g-src
git pull
cd 06*
virtualenv -p python3 env
source env/bin/activate
pip install -r requirements.txt

LLM agents have the ability to make decisions about what tools to use based on the context they receive. In this exercise an agent will utilize two custom tools to solve a Portswigger password authentication level. Go to: https://portswigger.net/web-security/authentication/password-based/lab-username-enumeration-via-different-responses

This is the first level in the password based authentication challenges found on Portswigger Academy. Click "Access the Lab" to start the lab.

To run the program:

python3 01_auth_breaker.py

The program will display the two tools that are running. Now, to use the tool, fill in the prompt below with the URL of your level.

I want to login to the website with a base url of <YOUR_LEVEL_URL>

The program will first call the find login tool

@tool("find_login_page", return_direct=False)
def find_login_page(base_url):
    """The function will try to find the login page url"""
    loader = RecursiveUrlLoader(
        url = base_url,
        max_depth = 2,
    )
    docs = loader.load()

    login_page = None

    for doc in docs:
        login_page = doc.metadata["source"]
        if login_url(login_page):
            break

This tool is utilizing the RecursiveUrlLoader found in the RAG section of the course to locate a webpage that has a url with the string "login" contained in it. It will also check to see if any of the returned links yield redirects:

def check_redirects(url):
    """
    Checks each URL in a list for redirects and returns if it contains login
    """
    try:
        # Send a request to the URL with allow_redirects set to True
        response = requests.get(url, allow_redirects=True)
        # Check if any redirection has occurred
        if response.history:
            # If redirected, check the url 
            if login_url(response.url):
                return response.url

If redirects are returned it will also check those. After checking the website's pages for a login page, it will return the URL it finds.

Then the ReAct agent will call the next tool, called get_creds

@tool("get_creds", return_direct=False)
def get_creds(login_url):
    """Given the login page url the function will find the credentials needed to login"""

This function loads a password list and a username list that were provided by the Portswigger level.

password_lines = open("./data/auth-lab-passwords","r").readlines()
username_lines = open("./data/auth-lab-usernames","r").readlines()

It then uses the Python requests library to check if the correct username password pair is found.

If the correct pair is found, it will login to the website and solve the level. The agent will then produce a final answer of the correct username and password pair.

How could you make the agent more generalizable so that it could solve a wider range of problems?
What would be a way to help the agent be able to reason about what it finds when crawling through the html pages of the website?

While it may be tempting to utilize an LLM to perform vulnerability analysis, it is often the case that special-purpose tools are more appropriate, both in accuracy and in costs. One such tool for performing Static Application Security Testing (SAST) to identify vulnerable Python code is Bandit. Bandit processes Python code, builds an abstract syntax tree (AST) from it, and then runs appropriate plugins against the AST nodes to identify problematic code snippets. Once Bandit has finished scanning all the files, it generates a report. In this exercise, Bandit is used to analyze a repository to find files with potentially vulnerable code. The summary is then fed to the LLM to generate a patch for vulnerable files automatically.

To do so, a program is provided that clones an arbitrary repository and then runs Bandit using flags that specify only vulnerabilities that bandit is highly confident are high severity ones. The function below runs the tool and asks the LLM to summarize its findings including listing the line numbers that the vulnerability appears for each vulnerable file.

def bandit_find_high_severity_files(repo_path):
    result = subprocess.run(
                ["bandit", "-r", repo_path, "--confidence-level", "high", "--severity-level", "high"],
                capture_output=True,
                text=True
             )
    bandit_results = result.stdout
    prompt = f"Analyze the results from the Bandit vulnerability scan and return a list of files with high confidence, high severity vulnerabilities in them.  For each, include the line numbers they occur in:\n\n{bandit_results}"
    response = llm.invoke(prompt)
    return response.content

One use for Bandit's analysis is to help generate patches for vulnerable files. To do so, consider the code below that performs the vulnerability analysis on a particular file from the previous step, then feeds its results along with the contents of the file to an LLM to generate a patch.

def patch_file(repo_path):
    result = subprocess.run(
                ["bandit", repo_path],
                capture_output=True,
                text=True
             )
    bandit_results = result.stdout
    file_content = open(repo_path,"r", encoding="utf-8").read()
    prompt = f"You are a skilled patch generator that takes a program from a file and a description of its vulnerabilities and then produces a patch for the program in diff format that fixes the problems in the description.\n\n The contents of the program file are: \n {file_content}\n\n The description of the issues in it are: \n {bandit_results}"
    response = llm.invoke(prompt)
    return response.content

Run the program and point it to the course repository.

python3 02_bandit_patch.py

What files have high-severity vulnerabilities in them?

Select one of the files to have the program generate a patch for it.

Does the patch fix the vulnerability?
Does the patch make sense to apply?