Because of its potential for finding software errors quickly, smart fuzzing has become increasingly prevalent in software development and testing in order to secure the programs, libraries, and operating systems that we rely upon. This lab steps through exercises that will walk you through how to use such tools to identify and correct some of the most common and devastating software errors.

What you will build

You will deploy a Compute Engine instance, install Docker, build a Docker container image that has AFL installed, download the AFL exercises (based on Thales Security's excellent tutorial), and use AFL to find vulnerabilities in them including the Heartbleed bug.

What you'll learn

What you'll need

Install Ubuntu 22.04 VM

Install Docker on the VM

sudo apt update
sudo apt install -y docker.io
sudo usermod -a -G docker $(whoami)
sudo su -c "echo kernel.core_pattern=core >> /etc/sysctl.conf"
echo core | sudo tee /proc/sys/kernel/core_pattern

(Important) Logout and log back into VM

Note: Derived from Thales E-security AFL training

Retrieve course files

git clone https://github.com/wu4f/cs492-src
cd cs492-src/afl

Examine Dockerfile

Read the Docker file to see what is included, then set the password for the fuzzer account in the container (to be used when you sudo)

FROM ubuntu:22.04

# Originally from Michael Macnair
LABEL maintainer="cs492"

# Users
RUN useradd --create-home --shell /bin/bash fuzzer

# AFL + Deps
USER root
RUN apt update && apt upgrade -y
RUN DEBIAN_FRONTEND=noninteractive apt install -y clang llvm-dev git build-essential curl vim nano libssl-dev screen cgroup-tools sudo gcc-multilib gcc gdb tmux afl++

# For sudo for ASAN:
RUN usermod -aG sudo fuzzer
USER fuzzer
WORKDIR /home/fuzzer
COPY . cs492-afl

# See the README - this password is visible to anyone with access to the image
USER root
RUN echo "fuzzer:`cat cs492-afl/password.txt`" | chpasswd && rm cs492-afl/password.txt
RUN chown -R fuzzer:fuzzer cs492-afl
USER fuzzer

Set the password for fuzzer user and build container

Local container named cs492-afl

echo "cs492592" > password.txt
docker build . -t cs492-afl

Run container

Done in privileged mode via the use of the "-di" flag to detach the container once running it (to keep it up). Name the container afl for use in subsequent commands.

docker run --privileged --network host --name afl -di cs492-afl

Examine the running container

Done via

docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                     PORTS               NAMES
3d09abf5ec45        afl                 "/bin/bash"         6 minutes ago       Up 2 seconds                                   afl

Examine container image

Done via

docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
cs492-afl           latest              309344bef881        4 months ago        919MB

Stop the container

docker stop afl
docker ps -a

Start the container

Done via its name, the command is:

docker start afl

Execute an interactive shell on the container

docker exec -it afl /bin/bash

Launch tmux

Used to create multiple shells

tmux

In this lab, you will use the address sanitization feature in the compiler to generate accurate information about any memory corruption bug that your program has. For this program, a trivial buffer overflow bug is in the program. View the output when sanitization is off and when it is on.

Compile and run without address sanitization

View the output of the program when run with a benign input:

cd ~/cs492-afl/01_asan
afl-gcc -m32 -o vulnerable vulnerable.c
echo hi | ./vulnerable

View the output using a longer input

echo hihihi | ./vulnerable

Compile and run with address sanitization

View the output of the program when run with a benign input:

AFL_USE_ASAN=1 afl-gcc -m32 -o vulnerable vulnerable.c
echo hi | ./vulnerable

Analyze results

Answer the following questions by reading the output file.

  1. What is the type of the offending operation?
  2. What is the size of the offending write?
  3. Which area of memory does the error occur in?
  4. Show the source line in vulnerable.c where the error occurs.
  5. Show the source line of the error and propose a fix that takes into account null termination
  6. Recompile and re-run example with address sanitization

With the AFL compiler performing the instrumentation, the AFL fuzzer can be used to automatically find memory corruption errors that lead to crashes. Note that the AFL fuzzer requires that your window be resized to at least 80x25. In this level, you will run the fuzzer on a similar program that has a buffer overflow vulnerability.

Compile and fuzz program

Build the binary:

cd ~/cs492-afl/02_scanf_fuzz
afl-gcc -m32 -o vulnerable vulnerable.c

Then, create a directory for seeding initial input and a directory for output and run the fuzzer on the binary.

mkdir inputs outputs
afl-fuzz -i inputs -o outputs -- ./vulnerable

As soon as a single crash is found, <Ctrl-C> the fuzzer to exit.


Save the output, then repeat the fuzzing step again

Analyze results

  1. Show the crashing input in outputs/default/crashes/id* using xxd
  2. Compare the outputs of the two runs and explain why they would be different.
  3. Show the source line that contains the vulnerability.

In this level, you will use the fuzzer to find another vulnerability. The level includes a Makefile.

Compile and fuzz program

Build program via make

cd ~/cs492-afl/03_fmt_fuzz
make

Then, run the fuzzer:

afl-fuzz -i inputs -o outputs -- ./vulnerable

As soon as a single crash is found, <Ctrl-C> the fuzzer to exit.

Analyze results

  1. Show the crashing input in outputs/default/crashes/id* using xxd
  2. Show the source line that contains the vulnerability.
  3. Explain how the specific crashing input found by AFL causes the vulnerability to be exploited.

Fuzzers instrument paths in a program in order to zero in on problematic input. In this level, there are two programs that will both crash on the same input.

Both programs (path_based.c and strcmp_based.c) will crash on the same input. One uses a library call that is opaque to the fuzzer while the other implements the logic within its source code. Fuzzing will find the crashing input quickly in one because of the number of code paths it splits itself across, but not in the other. Given what you know about the AFL fuzzer, which one might the fuzzer have difficulty with?

Compile binaries and fuzz path_based program

Build both programs via the Makefile:

cd ~/cs492-afl/04_path_fuzz
make

Fuzz the path_based program to identify the crashing input:

afl-fuzz -i inputs -o outputs -- ./path_based

As soon as a single crash is found, show the following (before exiting the fuzzer):

  1. The number of "total execs" and the total number of paths (indicated by the "corpus count")
  2. Then, show the crashing input in outputs/default/crashes/id* using xxd
  3. Explain the number obtained for "corpus count" based on examining the source code in path_based.c

Fuzz strcmp_based program

Now, run the fuzzer on the other binary for at most twice the number of total executions as the first. Terminate the fuzzer at this point regardless of output

afl-fuzz -i inputs -o outputs -- ./strcmp_based
  1. Show the number of paths via the "corpus count" that have been found
  2. Give an explanation as to why the "corpus count" is lower between the two programs based on the source code (knowing that the standard C library call is opaque to the fuzzer)
  3. Explain why it is harder for the fuzzer to find the crashing input on one versus the other.
  4. For the difficult program, explain what might happen if the standard C library were compiled with the AFL compiler and the fuzzing redone

With an understanding of how AFL works and how to apply it to find vulnerabilities, we will now examine a more complex program.

Examine program, build it, and run it

cd ~/cs492-afl/05_multi
CC=afl-clang-fast AFL_HARDEN=1 make
echo hi | ./vulnerable

Run the program on the inputs provided in the inputs directory to understand what each command does

cat inputs/ec ; ./vulnerable < inputs/ec

cat inputs/head ; ./vulnerable < inputs/head

cat inputs/c ; ./vulnerable < inputs/c

Fuzz the program

Run the fuzzer using the inputs directory to seed the search space. Wait until you have found a crashing input for each of the 3 commands. Crashes are located in the outputs directory. A tmux session will be helpful to monitor the output as the fuzzer runs in another session.

afl-fuzz -i inputs -o outputs ./vulnerable

Show the xxd output for each of the three crashing inputs found in the outputs directory.

Analyze results

Now, we will apply AFL to automatically find the Heartbleed vulnerability

Checkout and build vulnerable OpenSSL source

Clone using the tag containing the vulnerability

cd ~/cs492-afl/06_heartbleed

git clone https://github.com/openssl/openssl.git
cd openssl ; git checkout tags/OpenSSL_1_0_1f

Configure and build OpenSSL with ASAN

CC=afl-clang-fast CXX=afl-clang-fast++ ./config -d

AFL_USE_ASAN=1 make

Build program that uses vulnerable OpenSSL library

Examine handshake.cc top-level directory. The code that connects the fuzzer to the SSL connection that is made via stdin is included. Uncomment the appropriate lines in the file after analysis

After editing the file, compile the handshake program.

AFL_USE_ASAN=1 afl-clang++ -g handshake.cc openssl/libssl.a \
   openssl/libcrypto.a -o handshake -I openssl/include -ldl

We will now analyze the bug by examining the SSL record and the heartbeat message embedded into the record (as its record data).

Run the fuzzer

afl-fuzz -i inputs -o outputs -m none ./handshake

Map input to protocol

Examine OpenSSL source code

322                 /* Pull apart the header into the SSL3_RECORD */
323                 rr->type= *(p++);
324                 ssl_major= *(p++);
325                 ssl_minor= *(p++);
326                 version=(ssl_major<<8)|ssl_minor;
327                 n2s(p,rr->length);

Analyze results

Using the code above and the structure definition given for the HeartbeatMessage, for each of your inputs

  1. Identify in the xxd listing of the crashing input, the values for rr->type, ssl_major, ssl_minor, and rr->length that are used.
  2. Also identify in the xxd listing of the crashing input the HeartbeatMessageType (a single byte), the HeartbeatMessage payload_length, and the Heartbeat Message payload (if any).
  3. Finally, after the data is parsed, a message is sent to the server. The function in the server code that contains the Heartbleed vulnerability is located on line 2553 in ssl/t1_lib.c

Celebrate! (Or not). Be sure to stop the VM to save $.