3. AFL

Because of its potential for finding software errors quickly, smart fuzzing has become increasingly prevalent in software development and testing in order to secure the programs, libraries, and operating systems that we rely upon. This lab steps through exercises that will walk you through how to use such tools to identify and correct some of the most common and devastating software errors.

What you will build

You will deploy a Compute Engine instance, install Docker, build a Docker container image that has AFL installed, download the AFL exercises (based on Thales Security's excellent tutorial), and use AFL to find vulnerabilities in them including the Heartbleed bug.

What you'll learn

Launching instances on Google Compute Engine
Practice installing a Docker container on Linux
Basic commands downloading, running, and manipulating containers
Utilizing AFL to automatically locate vulnerabilities in source code

What you'll need

A Google Cloud Platform account

Install Ubuntu 22.04 VM

Go to Compute Engine in Google Cloud Console
Create instance
Place in zone us-west1-b
Use the default CPU setting
Select Ubuntu 22.04 for Boot disk

Install Docker on the VM

ssh into instance

Install docker and other tools on the VM and add your username to the docker group

sudo apt update
sudo apt install -y docker.io
sudo usermod -a -G docker $(whoami)

Configure core dumps to go to a file on reboot and for current session
Needs to be set on host OS in order for guest OS to pick it up

sudo su -c "echo kernel.core_pattern=core >> /etc/sysctl.conf"
echo core | sudo tee /proc/sys/kernel/core_pattern

(Important) Logout and log back into VM

Required for changes to apply to VM

Note: Derived from Thales E-security AFL training

Retrieve course files

git clone https://github.com/wu4f/cs492-src
cd cs492-src/afl

Examine Dockerfile

Read the Docker file to see what is included, then set the password for the fuzzer account in the container (to be used when you sudo)

FROM ubuntu:22.04

# Originally from Michael Macnair
LABEL maintainer="cs492"

# Users
RUN useradd --create-home --shell /bin/bash fuzzer

# AFL + Deps
USER root
RUN apt update && apt upgrade -y
RUN DEBIAN_FRONTEND=noninteractive apt install -y clang llvm-dev git build-essential curl vim nano libssl-dev screen cgroup-tools sudo gcc-multilib gcc gdb tmux afl++

# For sudo for ASAN:
RUN usermod -aG sudo fuzzer
USER fuzzer
WORKDIR /home/fuzzer
COPY . cs492-afl

# See the README - this password is visible to anyone with access to the image
USER root
RUN echo "fuzzer:`cat cs492-afl/password.txt`" | chpasswd && rm cs492-afl/password.txt
RUN chown -R fuzzer:fuzzer cs492-afl
USER fuzzer

Set the password for fuzzer user and build container

Local container named cs492-afl

echo "cs492592" > password.txt
docker build . -t cs492-afl

Run container

Done in privileged mode via the use of the "-di" flag to detach the container once running it (to keep it up). Name the container afl for use in subsequent commands.

docker run --privileged --network host --name afl -di cs492-afl

Examine the running container

Done via

docker ps

In the output, find the name of the container by examining the "NAMES" column.

CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                     PORTS               NAMES
3d09abf5ec45        afl                 "/bin/bash"         6 minutes ago       Up 2 seconds                                   afl

Examine container image

Done via

docker images

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
cs492-afl           latest              309344bef881        4 months ago        919MB

Stop the container

Done via its name, for a container called "afl", the command is

docker stop afl

See that it is no longer running via "docker ps" with the "-a" flag to list stopped containers

docker ps -a

If you wanted to, you could remove the stopped container by issuing a "docker rm afl". Note that while this removes the container, it does not remove the local container image it was derived from (i.e. cs492-afl).
If you wanted to, you could then remove the container image stored locally by issuing "docker rmi cs492-afl" after stopping the container

Start the container

Done via its name, the command is:

docker start afl

Note that this only starts the container, but doesn't give you a session on it.

Execute an interactive shell on the container

docker exec -it afl /bin/bash

Launch `tmux`

Used to create multiple shells

tmux

<Ctrl-b> c to create a new screen
Note the * in the session information at the bottom denoting the active screen
Use <Ctrl-b> n to go to next screen
Use <Ctrl-b> p to go to previous screen

In this lab, you will use the address sanitization feature in the compiler to generate accurate information about any memory corruption bug that your program has. For this program, a trivial buffer overflow bug is in the program. View the output when sanitization is off and when it is on.

Compile and run without address sanitization

View the output of the program when run with a benign input:

cd ~/cs492-afl/01_asan
afl-gcc -m32 -o vulnerable vulnerable.c
echo hi | ./vulnerable

View the output using a longer input

echo hihihi | ./vulnerable

Compile and run with address sanitization

View the output of the program when run with a benign input:

AFL_USE_ASAN=1 afl-gcc -m32 -o vulnerable vulnerable.c
echo hi | ./vulnerable

Note that to save the output you can direct standard error (fd = 2) to a file and then examine the file. (e.g. echo hi | ./vulnerable 2> asan_out.txt)

Analyze results

Answer the following questions by reading the output file.

What is the type of the offending operation?
What is the size of the offending write?
Which area of memory does the error occur in?
Show the source line in vulnerable.c where the error occurs.
Show the source line of the error and propose a fix that takes into account null termination
Recompile and re-run example with address sanitization

With the AFL compiler performing the instrumentation, the AFL fuzzer can be used to automatically find memory corruption errors that lead to crashes. Note that the AFL fuzzer requires that your window be resized to at least 80x25. In this level, you will run the fuzzer on a similar program that has a buffer overflow vulnerability.

Compile and fuzz program

Build the binary:

cd ~/cs492-afl/02_scanf_fuzz
afl-gcc -m32 -o vulnerable vulnerable.c

Then, create a directory for seeding initial input and a directory for output and run the fuzzer on the binary.

mkdir inputs outputs
afl-fuzz -i inputs -o outputs -- ./vulnerable

As soon as a single crash is found, <Ctrl-C> the fuzzer to exit.

Show the crashing input in outputs/default/crashes/id* using xxd

Save the output, then repeat the fuzzing step again

Analyze results

Show the crashing input in outputs/default/crashes/id* using xxd
Compare the outputs of the two runs and explain why they would be different.
Show the source line that contains the vulnerability.

In this level, you will use the fuzzer to find another vulnerability. The level includes a Makefile.

Compile and fuzz program

Build program via make

cd ~/cs492-afl/03_fmt_fuzz
make

Then, run the fuzzer:

afl-fuzz -i inputs -o outputs -- ./vulnerable

As soon as a single crash is found, <Ctrl-C> the fuzzer to exit.

Analyze results

Show the crashing input in outputs/default/crashes/id* using xxd
Show the source line that contains the vulnerability.
Explain how the specific crashing input found by AFL causes the vulnerability to be exploited.

Fuzzers instrument paths in a program in order to zero in on problematic input. In this level, there are two programs that will both crash on the same input.

Both programs (path_based.c and strcmp_based.c) will crash on the same input. One uses a library call that is opaque to the fuzzer while the other implements the logic within its source code. Fuzzing will find the crashing input quickly in one because of the number of code paths it splits itself across, but not in the other. Given what you know about the AFL fuzzer, which one might the fuzzer have difficulty with?

Compile binaries and fuzz `path_based` program

Build both programs via the Makefile:

cd ~/cs492-afl/04_path_fuzz
make

Fuzz the path_based program to identify the crashing input:

afl-fuzz -i inputs -o outputs -- ./path_based

As soon as a single crash is found, show the following (before exiting the fuzzer):

The number of "total execs" and the total number of paths (indicated by the "corpus count")
Then, show the crashing input in outputs/default/crashes/id* using xxd
Explain the number obtained for "corpus count" based on examining the source code in path_based.c

Fuzz `strcmp_based` program

Now, run the fuzzer on the other binary for at most twice the number of total executions as the first. Terminate the fuzzer at this point regardless of output

afl-fuzz -i inputs -o outputs -- ./strcmp_based

Show the number of paths via the "corpus count" that have been found
Give an explanation as to why the "corpus count" is lower between the two programs based on the source code (knowing that the standard C library call is opaque to the fuzzer)
Explain why it is harder for the fuzzer to find the crashing input on one versus the other.
For the difficult program, explain what might happen if the standard C library were compiled with the AFL compiler and the fuzzing redone

With an understanding of how AFL works and how to apply it to find vulnerabilities, we will now examine a more complex program.

Examine program, build it, and run it

Examine vulnerable.c to locate and describe the three vulnerabilities in the file (one for each command)
Build the binary

cd ~/cs492-afl/05_multi
CC=afl-clang-fast AFL_HARDEN=1 make

Run the binary to get usage instructions

echo hi | ./vulnerable

Run the program using the 3 commands implemented (e.g. ec , head, c)

Run the program on the inputs provided in the inputs directory to understand what each command does

cat inputs/ec ; ./vulnerable < inputs/ec

cat inputs/head ; ./vulnerable < inputs/head

cat inputs/c ; ./vulnerable < inputs/c

Fuzz the program

Run the fuzzer using the inputs directory to seed the search space. Wait until you have found a crashing input for each of the 3 commands. Crashes are located in the outputs directory. A tmux session will be helpful to monitor the output as the fuzzer runs in another session.

afl-fuzz -i inputs -o outputs ./vulnerable

If you need to, you can detach the tmux session and re-attach at a later time

Show the xxd output for each of the three crashing inputs found in the outputs directory.

Analyze results

For each crashing input, associate it with the flaw in the C source code it exercises (from above) and explain how the particular input found exercises it.

Now, we will apply AFL to automatically find the Heartbleed vulnerability

Checkout and build vulnerable OpenSSL source

Clone using the tag containing the vulnerability

cd ~/cs492-afl/06_heartbleed

git clone https://github.com/openssl/openssl.git
cd openssl ; git checkout tags/OpenSSL_1_0_1f

Configure and build OpenSSL with ASAN

CC=afl-clang-fast CXX=afl-clang-fast++ ./config -d

AFL_USE_ASAN=1 make

Build program that uses vulnerable OpenSSL library

Examine handshake.cc top-level directory. The code that connects the fuzzer to the SSL connection that is made via stdin is included. Uncomment the appropriate lines in the file after analysis

After editing the file, compile the handshake program.

AFL_USE_ASAN=1 afl-clang++ -g handshake.cc openssl/libssl.a \
   openssl/libcrypto.a -o handshake -I openssl/include -ldl

We will now analyze the bug by examining the SSL record and the heartbeat message embedded into the record (as its record data).

Start by reading the xkcd explanation
Then, read an analysis of the Heartbleed vulnerability including descriptions of the protocol fields that are involved with it here and detailed analysis here
The C data structure for the SSL3 record contains a length field and a pointer to the embedded record

One record that can be embedded is the HeartbeatMessage

As part of the handshake program, an SSL3 record is constructed that contains a HeartbeatMessage. The input includes an SSL3 record type, an SSL3 major version number, an SSL3 minor version number, the record length, a HeartbeatMessageType (request or response), a payload length for the HeartbeatMessage, and finally the payload for the HeartbeatMessage.
A picture of what a malicious client sends to a victim server is below as well as the victim server's response

Run the fuzzer

Fuzz the program 5 times to get 5 different crashing inputs

afl-fuzz -i inputs -o outputs -m none ./handshake

Using xxd, dump each one.

Map input to protocol

Examine OpenSSL source code

In line 325 in ssl/ssl3.h, find the protocol type in decimal of the TLS HeartBeat message. Calculate its hexadecimal equivalent and locate it in the crashing input
As input, an SSL3 record is being read in using the function ssl3_get_record() on line 275 of ssl/s3_pkt.c. This is what the handshake program has been programmed to send in from a file.
Examine lines starting from 322-327 to identify how the initial data being sent in from the handshake program is being parsed into the resource record struct (rr).
Note that n2s()is defined in line 249 in ssl/ssl_locl.h

322                 /* Pull apart the header into the SSL3_RECORD */
323                 rr->type= *(p++);
324                 ssl_major= *(p++);
325                 ssl_minor= *(p++);
326                 version=(ssl_major<<8)|ssl_minor;
327                 n2s(p,rr->length);

Then, examine line 379 and line 400 to identify where the embedded record data (i.e. the HeartbeatMessage in our case) begins

Analyze results

Using the code above and the structure definition given for the HeartbeatMessage, for each of your inputs

Identify in the xxd listing of the crashing input, the values for rr->type, ssl_major, ssl_minor, and rr->length that are used.
Also identify in the xxd listing of the crashing input the HeartbeatMessageType (a single byte), the HeartbeatMessage payload_length, and the Heartbeat Message payload (if any).
Finally, after the data is parsed, a message is sent to the server. The function in the server code that contains the Heartbleed vulnerability is located on line 2553 in ssl/t1_lib.c

Identify the line in which the rogue length is parsed from the client
Identify the line in which the rogue length is then used to perform an unauthorized read in memory.

Celebrate! (Or not). Be sure to stop the VM to save $.

What you will build

What you'll learn

What you'll need

Install Ubuntu 22.04 VM

Install Docker on the VM

(Important) Logout and log back into VM

Retrieve course files

Examine Dockerfile

Set the password for fuzzer user and build container

Run container

Examine the running container

Examine container image

Stop the container

Start the container

Execute an interactive shell on the container

Launch tmux

Compile and run without address sanitization

Compile and run with address sanitization

Analyze results

Compile and fuzz program

Analyze results

Compile and fuzz program

Analyze results

Compile binaries and fuzz path_based program

Fuzz strcmp_based program

Examine program, build it, and run it

Fuzz the program

Analyze results

Checkout and build vulnerable OpenSSL source

Build program that uses vulnerable OpenSSL library

Map input to protocol

Analyze results

Launch `tmux`

Compile binaries and fuzz `path_based` program

Fuzz `strcmp_based` program