Course VM setup

We'll be utilizing an agent on the course VM to issue commonly used Linux commands. ssh into your VM and install the packages we'll be using for the lab if they are not yet installed.

sudo apt update -y
sudo apt install wordlists tcpdump -y
sudo gunzip /usr/share/wordlists/rockyou.txt.gz

Within the repository on the course VM, change into the exercise directory and update the code.

cd cs475-src/07*
git pull

Command agent

Linux ships with a large library of commands that users invoke from the command line. While one typically memorizes the commands and its flags in order to learn how to use them, language models and generative AI can allow a user to focus on the function that is being asked rather than the syntax of a particular command.

In this exercise, we will examine a simple command agent skill that allows an agent to generate and execute Linux commands on the machine. Begin by changing into the exercise directory.

cd 01_command

The command skill instructs the agent how to execute commands when it is invoked.

skills/command/SKILL.md

---
name: command
description: Runs an arbitrary command and returns its output
---

# Command

Use this skill to execute a shell command and capture its output.

## Local Script

Run this exact command from `07_CommandConfigGeneration/01_command/`:

```bash
python skills/command/scripts/command.py <command> [args...]
```

The script prints stdout, stderr, and the return code.

## Deterministic Workflow

1. Split the requested command into program and arguments.
2. Run `python skills/command/scripts/command.py <program> [args...]`.
3. Report stdout, stderr, and return code to the user.

The skill includes a simple Python script to execute commands that are produced:

skills/command/scripts/command.py

import subprocess
import sys

result = subprocess.run(sys.argv[1:], capture_output=True, text=True)
print(f"stdout: {result.stdout}\nstderr: {result.stderr}\nreturncode: {result.returncode}")

One can create a simple agent that can leverage the skill given by the user.

command_client.py

user_instructions = sys.argv[1]

fast = FastAgent("Command Agent", skills_directory="skills")

@fast.agent(
    instruction = user_instructions + "\n\n{{agentSkills}}",
    model = "gemini3flash",
    use_history = True
)

Install the packages.

uv init --bare
uv add -r requirements.txt

Then, run the agent, asking it to execute specific commands associated with processes in Linux when prompted by the user.

uv run command_client.py \
"Answer the user's request by running Linux commands ps, sudo, lsof, or ls via this skill."

Then, test the agent with prompts for performing process forensic analysis in order to evaluate the model's effectiveness.

For each task, examine the command tool call generated.

When searching files for particular content in them, regular expressions are often used to express advanced string matching operations. As the language itself can be difficult to learn, one use for LLMs is to obviate the need for learning regular expression syntax. Linux's egrep (global regular expression print) command is often used in Linux to perform a regular expression search on an input. To demonstrate this, we'll be using the RockYou dataset of compromised passwords for this exercise.

Run the agent with instructions that ask it to utilize a set of tools for performing searches on the file via regular expressions.

uv run command_client.py \
"Answer the user's request by running Linux commands egrep or wc on the password file at /usr/share/wordlists/rockyou.txt via this skill."

Then, query the agent to execute commands using Linux's egrep that searches words in the rockyou.txt file. Run each command multiple times to validate the regular expressions generated for correctness.

For each task, examine the command tool call generated.

Searching files in Linux via the command-line is often done with the find command and its many flags. While in the past, administrators would have to memorize the flags or read the man pages to craft the appropriate command, an LLM may obviate the need to do so now.

Run the agent with a prompt that asks it to utilize a set of tools that allow it to find particular files in the file system.

uv run command_client.py \
"Answer the user's request by running Linux commands find and sudo via this skill."

Then, query the agent to execute commands using Linux's find that searches for particular files in the file system.

For each task, examine the command tool call generated.

Firewalls and filtering allow one to restrict traffic to and from a machine to reduce the attack surface of a machine as well as limit the impact that a compromise might have on the rest of the network. On Linux, iptables provides a mechanism for specifying rules that can be applied to traffic. For this exercise, because an incorrect command that is executed might cause us to lose connectivity to the VM, we'll only ask for the iptables commands to run.

Run the agent with a prompt that asks it to produce commands that allow it to implement a firewall policy the user requests.

uv run command_client.py \
"Answer the user's prompt by specifying iptables commands that implement what is asked."

Then, query the agent to generate commands that implement a variety of rules using iptables.

For each task, examine the command tool call generated.

It is helpful for Linux administrators to be able to examine the log files on a system to see what actions have occurred on them. Towards this end, we'll utilize an agent to compose a set of Linux commands for analyzing activity on a system from its audit logs.

Run the agent with a prompt that asks it to utilize a set of commands that allow it to extract information from audit logs on the system.

uv run command_client.py \
"Answer the user's prompt by using sudo to run Linux commands who, last, lastb, lastlog, egrep, head, or tail via this skill."

Then, query the agent to execute Linux commands that perform forensic queries below:

For each task, examine the command tool call generated.

Another useful function for an administrator is to analyze network traffic going to and from the machine. In this exercise, we'll utilize the agent to perform packet captures using tcpdump in order to answer queries from the user. Run the agent with a prompt that asks it to utilize tcpdump to answer questions about network connections on the machine.

uv run command_client.py \
"You are a tcpdump agent.  Use tcpdump to print the next 20 packets in order to answer the user's question"

Create a second terminal session on the course VM to execute network commands. For each of the queries and commands below, issue the query to the agent then run the network command on the second terminal. Show the results.

Agent query

Network command

What destination ports are being accessed currently?

dig www.pdx.edu

What was the destination of the last web request?

curl https://oregonctf.org

What addresses are communicating over ssh?

ssh linux.cs.pdx.edu

Linux administrators might be interested in whether or not packages and services they are running are vulnerable to attack. In this exercise, we'll examine the utility of two ways of performing this task. The first is to use our prior approach to allow an agent to compose a set of Linux commands for doing so. Run the agent with a prompt that asks it to utilize a set of commands that allow it to determine whether a system is vulnerable.

uv run command_client.py \
"Answer the user's prompt by using sudo to run Linux commands systemctl, dpkg-query, and apt via this skill."

Then, query the agent to see what commands it generates and the results it produces when given the following prompts:

For each task, examine the command tool call generated.

We can instead try to customize the behavior of an agent by creating custom tools that utilize these commands and only allow the LLM to call them in order to answer the prompts. An example is provided in the repository. Change into the exercise directory and install the packages.

cd ../02_package
uv init --bare
uv add -r requirements.txt

For each skill that has been implemented for the agent, include in your notebook:

Run the agent in the repository that implements these tools to answer queries.

uv run package_client.py

Repeat the prompts:

Compare the results to the prior version and answer the following questions for your lab notebook.