07.2: Command agents

Linux ships with a large library of commands that users invoke from the command line. While one typically memorizes the commands and its flags in order to learn how to use them, language models and generative AI can allow a user to focus on the function that is being asked rather than the syntax of a particular command.

Consider the following MCP server that is configured to execute commands in the terminal based on instructions given by the user and returns a JSON object containing the results. The server implements tools asynchronously, allowing a client to execute calls concurrently.

command_mcp_server.py

import shlex

mcp = FastMCP("Command")

@mcp.tool("command")
async def command(command: str):
    """Runs an arbitrary Linux command"""
    loop = asyncio.get_running_loop()
    result = await loop.run_in_executor(
        None,
        lambda: subprocess.run(
            shlex.split(command),
            stdout=subprocess.PIPE,
            stderr=subprocess.PIPE,
            text=True
        )
    )
    return {
        "stdout": result.stdout,
        "stderr": result.stderr,
        "returncode": result.returncode,
    }

One can create a simple MCP client to invoke the tool with a particular command based on the prompt given by the user.

command_mcp_client.py

@fast.agent(
    instruction = f"You are a Linux command agent.  Generate a command based on the user's request and return the results of execution directly back.",
    model = "gpt-4.1",
    servers = ["command"],
    use_history = True
)

Change into the exercise directory, install the packages.

cd 01_command
uv init
uv add -r requirements.txt

Then, run the agent, asking it to execute specific commands associated with processes in Linux when prompted by the user.

uv run command_mcp_client.py \
"Answer the user's request by running Linux commands ps, sudo, lsof, or ls via the tools given."

Then, test the agent with prompts for performing process forensic analysis in order to evaluate the model's effectiveness.

Using sudo, find the names of all processes with open network connections using lsof
Show the path of the executable file used to launch sshd
Show all processes whose original binaries have been deleted
Show the current working directory of the Python process
Find the PID of the python processes, then use /proc to show its environment
Show all processes which have the current working directory of /tmp
Show all scheduled tasks

For each task, examine the command tool call generated.

Include a table of successful commands generated for each task for your notebook

When searching files for particular content in them, regular expressions are often used to express advanced string matching operations. As the language itself can be difficult to learn, one use for LLMs is to obviate the need for learning regular expression syntax. Linux's egrep (global regular expression print) command is often used in Linux to perform a regular expression search on an input. To demonstrate this, we'll be using the RockYou dataset of compromised passwords for this exercise.

Run the agent with instructions that ask it to utilize a set of tools for performing searches on the file via regular expressions.

uv run command_mcp_client.py \
"Answer the user's request by running Linux commands egrep or wc on the password file at /usr/share/wordlists/rockyou.txt via the tools given."

Then, query the agent to execute commands using Linux's egrep that searches words in the rockyou.txt file. Run each command multiple times to validate the regular expressions generated for correctness.

How many passwords begin with the letter a
How many passwords begin with the number 1
How many passwords consist of only alphabetic characters
How many passwords are made up of exactly 6 numbers
List all passwords in rockyou.txt that begin abcde and end with the number 9

For each task, examine the command tool call generated.

Include a table of successful commands generated for each task for your notebook

Searching files in Linux via the command-line is often done with the find command and its many flags. While in the past, administrators would have to memorize the flags or read the man pages to craft the appropriate command, an LLM may obviate the need to do so now.

Run the agent with a prompt that asks it to utilize a set of tools that allow it to find particular files in the file system.

uv run command_mcp_client.py \
"Answer the user's request by running Linux commands find and sudo via the tools given."

Then, query the agent to execute commands using Linux's find that searches for particular files in the file system.

Find all setuid and setgid programs in /bin
Find all ELF executables in /bin
Find all files modified in the last day in /etc
Find all files created in the last day in /etc

For each task, examine the command tool call generated.

Include a table of successful commands generated for each task for your notebook

Firewalls and filtering allow one to restrict traffic to and from a machine to reduce the attack surface of a machine as well as limit the impact that a compromise might have on the rest of the network. On Linux, iptables provides a mechanism for specifying rules that can be applied to traffic. For this exercise, because an incorrect command that is executed might cause us to lose connectivity to the VM, we'll only ask for the iptables commands to run.

Run the agent with a prompt that asks it to produce commands that allow it to implement a firewall policy the user requests.

uv run command_mcp_client.py \
"Answer the user's prompt by specifying iptables commands that implement what is asked."

Then, query the agent to generate commands that implement a variety of rules using iptables.

What command allows ssh traffic from 131.252.220.0/24?
What command allows incoming connections to ports 80 and 443 from 131.252.0.0/16?
What command allows outgoing connections to a MySQL database server at 10.0.0.10?
What command denies all traffic from 1.1.1.1

For each task, examine the command tool call generated.

Include a table of successful commands generated for each task for your notebook

It is helpful for Linux administrators to be able to examine the log files on a system to see what actions have occurred on them. Towards this end, we'll utilize an agent to compose a set of Linux commands for analyzing activity on a system from its audit logs.

Run the agent with a prompt that asks it to utilize a set of commands that allow it to extract information from audit logs on the system.

uv run command_mcp_client.py \
"Answer the user's prompt by using sudo to run Linux commands who, last, lastb, lastlog, egrep, head, or tail via the tools given."

Then, query the agent to execute Linux commands that perform forensic queries below:

Find all currently logged in users
Find the last 5 successful logins
Find the last 5 unsuccessful logins and the IP addresses they came from

For each task, examine the command tool call generated.

Include a table of successful commands generated for each task for your notebook

Another useful function for an administrator is to analyze network traffic going to and from the machine. In this exercise, we'll utilize the agent to perform packet captures using tcpdump in order to answer queries from the user. Run the agent with a prompt that asks it to utilize tcpdump to answer questions about network connections on the machine.

uv run command_mcp_client.py \
"You are a tcpdump agent.  Use tcpdump to print the next 20 packets in order to answer the user's question"

Create a second terminal session on the course VM to execute network commands. For each of the queries and commands below, issue the query to the agent then run the network command on the second terminal. Show the results.

Agent query	Network command
What destination ports are being accessed currently?	`dig www.pdx.edu`
What was the destination of the last web request?	`curl https://oregonctf.org`
What addresses are communicating over ssh?	`ssh linux.cs.pdx.edu`

Take a screenshot showing the results for each that includes your OdinId

Linux administrators might be interested in whether or not packages and services they are running are vulnerable to attack. In this exercise, we'll examine the utility of two ways of performing this task. The first is to use our prior approach to allow an agent to compose a set of Linux commands for doing so. Run the agent with a prompt that asks it to utilize a set of commands that allow it to determine whether a system is vulnerable.

uv run command_mcp_client.py \
"Answer the user's prompt by using sudo to run Linux commands systemctl, dpkg-query, and apt via the tools given."

Then, query the agent to see what commands it generates and the results it produces when given the following prompts:

What is the ssh server being run on this machine?
Does the current ssh server being run have any vulnerabilities associated with it?

For each task, examine the command tool call generated.

Include a table of successful commands generated for each task for your notebook

We can instead try to customize the behavior of an agent by creating custom tools that utilize these commands and only allow the LLM to call them in order to answer the prompts. An example is provided in the repository. Change into the exercise directory and install the packages.

cd ../02_package
uv init
uv add -r requirements.txt

Consider the 3 tools that have been implemented in an MCP server for querying packages.

package_mcp_server.py

@mcp.tool("list_running_services")
def list_running_services():
    """Retrieves the list of running services on the host"""
    result = subprocess.run(['systemctl', 'list-units', '--type=service', '--state=running'], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    return result.stdout

@mcp.tool("list_installed_packages")
def list_installed_packages():
    """Retrieves the list of installed packages and their versions on the host"""
    result = subprocess.run(['dpkg-query', '-W', '-f=${Package} ${Version}\n'], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    return result.stdout

@mcp.tool("list_package_vulnerabilities")
def is_package_vulnerable(package_name):
    """Look up vulnerabilities of a package across all versions given the name of the package given as a string.  Find authors of package updates."""
    result = subprocess.run(['apt', 'changelog', package_name], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)
    return result.stdout

Run the agent in the repository that implements these tools to answer queries.

uv run 02_package.py

Repeat the prompts:

What is the ssh server being run on this machine?
Does the current ssh server being run have any vulnerabilities associated with it?

Compare the results to the prior version and answer the following questions for your lab notebook.

Did the custom tools allow the agent to provide a more accurate answer?