Agents have the ability to write their own plans, execute its parts, and update the plans during execution, making them extremely powerful tools for automating tasks. LangChain comes with support for building agents. In this lab, we'll experiment with a number of agents, leverage built-in tools for allowing them to perform tasks, as well as write custom tools for them to utilize. We will leverage the ReAct agent whose description can be found here. ReAct combines reasoning and action in order to handle a user's request.

Setup

Within the repository, create a virtual environment, activate it, and then install the packages required.

cd cs410g-src
git pull
cd 04*
virtualenv -p python3 env
source env/bin/activate
pip install -r requirements.txt

LangChain's documentation for their ReAct implementation can be found here. We'll be using LangChain's ReAct prompt template to drive our agent (langchain-ai/react-agent-template). The developer typically provides the initial instructions for the agent which is then spliced into the base prompt template. For example, consider the code below.

from langchain import hub
base_prompt = hub.pull("langchain-ai/react-agent-template")
prompt = base_prompt.partial(
             instructions="Answer the user's request utilizing at most 8 tool calls"
         )
print(prompt.template)

After splicing in the instructions to the template, the output is the prompt shown below.

Answer the user's request utilizing at most 8 tool calls

TOOLS:
------
You have access to the following tools:
{tools}

To use a tool, please use the following format:
```
Thought: Do I need to use a tool? Yes
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
```

When you have a response to say to the Human, or if you do not need to use a tool, you MUST use the format:
```
Thought: Do I need to use a tool? No
Final Answer: [your response here]
```

Begin!

Previous conversation history:
New input: {input}
{agent_scratchpad}

Although there are other ways of implementing agents, we will be using this ReAct prompt and agent in our examples.

Agents have the ability to plan out and execute sequences of tasks in order to answer user queries. A simple, albeit dangerous example is shown below that utilizes a Python REPL (Read-Eval-Print Loop) tool to execute arbitrary Python code. By combining an LLM's ability to generate code with a tool to execute it, a user is able to execute programs in English. The code instantiates the tool, then adds it to the list of tools as its only tool. Then, it initializes a prompt to instruct the agent to only use the Python REPL to produce a response.

from langchain.agents import AgentExecutor, create_react_agent
from langchain_experimental.tools import PythonREPLTool
tools = [PythonREPLTool()]

instructions = """You are an agent designed to write and execute python code to
answer questions.  You have access to a python REPL, which you can use to execute
python code.  If you get an error, debug your code and try again.  Only use the
output of your code to answer the question.  You might know the answer without
running any code, but you should still run the code to get the answer.  If it does
not seem like you can write code to answer the question, just return 'I don't know'
as the answer.
"""
base_prompt = hub.pull("langchain-ai/react-agent-template")
prompt = base_prompt.partial(instructions=instructions)

From this, the ReAct agent is instantiated with the prompt, the tool, and the LLM. It is then used to execute the user's request.

agent = create_react_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
# agent_executor.invoke({"input":"What is the 5th smallest prime number squared?"})

Run the Python REPL agent in the repository. The code implements an interactive interface for performing tasks that the user enters using Python.

python3 01_tools_python.py

Ask the agent the following requests multiple times and examine the output traces for variability and correctness.

Agents can utilize any collection of tools to perform actions such as retrieving data from the Internet or interacting with the local machine. LangChain comes with an extensive library of tools to choose from. Consider the program snippet below that uses the SerpApi (Google Search Engine results), LLM-Math, Wikipedia, and Terminal (e.g. shell command) tools to construct an agent. Note that, like the Python REPL tool, the Terminal tool is a dangerous one to include in the application, requiring us to explicitly allow its use.

tools = load_tools(["serpapi", "llm-math","wikipedia","terminal"], llm=llm, allow_dangerous_tools=True)

base_prompt = hub.pull("langchain-ai/react-agent-template")
prompt = base_prompt.partial(instructions="Answer the user's request utilizing at most 8 tool calls")

agent = create_react_agent(llm,tools,prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# agent_executor.invoke(
   {"input": "What percent of electricity is consumed by data centers in the US?"}
)

Run the program.

python 02_tools_builtin.py

Within its interactive loop, ask the agent to perform the following tasks and see which tools and reasoning approaches it utilizes to answer the prompt.

Toolkits are collections of tools that are designed to be used together for specific tasks. One useful toolkit is the Natural Language API toolkit (NLAToolkit). Many web applications now utilize backend REST APIs to handle client requests. To automatically produce code that interacts with such APIs, the OpenAPI standard allows an API developer to publish a specification of their API interface which allows this to happen. Consider a snippet of the OpenAPI specification for the xkcd comic strip below. It contains the URL of the server hosting the APIs as well as endpoint paths for handling 2 API requests: one to fetch the current comic and one to fetch a specific comic given its comicId.

openapi: 3.0.0
info:
  description: Webcomic of romance, sarcasm, math, and language.
  title: XKCD
  version: 1.0.0
externalDocs:
  url: https://xkcd.com/json.html
paths:
  /info.0.json:
    get:
      description: |
        Fetch current comic and metadata.
  . . .
 "/{comicId}/info.0.json":
    get:
      description: |
        Fetch comics and metadata  by comic id.
      parameters:
        - in: path
          name: comicId
          required: true
          schema:
            type: number
   . . .
servers:
  - url: http://xkcd.com/

We can utilize the NLAToolkit to access these APIs given the OpenAPI specification. To begin with, however, we'll need to patch its current version.

sed -i '15a\from langchain.chains.api.openapi.chain import OpenAPIEndpointChain' ./env/lib/python3.*/site-packages/langchain_community/agent_toolkits/nla/tool.py

The toolkit generates a tool for each API endpoint. For interacting with the xkcd API, the code below generates the tools from the site's OpenAPI specification given by its URL. It then instantiates an agent using the tools.

from langchain.agents import create_react_agent, AgentExecutor
from langchain_community.agent_toolkits import NLAToolkit
from langchain import hub

llm = GoogleGenerativeAI(model="gemini-pro")

toolkit = OpenAPIToolkit.from_llm_and_url(llm,
    "https://raw.githubusercontent.com/APIs-guru/unofficial_openapi_specs/master/xkcd.com/1.0.0/openapi.yaml"
)
tools = toolkit.get_tools()

base_prompt = hub.pull("langchain-ai/react-agent-template")
prompt = base_prompt.partial(instructions="Answer the user's request")

agent = create_react_agent(llm,tools,prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
# agent_executor.invoke({"input":"What's the latest xkcd?"})

The program in the repository provides an interactive shell for querying the xkcd with the code above. It also shows the tools that are generated by the NLAToolkit for use by the ReAct agent.

python 03_toolkits_nlapi.py

Ask the questions below to test the agent and its tools.

Another useful toolkit is SQLDatabaseToolkit that provides sets of SQL functions to support querying a database using natural language. The code below shows a simple program that takes a SQLite3 database provided as an argument and handles queries against it that are driven by a user's prompt.

from langchain.agents import create_sql_agent
from langchain_community.agent_toolkits import SQLDatabaseToolkit
from langchain.sql_database import SQLDatabase
from langchain_google_genai import GoogleGenerativeAI
from langchain.agents import AgentExecutor
import sys

database = sys.argv[1]
llm = GoogleGenerativeAI(model="gemini-pro",temperature=0)
db = SQLDatabase.from_uri(f"sqlite:///{database}")
toolkit = SQLDatabaseToolkit(db=db, llm=llm)
agent_executor = create_sql_agent(
    llm=llm,
    toolkit=toolkit,
    verbose=True
)
# agent_executor.invoke("How many users are there in this database?")

Using this program, we'll interact with two databases that reside in the repository: metactf_users.db and roadrecon.db (as provided here)

metactf_users.db

The program in the repository provides an interactive shell for querying a database you supply it. Run the program using the metactf_users.db that is given. It shows the tools and the database

python 04_toolkits_sql.py db_data/metactf_users.db

This database is used to store usernames and password hashes for the MetaCTF sites such as cs205.oregonctf.org. Examine the database via the following prompts. Note that you may need to run the query multiple times as the execution of the agent is typically not deterministic.

roadrecon.db

A more complex database is also included at roadrecon.db. Unless the model being used has a capable reasoning engine, it might be more difficult to use prompts to access the information sought. Run the SQLAgent program using this database as an argument.

python 04_toolkits_sql.py db_data/roadrecon.db

As before, examine the database via the following prompts.

In the SQLAgent examples, the agent must generate a plan for querying the SQL database starting with no knowledge of the database schema itself. As a result, it can generate any number of strategies to handle a user's query, updating its plan as queries fail. While being explicit in one's query can help the SQLAgent avoid calls that fail, augmenting the SQLAgent with knowledge that is known about the database can avoid errors and reduce the number of calls to the database and underlying LLM. Custom tools allow the developer to add their own tools to implement specific functions. There are multiple ways of implementing custom tools. In this exercise, we will begin with the most concise and most limited approach.

@tool decorator

LangChain provides a decorator for turning a Python function into a tool that an agent can utilize as part of its execution. Key to the definition of the function, is a Python docstring that the agent can utilize to understand when and how to call the function. By annotating each tool with a description of what they are to be used for, the LLM is able to forward calls to the appropriate tool based on what the user's query is asking to do. To show this approach, we revisit the SQLAgent program by creating custom tools for handling specific queries from the user on the metactf_users.db database.

The first tool we create, fetch_users, fetches all users in the database using our prior knowledge of the database's schema. This tool allows the agent to obtain the users without having to query for the column in the schema first, as it typically would need to do otherwise.

@tool
def fetch_users():
   """Useful when you want to fetch the users in the database.  Takes no arguments.  Returns a list of usernames in JSON."""
   res = db.run("SELECT username FROM users;")
   result = [el for sub in ast.literal_eval(res) for el in sub]
   return json.dumps(result)

The second tool we create, fetch_users_pass, fetches a particular user's password hash from the database. As before, we use knowledge of the schema to ensure the exact SQL query we require is produced. Note that the code has a security flaw in it.

@tool
def fetch_users_pass(username):
   """Useful when you want to fetch a password hash for a particular user.  Takes a username as an argument.  Returns a JSON string"""
   res = db.run(f"SELECT passhash FROM users WHERE username = '{username}';")
   result = [el for sub in ast.literal_eval(res) for el in sub]
   return json.dumps(result)

Given these custom tools, we can define a SQLAgent that defines them as extra tools so that they can be utilized in our queries.

toolkit = SQLDatabaseToolkit(db=db,llm=llm)
agent_executor = create_sql_agent(
    llm=llm,
    toolkit=toolkit,
    extra_tools=[fetch_users_tool, fetch_users_pass_tool],
    verbose=True
)
# agent_executor.invoke("What are the password hashes for admin and demo0?")

Run the agent using the metactf_users.db database

python 05_tools_custom_decorator.py db_data/metactf_users.db

Test it with queries that utilize both the custom and built-in tools of the SQLAgent.

The last example intentionally contained a SQL injection vulnerability. As a result of it, when the LLM was prompted to look for the canonical SQL injection user "admin' OR '1'='1", it simply passed it directly to the vulnerable fetch_users_pass function which then included it into the final SQL query string. One way of eliminating this vulnerability is to perform input validation on the data that the LLM invokes the tool with. This is often done via the Pydantic data validation package. LangChain integrates Pydantic throughout its classes and provides support within the @tool decorator for performing input validation using it. Revisiting the custom tool example, we can modify the fetch_users_pass tool decorator to include a Pydantic class definition that input it is being passed is instantiated with (FetchUsersPassInput).

from langchain_core.pydantic_v1 import BaseModel, Field, root_validator

class FetchUsersPassInput(BaseModel):
    username: str = Field(description="Should be an alphanumeric string")
    @root_validator
    def is_alphanumeric(cls, values: dict[str,any]) -> str:
        if not values.get("username").isalnum():
            raise ValueError("Malformed username")
        return values

@tool("fetch_users_pass", args_schema=FetchUsersInput, return_direct=True)
def fetch_users_pass(username):
   """Useful when you want to fetch a password hash for a particular user.  Takes a username as an argument.  Returns a JSON string"""
   res = db.run(f"SELECT passhash FROM users WHERE username = '{username}';")
   result = [el for sub in ast.literal_eval(res) for el in sub]
   return json.dumps(result)

Run the agent using the metactf_users.db database

python 06_tools_custom_pydantic.py db_data/metactf_users.db

Test the version using the SQL injection query and see how the injection has now been prevented.

Putting it all together, the next agent employs built-in tools, custom tools, and input validation to implement an application that allows one to perform queries on DNS. The agent has two custom tools. The first performs a DNS resolution, returning an IPv4 address given a well-formed DNS hostname as input.

import dns.resolver, dns.reversename
import validators
class LookupNameInput(BaseModel):
    hostname: str = Field(description="Should be a hostname such as www.google.com")
    @root_validator
    def is_dns_address(cls, values: dict[str,any]) -> str:
        if validators.domain(values.get("hostname")):
            return values
        raise ValueError("Malformed hostname")

@tool("lookup_name",args_schema=LookupNameInput, return_direct=True)
def lookup_name(hostname):
    """Given a DNS hostname, it will return its IPv4 addresses"""
    result = dns.resolver.resolve(hostname, 'A')
    res = [ r.to_text() for r in result ]
    return res[0]

The second performs a reverse DNS lookup, returning a DNS name given a well-formed IPv4 address as input.

class LookupIPInput(BaseModel):
    address: str = Field(description="Should be an IP address such as 208.91.197.27 or 143.95.239.83")
    @root_validator
    def is_ip_address(cls, values: dict[str,any]) -> str:
        if re.match("^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$", values.get("address")):
            return values
        raise ValueError("Malformed IP address")

@tool("lookup_ip", args_schema=LookupIPInput, return_direct=True)
def lookup_ip(address):
    """Given an IP address, returns names associated with it"""
    n = dns.reversename.from_address(address)
    result = dns.resolver.resolve(n, 'PTR')
    res = [ r.to_text() for r in result ]
    return res[0]

The custom tools are then added alongside built-in tools for searching and the terminal to instantiate the agent.

tools = load_tools(["serpapi", "terminal"]) + [lookup_name, lookup_ip]

agent = create_react_agent(llm,tools,prompt)

Run the agent and test the tools.

python3 07_tools_custom_agent.py

LangSmith is a service hosted by LangChain meant to make debugging, tracing, and information management easier when developing applications using LangChain. LangSmith is especially useful when creating agents, which can have complex interactions and reasoning loops that can be hard to debug. In order to use LangSmith, an API key is needed. First, sign up for an account at on LangChain's site using your Portland State University account.. Visit your account's settings and click on "Create API key".

On the Linux VM, set an environment variable that contains the value of the API key.

export LANGCHAIN_API_KEY="<FMI>"

Note that you can add this to your .bashrc file to automatically set the key when you login each time.

Instrument code

Now that the logistics are out of the way, LangSmith logging functionality can be easily incorporated into LangChain applications. In the previous custom agent code, we have placed instrumentation code to enable the tracing of the application, but have commented it out. Revisit the code for the agent, then find and uncomment the code below. The code instantiates the LangSmith client and sets environment variables to enable tracing and identify the logging endpoint for a particular project on your LangChain account.

from langsmith import Client
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = "LangSmith Introduction"
os.environ["LANGCHAIN_ENDPOINT"] = "https://api.smith.langchain.com"
client = Client()

After enabling LangSmith tracing, run the program

python 07_tools_custom_agent.py

Enter in a query. Then, after execution, navigate to the LangSmith projects tab in the UI, and the project "LangSmith Introduction" should appear in the projects list. Select the project and examine the execution logs for the query. Navigate the interface to see the trace information it has collected. Such information will be useful as you build your own agents.