Bring up an ssh session on your Linux VM. Clone the course repository, create a Python virtual environment, activate it, and then install the packages for utilizing langchain with Google's models. LangChain is a Python framework that enables developers to easily build applications utilizing language models.

git clone https://github.com/wu4f/cs410g-src
cd cs410g-src/01*
virtualenv -p python3 env
source env/bin/activate
pip install -r requirements.txt

Examine the gemini.py program. The program instantiates the model and then calls its invoke() method with a prompt to generate a completion. The invoke() call blocks until the entire response is generated before returning the result.

from langchain_google_genai import GoogleGenerativeAI
llm = GoogleGenerativeAI(model="gemini-pro")
response = llm.invoke("Write me a haiku about Portland State University")
print(response)

Ensure the appropriate environment variable is set to specify your API key. Then, run the program and view the results the model returns.

python3 01_gemini.py

Next, examine the Python snippet below. Instead of using a model's invoke() method, it utilizes its stream() method instead, which returns results to the program as it is generated. As longer responses are produced, the model will pass back partial results in a generator that is iteratively output in the for loop.

from langchain_google_genai import GoogleGenerativeAI
llm = GoogleGenerativeAI(model="gemini-pro")

for chunks in llm.stream("Write me a short story about a rabbit"):
    print(chunks,end="")

We can wrap the execution of the model with a simple interactive shell that accepts input in a loop and sends it to the language model. The shell exits when a blank line is given.

while True:
    line = input("llm>> ")
    if line:
        for chunks in llm.stream(line):
            print(chunks,end="")
        print("")
    else:
        break

Run the program and experiment with the model. Note that this particular program does not store prior conversational messages as one would expect a chatbot might do.

python3 02_gemini_interact.py

LangChain provides a simple way to access a subset of the Gemini model's features via API calls. If one wishes to access the full support of a model as well as leverage the computing infrastructure of Google Cloud to help run applications, one can do so via Vertex AI. In this step, we will do so using a Python Jupyter notebook. From the Google Cloud Platform console, navigate to VertexAI's Dashboard, enable the recommended APIs, then continue to Colab Enterprise (a managed Jupyter notebook service).

On your local machine, clone the course repository.

git clone https://github.com/wu4f/cs410g-src

Then, within Colab Enterprise, click on the upload button, and upload the Python notebook 03_gemini.ipynb from the repository. Then, connect it to a runtime environment. This will create a GPU-enabled Compute Engine instance that will run your notebook.

Run all cells in the notebook and ensure you are able to query the Gemini Pro model.

We can programmatically access our Ollama models from Python as well. If it is not still running, bring up your Ollama server for your project. Then, within the Compute Engine interface make a note of its external IP address (e.g. 35.230.93.16 below)

Consider the snippet below. It instantiates an Ollama model based on the name of a model that has been installed on the Ollama server (e.g. llama2) as well as the location of the API endpoint of the Ollama server as specified by the IP address in the base_url parameter.

from langchain_community.llms import Ollama
llm = Ollama( model="llama2", base_url="http://35.230.93.16:11434" )
for chunks in llm.stream("Write me a short story about a rabbit"):
    print(chunks,end="")

On your Linux VM, run the program below that implements a simple interactive shell for querying your Ollama server, supplying the IP address of your Ollama server and the model on the server you want to query as arguments. Test out the functionality of the models you've installed.

python3 04_ollama.py <OllamaIPAddress> <OllamaModel>

One of the benefits of LangChain is that it provides a unified interface for integrating models into an application. This allows one to plug-and-play different models and compare their behavior and performance for your application. One useful module in LangChain for doing so is the Model Laboratory. With this, one can instantiate a range of models and automatically run a given prompt across all of them, allowing one to analyze the results of each model. The snippet below shows an example using models pulled from HuggingFace, OpenAI, and Google.

from langchain.model_laboratory import ModelLaboratory

from langchain_community.llms import HuggingFaceEndpoint
from langchain_google_genai import GoogleGenerativeAI
from langchain_community.llms import HuggingFaceHub
from langchain_openai import OpenAI

llms = [
    HuggingFaceEndpoint(repo_id="mistralai/Mistral-7B-Instruct-v0.2", max_new_tokens=128),
    HuggingFaceHub(repo_id="google/flan-t5-small"),
    HuggingFaceHub(repo_id="google/gemma-2b"),
    OpenAI(temperature=0),
    GoogleGenerativeAI(model="models/text-bison-001"),
    GoogleGenerativeAI(model="gemini-pro")
]
model_lab = ModelLaboratory.from_llms(llms)
model_lab.compare("Who is the president of the United States?")

Examine the program provided in the repository that iteratively queries a set of LLMs with your prompts. Edit the file to the ones you want to compare, then run the program and examine how the models compare across a variety of prompts.

python3 05_multi.py

There are a number of models that can now analyze and produce images and video. In this set of exercises, we'll use a variety of examples to see how we can leverage the tools programmatically to perform a range of tasks. The code below shows how one can leverage the Gemini Vision model API from LangChain. In the code, as part of the chat model, we include structured input that encodes the URL of the image we want to analyze and the prompt we'd like to model to query the image with.

from langchain_core.messages import HumanMessage
from langchain_google_genai import ChatGoogleGenerativeAI

llm = ChatGoogleGenerativeAI(model="gemini-pro-vision")

def query_llm(llm, query, image_url):
  message = HumanMessage(
      content=[
          {
              "type": "text",
              "text": query,
          },
          {   "type": "image_url",
              "image_url": image_url
          }
      ]
  )
  response = llm.invoke([message])
  return response

prompt = "What is going on in this image?"
image_url = 'https://portswigger.net/cms/images/91/43/e4e5-article-popovers-hidden-inputs.png'
print(query_llm(llm, prompt, image_url))

An interactive application that tells you what is going on with an image via its URL is in the repository. Run it and test the model's ability to analyze images.

python3 06_gemini_vision.py

Accessing model functionality, especially for multimedia content, can be done more efficiently on the cloud platform itself. Within Vertex AI, go back to Colab Enterprise, click on the upload button, and upload the Python notebook 07_gemini_vision.ipynb from the repository. Then, connect it to a runtime environment. This will create a GPU-enabled Compute Engine instance that will run your notebook.

Run all cells in the notebook and ensure you are able to query the Gemini Pro Vision model. The notebook will walk you through the analysis of images downloaded via their URL as well as video stored in a storage bucket.