This project contains a conversational agent that can help you optimize your BigQuery queries. You can provide a BigQuery job ID, and the agent will analyze the query and provide optimization recommendations.
Follow these steps to set up and run the project.
- Git
- Google Cloud SDK (with
gcloudandbqCLIs authenticated) - uv (a fast Python package installer and resolver)
a. Clone the repository:
git clone https://github.com/NucleusEngineering/bq-optimization-agent
cd bq-optimization-agentb. Configure Environment Variables:
Create a .env file by copying the example file:
cp .env.example .envThen, open the .env file and set your Google Cloud Project ID:
GOOGLE_CLOUD_PROJECT="your-gcp-project-id"After setting your project ID, load the environment variables by running:
source .env
c. Install dependencies and run checks:
The make install command will create a virtual environment and install all the necessary dependencies. The make check command will run linting and static analysis to ensure the code is clean.
make install
make checkFirst, we need to check that Vertex AI API is enabled in our project:
gcloud services enable aiplatform.googleapis.com
Now that Vertex AI is enabled, you can choose one of the following workflows:
This is the easiest way to interact with the agent.
make webThis command starts a web-based chat interface where you can provide a BigQuery job ID and get optimization recommendations.
This workflow is for evaluating the agent's performance.
a. Prepare the evaluation dataset:
make evaluation_prepThis command runs a series of unoptimized queries against BigQuery to generate job IDs. These job IDs are then used to create the evaluation_dataset.json file, which is needed for the evaluation process.
b. Run the evaluation:
make evaluateThis command runs the evaluation script (evaluate_agent.py). The script uses the evaluation_dataset.json file to test the agent's ability to optimize queries and provides a summary of the results.
Congrats! You have successfully built and evaluated a BigQuery agent using the Agent Development Kit (ADK).
You now understand how to set up an ADK agent with BigQuery tools and measure its performance using custom evaluation metrics.
However, agent evaluation is an iterative process.
To improve the evaluation results further, you can tweak the system instructions, try out different models and its parameters.
Have fun with improving your agent!
The primary way to interact with this project is through the Makefile commands.
make help: Lists all available commands with a short description.make install: Creates a virtual environment and installs all the necessary dependencies.make check: Runs a series of checks to ensure code quality and consistency (linting, type checking, dependency checks).make test: Runs the test suite usingpytest.make web: Starts a web-based chat interface to interact with the agent.make evaluation_prep: Prepares the evaluation dataset by running unoptimized queries to generate job IDs.make evaluate: Runs the agent evaluation script.
Here is a brief overview of the main directories in this project:
bq_agent_app/: Contains the core application code for the BigQuery Optimization Agent.eval_resources/: Holds all resources needed for evaluating the agent. This includes:evaluation_dataset.template.json: A template for the evaluation dataset.populate_evaluation_dataset.sh: A script to populate the evaluation dataset with real job IDs.eth_benchmarking_queries/: A directory containing a set of Ethereum-related BigQuery queries used for benchmarking the agent's optimization capabilities.- The root of this directory contains the unoptimized SQL queries.
evaluation/: Contains evaluation versions of the queries, scripts to run them, and explanations of the optimizations.schemas/: Contains the JSON schema definitions for the public Ethereum BigQuery tables used in the queries.
eval_results/: This directory stores the output results from the evaluation runs.logs/: This directory contains log files generated by the various scripts, such aspopulate_evaluation_dataset.shandevaluate_agent.py.