This is A LLM Agent that predict the outcome of a professional Counter-Strike 2 match. It analyzes the recent news for each team and all the stats to give a precise prediction.
To get started, you first need to clone the code and install the dependencies.
$ git clone git@github.com:luizcieslak/cs2-match-prediction.git
$ cd cs2-match-prediction
$ pnpm installThen create a .env file in the root of the project from .env.example file and tweak it to your needs.
There's support for models besides OpenAI ones, as long as they are compatible with its API.
You can run a match analysis with the following:
pnpm run start --home Vitality --away Legacy --bestof 1
To see all the available config options from the CLI, run pnpm run start --help.
Or if you want to run several match predicts on batch, on src/repos/matches/repo.ts, add the matches you want to predict in the following format:
new Match(team1, team2, bestOf)And then you can run the agent with:
$ npm run start
It may take several minutes to complete. It analyzes many articles and matches and each call to the LLM may take upwards of ~30 seconds to complete. So be patient.
As a way of test its accuracy, I've used the Blast Austin 2025 Major Championship and predicted its outcome using several LLM models. You can check the results here.
It's current accuracy is 58.3% in the championship stage advancements, and around 65% in the individual matches happened in the major championship.
At a high-level, the agent crawls a bunch of statistics and news articles on HLTV's website. For each match it will feed the LLM stats and news relevant to that match and have it predict a winner.
The data retrieval is done by scraping HLTV pages (built using Playwright).
The data analysis is done by the LLM.
The general division here is that anything that can be done more-or-less deterministically in code, we should do in code. And fallback to the LLM for very specific tasks that are fuzzy, non-deterministic, and don't lend themselves to code due to their ambiguous or difficult-to-code nature.
The scrapers know how to retrieve:
- The articles mentioning a determined CS2 team. Example for FaZe team
- Overview Stats for a CS2 team. Example for NAVI team.
- Event history a CS2 team. Example for NAVI team.
- Previous matchups between two teams. Example for FaZe versus NAVI
- Map Pool stats for a CS2 team. Example for The MongolZ team.
The agents know how to:
-
Analyze News: Take a news article and extract:
- A summary of the article.
- key elements that can make the team win - member trades, stats and results.
Whenever we provide an article to the agent to summary, we also provide which is the team the agent needs to look for in order to provide the elements above.
-
Predict a Winner: Given both teams that are playing against each other (in a championship context if provided), the amount of maps they will play plus each teams Stats: KDA ratio, win rate, event history, matchup history, map pool stats along with any relevant news we can find for them, analyze all this data to make a prediction about who will win and also which maps will be played.
This is heavily based on a fork of Steve Krenzel's pick-ems LLM agent that originally predicts winners for NFL games. The code was modified to handle CS2 games but most of the architecture is the same.
There are a few key architectural patterns we use in this repo. One for data retrieval, one for working with LLMs, and one for managing the flow of data between the two.
- Patchright instead of the default Playwright is required to scrape HLTV's website (🙏 thank you HLTV for letting us scrape your website, your content rocks and for decades it's been the go-to place everyone goes when we talk about Counter-Strike.)
- Cached article analysis in JSON files (so my OpenAI billing doesn't skyrocket)
- feeding back results in each round as another stat for the subsequent ones.
Web scraping can be messy business, so we attempt to hide the browser from the
rest of our code as quickly as possible. The end goal is to basically access
the content from the various webpages the same way we would access data from
an API or a database. To that end, we use a Data Mapper Pattern
where each kind of data (e.g. Article, Match, Team, etc...) has a Repo
and an Entity. The Entity contains all of the fields of data that we want
that domain object to have. The Repo is how we retrieve and access the data.
The repos are structured in such a way that if you just use the objects
without peeking behind the scenes, you'd have no idea that you weren't
querying a database of some sort (though, an admittedly slow database).
Bridging the gap between nice well-defined data and fuzzy natural language can be a bit tricky. To help address this, we rely on OpenAI's ability to call tools/functions. We don't actually care about the tool itself. We pass OpenAI exactly one tool, force it to use that tool, and the only bit of the tool we care about are the parameters of the tool. This is the data that we are seeking from the LLM. The tool definition is just a means for us to provide OpenAI a nice JSONSchema that will influence the shape of the data the LLM returns to us.
Note: The LLM may return something that doesn't match our schema, but with GPT-4 this is exceedingly uncommon (at least for our use case).
Each agent is broken up into 3 parts: Prompt, Schema, Tool.
- The
Promptinstructs the LLM on what its task is and how it should achieve it. - The
Schemainstructs the LLM on what data we expect it to return to us. - The
Toolretrieves any data the LLM might need for its task and calls the LLM with the prompt, data, and schema and wraps it all in a nice function that abstracts away the details. A user can call a tool just like any other function and they can be oblivious to the fact that thestringorbooleanthey got back required the processing power of 1,000 remote GPUs.
One other pattern we use, that may be non-obvious, is that we wrap the
tool's schema in a parent schema that has a field called analysis and a
field called conclusion. The conclusion field isn't particularly notable
and simply maps to the tool's schema. The analysis is the notable one
here. Importantly, The field is generated first by the LLM and is a way
for the LLM to "think" out loud and reference that thinking in the subsequent
generation of the data in the conclusion.
A lot of LLM techniques, like Chain-of-Thought
(CoT)
require that the LLM generates a sequence of tokens that it can then reference
on for its final answer. If your Prompt to the Tool asks the LLM to use
any strategy like this, and you don't give it a scratchpad of sorts to write
in, then it won't be able to use that pattern. So this analysis field is a
scratchpad that we offer to the LLM to write whatever it needs to before
answering our prompt.
tl;dr; If you ask the LLM to return true or false, and you instruct it to
think carefully about the choice beforehand, but you don't give it space to
do that thinking then it won't be able to and instead will just return a boolean
without thinking carefully about it.
This pattern meaningfully improves the results of the payloads.
Finally, we've got data retrieval and we've got agents, but how do we think about the interplay between the two? We draw inspiration from the Model-View-Controller (MVC) pattern.
In this case, our:
Modelsmap toReposViewsmap toPromptsControllersmap toTools
The Tool is the thing that coordinates retrieving data from the Repos and
then rendering that data into a Prompt prior to sending it to our "client", the
LLM.
The analogy is not perfect, and starts to stretch under scrutiny, but as a rough guide on how to think about the division of labor between the components, I find it useful.