This repo contains information on Propel's work building a SNAP Eval (evaluation) for benchmarking and assessing large language models' (LLM) capabilities related to the SNAP or food stamp program.
You can find our release of 25 illustrative SNAP eval cases in this Google Sheet.
You can run the eval test cases in that spreadsheet above using the Promptfoo tool by using the promptfooconfig.yaml file in the included folder.
Note you will need to include API keys for the 3 providers in that config file (OpenAI, Anthropic, and Google.)
Have feedback, questions, or a contribution? Open an issue in this repo or email Dave Guarino at dave.guarino@joinpropel.com