Pinned Loading
-
openai/evals
openai/evals PublicEvals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.
-
openai/mle-bench
openai/mle-bench PublicMLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering
-
summarizing-from-human-feedback
summarizing-from-human-feedback PublicImplementation of OpenAI's "Learning to Summarize with Human Feedback"
-
bitblaster-16
bitblaster-16 PublicBitBlaster-16 is a 16-bit computer built from scratch using only NAND gates and data flip-flops as primitives! :)
Python 2
-
fermi-poker
fermi-poker PublicWant to get better at making better estimates under uncertainty? No? Well, now you can!
Python 3
-
self-taught-critiquer
self-taught-critiquer PublicReducing the time to create critique-writing models by 100-1000x on n-digit arithmetic problems by getting the model to learn from its own generated outputs.
Python 1
If the problem persists, check the GitHub status page or contact support.