elaborate

biocypher · Feb 17, 2024 · 56009ee · 56009ee
1 parent ab558b5
commit 56009ee
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/content/30.discussion.md b/content/30.discussion.md
@@ -19,7 +19,7 @@ As such, a framework is a necessary step towards the objective and reproducible
 We prevent data leakage from the benchmark datasets into the training data of new models by encryption, which is essential for the sustainability of the benchmark as new models are released.
 The living benchmark will be updated with new questions and tasks as they arise in the community.
 
-The benchmark's results provide a starting point for understanding why some models perform differently than expected.
+The benchmark's results provide convenient selection criteria and a starting point for understanding why some models perform differently than expected.
 For instance, the benchmark allowed immediate flagging of the drop in performance from the older (0613) to the newer (0125) version of gpt-4.
 It also identified a range of pre-trained open-source models suitable for our uses, most notably, the openhermes-2.5 model in 4- or 5-bit quantisation.
 This model is a fine-tuned (on GPT-4-generated data) variant of Mistral 7B v0.1, whose vanilla variants perform considerably worse in our benchmarks.