Skip to content

Commit

Permalink
simplify sentence
Browse files Browse the repository at this point in the history
  • Loading branch information
slobentanzer committed Feb 17, 2024
1 parent d09aa6d commit ab558b5
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions content/30.discussion.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,12 @@ Inspired by the productivity of open source libraries such as LangChain [@langch
To keep the framework effective and sustainable, we reuse existing open-source libraries and tools while adapting the advancements from the wider LLM community to the biomedical domain.
The transparency we emphasise at every step of the framework is essential to a sustainable application of LLMs in biomedical research and beyond [@doi:10.1038/d41586-024-00029-4].

To facilitate efficient human-AI interaction, a "lingua franca" is required; symbolic representations of concepts are required at least at the surface level of the conversation [@doi:10.1609/aaai.v36i11.21488].
Efficient human-AI interaction may require a "lingua franca": symbolic representations of concepts at least at the surface level of the conversation [@doi:10.1609/aaai.v36i11.21488].
We enable interaction with LLMs on a symbolic level by providing ontological grounding via the synergy of BioChatter with BioCypher KGs.
The configuration of BioCypher KGs allows the user to specify the contextual domain by mapping KG concepts to existing ontologies and custom terminology.
This way, we guarantee an overlap in the contextual understanding of user and LLM despite the generic nature of most pre-trained models.

We take particular care to guarantee robustness and objective evaluation of LLM behaviour and their performance in interaction with other parts of the framework.
We emphasise robustness and objective evaluation of LLM behaviour and performance in interaction with other parts of the framework.
We achieve this goal by implementing a living benchmarking framework that allows the automated evaluation of LLMs, prompts, and other components ([https://biochatter.org/benchmark/](https://biochatter.org/benchmark/)).
Even the most recent biomedicine-specific benchmarking efforts are small-scale manual approaches that do not consider the full matrix of possible combinations of components, and many benchmarks are performed by accessing web interfaces of LLMs, which obfuscates important parameters such as model version and temperature [@biollmbench].
As such, a framework is a necessary step towards the objective and reproducible evaluation of LLMs.
Expand Down

0 comments on commit ab558b5

Please sign in to comment.