Popular repositories Loading
-
scribegoat2
scribegoat2 PublicOpen-source medical LLM safety evaluation pipeline with reproducible benchmarks and high-risk clinical failure analysis.
-
lostbench
lostbench PublicStandalone benchmark for multi-turn safety persistence in medical LLM conversations. Measures recommendation monotonicity under sustained patient pressure.
Python
-
openem-corpus
openem-corpus PublicThe AI-native emergency medicine knowledge base. Agent-compiled, physician-verified, grep-friendly.
Python
-
safeshift
safeshift PublicDoes making the model faster make it less safe? Safety degradation benchmarking under inference optimization.
Python
-
radslice
radslice PublicMultimodal radiology LLM benchmark across CT, MRI, X-ray, and Ultrasound
Python
-
Repositories
- openem-corpus Public
The AI-native emergency medicine knowledge base. Agent-compiled, physician-verified, grep-friendly.
GOATnote-Inc/openem-corpus’s past year of commit activity - healthcraft Public
HEALTHCRAFT RL Training Environment: adapts the CORECRAFT architecture to emergency medicine
GOATnote-Inc/healthcraft’s past year of commit activity - lostbench Public
Standalone benchmark for multi-turn safety persistence in medical LLM conversations. Measures recommendation monotonicity under sustained patient pressure.
GOATnote-Inc/lostbench’s past year of commit activity - scribegoat2 Public
Open-source medical LLM safety evaluation pipeline with reproducible benchmarks and high-risk clinical failure analysis.
GOATnote-Inc/scribegoat2’s past year of commit activity - safeshift Public
Does making the model faster make it less safe? Safety degradation benchmarking under inference optimization.
GOATnote-Inc/safeshift’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…