Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release/2.6.0 #1184

Merged
merged 75 commits into from
Mar 10, 2025
Merged

Release/2.6.0 #1184

merged 75 commits into from
Mar 10, 2025

Conversation

chakravarthik27
Copy link
Collaborator

@chakravarthik27 chakravarthik27 commented Mar 9, 2025

📢 Highlights

We are excited to introduce the latest langtest release, bringing you a suite of improvements designed to streamline model evaluation and enhance overall performance:

  • 🛠 De-biasing Data Augmentation:
    We’ve integrated de-biasing techniques into our data augmentation process, ensuring more equitable and representative model assessments.

  • 🔄 Evaluation with Structured Outputs:
    LangTest now supports structured output APIs for both OpenAI and Ollama, offering greater flexibility and precision when processing model responses.

  • 🏥 Confidence Testing with Med Halt Tests:
    Introducing med halt tests for confidence evaluation, enabling more robust insights into your LLMs’ reliability under diverse conditions.

  • 📖 Expanded Task Support for JSL LLM Models:
    QA and Summarization tasks are now fully supported for JSL LLM models, enhancing their capabilities for real-world applications.

  • 🔒Security Enhancements:
    Critical vulnerabilities and security issues have been addressed, reinforcing the LangTest overall stability and safety.

  • 🐛 Resolved Bugs:
    We’ve fixed issues with templatic augmentation to ensure consistent, accurate, and reliable outputs across your workflows.

…curity-issues

chore: update certifi, idna, zipp versions and add extras in poetry.lock
…mentation-due-to-openai

fix(bug): update model handling in OpenAI and AzureOpenAI configurations
…-deepseek

Feature/add integration to deepseek
…-tests-for-robust-model-evaluation

Feature/implement med halt tests for robust model evaluation
…tion-supports-the-ollama-provider

feat: add support for generating templates using Ollama provider
…ting-bug-fixes-in-260-rc-version

fixes: resolving the bugs 2_6_0rc versions
…ting-bug-fixes-in-260-rc-version

fix: better handling of extra model params in Harness
@chakravarthik27 chakravarthik27 self-assigned this Mar 9, 2025
@chakravarthik27 chakravarthik27 merged commit cbfdc33 into main Mar 10, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants