LLM-trustworthiness-robustness

This is a research initiative by Intuit AI Research and the project focuses on fundamental research in the area of LLM response quality assessment, LLM trustworthiness and robustness. Some of our work in this space include -

Hallucination detection and mitigation :- (SAC3)Semantic-aware cross-check consistency, DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models
Uncertainty quantification :- SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models

🔥 News

[2024.10] DCR paper accepted at EMNLP 2024
[2024.04] Intuit presents innovative approach to Quantify LLM Uncertainty at EACL 2024
[2024.04] SPUQ arxiv link available
[2024.02] SAC3 work presented in AI for Production organized by MLOps community!
[2024.01] DCR-consistency arxiv link available.
[2023.12] Intuit AI Research Debuts Novel Approach to Reliable Hallucination Detection in Black Box Language Models at EMNLP 2023
[2023.11] SAC3 arxiv link available.
[2023.10] SAC3 paper accepted at EMNLP 2023

Hallucination Detection and Mitigation

SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency

[Paper] [Code]

Semantic-aware cross-check consistency (SAC3) is a novel sampling-based hallucination detection method that expands on the principle of self-consistency checking and incorporates additional mechanisms to detect both question-level and model-level hallucinations by leveraging advances including semantically equivalent question perturbation and cross-model response consistency checking.

DCR-Consistency : Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models

[Paper] [Code]

DCR-Consistency is a novel framework that uses LLM agents to detect and mitigate inconsistencies, or in other words hallucinations. It takes advantage of LLM's power in semantic understanding while circumventing known pitfalls such as relatively poor performance in math. For more details please see our paper. Given a reference as the ground truth and a candidate to evaluate, it will output a numeric score between [0, 1] indicating its consistency where 0 means no sentence in the candidate is consistent and 1 otherwise. It also outputs a list of reasons about why this score is generated. Better yet, based on such reasons, it can improve the candidate and mitigate detected inconsistencies.

Uncertainty Quantification

SPUQ : Perturbation-Based Uncertainty Quantification for Large Language Models

[Paper] [Code]

SPUQ is an LLM uncertainty calibration algorithm. It provides a confidence score for each query, for a given LLM. Experiments show that this confidence score is correlated with the generation accuracy, and therefore provides a useful LLM response evaluation metric on-the-fly.

The details of the approach are documented in our paper published at EACL-2024 Conference.

The basic idea is to check whether an LLM provides a significantly different answer when we ask the same question in a slightly different way. If it does, we assume the LLM is not confident in this case. SPUQ perturbs the input (including the prompt and the temperature) to get multiple outputs, and then aggregate the outputs to obtain the final confidence score. This allows SPUQ to address both epistemic (via perturbation) and aleatoric (via sampling) uncertainties, and it provides better calibration than some of the other existing methods.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
DCR		DCR
SAC3		SAC3
SPUQ		SPUQ
utils		utils
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
requirements_meta.txt		requirements_meta.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM-trustworthiness-robustness

🔥 News

Hallucination Detection and Mitigation

SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency

DCR-Consistency : Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models

Uncertainty Quantification

SPUQ : Perturbation-Based Uncertainty Quantification for Large Language Models

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

intuit-ai-research/LLM-trustworthiness-robustness

Folders and files

Latest commit

History

Repository files navigation

LLM-trustworthiness-robustness

🔥 News

Hallucination Detection and Mitigation

SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency

DCR-Consistency : Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models

Uncertainty Quantification

SPUQ : Perturbation-Based Uncertainty Quantification for Large Language Models

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages