[MODULE] - GPT-based cross-encoder

**Please describe the module you would like to add to bricks**
Information retrieval is a 2-step process:
- basic similarity search (bi-encoders) provide top-100 candidates out of millions of documents extremely fast, but top-100 candidates is still too much
- to figure out the top-5 candidates, you can apply binary classification: "Is this fact relevant for the query?", and then score by confidence.

I would like to enable users to do this via GPT-3.5-Turbo. We found that this prompt is good:

```
Take a breath. You are assessing the relevance of question-fact pairs.
If a fact is directly related to the topic of the question (e.g. directly or even by implying consequences), it is "Relevant".
If there is no connection, it is "Irrelevant". In case of doubt, the fact is "Irrelevant".

        Fact: {fact}
        Question: {question}

Determine the relevance. Give a score from 0 to 100 for this (100 would be a straight answer to the question). 
Answer ONLY with the score itself (i.e. a number between 0 and 100).
If you answer with more than one number between 0 and 100, I will not process your output!
```

**Do you already have an implementation?**
See above, for openai/azure

**Additional context**
This requires an API key; for cognition, this is super relevant, but not part of the wizard setup if the user doesn't provide an API key.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MODULE] - GPT-based cross-encoder #357

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[MODULE] - GPT-based cross-encoder #357

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions