Add chat categorization dataset #98

hinthornw · 2023-12-01T00:35:46Z

This task is meant to test a couple things:

Classification -> both on common things where it is expected to perform well (e.g., sentiment, toxicity -> which currently is always 0)
Structured json output -> the schema is nested, which confused some of the smaller 7b models i tested out but works fine for llama 32b code instruct (and OAI/anthropic)

Includes a couple common things like enums.

TODO: Clean up notebooks and add more analysis

langchain_benchmarks/extraction/tasks/chat_extraction/schema.py

langchain_benchmarks/extraction/tasks/chat_extraction/__init__.py

hinthornw added 2 commits November 30, 2023 16:29

Update dataset

5df9cb5

Updating outputs

c4d8593

eyurtsev approved these changes Dec 1, 2023

View reviewed changes

langchain_benchmarks/extraction/tasks/chat_extraction/schema.py Outdated Show resolved Hide resolved

eyurtsev reviewed Dec 1, 2023

View reviewed changes

langchain_benchmarks/extraction/tasks/chat_extraction/__init__.py Outdated Show resolved Hide resolved

hinthornw added 8 commits November 30, 2023 21:02

Update

bfd8476

Add doc

b3651a1

Casing

dc3959c

Update linnk

b4602de

fmt

c1cfb19

fmt

eae504a

Delete langchain_benchmarks/extraction/utils.py

a632d40

format

89e8b6c

hinthornw merged commit 5ffdbb5 into main Dec 1, 2023

hinthornw deleted the wfh/clc-extr branch December 1, 2023 18:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add chat categorization dataset #98

Add chat categorization dataset #98

Uh oh!

hinthornw commented Dec 1, 2023

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Add chat categorization dataset #98

Add chat categorization dataset #98

Uh oh!

Conversation

hinthornw commented Dec 1, 2023

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants