What would it take to 100x the context window?

Thinking about what could be done when large language models can operate on phenomenally large context, and wondering what it might actually take to get there.

And realised this repo has a ton of really bright people in orbit, who actually understand brass tacks what might be involved.

Assuming it's really desirable, what hacks could be done to get there?

- What options are there?
- How much space or time would be involved?
- Would the models need full retraining?
- What about for 10x?

🙏

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What would it take to 100x the context window? #799

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

What would it take to 100x the context window? #799

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions