Closed
Description
Thinking about what could be done when large language models can operate on phenomenally large context, and wondering what it might actually take to get there.
And realised this repo has a ton of really bright people in orbit, who actually understand brass tacks what might be involved.
Assuming it's really desirable, what hacks could be done to get there?
- What options are there?
- How much space or time would be involved?
- Would the models need full retraining?
- What about for 10x?
🙏