Language models that create tools, and AI efficiency and democratization #5
Replies: 10 comments
-
Beta Was this translation helpful? Give feedback.
-
Customizationhttps://twitter.com/johnjnay/status/1637843926840164353 -Supervised fine-tuning on your tasks -RL w/ your reward model (RM) -Prompt w/ context RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation (Microsoft / Wuhan), https://arxiv.org/abs/2303.12570 OpenChatKit, https://www.together.xyz/blog/openchatkit, featuring Customization recipes to fine-tune the model and Extensible retrieval system for live-updating answers. ChatGPT plugins, https://openai.com/blog/chatgpt-plugins Copilot for Docs, https://githubnext.com/projects/copilot-for-docs Compare https://github.com/context-labs/autodoc Tool useToolformer implementations: https://github.com/conceptofmind/toolformer (official), https://github.com/lucidrains/toolformer-pytorch Tool Learning with Foundation Models (Tsinghua), https://arxiv.org/abs/2304.08354, https://github.com/OpenBMB/BMTools (Auto-GPT and BabyAGI support) Augmented Language Models: a Survey, LeCun et al., https://arxiv.org/abs/2302.07842 TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs (Microsoft), https://arxiv.org/abs/2303.16434 Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP (Stanford), https://github.com/stanfordnlp/dsp Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback (Microsoft/Columbia U), https://arxiv.org/abs/2302.12813 ART: Automatic multi-step reasoning and tool-use for large language models (UWash, Microsoft, UCI, Allen, Meta), https://arxiv.org/abs/2303.09014 The surprising ease and effectiveness of AI in a loop, Matt Webb, https://interconnected.org/home/2023/03/16/singularity A simple Python implementation of the ReAct pattern for LLMs, Simon Willison, https://til.simonwillison.net/llms/python-react-pattern Tool building (Coding)Planning with Large Language Models for Code Generation, Tenenbaum and MIT-IBM Waston, https://arxiv.org/abs/2303.05510 ViperGPT: Visual Inference via Python Execution for Reasoning (Columbia U), https://viper.cs.columbia.edu/ Reflexion: an autonomous agent with dynamic memory and self-reflection (MIT / Northeastern), https://arxiv.org/abs/2303.11366, blog post Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks (Waterloo/UCSB/Google, Nov 2022), https://github.com/wenhuchen/Program-of-Thoughts Symbolic Knowledge Distillation: from General Language Models to Commonsense Models, Yejin Choi et al., https://arxiv.org/abs/2110.07178 AI efficiencyIn AI, is bigger always better?, Anil Ananthaswamy, https://www.nature.com/articles/d41586-023-00641-w CoLT5: Faster Long-Range Transformers with Conditional Computation (Google), https://arxiv.org/abs/2303.09752 Algorithm optimizationEvoPrompting: Language Models for Code-Level Neural Architecture Search (NYU/Google Brain), https://arxiv.org/abs/2302.14838 Symbolic Discovery of Optimization Algorithms, Quoc Le et al. (Google/UCLA), https://github.com/lucidrains/lion-pytorch, https://arxiv.org/abs/2302.06675 Theorem provingBaldur: Whole-Proof Generation and Repair with Large Language Models, First, Rabe, Ringer and Brun, https://arxiv.org/abs/2303.04910 Magnushammer: A Transformer-based Approach to Premise Selection, Albert Jiang, Szegedy, Yuhuai Wu et al., https://arxiv.org/abs/2303.04488 ProofNet: Autoformalizing and Formally Proving Undergraduate-Level Mathematics (new dataset), Zhangir Azerbayev, Ayers, Avigad et al. https://github.com/zhangir-azerbayev/ProofNet Future of MathematicsSome thoughts on automation and mathematical research, Akshay Venkatesh, Nov 2021, https://www.math.ias.edu/~akshay/research/IASEssay.pdf Terence Tao's integration of ChatGPT into daily workflow, picked up by Chinese news outlets MoreLearning to Compress Prompts with Gist Tokens, Jesse Mu et al., https://arxiv.org/abs/2304.08467 GPT-4: The Bitterer Lesson, Alberto Romero, https://thealgorithmicbridge.substack.com/p/gpt-4-the-bitterer-lesson A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT (Vanderbilt), https://arxiv.org/abs/2302.11382 and ChatGPT Prompt Patterns for Improving Code Quality, Refactoring, Requirements Elicitation, and Software Design (Vanderbilt), https://arxiv.org/abs/2303.07839 Impact of Code Language Models on Automated Program Repair (Alberta / Purdue), https://arxiv.org/abs/2302.05020 Program synthesis in chip design, https://hub.baai.ac.cn/view/24406 (Chinese) |
Beta Was this translation helpful? Give feedback.
-
I did reply to your twitter thread btw https://twitter.com/0xDist/status/1631883008918835201. Maybe I am also shadow-banned 😅 I coincidentally both came across your twitter thread and this thread separately, but fully agree with both. |
Beta Was this translation helpful? Give feedback.
-
Thanks for your reply @distbit0 ! I wrote the post on Feb 14 before you replied :) I like the neuroscience articles you linked to but I'm not sure what inspirations to get from them ... I had written some responses to your tweets but too much is going on these days; when I get around to polish them a bit I'll post them here. |
Beta Was this translation helpful? Give feedback.
-
LLM agentshttps://github.com/jtmuller5/The-HustleGPT-Challenge (inspred Auto-GPT and BabyAGI) https://github.com/Torantulino/Auto-GPT (>100k stars, implements code execution/improvement) Foundation Models for Decision Making: Problems, Methods, and Opportunities (Google/Berkeley/MIT), https://arxiv.org/abs/2303.04129 https://github.com/hwchase17/langchain (recently got $10M investment): Agents https://github.com/eumemic/ai-legion Lean 4 agent Minecraft agents: Voyager (creates tools), GITM Reinforcement learningReward Design with Language Models (Stanford/DeepMind), uses GPT3 API; proxy reward function is more akin to Constitutional AI than RLHF, https://arxiv.org/abs/2303.00001, news Vision-Language Models (Flamingo) as Success Detectors (DeepMind), https://arxiv.org/abs/2303.07280 Reinforcement Learning from Passive Data via Latent Intentions (Berkeley), https://arxiv.org/abs/2304.04782 Reinforcement Learning for Language Models, Yoav Goldberg, https://gist.github.com/yoavg/6bff0fecd65950898eba1bb321cfbd81 Active learningInternet Explorer: Targeted Representation Learning on the Open Web, Deepak Pathak et al. (CMU/Berkeley), https://internet-explorer-ssl.github.io/ Active Self-Supervised Learning: A Few Low-Cost Relationships Are All You Need, LeCun et al., https://arxiv.org/abs/2303.15256 Iterated improvementExamples of AI Improving AI, Thomas Woodside, https://ai-improving-ai.safe.ai/ AI for ScienceAI for Science: An Emerging Agenda (Cambridge / Madison / Tübingen), https://arxiv.org/abs/2303.04217 Emergent autonomous scientific research capabilities of large language models, https://arxiv.org/abs/2304.05332, tweet (John Nay) ChemCrow: Augmenting large-language models with chemistry tools, https://arxiv.org/abs/2304.05376, tweet (Jim Fan) |
Beta Was this translation helpful? Give feedback.
-
Also some more: https://github.com/stars/distbit0/lists/agentic-ai |
Beta Was this translation helpful? Give feedback.
-
Also add Jarvis to the list: |
Beta Was this translation helpful? Give feedback.
-
Thanks! I classified HuggingGPT under the "tool use" section; it operates in four stages and doesn't have the recursive, unbounded nature of Auto-GPT-like agents. |
Beta Was this translation helpful? Give feedback.
-
hey @alreadydone would you be up for a half hour call soon by any chance? Would be awesome to talk. https://cal.com/distbit/30min?duration=30 np if you are too busy. I should have more code+writing on these topics v soon which I will link to here anyway, if that is ok with you. |
Beta Was this translation helpful? Give feedback.
-
Datasets for fine-tuninghttps://github.com/yaodongC/awesome-instruction-dataset Toolformer / plugin / APIs: StackLLaMA: A hands-on guide to train LLaMA with RLHF, https://huggingface.co/blog/stackllama Context lengthScaling Transformer to 1M tokens and beyond with RMT, https://arxiv.org/abs/2304.11062 (2M tokens) |
Beta Was this translation helpful? Give feedback.
-
In response to https://leanprover-community.github.io/archive/stream/208328-IMO-grand-challenge/topic/Current.20status.3F.html#320982506
@Mario Carneiro I think models like the Toolformer that offloads/outsources some capabilities to external (symbolic) tools will drive down at least the inference cost, if not the training cost as well. The paper reports competitive zero-shot performance with much lower parameter count. (Langchain's MRKL systems are also mentioned by many people in connection with it.) RETRO performs well on knowledge-intensive tasks with much lower parameter count, by retrieving from a database rather than a toolbox. And there are architecture innovations like RWKV-LM that are less resource-intensive than transformers but may prove to be as powerful (they just released an RNN with 14B parameters).
Next-generation models may be trained to not just use tools, but also create tools and organize them into a codebase (think of long-term memory, or crystallized intelligence), and they're gonna read books and papers and grow mathlib for us automatically (auto-formalization). Being able to execute programs and observe outputs serves as a strong form of grounding (Yoav Goldberg, 5th paragraph), similar to formalization of mathematics. Drori et al.'s PNAS paper and Program-aided Language Models from CMU already do program synthesis in Python, but do not store the programs and build upon them; as we know, to prove a big theorem, it's often necessary to have lots of intermediate lemmas in place.
As the model continually learns, it might gradually optimize programs in its codebase (think of AlphaTensor), including other models it creates and stores there, so more efficient models could eventually emerge in the codebase and the original model could offload most of its capability to them, and we could just use the most efficient one. The original model probably need to be of adaptive computation time though, since otherwise offloading can't possibly save time even though it may improve accuracy. Data center cooling, neural architecture search, and chip design are other places where a positive feedback loop for AI efficiency could be formed.
ChatGPT has shown that reinforcement learning can be a very effective way of aligning language models to our purposes, and a codebase has a naturally associated dependency graph, which could potentially help with credit assignment, so actions that wrote the most popular functions/theorems in the codebase/library could be rewarded. As a bonus, PageRank is easy on a DAG.
I've written down some thoughts in this tweet thread a few weeks ago
but apparently I was talking into air😂 Are my ideas too crazy, or too well-known and already actively pursued in major labs @Geoffrey Irving @Timothee Lacroix?Or maybe I was just shadow-banned by Twitter.(Unfortunately it seems OpenAI's Lean paper authors have all left ...)I think these ideas overlap with those of memorizing transformers from @Yuhuai Tony Wu, @Christian Szegedy, et al. which use non-differentiable memory as well, but memory access is via kNN lookup of vector embeddings. Since LMs are supposed to good at languages, I think it's natural to let them establish and follow naming conventions to name the new concepts they invented, and interact with their memory via tokens, but I don't have the experience to judge which approach is more practical.
As for democratization of access, I'm generally optimistic: I'm pretty impressed that I can now run Stable Diffusion (SD) to create 512x512 images with just 4GB VRAM on my laptop. And it may not be difficult to adapt LMs to our particular purpose: for example, it now suffice to fine-tune only 3MB of the SD model's parameters on custom data, thanks to LoRA. Open Assistant is now crowd-sourcing training data to create a ChatGPT clone, and this paradigm of distributed data generation combined with centralized training could be traced back to Leela Zero / Leela Chess Zero. It's worth noting that people are building distributed training infrastructure as well, but it may take time to mature.
Volunteer computing has a long history and people have donated massive amounts of compute to projects like GIMPS (Mersenne prime search), or Fishtest, and through BOINC. So it's just a matter of popularizing mathematics to get more computing resources directed to mathlib. When AI is capable of creating new mathematics, it will draw even more volunteer resources: it's much more exciting to see long-standing conjectures being solved than seeing Elo rating keeps improving monotonically (pun intended). Considering how fast LMs can turn natural language into code and vice versa (already useful in reverse engineering), I think the formal-informal gap could be fairly small for machines.
Time is on our side. Conversational LMs will revolutionize education, and people around the world will get equal access to the most sophisticated mathematical ideas. Advanced mathematics will come out of brains of geniuses and ivory towers and reach more people than ever. (mathlib is already democratizing access to every detail for every proof and definition in it, which you don't normally get in textbooks, and LMs will be able to parse and summarize the Lean code and explain the big picture.) As automation progresses and people work less hours, they will have more spare time to devote to mathematics, among other means of entertainment. Gladly, Microsoft's CEO seems to share the same vision:
Exciting times are ahead!
(Mandatory disclaimer: ChatGPT did not engage in the writing of this essay.)
Beta Was this translation helpful? Give feedback.
All reactions