Falcon LLM is a cutting-edge family of open-source, causal decoder-only large language models developed by the Technology Innovation Institute (TII) in Abu Dhabi, UAE. These models are designed to perform diverse natural language processing tasks with remarkable efficiency and scalability. Falcon stands out for its fully transparent development process, making it an ideal choice for both academic and commercial applications.
- Open Source: Released under the permissive Apache 2.0 license.
- Scalability: Models ranging from lightweight Falcon-1B to enterprise-grade Falcon-180B.
- Training Data: Pretrained on 5 trillion tokens from the high-quality RefinedWeb dataset.
- Performance: Excels in benchmarks like MMLU and HELM, rivaling proprietary models such as GPT-4 and PaLM-2.
- Applications: Text generation, code creation, conversational AI, and more.
The Falcon team tried various combinations and found that the below worked best for them. The criteria for evaluation were the design philosophy that they need to not only improve model performance but also make sure that model design is scalable and cost /memory efficient.
Falcon models are built on an enhanced decoder-only transformer architecture. Key innovations include:
- Flash Attention: Optimized for faster and more memory-efficient computations.
- Rotary Positional Embeddings (RoPE): Combines absolute and relative positional encodings, enhancing generalization to longer sequences.
- Multi-Query Attention: Reduces memory and computation overhead by sharing key and value vectors across attention heads.
- Parallel Attention and Feed-Forward Layers: Boosts inference and training efficiency.
These enhancements ensure that Falcon LLM achieves state-of-the-art performance while remaining cost-effective.
Falcon LLM has demonstrated leading performance in various evaluations:
- AI2 Reasoning Challenge (ARC): Grade-school multiple choice science questions.
- HellaSwag: Commonsense reasoning around everyday events.
- MMLU: Multiple-choice questions in 57 subjects (professional & academic).
- TruthfulQA: Tests the model’s ability to separate fact from an adversarially-selected set of incorrect statements.
-
Set up a Python environment with the Hugging Face Transformers library.
-
Download the Falcon model (1b, 7b, 10b, 11b, 40b, 180b) via Hugging Face:
pip install transformers
from transformers import AutoModelForCausalLM, AutoTokenizer model = AutoModelForCausalLM.from_pretrained("tiiuae/falcon-7b") tokenizer = AutoTokenizer.from_pretrained("tiiuae/falcon-7b")
-
Run from transformers import pipeline
pip install transformers
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="tiiuae/Falcon3-1B-Base") # Generate text prompt = "Once upon a time" result = pipe(prompt, max_length=1000, num_return_sequences=1) # Print the generated text print(result[0]['generated_text'])
Run test scripts or interact with the model in a Python session.
-
Download and install the Ollama environment from Ollama Library.
-
Use the provided instructions to set up Falcon for your operating system.
ollama pull <falcon-model>
ollama run <falcon-model>
-
Engage with the model interactively for text generation or other NLP tasks.
-
Run test scripts or interact with the model in an Ollama session.
-
Download and install the Ollama environment from Ollama Library.
-
Use the provided instructions to set up Falcon for your operating system.
ollama pull <falcon-model>
-
Setup open web UI as interface for Falcon model, Open WebUI can be installed using pip, the Python package installer. Before proceeding, ensure you're using Python 3.11 to avoid compatibility issues.
- Install Open WebUI: Open your terminal and run the following command to install Open WebUI:
pip install open-webui
- Running Open WebUI: After installation, you can start Open WebUI by executing:
open-webui serve
- This will start the Open WebUI server, which you can access at http://localhost:8080
- Install Open WebUI: Open your terminal and run the following command to install Open WebUI:
-
Engage with the model interactively for text generation or other NLP tasks.
For further details or queries, please visit the Falcon LLM Website.