Skip to content

kyegomez/LFM

Repository files navigation

Liquid Foundation Models [LFMs]

Join our Discord Subscribe on YouTube Connect on LinkedIn Follow on X.com

This is an attempt to make an open source implementation of LFMs, this is obviously not the official repository because it's closed source. I link papers below which I am using as a referrence. Discover more about the model from the original article

Installation

$ pip3 install -U lfm-torch

Usage

import torch
from lfm_torch.model import LFModel
from loguru import logger

# Instantiate and test the model
if __name__ == "__main__":
    batch_size, seq_length, embedding_dim = 32, 128, 512
    token_dim, channel_dim, expert_dim, adapt_dim, num_experts = (
        embedding_dim,
        embedding_dim,
        embedding_dim,
        128,
        4,
    )
    model = LFModel(
        token_dim, channel_dim, expert_dim, adapt_dim, num_experts
    )

    input_tensor = torch.randn(
        batch_size, seq_length, embedding_dim
    )  # 3D text tensor
    output = model(input_tensor)
    logger.info("Model forward pass complete.")

Liquid Transformer

A novel neural architecture combining Liquid Neural Networks, Transformer attention mechanisms, and Mixture of Experts (MoE) for enhanced adaptive processing and dynamic state updates. Very experimental and early! We're working on a training script here. It still needs an actual tokenizer like llama's tokenizer but it's getting there. If you can help with this then let me know.

Architecture Overview

flowchart TB
    subgraph "Liquid Transformer"
        Input["Input Sequence"] --> TL["Transformer Layer"]
        
        subgraph "Transformer Layer"
            direction TB
            MHA["Multi-Head Attention"] --> LC["Liquid Cell"]
            LC --> MOE["Mixture of Experts"]
            MOE --> LN["Layer Norm + Residual"]
        end
        
        subgraph "Liquid Cell Details"
            direction LR
            HS["Hidden State"] --> WH["W_h Linear"]
            Input2["Input"] --> WI["W_in Linear"]
            WH --> Add((+))
            WI --> Add
            Add --> Act["Activation"]
            Act --> LN2["LayerNorm"]
            LN2 --> DO["Dropout"]
        end
        
        subgraph "MoE Details"
            direction TB
            Input3["Input"] --> Gate["Gating Network"]
            Input3 --> E1["Expert 1"]
            Input3 --> E2["Expert 2"]
            Input3 --> E3["Expert N"]
            Gate --> Comb["Weighted Combination"]
            E1 --> Comb
            E2 --> Comb
            E3 --> Comb
        end
        
        TL --> Output["Output Sequence"]
    end
Loading
import torch
from loguru import logger

from lfm_torch.liquid_t_moe import LiquidTransformer

# Example usage
if __name__ == "__main__":
    seq_len, batch_size, embed_size = 10, 2, 64
    num_heads, num_experts, expert_size, num_layers = 8, 4, 64, 6

    # Create the model
    model = LiquidTransformer(embed_size, num_heads, num_experts, expert_size, num_layers)

    # Example input tensor
    x = torch.randn(seq_len, batch_size, embed_size)

    # Forward pass
    output = model(x)
    logger.info(f"Model output shape: {output.shape}")

Citations

License

This project is licensed under the MIT License. See the LICENSE file for details.

Releases

No releases published

Sponsor this project

 

Packages

No packages published