Latent Recurrent Depth Language Model

Welcome to the Latent Recurrent Depth Language Model repository! This project provides an implementation of a deep language model that combines latent recurrent architectures with modern attention mechanisms. The model is designed for efficient sequence modeling and language understanding tasks.

Overview

This repository implements a novel language modeling architecture that leverages:

Latent Recurrent Blocks: To capture long-term dependencies.
Multi-Head Attention: For modeling complex interactions between tokens.
Deep Stacking of Model Blocks: To achieve depth and expressivity in the network.

The project is modularized to separate concerns such as data handling, tokenization, model definition, training pipelines, and inference utilities. This makes it easy to experiment with different configurations and extend the model.

Features

Custom Dataset Processing: Easily preprocess and load your text data using dataset.py.
Flexible Training Pipeline: Train the model with configurable options using trainer.py and pipeline.py.
Inference Utilities: Generate sequences or test model predictions with scripts in the Inference/ directory.
Model Hub Integration: Push trained models to popular hubs using push_to_hub.py.
Modular Model Design: Separate model components in the Model/ directory including:
- latent_Recurrent.py
- recurrent_Block.py
- prelude_Block.py
- codaBlock.py
- multi_head_Attention.py

Directory Structure

codewithdark-git-latentrecurrentdepthlm/
├── README.md
├── LICENSE
├── dataset.py
├── pipeline.py
├── push_to_hub.py
├── tokenizer.py
├── trainer.py
├── Inference/
│   ├── One_word.py
│   ├── Squence_Generator.py
│   └── locally.py
└── Model/
    ├── codaBlock.py
    ├── latent_Recurrent.py
    ├── model.py
    ├── multi_head_Attention.py
    ├── prelude_Block.py
    └── recurrent_Block.py

Root Files: Core utilities for data processing, training, tokenization, and hub integration.
Inference/: Contains scripts for various inference scenarios:
- One_word.py: Likely for single-word prediction or testing.
- Squence_Generator.py: For generating sequences.
- locally.py: For running inference locally.
Model/: Contains model definitions and components that build the architecture.

Installation

Clone the Repository:

git clone https://github.com/codewithdark/latent-recurrent-depth-lm.git
cd latent-recurrent-depth-lm

Create a Virtual Environment (Optional but Recommended):

python -m venv venv
source venv/bin/activate   # On Windows use `venv\Scripts\activate`

Install Dependencies:

Install the required Python packages. For example, if using pip:
```
pip install -r requirements.txt
```
Note: If a requirements.txt is not provided, ensure you have the following installed:
- Python 3.7+
- PyTorch
- NumPy
- (Any other library required by your specific implementation)

Usage

Data Preparation

Use dataset.py to preprocess your text data.

Training

Start training the model by running the pipeline. You can adjust hyperparameters and training configurations within pipeline.py

Model Architecture

The model architecture is composed of several custom blocks:

latent_Recurrent.py & recurrent_Block.py: Implements the recurrent components for sequence modeling.
prelude_Block.py & codaBlock.py: Serve as the input and output blocks, respectively, to preprocess input tokens and postprocess model outputs.
multi_head_Attention.py: Implements multi-head attention mechanisms that allow the model to focus on different parts of the input simultaneously.
model.py: Combines all these components into a cohesive model that can be trained and evaluated.

The modular design allows for easy experimentation with different configurations and architectures.

Push to Hub

To share your trained model with the community or deploy it on a model hub, use the `push_to_hub.py` script.

Contributing

Contributions are welcome! If you have suggestions, bug fixes, or improvements, please open an issue or submit a pull request.

Fork the repository.
Create a new branch (git checkout -b feature/your-feature).
Commit your changes (git commit -am 'Add new feature').
Push to the branch (git push origin feature/your-feature).
Create a new Pull Request.

License

This project is licensed under the terms of the MIT License.

Happy Modeling!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Latent Recurrent Depth Language Model

Table of Contents

Overview

Features

Directory Structure

Installation

Usage

Data Preparation

Training

Model Architecture

Push to Hub

To share your trained model with the community or deploy it on a model hub, use the `push_to_hub.py` script.

Contributing

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Inference		Inference
Model		Model
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
pipeline.py		pipeline.py
push_to_hub.py		push_to_hub.py
tokenizer.py		tokenizer.py
train.py		train.py
trainer.py		trainer.py

License

codewithdark-git/LatentRecurrentDepthLM

Folders and files

Latest commit

History

Repository files navigation

Latent Recurrent Depth Language Model

Table of Contents

Overview

Features

Directory Structure

Installation

Usage

Data Preparation

Training

Model Architecture

Push to Hub

To share your trained model with the community or deploy it on a model hub, use the push_to_hub.py script.

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

To share your trained model with the community or deploy it on a model hub, use the `push_to_hub.py` script.

Packages