Skip to content

Liang-Team/Sequenzo

Repository files navigation

Sequenzo Logo

PyPI - Version Downloads License

Sequenzo: Fast, scalable, and intuitive social sequence analysis in Python

Sequenzo is a high-performance Python package designed for social sequence analysis. It is built to analyze any sequence of categorical events, from individual career paths and migration patterns to corporate growth and urban development. Whether you are working with people, places, or policies, Sequenzo helps uncover meaningful patterns efficiently.

Sequenzo outperforms traditional R-based tools in social sequence analysis, delivering faster processing and superior efficiency, especially for large-scale datasets. No big data? No problem. You don’t need big data to benefit as Sequenzo is designed to enhance sequence analysis at any scale, making complex methods accessible to everyone.

🚀 Explore the official documentation at sequenzo.yuqi-liang.tech
with tutorials, practical examples, and API references to help you get started quickly.

📖 Available in English and Chinese, our docs are written to be approachable, practical, and easy to follow.

✨ Be part of the Sequenzo community

Join our Discord channel to iscuss ideas, get help, and hear about upcoming Sequenzo versions, tutorials, and workshops first.

➡️ https://discord.gg/3bMDKRHW

Target Users

Sequenzo is designed for:

  • Quantitative researchers in sociology, demography, political science, economics, management, etc.
  • Data scientists, data analysts, and business analysts working on trajectory/time-series clustering
  • Educators teaching courses involving social sequence data
  • Users familiar with R packages such as TraMineR who want a Python-native alternative

Why Choose Sequenzo?

🚀 High Performance

Leverages Python’s computational power to achieve 8× faster processing than traditional R-based tools like TraMineR.

🎯 Easy-to-Use API

Designed with simplicity in mind: intuitive functions streamline complex sequence analysis without compromising flexibility.

🌍 Flexible for Any Scenario

Perfect for research, policy, and business, enabling seamless analysis of categorical data and its evolution over time.

Platform Compatibility

Sequenzo provides pre-built Python wheels for maximum compatibility — no need to compile from source.

Platform Architecture Python Versions Status
macOS Intel && Apple Silicon (64-bit) 3.9, 3.10, 3.11, 3.12 ✅ Pre-built wheel
Windows AMD64 (64-bit) 3.9, 3.10, 3.11, 3.12 ✅ Pre-built wheel
Linux (glibc) x86_64 (standard Linux) 3.9, 3.10, 3.11, 3.12 ✅ Pre-built wheel
Linux (musl) x86_64 (Alpine Linux) 3.9, 3.10, 3.11, 3.12 ✅ Pre-built wheel

What do these terms mean?

  • macosx_arm64 (macOS): One wheel supports Apple Silicon Macs.
  • macosx_x86_64 (macOS): One wheel supports Intel Macs.
  • manylinux2014_x86_64 (glibc-based Linux): Compatible with most mainstream Linux distributions (e.g., Ubuntu, Debian, CentOS).
  • musllinux_1_2 (musl-based Linux): For lightweight Alpine Linux environments, common in Docker containers.
  • AMD64 (Windows): Standard 64-bit Windows system architecture.

All of these wheels are pre-built and available on PyPI — so pip install sequenzo should work on supported platforms, without needing a compiler.

Windows (win32) and Linux (i686) are dropped due to:

  • Extremely low usage in modern systems (post-2020)
  • Memory limitations (≤ 4GB) unsuitable for scientific computing workloads
  • Increasing incompatibility with packages such as numpy, scipy, and pybind11
  • Frequent build failures and maintenance overhead in CI/CD pipelines

Installation

If you haven't installed Python, please follow Yuqi's tutorial about how to set up Python and your virtual environment.

Once Python is installed, we highly recommend using PyCharm as your IDE (Integrated Development Environment — the place where you open your folder and files to work with Python), rather than Visual Studio. PyCharm has excellent built-in support for managing virtual environments, making your workflow much easier and more reliable.

In PyCharm, please make sure to select a virtual environment using Python 3.9, 3.10, or 3.11 as these versions are fully supported by sequenzo.

Then, you can open the built-in terminal by clicking the Terminal icon terminal icon in the left sidebar (usually near the bottom). It looks like a small command-line window icon.

Once it’s open, type the following to install sequenzo:

pip install sequenzo

If you have some issues with the installation, it might because you have both Python 2 and Python 3 installed on your computer. In this case, you can try to use pip3 instead of pip to install the package.

pip3 install sequenzo

Optional R Integration

Sequenzo now checks the system environment variables before running ward.D hierarchical clustering.

If R is missing, a relevant prompt will be displayed along with specific installation instructions. If fastcluster is missing, Sequenzo will automatically download fastcluster.

Before automatically downloading fastcluster, Sequenzo checks whether R is available; if R is not installed, sequenzo will not automatically download fastcluster.

Sequenzo supports advanced Ward clustering methods that require R integration. If you need to use the ward_d clustering method, install with R support:

pip install sequenzo[r]

This will install the optional rpy2 dependency, which provides Python-R interoperability. Note that R must also be installed on your system for rpy2 to work.

For more information about the latest stable release and required dependencies, please refer to PyPI.

Documentation

Explore the full Sequenzo documentation here. Even though the documentation website is still under construction, you can already find some useful information there.

Where to start on the documentation website?

  • New to Sequenzo or social sequence analysis? Begin with "About Sequenzo" → "Quickstart Guide" for a smooth introduction.
  • Got your own data? After going through "About Sequenzo" and "Quickstart Guide", you are ready to dive in and start analyzing.
  • Looking for more? Check out our example datasets and tutorials to deepen your understanding.

For Chinese users, additional tutorials are available on Yuqi's video tutorials on Bilibili.

Join the Community

💬 Have a question or found a bug?

Please submit an issue on GitHub Issues by following this instruction.

  • We will respond as quickly as possible.
  • For requests that are not too large, we aim to fix or implement the feature within one week from our response time.
  • Timeline may vary depending on how many requests we receive.

🌟 Enjoying Sequenzo?

Support the project by starring ⭐ the GitHub repo and spreading the word!

🛠 Interested in contributing?

Check out our contribution guide for more details (work in progress).

  • Write code? Submit a pull request to enhance Sequenzo.
  • Testing? Try Sequenzo and share your feedback. Every suggestion counts!

If you're contributing or debugging, use:

pip install -r requirements/requirements-3.10.txt  # Or matching your Python version

For standard installation, use:

pip install .  # Uses pyproject.toml

Team

Paper Authors

Package Contributors

Coding contributors:

Documentation contributors:

Others

  • With special thanks to our initial testers (alphabetically ordered): Joji Chia, Kass Gonzalez, Sinyee Lu, Sohee Shin
  • Website and related technical support: Mactavish
  • Sequence data sources compilation - History: Jingrui Chen
  • Visual design consultant: Changyu Yi

Acknowledgements