Skip to content
@PKU-Alignment

PKU-Alignment

Loves Sharing and Open-Source, Making AI Safer.

PKU-Alignment Team

Large language models (LLMs) have immense potential in the field of general intelligence but come with significant risks. As a research team at Peking University, we actively focus on alignment techniques for LLMs, such as safety alignment, to enhance the model's safety and reduce toxicity.

Welcome to follow our AI Safety project:

Pinned Loading

  1. omnisafe omnisafe Public

    JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.

    Python 939 132

  2. safety-gymnasium safety-gymnasium Public

    NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

    Python 394 52

  3. safe-rlhf safe-rlhf Public

    Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

    Python 1.3k 119

  4. Safe-Policy-Optimization Safe-Policy-Optimization Public

    NeurIPS 2023: Safe Policy Optimization: A benchmark repository for safe reinforcement learning algorithms

    Python 328 45

Repositories

Showing 10 of 16 repositories
  • align-anything Public

    Align Anything: Training All-modality Model with Feedback

    PKU-Alignment/align-anything’s past year of commit activity
    Python 238 Apache-2.0 43 4 0 Updated Nov 10, 2024
  • .github Public
    PKU-Alignment/.github’s past year of commit activity
    0 0 0 0 Updated Nov 9, 2024
  • ProgressGym Public

    Alignment with a millennium of moral progress. Spotlight@NeurIPS 2024 Track on Datasets and Benchmarks.

    PKU-Alignment/ProgressGym’s past year of commit activity
    Python 11 MIT 3 0 0 Updated Nov 9, 2024
  • PKU-Alignment/Aligner2024.github.io’s past year of commit activity
    HTML 0 1 0 0 Updated Oct 31, 2024
  • omnisafe Public

    JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.

    PKU-Alignment/omnisafe’s past year of commit activity
    Python 939 Apache-2.0 132 9 3 Updated Oct 15, 2024
  • safe-sora Public

    SafeSora is a human preference dataset designed to support safety alignment research in the text-to-video generation field, aiming to enhance the helpfulness and harmlessness of Large Vision Models (LVMs).

    PKU-Alignment/safe-sora’s past year of commit activity
    Python 25 5 1 0 Updated Aug 20, 2024
  • safe-rlhf Public

    Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback

    PKU-Alignment/safe-rlhf’s past year of commit activity
    Python 1,345 Apache-2.0 119 12 0 Updated Jun 13, 2024
  • llms-resist-alignment Public

    Repo for paper "Language Models Resist Alignment"

    PKU-Alignment/llms-resist-alignment’s past year of commit activity
    Python 4 0 0 0 Updated Jun 9, 2024
  • aligner Public Forked from cby-pku/aligner

    Achieving Efficient Alignment through Learned Correction

    PKU-Alignment/aligner’s past year of commit activity
    Python 0 6 0 0 Updated Jun 7, 2024
  • safety-gymnasium Public

    NeurIPS 2023: Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

    PKU-Alignment/safety-gymnasium’s past year of commit activity
    Python 394 Apache-2.0 52 4 0 Updated May 14, 2024

Top languages

Loading…

Most used topics

Loading…