-
Notifications
You must be signed in to change notification settings - Fork 36
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #2 from microsoft/hjiang/MInference
PreRelease: v0.1.0 Co-authored-by: Yucheng Li <liyucheng09@gmail.com> Co-authored-by: Chengruidong Zhang <chengzhang@microsoft.com>
- Loading branch information
Showing
90 changed files
with
14,295 additions
and
28 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
name: "\U0001F41B Bug Report" | ||
description: Submit a bug report to help us improve MInference | ||
title: "[Bug]: " | ||
labels: ["bug"] | ||
|
||
body: | ||
- type: textarea | ||
id: description | ||
attributes: | ||
label: Describe the bug | ||
description: A clear and concise description of what the bug is. | ||
placeholder: What went wrong? | ||
- type: textarea | ||
id: reproduce | ||
attributes: | ||
label: Steps to reproduce | ||
description: | | ||
Steps to reproduce the behavior: | ||
1. Step 1 | ||
2. Step 2 | ||
3. ... | ||
4. See error | ||
placeholder: How can we replicate the issue? | ||
- type: textarea | ||
id: expected_behavior | ||
attributes: | ||
label: Expected Behavior | ||
description: A clear and concise description of what you expected to happen. | ||
placeholder: What should have happened? | ||
- type: textarea | ||
id: logs | ||
attributes: | ||
label: Logs | ||
description: If applicable, add logs or screenshots to help explain your problem. | ||
placeholder: Add logs here | ||
- type: textarea | ||
id: additional_information | ||
attributes: | ||
label: Additional Information | ||
description: | | ||
- MInference Version: <!-- Specify the MInference version (e.g., v0.1.0) --> | ||
- Operating System: <!-- Specify the OS (e.g., Windows 10, Ubuntu 20.04) --> | ||
- Python Version: <!-- Specify the Python version (e.g., 3.8) --> | ||
- Related Issues: <!-- Link to any related issues here (e.g., #1) --> | ||
- Any other relevant information. | ||
placeholder: Any additional details |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
blank_issues_enabled: true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
name: "\U0001F680 Feature request" | ||
description: Submit a proposal/request for a new MInference feature | ||
labels: ["feature request"] | ||
title: "[Feature Request]: " | ||
|
||
body: | ||
- type: textarea | ||
id: problem_description | ||
attributes: | ||
label: Is your feature request related to a problem? Please describe. | ||
description: A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] | ||
placeholder: What problem are you trying to solve? | ||
|
||
- type: textarea | ||
id: solution_description | ||
attributes: | ||
label: Describe the solution you'd like | ||
description: A clear and concise description of what you want to happen. | ||
placeholder: How do you envision the solution? | ||
|
||
- type: textarea | ||
id: additional_context | ||
attributes: | ||
label: Additional context | ||
description: Add any other context or screenshots about the feature request here. | ||
placeholder: Any additional information |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
name: "\U0001F31F General Question" | ||
description: File a general question | ||
title: "[Question]: " | ||
labels: ["question"] | ||
|
||
body: | ||
- type: textarea | ||
id: description | ||
attributes: | ||
label: Describe the issue | ||
description: A clear and concise description of what the question is. | ||
placeholder: The detail of question. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
# What does this PR do? | ||
|
||
<!-- | ||
Congratulations! You've made it this far! You're not quite done yet though. | ||
Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. | ||
Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. | ||
Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. | ||
--> | ||
|
||
<!-- Remove if not applicable --> | ||
|
||
Fixes # (issue) | ||
|
||
|
||
## Before submitting | ||
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). | ||
- [ ] Was this discussed/approved via a Github issue? Please add a link | ||
to it if that's the case. | ||
- [ ] Did you make sure to update the documentation with your changes? | ||
- [ ] Did you write any new necessary tests? | ||
|
||
|
||
## Who can review? | ||
|
||
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag | ||
members/contributors who may be interested in your PR. | ||
|
||
<!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ | ||
If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of **who to tag**. | ||
Please tag fewer than 3 people. | ||
MInference: | ||
- general: @iofu728, @liyucheng09, @Starmys, and @mydmdm | ||
- kernel related: @Starmys | ||
--> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
# This workflows will build and upload a Python Package using Twine when a release is published | ||
# Conda-forge bot will pick up new PyPI version and automatically create new version | ||
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions#publishing-to-package-registries | ||
|
||
name: Release | ||
|
||
on: | ||
push: | ||
tags: | ||
- v* | ||
|
||
# Needed to create release and upload assets | ||
permissions: | ||
contents: write | ||
|
||
jobs: | ||
release: | ||
# Retrieve tag and create release | ||
name: Create Release | ||
runs-on: ubuntu-latest | ||
outputs: | ||
upload_url: ${{ steps.create_release.outputs.upload_url }} | ||
steps: | ||
- name: Checkout | ||
uses: actions/checkout@v4 | ||
|
||
- name: Extract branch info | ||
shell: bash | ||
run: | | ||
echo "release_tag=${GITHUB_REF#refs/*/}" >> $GITHUB_ENV | ||
- name: Create Release | ||
id: create_release | ||
uses: "actions/github-script@v6" | ||
env: | ||
RELEASE_TAG: ${{ env.release_tag }} | ||
with: | ||
github-token: "${{ secrets.GITHUB_TOKEN }}" | ||
script: | | ||
const script = require('.github/workflows/scripts/create_release.js') | ||
await script(github, context, core) | ||
wheel: | ||
name: Build Wheel | ||
runs-on: ${{ matrix.os }} | ||
needs: release | ||
|
||
strategy: | ||
fail-fast: false | ||
matrix: | ||
os: ['ubuntu-20.04'] | ||
python-version: ['3.8', '3.9', '3.10', '3.11'] | ||
pytorch-version: ['2.3.0'] # Must be the most recent version that meets requirements-cuda.txt. | ||
cuda-version: ['11.8', '12.1'] | ||
|
||
steps: | ||
- name: Checkout | ||
uses: actions/checkout@v4 | ||
|
||
- name: Setup ccache | ||
uses: hendrikmuhs/ccache-action@v1.2 | ||
with: | ||
create-symlink: true | ||
key: ${{ github.job }}-${{ matrix.python-version }}-${{ matrix.cuda-version }} | ||
|
||
- name: Set up Linux Env | ||
if: ${{ runner.os == 'Linux' }} | ||
run: | | ||
bash -x .github/workflows/scripts/env.sh | ||
- name: Set up Python | ||
uses: actions/setup-python@v4 | ||
with: | ||
python-version: ${{ matrix.python-version }} | ||
|
||
- name: Install CUDA ${{ matrix.cuda-version }} | ||
run: | | ||
bash -x .github/workflows/scripts/cuda-install.sh ${{ matrix.cuda-version }} ${{ matrix.os }} | ||
- name: Install PyTorch ${{ matrix.pytorch-version }} with CUDA ${{ matrix.cuda-version }} | ||
run: | | ||
bash -x .github/workflows/scripts/pytorch-install.sh ${{ matrix.python-version }} ${{ matrix.pytorch-version }} ${{ matrix.cuda-version }} | ||
- name: Build wheel | ||
shell: bash | ||
env: | ||
CMAKE_BUILD_TYPE: Release # do not compile with debug symbol to reduce wheel size | ||
run: | | ||
bash -x .github/workflows/scripts/build.sh ${{ matrix.python-version }} ${{ matrix.cuda-version }} | ||
wheel_name=$(ls dist/*whl | xargs -n 1 basename) | ||
asset_name=${wheel_name//"linux"/"manylinux1"} | ||
echo "wheel_name=${wheel_name}" >> $GITHUB_ENV | ||
echo "asset_name=${asset_name}" >> $GITHUB_ENV | ||
- name: Upload Release Asset | ||
uses: actions/upload-release-asset@v1 | ||
env: | ||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
with: | ||
upload_url: ${{ needs.release.outputs.upload_url }} | ||
asset_path: ./dist/${{ env.wheel_name }} | ||
asset_name: ${{ env.asset_name }} | ||
asset_content_type: application/* | ||
- name: Store the distribution packages | ||
uses: actions/upload-artifact@v4 | ||
with: | ||
name: ${{ env.asset_name }} | ||
path: ./dist/${{ env.wheel_name }} | ||
|
||
# publish-to-pypi: | ||
# name: >- | ||
# Publish Python 🐍 distribution 📦 to PyPI | ||
# if: startsWith(github.ref, 'refs/tags/') # only publish to PyPI on tag pushes | ||
# needs: wheel | ||
# runs-on: ubuntu-latest | ||
# permissions: | ||
# id-token: write # IMPORTANT: mandatory for trusted publishing | ||
|
||
# steps: | ||
# - name: Download all the dists | ||
# uses: actions/download-artifact@v4 | ||
# with: | ||
# path: dist/ | ||
# - name: Pick the whl files | ||
# run: for file in dist/*;do mv $file ${file}1; done && cp dist/*/*.whl dist/ && rm -rf dist/*.whl1 && rm -rf dist/*+cu* | ||
# - name: Display structure of downloaded files | ||
# run: ls -R dist/ | ||
# - name: Publish distribution 📦 to PyPI | ||
# uses: pypa/gh-action-pypi-publish@release/v1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
#!/bin/bash | ||
|
||
python_executable=python$1 | ||
cuda_home=/usr/local/cuda-$2 | ||
|
||
# Update paths | ||
PATH=${cuda_home}/bin:$PATH | ||
LD_LIBRARY_PATH=${cuda_home}/lib64:$LD_LIBRARY_PATH | ||
|
||
# Install requirements | ||
$python_executable -m pip install wheel packaging | ||
$python_executable -m pip install flash_attn triton | ||
|
||
# Limit the number of parallel jobs to avoid OOM | ||
export MAX_JOBS=1 | ||
# Make sure release wheels are built for the following architectures | ||
export TORCH_CUDA_ARCH_LIST="7.0 7.5 8.0 8.6 8.9 9.0+PTX" | ||
# Build | ||
$python_executable setup.py bdist_wheel --dist-dir=dist |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
// Uses Github's API to create the release and wait for result. | ||
// We use a JS script since github CLI doesn't provide a way to wait for the release's creation and returns immediately. | ||
|
||
module.exports = async (github, context, core) => { | ||
try { | ||
const response = await github.rest.repos.createRelease({ | ||
draft: false, | ||
generate_release_notes: true, | ||
name: process.env.RELEASE_TAG, | ||
owner: context.repo.owner, | ||
prerelease: true, | ||
repo: context.repo.repo, | ||
tag_name: process.env.RELEASE_TAG, | ||
}); | ||
|
||
core.setOutput('upload_url', response.data.upload_url); | ||
} catch (error) { | ||
core.setFailed(error.message); | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
#!/bin/bash | ||
|
||
# Replace '.' with '-' ex: 11.8 -> 11-8 | ||
cuda_version=$(echo $1 | tr "." "-") | ||
# Removes '-' and '.' ex: ubuntu-20.04 -> ubuntu2004 | ||
OS=$(echo $2 | tr -d ".\-") | ||
|
||
# Installs CUDA | ||
wget -nv https://developer.download.nvidia.com/compute/cuda/repos/${OS}/x86_64/cuda-keyring_1.1-1_all.deb | ||
sudo dpkg -i cuda-keyring_1.1-1_all.deb | ||
rm cuda-keyring_1.1-1_all.deb | ||
sudo apt -qq update | ||
sudo apt -y install cuda-${cuda_version} cuda-nvcc-${cuda_version} cuda-libraries-dev-${cuda_version} | ||
sudo apt clean | ||
|
||
# Test nvcc | ||
PATH=/usr/local/cuda-$1/bin:${PATH} | ||
nvcc --version | ||
|
||
# Log gcc, g++, c++ versions | ||
gcc --version | ||
g++ --version | ||
c++ --version |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
#!/bin/bash | ||
|
||
# This file installs common linux environment tools | ||
|
||
export LANG C.UTF-8 | ||
|
||
# python_version=$1 | ||
|
||
sudo apt-get update && \ | ||
sudo apt-get install -y --no-install-recommends \ | ||
software-properties-common \ | ||
|
||
sudo apt-get install -y --no-install-recommends \ | ||
build-essential \ | ||
apt-utils \ | ||
ca-certificates \ | ||
wget \ | ||
git \ | ||
vim \ | ||
libssl-dev \ | ||
curl \ | ||
unzip \ | ||
unrar \ | ||
cmake \ | ||
net-tools \ | ||
sudo \ | ||
autotools-dev \ | ||
rsync \ | ||
jq \ | ||
openssh-server \ | ||
tmux \ | ||
screen \ | ||
htop \ | ||
pdsh \ | ||
openssh-client \ | ||
lshw \ | ||
dmidecode \ | ||
util-linux \ | ||
automake \ | ||
autoconf \ | ||
libtool \ | ||
net-tools \ | ||
pciutils \ | ||
libpci-dev \ | ||
libaio-dev \ | ||
libcap2 \ | ||
libtinfo5 \ | ||
fakeroot \ | ||
devscripts \ | ||
debhelper \ | ||
nfs-common | ||
|
||
# Remove github bloat files to free up disk space | ||
sudo rm -rf "/usr/local/share/boost" | ||
sudo rm -rf "$AGENT_TOOLSDIRECTORY" | ||
sudo rm -rf "/usr/share/dotnet" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
#!/bin/bash | ||
|
||
python_executable=python$1 | ||
pytorch_version=$2 | ||
cuda_version=$3 | ||
|
||
# Install torch | ||
$python_executable -m pip install numpy pyyaml scipy ipython mkl mkl-include ninja cython typing pandas typing-extensions dataclasses setuptools && conda clean -ya | ||
$python_executable -m pip install torch==${pytorch_version}+cu${cuda_version//./} --extra-index-url https://download.pytorch.org/whl/cu${cuda_version//./} | ||
|
||
# Print version information | ||
$python_executable --version | ||
$python_executable -c "import torch; print('PyTorch:', torch.__version__)" | ||
$python_executable -c "import torch; print('CUDA:', torch.version.cuda)" | ||
$python_executable -c "from torch.utils import cpp_extension; print (cpp_extension.CUDA_HOME)" |
Oops, something went wrong.