Skip to content

Commit

Permalink
add readme example & fix peer access (#13)
Browse files Browse the repository at this point in the history
* add readme example & fix peer access

* use bidirectional peer access

* CI:add publish to testpypi

* CI:change trigger

---------

Co-authored-by: Your Name <[email protected]>
Co-authored-by: luzhan <[email protected]>
  • Loading branch information
3 people authored Apr 16, 2024
1 parent 75e468c commit 51f5730
Show file tree
Hide file tree
Showing 5 changed files with 140 additions and 5 deletions.
105 changes: 105 additions & 0 deletions .github/workflows/publish-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# This workflow will upload a Python Package to Release asset
# For more information see: https://help.github.com/en/actions/language-and-framework-guides/using-python-with-github-actions

name: Publish to Test PyPI

on:
push:
branches:
- main

# Needed to create release and upload assets
permissions:
contents: write


jobs:
setup-version:
runs-on: ubuntu-latest
steps:
- name: Generate version number
run: |
VERSION_HASH=$(date +"%Y%m%d%H%M%S")
echo "Generated version hash: $VERSION_HASH"
echo $VERSION_HASH > version.txt
- name: Upload version number as artifact
uses: actions/upload-artifact@v2
with:
name: version
path: version.txt

wheel:
name: Build Wheel
runs-on: ${{ matrix.os }}
permissions: write-all

strategy:
fail-fast: false
matrix:
os: ['ubuntu-20.04']
python-version: ['3.8', '3.9', '3.10', '3.11']
cuda-version: ['11.7']

steps:
- name: Checkout
uses: actions/checkout@v3

# - name: Set up Linux Env
# if: ${{ runner.os == 'Linux' }}
# run: |
# bash -x .github/workflows/scripts/env.sh

# https://github.com/orgs/community/discussions/26313
- name: Download version value artifact
uses: actions/download-artifact@v2
with:
name: version
path: artifact

- name: Free disk space
run: |
sudo rm -rf /usr/local/cuda-* /opt/cuda
sudo rm -rf /usr/local/cuda
bash -x .github/workflows/scripts/free-disk-space.sh
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: ${{ matrix.python-version }}

- name: Install CUDA ${{ matrix.cuda-version }}
run: |
bash -x .github/workflows/scripts/cuda-install.sh ${{ matrix.cuda-version }} ${{ matrix.os }}
- name: Build wheel
shell: bash
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install build
VERSION_HASH=$(cat artifact/version.txt)
MOEINF_VERSION=0.0.1dev${VERSION_HASH} BUILD_OPS=1 python -m build --wheel
wheel_name=$(ls dist/*whl | xargs -n 1 basename)
asset_name=${wheel_name//"linux"/"manylinux1"}
echo "wheel_name=${wheel_name}" >> $GITHUB_ENV
echo "asset_name=${asset_name}" >> $GITHUB_ENV

# only build source when the python version is 3.8
- name: Build Source
if: ${{ matrix.python-version == '3.8' }}
run: |
VERSION_HASH=$(cat artifact/version.txt)
MOEINF_VERSION=0.0.1dev${VERSION_HASH} python -m build --sdist
- name: Rename wheel
run: |
mv dist/${{ env.wheel_name }} dist/${{ env.asset_name }}
# (Danielkinz): This last step will publish the .whl to pypi. Warning: untested
- name: Publish package
uses: pypa/gh-action-pypi-publish@release/v1.8
with:
repository-url: https://test.pypi.org/legacy/
skip-existing: true
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ CUDA_VISIBLE_DEVICES=0,1 python script.py
We provide a simple example to run inference on a Huggingface LLM model. The script will download the model checkpoint and run inference on the specified input text. The output will be printed to the console.

```bash
CUDA_VISIBLE_DEVICES=0 python example/interface_example.py --model_name_or_path "mistralai/Mixtral-8x7B-Instruct-v0.1" --offload_dir <your local path on SSD>
CUDA_VISIBLE_DEVICES=0 python examples/interface_example.py --model_name_or_path "mistralai/Mixtral-8x7B-Instruct-v0.1" --offload_dir <your local path on SSD>
```

## Release Plan
Expand Down
12 changes: 9 additions & 3 deletions core/prefetch/archer_prefetch_handle.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,18 @@ ArcherPrefetchHandle::ArcherPrefetchHandle(const std::string& prefix,
ARCHER_LOG_INFO("Device count ", device_count);

for (int i = 0; i < device_count; i++) {
cudaSetDevice(i);
for (int j = 0; j < device_count; j++) {
if (i != j) { cudaDeviceEnablePeerAccess(j, 0); }
if (i != j) {
int can_access = 0;
cudaDeviceCanAccessPeer(&can_access, i, j);
if (can_access == 1) {
cudaSetDevice(i);
cudaDeviceEnablePeerAccess(j, 0);
}
}
}
}

ARCHER_LOG_INFO("Enabled peer access for all devices");
}

Expand Down
24 changes: 24 additions & 0 deletions examples/readme_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
import torch
import os
from transformers import AutoTokenizer, SwitchTransformersForConditionalGeneration
from moe_infinity import MoE

user_home = os.path.expanduser('~')

checkpoint = 'TheBloke/Mixtral-8x7B-v0.1-GPTQ'
tokenizer = AutoTokenizer.from_pretrained(checkpoint)

config = {
"offload_path": os.path.join(user_home, "moe-infinity"),
"device_memory_ratio": 0.75, # 75% of the device memory is used for caching, change the value according to your device memory size on OOM
}

model = MoE(checkpoint, config)

input_text = "translate English to German: How old are you?"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda:0")

output_ids = model.generate(input_ids)
output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)

print(output_text)
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ def read_readme() -> str:
# install all files in the package, rather than just the egg
setup(
name='moe_infinity',
version='0.0.1',
version=os.getenv('MOEINF_VERSION', '0.0.1'),
packages=find_packages(exclude=['op_builder', 'op_builder.*', 'moe_infinity.ops.core.*']),
package_data={
'moe_infinity.ops.prefetch': ['**/*.so'],
Expand Down

0 comments on commit 51f5730

Please sign in to comment.