-
Notifications
You must be signed in to change notification settings - Fork 177
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
98 changed files
with
3,312 additions
and
1,805 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
# What does this PR do? | ||
|
||
<!-- | ||
Please briefly describe your change, including what problem the change fixes, and any context | ||
necessary for understanding the change | ||
--> | ||
|
||
# What issue(s) does this change relate to? | ||
|
||
<!-- | ||
Please include any issues related to this pull request, including 'Fixes' if the issue is resolved | ||
by this pull request. | ||
Example: | ||
- Fixes #42 | ||
- Related to #1234 | ||
--> | ||
|
||
# Before submitting | ||
- [ ] Have you read the [contributor guidelines](https://github.com/databricks/megablocks/blob/dev/CONTRIBUTING.md)? | ||
- [ ] Is this change a documentation change or typo fix? If so, skip the rest of this checklist. | ||
- [ ] Was this change discussed/approved in a GitHub issue first? It is much more likely to be merged if so. | ||
- [ ] Did you update any related docs and document your change? | ||
- [ ] Did you update any related tests and add any new tests related to your change? (see [testing](https://github.com/databricks/megablocks/blob/dev/CONTRIBUTING.md#running-tests)) | ||
- [ ] Did you run the tests locally to make sure they pass? | ||
- [ ] Did you run `pre-commit` on your change? (see the `pre-commit` section of [prerequisites](https://github.com/databricks/megablocks/blob/dev/CONTRIBUTING.md#prerequisites)) | ||
|
||
<!-- | ||
Thanks so much for contributing to MegaBlocks! We really appreciate it :) | ||
--> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
name: Code Quality Checks | ||
on: | ||
push: | ||
branches: | ||
- main | ||
- release/** | ||
pull_request: | ||
branches: | ||
- main | ||
- release/** | ||
workflow_call: | ||
workflow_dispatch: | ||
# Cancel old runs when a new commit is pushed to the same branch if not on main or dev | ||
concurrency: | ||
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }} | ||
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }} | ||
defaults: | ||
run: | ||
working-directory: . | ||
jobs: | ||
code-quality: | ||
runs-on: ubuntu-latest # TODO: switch to linux-ubuntu-latest later | ||
timeout-minutes: 30 | ||
strategy: | ||
matrix: | ||
python_version: | ||
- "3.11" | ||
pip_deps: | ||
- "[dev]" | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- name: Get composite run steps repository | ||
uses: actions/checkout@v3 | ||
with: | ||
repository: mosaicml/ci-testing | ||
ref: v0.1.2 | ||
path: ./ci-testing | ||
- uses: ./ci-testing/.github/actions/code-quality | ||
with: | ||
python_version: ${{ matrix.python_version }} | ||
pip_deps: ${{ matrix.pip_deps }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,30 +15,32 @@ concurrency: | |
jobs: | ||
pytest-gpu: | ||
name: ${{ matrix.name }} | ||
runs-on: ubuntu-latest # todo: switch to linux-ubuntu-latest later | ||
if: github.repository_owner == 'databricks' | ||
runs-on: ubuntu-latest # TODO: switch to linux-ubuntu-latest later | ||
strategy: | ||
fail-fast: false | ||
matrix: | ||
include: | ||
- name: "python3.11-pytorch2.3.1-gpus1" | ||
- name: "python3.11-pytorch2.4.0-gpus1" | ||
gpu_num: 1 | ||
python_version: 3.11 | ||
container: mosaicml/pytorch:2.3.1_cu121-python3.11-ubuntu20.04 | ||
- name: "python3.11-pytorch2.3.1-gpus2" | ||
container: mosaicml/pytorch:2.4.0_cu124-python3.11-ubuntu20.04 | ||
- name: "python3.11-pytorch2.4.0-gpus2" | ||
gpu_num: 2 | ||
python_version: 3.11 | ||
container: mosaicml/pytorch:2.3.1_cu121-python3.11-ubuntu20.04 | ||
container: mosaicml/pytorch:2.4.0_cu124-python3.11-ubuntu20.04 | ||
steps: | ||
- name: Run PR GPU tests | ||
uses: mosaicml/ci-testing/.github/actions/[email protected].0 | ||
uses: mosaicml/ci-testing/.github/actions/[email protected].2 | ||
with: | ||
name: ${{ matrix.name }} | ||
container: ${{ matrix.container }} | ||
python_version: ${{ matrix.python_version }} | ||
gpu_num: ${{ matrix.gpu_num }} | ||
git_repo: databricks/megablocks | ||
pip_deps: "[all,testing]" | ||
pytest_command: "coverage run -m pytest tests" # todo: remove tests from pytest tests when we delete all tests outside of MegaBlocks repo | ||
pytest_command: "coverage run -m pytest tests" | ||
# TODO: remove tests from pytest tests when we delete all tests in the MegaBlocks dir | ||
pytest_markers: "gpu" | ||
composer_package_name: mosaicml # Required as Composer is built from source | ||
mcloud_timeout: 3600 | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
# Copyright 2024 Databricks authors | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
default_language_version: | ||
python: python3 | ||
repos: | ||
# - repo: local | ||
# hooks: | ||
# - id: pyright | ||
# name: pyright | ||
# entry: pyright | ||
# language: node | ||
# types: [python] | ||
# pass_filenames: false | ||
# args: [--warnings] | ||
# additional_dependencies: ["[email protected]"] | ||
- repo: https://github.com/astral-sh/ruff-pre-commit | ||
rev: v0.2.2 | ||
hooks: | ||
- id: ruff | ||
args: [--fix, --exit-non-zero-on-fix] | ||
- repo: https://github.com/google/yapf | ||
rev: v0.32.0 | ||
hooks: | ||
- id: yapf | ||
name: yapf | ||
description: A formatter for Python files. | ||
entry: yapf | ||
args: [-i, -vv, -p] # inplace | ||
language: python | ||
types: [python] | ||
additional_dependencies: | ||
- toml | ||
- repo: https://github.com/hadialqattan/pycln | ||
rev: v2.1.2 | ||
hooks: | ||
- id: pycln | ||
args: [. --all] | ||
- repo: https://github.com/pycqa/isort | ||
hooks: | ||
- id: isort | ||
rev: 5.12.0 | ||
- repo: https://github.com/pre-commit/pre-commit-hooks | ||
rev: v4.3.0 | ||
hooks: | ||
- id: check-added-large-files | ||
- id: check-ast | ||
- id: check-builtin-literals | ||
- id: check-case-conflict | ||
- id: check-docstring-first | ||
- id: check-executables-have-shebangs | ||
- id: check-json | ||
- id: check-shebang-scripts-are-executable | ||
- id: pretty-format-json | ||
args: | ||
- --autofix | ||
- --no-sort-keys | ||
- --indent=4 | ||
- --no-ensure-ascii | ||
- id: check-merge-conflict | ||
- id: check-symlinks | ||
- id: check-toml | ||
- id: check-vcs-permalinks | ||
- id: check-xml | ||
- id: check-yaml | ||
- id: debug-statements | ||
- id: destroyed-symlinks | ||
- id: double-quote-string-fixer | ||
- id: end-of-file-fixer | ||
- id: fix-byte-order-marker | ||
- id: mixed-line-ending | ||
- id: trailing-whitespace | ||
- repo: https://github.com/Lucas-C/pre-commit-hooks | ||
rev: v1.5.4 | ||
hooks: | ||
- id: insert-license | ||
args: | ||
- --license-filepath | ||
- .pre-commit/FILE_HEADER | ||
- --comment-style | ||
- "#" | ||
- --allow-past-years | ||
types: [python] | ||
- repo: https://github.com/PyCQA/docformatter | ||
rev: v1.5.0 | ||
hooks: | ||
- id: docformatter | ||
args: [--in-place, --wrap-summaries=120, --wrap-descriptions=120] | ||
- repo: https://github.com/PyCQA/pydocstyle | ||
rev: 6.1.1 | ||
hooks: | ||
- id: pydocstyle | ||
name: pydocstyle | ||
entry: pydocstyle | ||
language: python | ||
types: [python] | ||
exclude: (.ci|.github) | ||
additional_dependencies: | ||
- toml | ||
- repo: https://github.com/adrienverge/yamllint.git | ||
rev: v1.28.0 | ||
hooks: | ||
- id: yamllint | ||
name: yamllint | ||
description: This hook runs yamllint. | ||
entry: yamllint | ||
language: python | ||
types: [file, yaml] | ||
- repo: https://github.com/trufflesecurity/trufflehog.git | ||
rev: v3.40.0 | ||
hooks: | ||
- id: trufflehog | ||
name: secret scan | ||
exclude: tests/horrible_strings.py | ||
entry: trufflehog filesystem ./ | ||
args: | ||
- --only-verified | ||
- --fail |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
Copyright 2024 Databricks | ||
SPDX-License-Identifier: Apache-2.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
yaml-files: | ||
- "*.yaml" | ||
- "*.yml" | ||
- .yamllint | ||
|
||
ignore: | | ||
wandb | ||
rules: | ||
braces: | ||
forbid: non-empty | ||
brackets: | ||
forbid: false | ||
colons: enable | ||
commas: enable | ||
comments: enable | ||
comments-indentation: enable | ||
document-end: | ||
present: false | ||
document-start: | ||
present: false | ||
empty-lines: enable | ||
empty-values: disable | ||
hyphens: enable | ||
indentation: | ||
spaces: 2 | ||
indent-sequences: false | ||
check-multi-line-strings: false | ||
key-duplicates: enable | ||
key-ordering: disable | ||
line-length: | ||
max: 120 | ||
allow-non-breakable-words: true | ||
allow-non-breakable-inline-mappings: true | ||
new-line-at-end-of-file: enable | ||
new-lines: enable | ||
octal-values: enable | ||
quoted-strings: | ||
quote-type: double | ||
required: false | ||
trailing-spaces: enable | ||
truthy: disable |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,104 @@ | ||
# Contributing to MegaBlocks | ||
|
||
Thanks for considering contributing to MegaBlocks! | ||
|
||
Issues tagged with [good first issue](https://github.com/mosaicml/megablocks/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) are great options to start contributing. | ||
|
||
If you have questions, join us on [Slack](https://join.slack.com/t/mosaicml-community/shared_invite/zt-w0tiddn9-WGTlRpfjcO9J5jyrMub1dg) -- we'll be happy to help you! | ||
|
||
We welcome contributions for bug fixes, new efficient methods you'd like to contribute to the community, or new models and datasets! | ||
|
||
## Prerequisites | ||
|
||
To set up the development environment in your local box, run the commands below. | ||
|
||
1\. Install the dependencies needed for testing and linting the code: | ||
|
||
<!--pytest.mark.skip--> | ||
```bash | ||
pip install -e '.[all]' | ||
``` | ||
|
||
2\. Configure [pre-commit](https://pre-commit.com/), which automatically formats code before | ||
each commit: | ||
|
||
<!--pytest.mark.skip--> | ||
```bash | ||
pre-commit install | ||
``` | ||
|
||
## Submitting a Contribution | ||
|
||
To submit a contribution: | ||
|
||
1\. Fork a copy of the [MegaBlocks](https://github.com/databricks/megablocks) library to your own account. | ||
|
||
2\. Clone your fork locally and add the megablocks repo as a remote repository: | ||
|
||
<!--pytest.mark.skip--> | ||
```bash | ||
git clone [email protected]:<github_id>/megablocks.git | ||
cd megablocks | ||
git remote add upstream https://github.com/databricks/megablocks.git | ||
``` | ||
|
||
3\. Create a branch and make your proposed changes. | ||
|
||
<!--pytest.mark.skip--> | ||
```bash | ||
git checkout -b cool-new-feature | ||
``` | ||
|
||
4\. When you are ready, submit a pull request into the megablocks repository! | ||
|
||
## Pull request (PR) guidelines | ||
|
||
We have some rough guidelines that will make your PR easier to review and more likely to get smoothly merged. Please don't let uncertainty or difficulty with any of these things stop you from opening a PR! We are happy to help you through them :) | ||
* Self-contained title and description. Please include a concise title and clear PR description. The title should allow someone to understand what the PR changes or does at a glance. The description should allow someone to understand the contents of the PR _without_ looking at the code. | ||
* If the PR affects output that is displayed to a user of MegaBlocks (e.g. console logging or experiment tracker reporting), please include screenshots showing what the new output looks like. UX is important! | ||
* Include tests. If you are fixing a bug, please add a test that would've caught the bug. If you are adding a new feature, please add unit tests that test the various components of the feature, and also a test that tests the full functionality of the feature. | ||
* Please consider whether your changes affect the example notebooks or large parts of the code base, and run the daily tests locally if so (`pytest -m 'daily and not remote and not gpu and not vision and not doctest'`) | ||
* `pre-commit` should help you handle formatting and type checking, but please do make sure you have it installed as described [above](#prerequisites). | ||
|
||
## Configuring README Code Snippets | ||
|
||
MegaBlocks uses [pytest-codeblocks](https://github.com/nschloe/pytest-codeblocks) to test all example code snippets. The pytest-codeblocks repository explains how to annotate code snippets, which supports most `pytest` configurations. For example, if a test requires model training, the GPU mark (`<!--pytest.mark.skip-->`) should be applied. | ||
|
||
## Running Tests | ||
|
||
To test your changes locally, run: | ||
|
||
* `make test` # run CPU tests | ||
* `make test-gpu` # run GPU tests | ||
* `cd docs && make doctest` # run doctests | ||
|
||
Some of our checks test distributed training as well. To test these, run: | ||
|
||
* `make test-dist WORLD_SIZE=2` # run 2-cpu distributed tests | ||
* `make test-dist-gpu WORLD_SIZE=2` # run 2-gpu distributed tests | ||
|
||
These tests run with the `composer` launcher. We also support `WORLD_SIZE=1`, which would run the tests with the `composer` launcher on a single device. | ||
|
||
See the [Makefile](/Makefile) for more information. | ||
|
||
If you want to run pre-commit hooks manually, which check for code formatting and type annotations, run `pre-commit run --all-files` | ||
|
||
### Docker | ||
|
||
To run the tests in the provided docker containers: | ||
|
||
* `docker pull mosaicml/composer` (or an alternative image like `mosaicml/composer:latest_cpu`) | ||
* `docker run --rm -v ./:/composer --user $(id -u):$(id -g) -it mosaicml/composer` | ||
* from inside the container | ||
* `cd /megablocks` | ||
* `pip install -e .` | ||
* `pytest <args>` or `make <args>` to run the desired tests | ||
|
||
|
||
## Code Style & Typing | ||
|
||
See the [MegaBlocks Style Guide](/STYLE_GUIDE.md) for guidelines on how to structure and format your code. | ||
|
||
MegaBlocks aims to annotate all functions with type annotations (introduced in | ||
[PEP 526](https://www.python.org/dev/peps/pep-0526/)). Don't worry if you are not a Python typing expert; | ||
put in the pull request, and we'll help you with getting the code into shape. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.