Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fire finetuning #553

Draft
wants to merge 762 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
762 commits
Select commit Hold shift + click to select a range
973d868
Add specific vizier running file for optimization
Jun 13, 2024
a4bb9c8
Add working run_vizier modification of run_exp
Jun 13, 2024
c3b56c7
Convert boolean action to new format for compat
Jun 13, 2024
f832845
Remove temporary test file
Jun 15, 2024
f524534
Streamline run_vizier.py
Jun 15, 2024
a8a955d
Remove leading spaces from train.py
Jun 15, 2024
ce306f9
Add configuration json file to out_dir
Jun 15, 2024
2a14bd8
Add saving of best validation loss and iter file
Jun 15, 2024
f050079
Copy meta.pkl to out_dir
Jun 15, 2024
f13233c
Add check for meta.pkl from out_dir to sample.py
Jun 15, 2024
5fc97ef
Add fast method for obtaining best validation loss
Jun 15, 2024
fc9c93f
Supress warnings in the run_vizier
Jun 15, 2024
8b27d8e
Add comments to ckpt saving and end action list
Jun 15, 2024
8673a3a
Add --fast option for inspect checkpoints
Jun 15, 2024
ee8f4df
Add scripts compat with python-codes-25k
Jun 15, 2024
447ae47
Merge pull request #186 from klei22/add_vizier_optimization
gkielian Jun 16, 2024
6196fc9
Merge branch 'add_scripts_for_python_codes_dataset' into HEAD
Jun 16, 2024
dcfb5bc
Merge pull request #187 from klei22/add_more_datasets
gkielian Jun 16, 2024
48a36cc
Remove duplicate block
gkielian Jun 17, 2024
641401e
Merge pull request #188 from gkielian/main
klei22 Jun 17, 2024
aeca808
Add options random init mean and std to train.py
Jun 18, 2024
6277d35
Clean imports section of model.py
Jun 18, 2024
45a5b2e
Add polymorphic interface for linear variations
Jun 18, 2024
6e70517
Simplify linear wrapper to inheritance based
Jun 18, 2024
df5bd39
Merge branch 'add_kan_and_hyperparams' into origin_main
Jun 18, 2024
01aef4b
Don't set WPE if not selected
Jun 18, 2024
0e9ac28
Merge pull request #189 from klei22/add_linear_wrapper_and_kan_feedback
gkielian Jun 18, 2024
e72f663
Fix bug separating shuffle moveset with moveset
Jun 16, 2024
881d7a6
Add trial argparse arg
Jun 16, 2024
1c76156
Upgrade progress bar
Jun 16, 2024
0777f5b
Add create_datasets.sh for experiments
Jun 22, 2024
40070e8
Add latest iteration of parameter exploration
Jun 22, 2024
5516e22
Add quantized krmsnorm
Jun 23, 2024
b3ae142
Add exploration for vizier
Jun 23, 2024
0c81ccf
Add option for no quantization
Jun 23, 2024
f4ef50f
Update ints to categories
Jun 23, 2024
a081c0e
Fix configuration param names
Jun 24, 2024
78c1165
Add fix for exploration json specifying krmsnorm
Jun 24, 2024
939e7cd
Add initial nan detection to train.py
Jun 24, 2024
437dd7e
Merge pull request #191 from klei22/rubiks_cube_improvements_2
gkielian Jun 24, 2024
4b079e1
golden gen for decoder
Buck008 Jun 25, 2024
03698f7
update for golden gen
Buck008 Jun 28, 2024
d17fdb3
Add scripts compatible with the Newswire Dataset
Jul 1, 2024
083f37c
Add snac_converter.py for mp3 <-> text conversion
Jul 4, 2024
a75244d
Update prepare.py to include direct numeric tokens
Jul 5, 2024
bc9d8df
Add optional token boundary to sample.py
Jul 5, 2024
7530e41
Update sample.py to use write instead of append
Jul 5, 2024
d0b9ed5
Add option for snac processing all files in a dir
Jul 5, 2024
c6cc4fe
Add prepare.py softlink to snac folder
Jul 5, 2024
178fe66
Add example of how to create a listening sample
Jul 5, 2024
ba4dec5
Add helper files
Jul 5, 2024
468e21e
Update get_dataset.sh with tokenization
Jul 5, 2024
c551d5f
Add sampling and training scripts
Jul 5, 2024
cead197
Add progress bars to audio directory processing
Jul 5, 2024
ad7192b
Add split_mp3s.py to help reduce mp3 size for gpu
Jul 5, 2024
f5d725c
Organize imports
Jul 5, 2024
584ab7a
Add sample.sh and train.sh for easy demo of vc
Jul 5, 2024
9196ee6
Add README.md
Jul 5, 2024
3644b37
Remove stray char from script
Jul 9, 2024
99e820b
Add dependency installation instructions
Jul 9, 2024
5c44f27
Merge pull request #198 from klei22/add_snac_tokenization
gkielian Jul 9, 2024
3395f2d
Update get_dataset script to use jq
Jul 9, 2024
66a73be
Check for download directory before saving
Jul 9, 2024
c6cb631
Remove unused bash setting +x
Jul 9, 2024
2bdaa38
Merge pull request #196 from klei22/add_newswire
gkielian Jul 9, 2024
21d6ee4
Merge pull request #194 from Buck008/golden_gen
gkielian Jul 9, 2024
cfde3e9
Update train.py
gkielian Jul 9, 2024
d8a6b4b
Merge pull request #192 from klei22/add_quantized_krmsnorm
gkielian Jul 9, 2024
e9be482
Add scripts compatible wtih smollm-corpus
Jul 27, 2024
d6215a3
Substantive commit on implementing Mixture of Experts (MoE) architecture
djlisbonne Jul 29, 2024
104e3d9
Update typing extensions module
Jul 29, 2024
eea13a9
Merge pull request #204 from klei22/update_typing_extensions
klei22 Jul 29, 2024
cd761ae
Altering Block class to use MoE layer instead of basic MLP
djlisbonne Jul 29, 2024
28235b3
Small change to cmd line flags
djlisbonne Jul 29, 2024
4f713b3
More MoE flags and slight renaming
djlisbonne Jul 30, 2024
69a1ab7
Bug fix
djlisbonne Jul 30, 2024
33854fd
added kRMSNormWithRecompute module
mmoffatt2 Jul 30, 2024
6f46df9
added gpt_conf
mmoffatt2 Jul 30, 2024
7bad0c3
Comment updates
djlisbonne Jul 30, 2024
bd68e6e
Store and load support for GPTConfig
djlisbonne Jul 30, 2024
4744d67
Merge branch 'save_gptconf' into add_moe
djlisbonne Jul 30, 2024
6e9aeda
got rid of unneccesary krmsnorm module
mmoffatt2 Jul 31, 2024
b931379
moved quantize embedding code to new PR
mmoffatt2 Jul 31, 2024
6bd98a6
Adding argparse arg in train.py to load and save params to json
djlisbonne Jul 31, 2024
2736855
Merge branch 'save_gptconf' into add_moe
djlisbonne Jul 31, 2024
4c2f911
Added partial code for snac tokens
xinyixuu Jul 31, 2024
a4a19cf
got rid of unnecessary linear quant
mmoffatt2 Aug 1, 2024
94a34a9
initial feedback changes
mmoffatt2 Aug 2, 2024
23ece16
Added submodule
xinyixuu Aug 2, 2024
6661182
Update path to submodule
xinyixuu Aug 2, 2024
8c5aa0c
update minor issue on sample_whisper_snac.py
xinyixuu Aug 2, 2024
3ce65e4
Resolved merge conflict in data/snac/sample_whisper_snac.py
xinyixuu Aug 2, 2024
d41bbbb
Update .gitmodules
xinyixuu Aug 2, 2024
e81ed92
Added submodule
xinyixuu Aug 2, 2024
6b4a662
added bash script to install whisper.cpp
xinyixuu Aug 2, 2024
9929ad3
Moved moe router into dedicated variations file
djlisbonne Aug 2, 2024
db2921a
Merge pull request #208 from mmoffatt2/krmsnorm-recompute
gkielian Aug 3, 2024
dccaf76
Replace counters with flag for recomputed
gkielian Aug 3, 2024
238b0b7
Merge pull request #209 from djlisbonne/save_gptconf
gkielian Aug 3, 2024
31e032b
Merge pull request #203 from klei22/add_compat_with_smollm_corpus
gkielian Aug 3, 2024
02b7537
Fix bug with MoE layer frequency
djlisbonne Aug 4, 2024
f818f9a
add test files for debug
xinyixuu Aug 5, 2024
99fcad2
add bash script that run the whisper
xinyixuu Aug 6, 2024
71faffc
Add guards to sample_whisper_snac.py
gkielian Aug 6, 2024
f572743
Add graceful end of file handling
gkielian Aug 6, 2024
196c4f0
Add formatting for stdout
gkielian Aug 6, 2024
4ecc0cb
Add small formatting edits
gkielian Aug 6, 2024
f51c438
Add tempfix for compile error from rv64 flags
gkielian Aug 6, 2024
e3e7acc
Merge branch 'master' into quantization_embedding
gkielian Aug 6, 2024
3ac1d49
Merge pull request #210 from mmoffatt2/quantization_embedding
gkielian Aug 6, 2024
61b478a
Merge pull request #215 from gkielian/add_krmsnorm_recompute_flag
klei22 Aug 6, 2024
f757848
Merge pull request #1 from gkielian/add_snac_tokens_patch
xinyixuu Aug 6, 2024
bd9fc88
Merge pull request #2 from xinyixuu/master
xinyixuu Aug 6, 2024
8e847ae
updated to add snac patch
xinyixuu Aug 6, 2024
6cca569
Delete data/snac/.tmux.conf
gkielian Aug 6, 2024
fcb8735
Merge pull request #212 from xinyixuu/add_snac_tokens
gkielian Aug 6, 2024
4c63f46
Remove temp wav file
gkielian Aug 6, 2024
dc6cfb9
Add .wav and .mp3 files to .gitignore
gkielian Aug 6, 2024
9ce521e
Move whisper_snac.sh to snac dir and adjust paths
gkielian Aug 7, 2024
7ed18f9
Moved MoE layer construction to create_shared_param_group to save ove…
djlisbonne Aug 7, 2024
7d238c3
Removed print statements
djlisbonne Aug 7, 2024
5191acd
Merge pull request #205 from djlisbonne/add_moe
gkielian Aug 7, 2024
424d360
moved activation quantization to new PR
mmoffatt2 Aug 5, 2024
9fdd822
undo gpt_conf changes
mmoffatt2 Aug 5, 2024
80e4fee
updated act names
mmoffatt2 Aug 8, 2024
669377c
moved quantized linears to new PR
mmoffatt2 Aug 1, 2024
00b0f2e
got rid of prints
mmoffatt2 Aug 1, 2024
ad2dbf3
initial PR feedback
mmoffatt2 Aug 2, 2024
3623baa
removed warmup_iters changes
mmoffatt2 Aug 2, 2024
089cfa1
added quantization_warmup_iters
mmoffatt2 Aug 2, 2024
16032aa
added quantization_dict
mmoffatt2 Aug 7, 2024
82479c1
added new quantization to embedding
mmoffatt2 Aug 8, 2024
eb0eac4
added new quantization to embedding
mmoffatt2 Aug 8, 2024
10526d5
Merge pull request #218 from gkielian/add_snac_patches_2
klei22 Aug 8, 2024
c7ebec2
Add Model Parameter Section
gkielian Aug 9, 2024
a008fa7
Trim trailing spaces
gkielian Aug 9, 2024
a886556
Add torchinfo to requirements_cpu
gkielian Aug 9, 2024
5f8c620
Add torchinfo installation to README.md
gkielian Aug 9, 2024
9ce06df
changed act inouts to input
mmoffatt2 Aug 9, 2024
cfd8bdf
added linear_variants and quant_methods lists
mmoffatt2 Aug 9, 2024
2ef1c57
Fix typo in logging group
gkielian Aug 9, 2024
b8e43e0
Merge pull request #214 from mmoffatt2/quantization_linear
gkielian Aug 9, 2024
ee5b012
Merge branch 'master' into quantization_activations
gkielian Aug 9, 2024
90f1f91
Update train.py
gkielian Aug 9, 2024
0906b16
Update model.py
gkielian Aug 9, 2024
9e1206a
Merge pull request #216 from mmoffatt2/quantization_activations
gkielian Aug 9, 2024
c12522e
Merge pull request #223 from gkielian/add_model_param_section
klei22 Aug 9, 2024
6cafe38
Add implementation of Rotary Embeddings
Aug 11, 2024
556d99c
Add Rope Length option to Symmetrical Rope
Aug 12, 2024
8c29fef
Add Rope length to Standard Rope
Aug 12, 2024
e99d875
combined linear and activation PRs
mmoffatt2 Aug 8, 2024
151f57c
Add polling to inspect_ckpts.py
Aug 12, 2024
3ed497a
added functionality to save quantized weights/activations
mmoffatt2 Aug 9, 2024
05e86bc
quantized linear/activation feedback
mmoffatt2 Aug 9, 2024
dcd75bc
Got rid of merge mistakes
mmoffatt2 Aug 12, 2024
68249a5
Removing inspect print statement when polling
Aug 12, 2024
7f6ea79
Update rope_sweep with random seed
Aug 12, 2024
ca0dd1f
modified whisper_snac.sh file to run the whole process to get the re…
xinyixuu Aug 12, 2024
a6d4667
Adjust snac whisper script names and paths
gkielian Aug 12, 2024
e890fe0
Merge pull request #3 from gkielian/add_snac_tokens_over_dir
xinyixuu Aug 14, 2024
61b7876
Adjust try catch block and formatting
gkielian Aug 14, 2024
ebca13f
Merge pull request #4 from gkielian/add_snac_tokens_over_dir
xinyixuu Aug 14, 2024
d122ad5
Merge pull request #225 from klei22/add_rope_embeddings
gkielian Aug 14, 2024
a01bb72
added update_activations bool
mmoffatt2 Aug 14, 2024
76516f7
Add ReLUMax variation and sweep
gkielian Aug 14, 2024
90b05a7
refactored fake_quantize_act code
mmoffatt2 Aug 14, 2024
293cc60
fixed bug in ordering of fake_quantize_act
mmoffatt2 Aug 15, 2024
7a0b04f
added more granular qk and pv input act quantization
mmoffatt2 Aug 15, 2024
8706d6c
moved activation buffer creation to a function
mmoffatt2 Aug 15, 2024
bbc09a5
moved all train.py statistics functions to new folder
mmoffatt2 Aug 15, 2024
b957e7e
fixed plot_statistics argument
mmoffatt2 Aug 15, 2024
b08ffcd
moved if statement inside create_statistics
mmoffatt2 Aug 15, 2024
52a15d1
Merge pull request #230 from mmoffatt2/statistics_util
gkielian Aug 16, 2024
eac0c67
Merge pull request #224 from mmoffatt2/quantization_save_weights
gkielian Aug 16, 2024
c1db268
Merge pull request #227 from xinyixuu/add_snac_tokens
gkielian Aug 17, 2024
ee5ad20
Merge pull request #229 from mmoffatt2/matmul_quantization_granularity
gkielian Aug 17, 2024
9beea4b
Merge branch 'master' into add_relumax
klei22 Aug 17, 2024
d3de6f3
Merge pull request #231 from gkielian/add_relumax
klei22 Aug 17, 2024
2483864
Add model printing before training run
gkielian Aug 17, 2024
24ff05e
Adjust plotting to be optional
gkielian Aug 17, 2024
7d2e241
Add logging and plotting options to gptconf
gkielian Aug 17, 2024
3f61101
Trim ending spaces in train.py and gptconf.py
gkielian Aug 17, 2024
0c5be9f
Add ConSmaxV2 and add conditional io logging
gkielian Aug 17, 2024
d03d988
Add formatted model printing functions
gkielian Aug 17, 2024
74b4880
Merge branch 'master' of https://github.com/chipGPT/nanoGPT
gkielian Aug 17, 2024
7a57222
Trim settings for ConSvaxV2 to only defaults
gkielian Aug 19, 2024
54badaa
Support for importing & translating models from HF
djlisbonne Aug 19, 2024
e9e41bc
Cleanup
djlisbonne Aug 20, 2024
3464d3b
Temp remove of resume_gpt_model
gkielian Aug 20, 2024
bfb268c
Merge pull request #233 from djlisbonne/hf_from_pretrained_fix
gkielian Aug 20, 2024
1d4f8a3
Add printing of model summary to sample.py
gkielian Aug 20, 2024
e484371
Add abbreviations for config and output files
gkielian Aug 20, 2024
265ce88
Add gating args for logging statistics
gkielian Aug 20, 2024
26076a8
Add sweep for ConSmaxV2
gkielian Aug 20, 2024
9a16913
Add printout for model param names
gkielian Aug 20, 2024
dd0a852
Add means for logging per head via tensorboard/csv
gkielian Aug 21, 2024
6925207
Add softmax overflow recompute test
gkielian Aug 21, 2024
9e76d1a
added custom_gpt code
mmoffatt2 Aug 21, 2024
b903c04
Initial commit
djlisbonne Aug 21, 2024
273d9cc
simplifying train.py init_from branching and added support for overri…
djlisbonne Aug 21, 2024
c42bffc
Require xmax_guess set if overflow recompute
gkielian Aug 21, 2024
541d31d
Make overflow recompute false by default
gkielian Aug 21, 2024
fd97d3c
changed softmax to softplus
mmoffatt2 Aug 21, 2024
4b4f415
Add initial code emulating hardware
gkielian Aug 21, 2024
2fd45cf
Add latest train and pickel for testing
gkielian Aug 21, 2024
2671c15
Update kv_group to default as none
gkielian Aug 21, 2024
f4d3b29
Merge branch 'master' of https://github.com/chipGPT/nanoGPT
gkielian Aug 21, 2024
b073506
removed test_train.py
mmoffatt2 Aug 22, 2024
df9d975
Merge pull request #237 from mmoffatt2/huggingface_model
gkielian Aug 22, 2024
355149c
added file for uploading to huggingfacehub
mmoffatt2 Aug 22, 2024
4206edf
moved sample to its own file
mmoffatt2 Aug 22, 2024
0c94f78
Merge branch 'ReaLLMASIC:master' into huggingface_model
mmoffatt2 Aug 22, 2024
b9e20db
Merge pull request #238 from mmoffatt2/huggingface_model
gkielian Aug 22, 2024
a5e7592
update to sample.py to properly load pretrained GPT2 model
djlisbonne Aug 22, 2024
ae45d40
Add option to get sample inference after each val
gkielian Aug 22, 2024
02eb15c
Removed exits
djlisbonne Aug 23, 2024
e97b2fd
Adjust imports for inference compatibility
gkielian Aug 23, 2024
4f65cbb
Add start tokens option
gkielian Aug 23, 2024
c4be2b1
Add test script and README for goldenbrick
gkielian Aug 23, 2024
7006be5
Add note and modification for weight export
gkielian Aug 23, 2024
3eaaa69
Update to state_dict translation to correctly assign q,k,v matrices f…
djlisbonne Aug 23, 2024
eb8be6a
Merge pull request #240 from djlisbonne/gptconfig_fix
gkielian Aug 23, 2024
6eaa5f4
Merge branch 'master' into add_numpy_hw_test
klei22 Aug 24, 2024
4dc821a
Remove duplicate save file
klei22 Aug 24, 2024
55864d5
Restore statistic_plots.py
klei22 Aug 24, 2024
a65c5c3
Merge pull request #236 from gkielian/add_numpy_hw_test
klei22 Aug 24, 2024
57ff63e
Merge branch 'master' into add_training_sample_option
klei22 Aug 24, 2024
6c602e9
Merge pull request #241 from gkielian/add_training_sample_option
klei22 Aug 24, 2024
937ffae
Add progress bar to train.py
gkielian Aug 25, 2024
57b68f1
Merge pull request #243 from gkielian/add_progress_bar
klei22 Aug 25, 2024
6645172
Move notebooks to colab folder
Aug 25, 2024
c4ce6ed
Remove data_augmentation folder
Aug 25, 2024
474b73b
Remove data augmentation in favor of HF apis
Aug 25, 2024
405a276
Add original nanoGPT as module instead of hardcopy
Aug 25, 2024
8c8eb9a
Add llm.c as a submodule
Aug 25, 2024
be8d426
Clean images no longer used in README
Aug 25, 2024
fbf5988
Merge pull request #244 from klei22/organize_folders
gkielian Aug 25, 2024
5423961
Add softmax sweep to benchmark softmaxes v context
Aug 25, 2024
da5cc53
v2: manually adding +1 in log_rel & log_pos.
Mars-Cat2023 Aug 25, 2024
515bb90
v3: Adding one argument: –fire_log_bias
Mars-Cat2023 Aug 25, 2024
cddf59d
Add option to just do forward, for testing inference
Aug 25, 2024
8ec274b
v4: Adding 5 new arguments: –-fire_num_hidden_layers, mlp_width, init…
Mars-Cat2023 Aug 25, 2024
61833cd
Update train.py
Mars-Cat2023 Aug 26, 2024
7a414a7
Merge pull request #246 from Mars-Cat2023/FIRE
gkielian Aug 26, 2024
f4c0781
Merge pull request #245 from klei22/add_softmax_context_benchmark
gkielian Aug 26, 2024
cbf3d95
Fixed One Bug in FIRE - PR #246 v4
Mars-Cat2023 Aug 29, 2024
981c8dd
Add MLP Expansion factor control and sweep
gkielian Aug 31, 2024
863c54d
Merge pull request #251 from Mars-Cat2023/FIRE
gkielian Sep 2, 2024
37ca368
Merge pull request #252 from gkielian/add_mlp_expansion_factor
klei22 Sep 3, 2024
5a7528b
Add code for finetuning with FIRE
gkielian Sep 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
35 changes: 35 additions & 0 deletions .github/workflows/cpu-basic-install-prepare-train-inf-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
name: Basic Pytorch Installation, Data Prep, CPU Training, CPU Inference
on: [push, pull_request]
jobs:
Install-Dependencies_Data-Prep_CPU-Training_CPU-Inference:
runs-on: ubuntu-latest
steps:
- name: Check out repository code
uses: actions/checkout@v4
- run: echo "${{ github.repository }} repository has been cloned to the runner."
- run: echo "Currently on ${{ github.ref }} branch"
- name: ls of directory
run: |
ls ${{ github.workspace }}
# Caching pip dependencies
- name: Cache pip dependencies
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements_cpu.txt') }}
restore-keys: |
${{ runner.os }}-pip-
- name: Install CPU Dependencies
run: |
python3 -m pip install --upgrade pip
python3 -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
python3 -m pip install numpy transformers datasets tiktoken wandb tqdm tensorboard
python3 -m pip install -r requirements_cpu.txt
- name: Run Small Network on CPU
run: |
python3 data/shakespeare_char/prepare.py
python3 train.py --out_dir=out --device=cpu --eval_interval=2 --log_interval=1 --block_size=2 --batch_size=2 --n_layer=2 --n_head=2 --n_kv_group=2 --n_embd=16 --max_iters=3 --lr_decay_iters=2 --dropout=0.0
- name: Run CPU Inference
run: |
python3 sample.py --device=cpu --out_dir="out"

33 changes: 33 additions & 0 deletions .github/workflows/cpu-test-all-activation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: Install Then Test All activations
on: [push, pull_request]
jobs:
Install-And-Test-Activations:
runs-on: ubuntu-latest
steps:
- name: Check out repository code
uses: actions/checkout@v4
- run: echo "${{ github.repository }} repository has been cloned to the runner."
- run: echo "Currently on ${{ github.ref }} branch"
- name: ls of directory
run: |
ls ${{ github.workspace }}
# Caching pip dependencies
- name: Cache pip dependencies
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements_cpu.txt') }}
restore-keys: |
${{ runner.os }}-pip-
- name: Install CPU Dependencies
run: |
python3 -m pip install --upgrade pip
python3 -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
python3 -m pip install numpy transformers datasets tiktoken wandb tqdm tensorboard
python3 -m pip install -r requirements_cpu.txt
- name: Test all activation variations
run: |
python3 data/shakespeare_char/prepare.py
cd tests
source test_all_activation_variations_cpu.sh

33 changes: 33 additions & 0 deletions .github/workflows/cpu-test-all-softmax.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: Install Then Test All Softmaxes
on: [push, pull_request]
jobs:
Install-And-Test-Softmax:
runs-on: ubuntu-latest
steps:
- name: Check out repository code
uses: actions/checkout@v4
- run: echo "${{ github.repository }} repository has been cloned to the runner."
- run: echo "Currently on ${{ github.ref }} branch"
- name: ls of directory
run: |
ls ${{ github.workspace }}
# Caching pip dependencies
- name: Cache pip dependencies
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements_cpu.txt') }}
restore-keys: |
${{ runner.os }}-pip-
- name: Install CPU Dependencies
run: |
python3 -m pip install --upgrade pip
python3 -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
python3 -m pip install numpy transformers datasets tiktoken wandb tqdm tensorboard
python3 -m pip install -r requirements_cpu.txt
- name: Test all softmax variations
run: |
python3 data/shakespeare_char/prepare.py
cd tests
source test_all_softmax_variations_cpu.sh

33 changes: 33 additions & 0 deletions .github/workflows/cpu-test-gqa.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: Install Then Test GQA Variations
on: [push, pull_request]
jobs:
Install-And-Test-GQA:
runs-on: ubuntu-latest
steps:
- name: Check out repository code
uses: actions/checkout@v4
- run: echo "${{ github.repository }} repository has been cloned to the runner."
- run: echo "Currently on ${{ github.ref }} branch"
- name: ls of directory
run: |
ls ${{ github.workspace }}
# Caching pip dependencies
- name: Cache pip dependencies
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements_cpu.txt') }}
restore-keys: |
${{ runner.os }}-pip-
- name: Install CPU Dependencies
run: |
python3 -m pip install --upgrade pip
python3 -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
python3 -m pip install numpy transformers datasets tiktoken wandb tqdm tensorboard
python3 -m pip install -r requirements_cpu.txt
- name: Test all softmax variations
run: |
python3 data/shakespeare_char/prepare.py
cd tests
source test_gqa_variations_cpu.sh

32 changes: 32 additions & 0 deletions .github/workflows/cpu-test-run-exp.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: Install Then Test Run Experiments script
on: [push, pull_request]
jobs:
Install-And-Test-Run-Exp:
runs-on: ubuntu-latest
steps:
- name: Check out repository code
uses: actions/checkout@v4
- run: echo "${{ github.repository }} repository has been cloned to the runner."
- run: echo "Currently on ${{ github.ref }} branch"
- name: ls of directory
run: |
ls ${{ github.workspace }}
# Caching pip dependencies
- name: Cache pip dependencies
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements_cpu.txt') }}
restore-keys: |
${{ runner.os }}-pip-
- name: Install CPU Dependencies
run: |
python3 -m pip install --upgrade pip
python3 -m pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
python3 -m pip install numpy transformers datasets tiktoken wandb tqdm tensorboard
python3 -m pip install -r requirements_cpu.txt
- name: Test all softmax variations
run: |
python3 data/shakespeare_char/prepare.py
python3 run_experiments.py --config explorations/config_cpu.json

25 changes: 17 additions & 8 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,10 +1,19 @@
.DS_Store
.idea
.ipynb_checkpoints/
.vscode
# folders
__pycache__/
*.bin
logs/
csv_logs/

# file extensions
*.pkl
*.pt
*.pyc
input.txt
*.bin
*.txt

# audio file extensions
*.wav
*.mp3

# checkpoint directories
out*/
.aider*

venv/*
9 changes: 9 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
[submodule "data/template/whisper.cpp"]
path = data/template/whisper.cpp
url = https://github.com/ggerganov/whisper.cpp.git
[submodule "modules/nanoGPT"]
path = modules/nanoGPT
url = https://github.com/karpathy/nanoGPT
[submodule "modules/llm.c"]
path = modules/llm.c
url = https://github.com/karpathy/llm.c.git
116 changes: 116 additions & 0 deletions Contributing_Features.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
# How to add new features

This is a guide for adding a new feature to the search space.

# TOC

* [Step 1 Add new variation](#step-1-add-new-variation)
* [Step 2 Adjust model.py](#step-2-adjust-modelpy)
* [Step 3 Add a config within the model.py](#step-3-add-a-config-within-the-modelpy)
* [Step 4 Add an argparse argument for the train.py](#step-4-add-an-argparse-argument-for-the-trainpy)
* [Step 5 Create configuration json in exploration folder](#step-5-create-configuration-json-in-exploration-folder)
* [Other Parameter Groups](#other-parameter-groups)
* [Ideas](#ideas)

## Step 1 Add new variation


If the variation is in the following categories, add to the appropriate file or
create a new file in the variations folder:

```
variations/
├── activation_variations.py
├── normalization_variations.py
├── position_encoding_variations.py
└── softmax_variations.py
```
Some variations, such as orderings of the network, may need to be made directly
to the `model.py` file.

## Step 2 Adjust model.py

Import the new variation:
```
from variations.softmax_variations import YourSoftmaxVariation
```

And add to the model.py in appropriate section:
```
if self.softmax_variant_attn == "yournewvariation":
self.softmax_layer = YourNewVariation(config)
```

## Step 3 Add a config within the model.py

Open up `model.py` and add your new configuration within the `GPTConfig`
dataclass:

```python
@dataclass
class GPTConfig:
block_size: int = 1024
vocab_size: int = 50304 # GPT-2 vocab_size of 50257, padded up to nearest m
n_layer: int = 12
n_head: int = 12
n_embd: int = 768
dropout: float = 0.0

# Your New Setting
new_variation_setting: bool = True
```

## Step 4 Add an argparse argument for the train.py


Open up `train.py` and add your new feature to the model group inside `parse_args` function,
depending on the type:

For boolean values:
```python
model_group.add_argument('--use_faster_inference', default=True, action=argparse.BooleanOptionalAction)
```

For string values (e.g. for selection between several types of a module):
```python
model_group.add_argument("--softmax_variant", type=str, default="softermax", choices=["constantmax", "polymax", "strongermax", "softermax", "sigsoftmax", "sigsoftmax_base2"])
```

For numeric values:
```python
model_group.add_argument("--block_size", type=int, default=256)
```

## Step 5 Create configuration json in exploration folder

`cd` into the exploration folder and copy a template for a new exploration sweep.

Run the sweep with `run_experiments.py` from the repo root specifying our
config file.

```
python3 run_experiments.py --config explorations/config.json --output_dir out_test
```

This will automatically timestamp and apply labels to your tensorboard logs,
create direct csv logs, and save output checkpoints into a specified folder.

## Other Parameter Groups

`train.py` is parameterized with argparse into three groups:

1. `model_group` - these are automatically added to a config and sent to model.py
2. `training_group` - only used by train.py
3. `logging_group` - also only used by train.py (specifically for logging)

Adding to the model group will have it sent into model.py, making it really just
a two step process for adding a new feature.

## Ideas

In addition to scanning perplexity results from for different settings:

- Reinforcement Loops - adding gymnasium to optimize parameters
- Training Loop - Generating output sample.py, augmenting, then feeding back as training data.
- Monitoring of hyperparameters - e.g. gamma and beta values for constantmax

30 changes: 30 additions & 0 deletions HW/SA/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
TESTBENCH = ../SA/tb/tb.sv
SIM_FILES = ../SA/define.vh ../SA/verilog/fadd.sv ../SA/verilog/fmul.sv ../SA/verilog/PE.sv ../SA/verilog/SA.sv

VV = vcs
VVOPTS = +v2k +vc -sverilog -timescale=1ns/1ps +vcs+lic+wait +multisource_int_delays +lint=TFIPC-L \
+neg_tchk +incdir+$(VERIF) +plusarg_save +overlap +warn=noSDFCOM_UHICD,noSDFCOM_IWSBA,noSDFCOM_IANE,noSDFCOM_PONF -full64 -cc gcc +libext+.v+.vlib+.vh

ifdef WAVES
VVOPTS += +define+DUMP_VCD=1 +memcbk +vcs+dumparrays +sdfverbose
endif

ifdef GUI
VVOPTS += -gui
endif

all: clean sim

clean:
rm -f ucli.key
rm -f sim
rm -f sim_synth
rm -fr sim.daidir
rm -rf *.log
rm -fr csrc

sim: clean
$(VV) -o $@ $(VVOPTS) -debug_access+all $(SIM_FILES) $(TESTBENCH) -kdb -R -gui | tee sim_result.txt

dve: $(SIM_FILES) $(TESTBENCH)
$(VV) $(VVOPTS) -lncurses $^ -debug_access+all -kdb -o $@ -R -gui
16 changes: 16 additions & 0 deletions HW/SA/define.vh
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
`ifndef _DEFINE_SVH_
`define _DEFINE_SVH_
package DEFINE_PKG;

`define DIMENSION 4
`define M_W 23
`define EXP_W 8
`define BIT_W 32
`define MULT_W `M_W+`M_W+2
`define EXP_MAX 2**(`EXP_W-1)+2**(`EXP_W)-3

`define N_TESTS 100000

endpackage

`endif
22 changes: 22 additions & 0 deletions HW/SA/tb/TestCasesGeneratorMultiplication.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
import bitstring
import random

span = 10000000
iteration = 100000

def ieee754(flt):
b = bitstring.BitArray(float=flt, length=32)
return b

with open("TestVectorMultiply", "w") as f:

for i in range(iteration):
a = ieee754(random.uniform(-span, span))
b = ieee754(random.uniform(-span, span))
ab = ieee754(a.float * b.float)

f.write(a.hex +"_" + b.hex + "_" + ab.hex + "\n")



##############END OF PROGRAM###########################################################
Loading