Release v0.7.0: SDK3.3, Whisper on 1 IPU, MT5, transformers 4.29 · huggingface/optimum-graphcore

What's Changed

Optimum has been updated to support Poplar SDK 3.3.
A new feature in that SDK is the poptorch.cond operation, which enables conditional compute. This enabled us to implement some new optimisations.
Using the the new cond operation we are able to fit Whisper-tiny encoder and decoder on a single IPU. To enable, pass the option use_cond_encoder to Whisper's parallelize method.
Added the option for cross-attention KV caching in Whisper, also using the cond op. To enable, pass the option use_cross_cache to Whisper's parallelize method.
We added support for the MT5 model for summarisation and translation tasks.
The version of transformers has been updated to 4.29. One of the things this enables in Optimum is Whisper timestamp decoding.
Added optimum.graphcore.models.whisper.WhisperProcessorTorch - a faster, drop-in replacement for transformers.WhisperProcessor.
The pod_type argument, which was deprecated in 0.6.1, has been removed.

Commits

Fixing links to API references by @jayniep-gc in #391
Do not override replicated_tensor_sharding in the IPUConfig by @kundaMwiza in #393
Preserve the set padding idx in SerializedEmbedding by @kundaMwiza in #395
Add MT5 by @kundaMwiza in #392
deberta/translation/summarization notebook fixes by @kundaMwiza in #396
MT5 notebooks: prefix exec cache with mt5 by @kundaMwiza in #397
Flan-T5 Notebook Formatting Tweaks by @hmellor in #398
Add cross KV caching by @katalinic-gc in #329
Beam search adjustment by @katalinic-gc in #394
Updating Whisper notebook so it uses new SDK and new features by @lukem-gc in #399
Add padding_idx to appropriate embedding split by @hmellor in #403
Bump transformers to 4.29.2 by @katalinic-gc in #389
Fix Whisper processor torch with transformers 4.29.2 bump by @katalinic-gc in #405
Fix Stable Diffusion notebooks by @hmellor in #408
Add IPU support for HF pipelines to Whisper by @paolot-gc in #368
Throw error is kwargs isn't empty by end of init by @hmellor in #406
Add Whisper pipeline tests by @katalinic-gc in #409
Enable fine-tuning of whisper-tiny by @hmellor in #400
Fix issue where exe cache dir was set too late by @hmellor in #411
Enable generation tests by @kundaMwiza in #407
Add Seq2Seq trainer test by @kundaMwiza in #404
Use the generation config to control generation by @katalinic-gc in #410
Add support for Whisper timestamp decoding with on-device generation by @katalinic-gc in #413
Fix IPUWhisperTimeStampLogitsProcessor for beam search by @katalinic-gc in #414
Remove usage of deprecated config: pod_type by @hmellor in #416
Fix matmul_proportion ManagedAttribute usage by @hmellor in #415
Enable Whisper encoder and decoder to run on 1 IPU by @katalinic-gc in #418
Enable replication with on device text generation by @katalinic-gc in #420
Update doc workflows by @regisss in #417
Update whisper pipeline example for latest features by @katalinic-gc in #421
Fix text encoder for SD with 4.29 bump by @katalinic-gc in #424
Use the faster whisper feature extractor in whisper pipelines by @katalinic-gc in #423
Remove engine references from SD pipelines by @katalinic-gc in #422
Add support for whisper-small fine-tuning by @hmellor in #426
Use index select for whisper position embedding for better tile utili… by @katalinic-gc in #435
Print execution time of each example test by @kundaMwiza in #440
SplitProjection layer: Add output channels serialization mode by @kundaMwiza in #438
3.3 Examples CI Fixes by @jimypbr in #443
Support T5EncoderModel for t5-based embedding models by @alex-coniasse in #437
Integrate whisper large into the existing notebook by @alex-coniasse in #441
Bump SDK version to 3.3 in the github workflows by @jimypbr in #444
Update examples requirements for sdk3.3 by @jimypbr in #434

Full Changelog: v0.6.1...v0.7.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.7.0: SDK3.3, Whisper on 1 IPU, MT5, transformers 4.29

What's Changed

Commits

Contributors