v0.7.0: SDK3.3, Whisper on 1 IPU, MT5, transformers 4.29
What's Changed
- Optimum has been updated to support Poplar SDK 3.3.
- A new feature in that SDK is the
poptorch.cond
operation, which enables conditional compute. This enabled us to implement some new optimisations. - Using the the new
cond
operation we are able to fit Whisper-tiny encoder and decoder on a single IPU. To enable, pass the optionuse_cond_encoder
to Whisper'sparallelize
method. - Added the option for cross-attention KV caching in Whisper, also using the
cond
op. To enable, pass the optionuse_cross_cache
to Whisper'sparallelize
method. - We added support for the MT5 model for summarisation and translation tasks.
- The version of
transformers
has been updated to 4.29. One of the things this enables in Optimum is Whisper timestamp decoding. - Added
optimum.graphcore.models.whisper.WhisperProcessorTorch
- a faster, drop-in replacement fortransformers.WhisperProcessor
. - The
pod_type
argument, which was deprecated in 0.6.1, has been removed.
Commits
- Fixing links to API references by @jayniep-gc in #391
- Do not override replicated_tensor_sharding in the IPUConfig by @kundaMwiza in #393
- Preserve the set padding idx in SerializedEmbedding by @kundaMwiza in #395
- Add MT5 by @kundaMwiza in #392
- deberta/translation/summarization notebook fixes by @kundaMwiza in #396
- MT5 notebooks: prefix exec cache with mt5 by @kundaMwiza in #397
- Flan-T5 Notebook Formatting Tweaks by @hmellor in #398
- Add cross KV caching by @katalinic-gc in #329
- Beam search adjustment by @katalinic-gc in #394
- Updating Whisper notebook so it uses new SDK and new features by @lukem-gc in #399
- Add
padding_idx
to appropriate embedding split by @hmellor in #403 - Bump transformers to 4.29.2 by @katalinic-gc in #389
- Fix Whisper processor torch with transformers 4.29.2 bump by @katalinic-gc in #405
- Fix Stable Diffusion notebooks by @hmellor in #408
- Add IPU support for HF pipelines to Whisper by @paolot-gc in #368
- Throw error is kwargs isn't empty by end of init by @hmellor in #406
- Add Whisper pipeline tests by @katalinic-gc in #409
- Enable fine-tuning of
whisper-tiny
by @hmellor in #400 - Fix issue where exe cache dir was set too late by @hmellor in #411
- Enable generation tests by @kundaMwiza in #407
- Add Seq2Seq trainer test by @kundaMwiza in #404
- Use the generation config to control generation by @katalinic-gc in #410
- Add support for Whisper timestamp decoding with on-device generation by @katalinic-gc in #413
- Fix IPUWhisperTimeStampLogitsProcessor for beam search by @katalinic-gc in #414
- Remove usage of deprecated config:
pod_type
by @hmellor in #416 - Fix
matmul_proportion
ManagedAttribute
usage by @hmellor in #415 - Enable Whisper encoder and decoder to run on 1 IPU by @katalinic-gc in #418
- Enable replication with on device text generation by @katalinic-gc in #420
- Update doc workflows by @regisss in #417
- Update whisper pipeline example for latest features by @katalinic-gc in #421
- Fix text encoder for SD with 4.29 bump by @katalinic-gc in #424
- Use the faster whisper feature extractor in whisper pipelines by @katalinic-gc in #423
- Remove engine references from SD pipelines by @katalinic-gc in #422
- Add support for
whisper-small
fine-tuning by @hmellor in #426 - Use index select for whisper position embedding for better tile utili… by @katalinic-gc in #435
- Print execution time of each example test by @kundaMwiza in #440
- SplitProjection layer: Add output channels serialization mode by @kundaMwiza in #438
- 3.3 Examples CI Fixes by @jimypbr in #443
- Support T5EncoderModel for t5-based embedding models by @alex-coniasse in #437
- Integrate whisper large into the existing notebook by @alex-coniasse in #441
- Bump SDK version to 3.3 in the github workflows by @jimypbr in #444
- Update examples requirements for sdk3.3 by @jimypbr in #434
Full Changelog: v0.6.1...v0.7.0