This Repository surveys the paper focusing on Adapters and Prompting methods for Speech Processing.
- ICASSP 2023 Tutorial Information
- Adapters for Speech Processing
- Prompting for Speech Processing
- Reprogramming and Prompting
- Parameter Efficient Learning Methods
- Contact
- In ICASSP 2023, we will give a tutorial about Paramter-Efficient Learning for speech processing and natural langauge processing. I (Kai-Wei Chang) will cover the topics of adapters and prompts for speech processing.
- Title: Parameter-Efficient Learning for Speech and Language Processing: Adapters, Prompts, and Reprogramming
- Conference: ICASSP 2023
- Website: ICASSP 2023 - Tutorials
- Parameter-Efficient Learning for Speech Processing Slides
- Pin-Yu Chen (IBM Research)
- Hung-yi Lee (National Taiwan University)
- Chao-Han Huck Yang (Georgia Institute of Technology )
- Kai-Wei Chang (National Taiwan University)
- Cheng-Han Chiang (National Taiwan University)
Title | Authors | Modality | Task | Link |
---|---|---|---|---|
Differentially Private Adapters for Parameter Efficient Acoustic Modeling | Chun-Wei Ho et al. | Speech | keyword Spotting | Interspeech 2023 |
Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition | Haoyu Tang et al. | Speech | ASR | arXiv 2023 |
A Parameter-Efficient Learning Approach to Arabic Dialect Identification with Pre-Trained General-Purpose Speech Model | Srijith Radhakrishnan et al. | Speech | Dialect Identification | Interspeech 2023 |
CHAPTER: Exploiting Convolutional Neural Network Adapters for Self-supervised Speech Models | Zih-Ching Chen et al. | Speech | [Multiple] | arXiv 2022 |
Parameter Efficient Transfer Learning for Various Speech Processing Tasks | Shinta Otake et al. | Speech | [Multiple] | arXiv 2022 |
Parameter-efficient transfer learning of pre-trained Transformer models for speaker verification using adapters | Junyi Peng et al. | Speech | Speaker Verification | arXiv 2022 |
Exploring Efficient-tuning Methods in Self-supervised Speech Models | Zih-Ching Chen et al. | Speech | [Multiple] | SLT 2022 |
DRAFT: A Novel Framework to Reduce Domain Shifting in Self-supervised Learning and Its Application to Children’s ASR | Ruchao Fan, Abeer Alwan | Speech | ASR | Interspeech 2022 |
Speaker adaptation for Wav2vec2 based dysarthric ASR | Murali Karthick Baskar et al. | Speech | ASR | Interspeech 2022 |
Adaptive multilingual speech recognition with pretrained models | Ngoc-Quan Pham et al. | Speech | ASR | Interspeech 2022 |
An Adapter Based Pre-Training for Efficient and Scalable Self-Supervised Speech Representation Learning | Samuel Kessler et al. | Speech | ASR | ICASSP 2022 |
Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition | Bethan Thomas et al. | Speech | ASR | ICASSP 2022 |
Scaling End-to-End Models for Large-Scale Multilingual ASR | Bo Li et al. | Speech | ASR | ASRU 2021 |
Meta-Adapter: Efficient Cross-Lingual Adaptation With Meta-Learning | Wenxin Hou et al. | Speech | ASR | ICASSP 2021 |
Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition | Wenxin Hou et al. | Speech | ASR | TASLP 2021 |
Lightweight Adapter Tuning for Multilingual Speech Translation | Hang Le et al. | Speech | Speech Translation | ACL-IJCNLP 2021 |
Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech | Katrin Tomanek et al. | Speech | ASR | EMNLP 2021 |
Multilingual Speech Recognition with Self-Attention Structured Parameterization | Yun Zhu et al. | Speech | ASR | Interspeech 2020 |
Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model | Anjuli Kannan et al. | Speech | ASR | Interspeech 2019 |
Title | Authors | Modality | Task | Link |
---|---|---|---|---|
Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization | Puyuan Peng et al. | Speech | [Multiple] | Interspeech 2023 |
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition | Chao-Han Huck Yang et al. | Speech | ASR | ICASSP 2023 |
SpeechPrompt v2: Prompt Tuning for Speech Classification Tasks | Kai-Wei Chang et al. | Speech | [Multiple] | arXiv 2023 |
Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision | Eugene Kharitonov et al. | Text & Speech | TTS | arXiv 2023 |
Describing emotions with acoustic property prompts for speech emotion recognition | Hira Dhamyal et al. | Text & Speech | ER | arXiv 2022 |
PromptTTS: Controllable Text-to-Speech with Text Descriptions | Zhifang Guo et al. | Text & Speech | TTS | arXiv 2022 |
Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Classification | Hao Yen et al. | Speech | Spoken Command Recognition | arXiv 2022 |
WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models | Heting Gao et al. | Text & Speech | SLU | Interspeech 2022 |
An Exploration of Prompt Tuning on Generative Spoken Language Model for Speech Processing Tasks | Kai-Wei Chang et al. | Speech | [Multiple] | Interspeech 2022 |
For more information about reprogramming and prompting for large pre-trained models, please refer to the "awesome-neural-reprogramming-acoustic-prompting" repository. This topic was also covered in ICASSP 2022 tutorial by Dr. Pin-Yu Chen and Dr. Huck Yang.
- GitHub Resource: awesome-neural-reprogramming-acoustic-prompting
- Tutorial Video: ICASSP 22 Tutorial, "Neural Model Reprogramming and Prompting for Speech Modeling, " Huck Yang
Title | Authors | Link |
---|---|---|
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models | Elad Ben Zaken et al. | ACL 2022 |
Towards a Unified View of Parameter-Efficient Transfer Learning | Junxian He et al. | ICLR 2022 |
LoRA: Low-Rank Adaptation of Large Language Models | Edward J. Hu et al. | ICLR 2022 |
Parameter-Efficient Transfer Learning for NLP | Neil Houlsby et al. | ICML 2019 |
We thank Kuang-Chen Peng, Tzu-Han Lin, and Fabian Ritter for their invaluable contribution to the initial collection.
This repository is maintained by Kai-Wei Chang ([email protected]) and Zih-Ching Chen. Feel free to contact us or make a pull request