Obserwuj
Wei-Ning Hsu
Wei-Ning Hsu
Facebook AI Research (FAIR)
Zweryfikowany adres z csail.mit.edu - Strona główna
Tytuł
Cytowane przez
Cytowane przez
Rok
Hubert: Self-supervised speech representation learning by masked prediction of hidden units
WN Hsu, B Bolte, YHH Tsai, K Lakhotia, R Salakhutdinov, A Mohamed
IEEE/ACM Transactions on Audio, Speech, and Language Processing 29, 3451-3460, 2021
26332021
Data2vec: A general framework for self-supervised learning in speech, vision and language
A Baevski, WN Hsu, Q Xu, A Babu, J Gu, M Auli
International Conference on Machine Learning, 1298-1312, 2022
8182022
An unsupervised autoregressive model for speech representation learning
YA Chung, WN Hsu, H Tang, J Glass
INTERSPEECH, 2019
4542019
Unsupervised learning of disentangled and interpretable representations from sequential data
WN Hsu, Y Zhang, J Glass
Thirty-first Conference on Neural Information Processing Systems (NeurIPS), 2017
4222017
On generative spoken language modeling from raw audio
K Lakhotia, E Kharitonov, WN Hsu, Y Adi, A Polyak, B Bolte, TA Nguyen, ...
Transactions of the Association for Computational Linguistics 9, 1336-1354, 2021
3102021
Hierarchical generative modeling for controllable speech synthesis
WN Hsu, Y Zhang, RJ Weiss, H Zen, Y Wu, Y Wang, Y Cao, Y Jia, Z Chen, ...
Seventh International Conference on Learning Representations (ICLR), 2019
305*2019
Unsupervised speech recognition
A Baevski, WN Hsu, A Conneau, M Auli
Advances in Neural Information Processing Systems 34, 27826-27839, 2021
3022021
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
A Polyak, Y Adi, J Copet, E Kharitonov, K Lakhotia, WN Hsu, A Mohamed, ...
INTERSPEECH, 2021
2822021
Learning audio-visual speech representation by masked multimodal cluster prediction
B Shi, WN Hsu, K Lakhotia, A Mohamed
arXiv preprint arXiv:2201.02184, 2022
2662022
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training
WN Hsu, A Sriram, A Baevski, T Likhomanenko, Q Xu, V Pratap, J Kahn, ...
INTERSPEECH, 2021
2442021
Lingvo: a modular and scalable framework for sequence-to-sequence modeling
J Shen, P Nguyen, Y Wu, Z Chen, MX Chen, Y Jia, A Kannan, T Sainath, ...
arXiv preprint arXiv:1902.08295, 2019
2092019
Scaling speech technology to 1,000+ languages
V Pratap, A Tjandra, B Shi, P Tomasello, A Babu, S Kundu, A Elkahky, ...
Journal of Machine Learning Research 25 (97), 1-52, 2024
2082024
Active learning by learning
WN Hsu, HT Lin
Proceedings of the AAAI Conference on Artificial Intelligence 29 (1), 2015
2012015
Learning Latent Representations for Speech Generation and Transformation
WN Hsu, Y Zhang, J Glass
INTERSPEECH, 1273-1277, 2017
1852017
Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation
WN Hsu, Y Zhang, J Glass
2017 IEEE automatic speech recognition and understanding workshop (ASRU), 16-23, 2017
1732017
Voicebox: Text-guided multilingual universal speech generation at scale
M Le, A Vyas, B Shi, B Karrer, L Sari, R Moritz, M Williamson, V Manohar, ...
Advances in neural information processing systems 36, 2024
1682024
Direct speech-to-speech translation with discrete units
A Lee, PJ Chen, C Wang, J Gu, S Popuri, X Ma, A Polyak, Y Adi, Q He, ...
arXiv preprint arXiv:2107.05604, 2021
1502021
Semi-supervised training for improving data efficiency in end-to-end speech synthesis
YA Chung, Y Wang, WN Hsu, Y Zhang, RJ Skerry-Ryan
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1412019
Textless speech-to-speech translation on real data
A Lee, H Gong, PA Duquenne, H Schwenk, PJ Chen, C Wang, S Popuri, ...
arXiv preprint arXiv:2112.08352, 2021
1292021
Disentangling correlated speaker and noise for speech synthesis via data augmentation and adversarial factorization
WN Hsu, Y Zhang, RJ Weiss, YA Chung, Y Wang, Y Wu, J Glass
ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and …, 2019
1282019
Nie można teraz wykonać tej operacji. Spróbuj ponownie później.
Prace 1–20