Yupan Huang

Cytowane przez

	Wszystkie	Od 2019
Cytowania	806	806
h-indeks	9	9
i10-indeks	9	9

420

210

105

315

2020202120222023202413 49 180 408 152

Dostęp publiczny

Wyświetl wszystko

5 artykułów

1 artykuł

dostępne

niedostępne

Objęte finansowaniem

Współautorzy

Bei LiuMicrosoft ResearchZweryfikowany adres z microsoft.com
Jianlong FuMicrosoft ResearchZweryfikowany adres z microsoft.com
Furu WeiPartner Research Manager, Microsoft ResearchZweryfikowany adres z microsoft.com
Lei CuiMicrosoft Research AsiaZweryfikowany adres z microsoft.com
Qi DaiMicrosoft ResearchZweryfikowany adres z microsoft.com
Nigel CollierProfessor of Natural Language Processing, University of CambridgeZweryfikowany adres z cam.ac.uk

Obserwuj

Yupan Huang

Sun Yat-sen University

Zweryfikowany adres z mail2.sysu.edu.cn - Strona główna

Multimodal AI Computer Vision Natural Language Processing


Tytuł Sortuj wg cytatów Sortuj wg roku Sortuj wg tytułu	Cytowane przez Cytowane przez	Rok
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking Y Huang, T Lv, L Cui, Y Lu, F Wei Proceedings of the 30th ACM International Conference on Multimedia, 2022	282	2022
Seeing out of the box: End-to-end pre-training for vision-language representation learning Z Huang, Z Zeng, Y Huang*, B Liu, D Fu, J Fu Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2021	250	2021
Probing inter-modality: Visual parsing with self-attention for vision-and-language pre-training H Xue, Y Huang, B Liu, H Peng, J Fu, H Li, J Luo Advances in Neural Information Processing Systems 34, 4514-4528, 2021	79	2021
Decoupling localization and classification in single shot temporal action detection Y Huang, Q Dai, Y Lu 2019 IEEE International Conference on Multimedia and Expo (ICME), 1288-1293, 2019	57	2019
Unifying multimodal transformer for bi-directional image and text generation Y Huang, H Xue, B Liu, Y Lu Proceedings of the 29th ACM International Conference on Multimedia, 1138-1147, 2021	56	2021
Reinforced short-length hashing X Liu, X Nie, Q Dai, Y Huang, L Lian, Y Yin IEEE Transactions on Circuits and Systems for Video Technology 31 (9), 3655-3668, 2020	21	2020
Textdiffuser: Diffusion models as text painters J Chen, Y Huang, T Lv, L Cui, Q Chen, F Wei Advances in Neural Information Processing Systems 36, 2024	17	2024
Kosmos-2.5: A Multimodal Literate Model T Lv, Y Huang, J Chen, L Cui, S Ma, Y Chang, S Huang, W Wang, ... arXiv preprint arXiv:2309.11419, 2023	16	2023
Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models Y Huang, Z Meng, F Liu, Y Su, N Collier, Y Lu arXiv preprint arXiv:2308.16463, 2023	12	2023
A picture is worth a thousand words: A unified system for diverse captions and rich images generation Y Huang, B Liu, J Fu, Y Lu Proceedings of the 29th ACM International Conference on Multimedia, 2792-2794, 2021	8	2021
Be specific, be clear: Bridging machine and human captions by scene-guided transformer Y Huang, Z Zeng, Y Lu Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia …, 2021	5	2021
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering J Chen, Y Huang, T Lv, L Cui, Q Chen, F Wei arXiv preprint arXiv:2311.16465, 2023	3	2023

Nie można teraz wykonać tej operacji. Spróbuj ponownie później.

Prace 1–12

Cytowania rocznie

Powielone cytowania

Scalone cytowania

Dodaj współautorówWspółautorzy

Obserwuj

Cytowane przez

Współautorzy