Obserwuj
Tengchao Lv (吕腾超)
Tengchao Lv (吕腾超)
Microsoft Research Asia
Zweryfikowany adres z microsoft.com - Strona główna
Tytuł
Cytowane przez
Cytowane przez
Rok
Layoutlmv2: Multi-modal pre-training for visually-rich document understanding
Y Xu, Y Xu, T Lv, L Cui, F Wei, G Wang, Y Lu, D Florencio, C Zhang, ...
arXiv preprint arXiv:2012.14740, 2020
5282020
Language is not all you need: Aligning perception with language models
S Huang, L Dong, W Wang, Y Hao, S Singhal, S Ma, T Lv, L Cui, ...
Advances in Neural Information Processing Systems 36, 72096-72109, 2023
4372023
Layoutlmv3: Pre-training for document ai with unified text and image masking
Y Huang, T Lv, L Cui, Y Lu, F Wei
Proceedings of the 30th ACM International Conference on Multimedia, 4083-4091, 2022
4332022
Trocr: Transformer-based optical character recognition with pre-trained models
M Li, T Lv, J Chen, L Cui, Y Lu, D Florencio, C Zhang, Z Li, F Wei
Proceedings of the AAAI Conference on Artificial Intelligence 37 (11), 13094 …, 2023
4012023
Dit: Self-supervised pre-training for document image transformer
J Li, Y Xu, T Lv, L Cui, C Zhang, F Wei
Proceedings of the 30th ACM International Conference on Multimedia, 3530-3539, 2022
1572022
Hierarchical attention prototypical networks for few-shot text classification
S Sun, Q Sun, K Zhou, T Lv
Proceedings of the 2019 conference on empirical methods in natural language …, 2019
1472019
Layoutxlm: Multimodal pre-training for multilingual visually-rich document understanding
Y Xu, T Lv, L Cui, G Wang, Y Lu, D Florencio, C Zhang, F Wei
arXiv preprint arXiv:2104.08836, 2021
1282021
Textdiffuser: Diffusion models as text painters
J Chen, Y Huang, T Lv, L Cui, Q Chen, F Wei
Advances in Neural Information Processing Systems 36, 2024
912024
Document AI: Benchmarks, Models and Applications
FW Lei Cui, Yiheng Xu, Tengchao Lv
CCL 2021, 2021
83*2021
XFUND: a benchmark dataset for multilingual visually rich form understanding
Y Xu, T Lv, L Cui, G Wang, Y Lu, D Florencio, C Zhang, F Wei
Findings of the Association for Computational Linguistics: ACL 2022, 3214-3224, 2022
652022
Kosmos-2.5: A multimodal literate model
T Lv, Y Huang, J Chen, Y Zhao, Y Jia, L Cui, S Ma, Y Chang, S Huang, ...
arXiv preprint arXiv:2309.11419, 2023
422023
Textdiffuser-2: Unleashing the power of language models for text rendering
J Chen, Y Huang, T Lv, L Cui, Q Chen, F Wei
European Conference on Computer Vision, 386-402, 2025
382025
XDoc: Unified pre-training for cross-format document understanding
J Chen, T Lv, L Cui, C Zhang, F Wei
arXiv preprint arXiv:2210.02849, 2022
142022
Vt-ssum: A benchmark dataset for video transcript segmentation and summarization
T Lv, L Cui, M Vasilijevic, F Wei
arXiv preprint arXiv:2106.05606, 2021
112021
A simple yet effective learnable positional encoding method for improving document transformer model
G Wang, Y Lu, L Cui, T Lv, D Florencio, C Zhang
Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022 …, 2022
92022
Adversarial Knowledge Stimulated Contrastive Prompting for Few-shot Language Learners
K Zheng, Q Sun, Y Yang, T Lv, Y Pi, C Zhao, F Xu, Q Zhang
Findings of the Association for Computational Linguistics: ACL 2023, 13495-13507, 2023
2023
TextDiffuser: Diffusion Models as Text Painters
FW Jingye Chen, YupanHuang, Tengchao Lv, Lei Cui, Qifeng Chen
https://arxiv.org/pdf/2305.10855.pdf, 0
Nie można teraz wykonać tej operacji. Spróbuj ponownie później.
Prace 1–17