Obserwuj
Yiheng Xu
Tytuł
Cytowane przez
Cytowane przez
Rok
LayoutLM: Pre-training of Text and Layout for Document Image Understanding
Y Xu, M Li, L Cui, S Huang, F Wei, M Zhou
Proceedings of the 26th ACM SIGKDD International Conference on Knowledge …, 2020
8082020
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Y Xu, Y Xu, T Lv, L Cui, F Wei, G Wang, Y Lu, D Florencio, C Zhang, ...
Proceedings of the 59th Annual Meeting of the Association for Computational …, 2020
5272020
DocBank: A Benchmark Dataset for Document Layout Analysis
M Li, Y Xu, L Cui, S Huang, F Wei, Z Li, M Zhou
Proceedings of the 28th International Conference on Computational …, 2020
2072020
Graph Convolutional Networks with Markov Random Field Reasoning for Social Spammer Detection
Y Wu, D Lian, Y Xu, L Wu, E Chen
Proceedings of the AAAI Conference on Artificial Intelligence 34 (01), 1054-1061, 2020
2062020
DiT: Self-Supervised Pre-training for Document Image Transformer
J Li, Y Xu, T Lv, L Cui, C Zhang, F Wei
Proceedings of the 30th ACM International Conference on Multimedia, 2022
1572022
LayoutXLM: Multimodal Pre-training for Multilingual Visually-Rich Document Understanding
Y Xu, T Lv, L Cui, G Wang, Y Lu, D Florencio, C Zhang, F Wei
arXiv preprint arXiv:2104.08836, 2021
1272021
Document AI: Benchmarks, Models and Applications
L Cui, Y Xu, T Lv, F Wei
arXiv preprint arXiv:2111.08609, 2021
822021
XFUND: a benchmark dataset for multilingual visually rich form understanding
Y Xu, T Lv, L Cui, G Wang, Y Lu, D Florencio, C Zhang, F Wei
Findings of the Association for Computational Linguistics: ACL 2022, 3214-3224, 2022
652022
LayoutReader: Pre-training of Text and Layout for Reading Order Detection
Z Wang, Y Xu, L Cui, J Shang, F Wei
Proceedings of the 2021 Conference on Empirical Methods in Natural Language …, 2021
652021
MarkupLM: Pre-training of Text and Markup Language for Visually Rich Document Understanding
J Li, Y Xu, L Cui, F Wei
Proceedings of the 60th Annual Meeting of the Association for Computational …, 2022
572022
Openagents: An open platform for language agents in the wild
T Xie, F Zhou, Z Cheng, P Shi, L Weng, Y Liu, TJ Hua, J Zhao, Q Liu, C Liu, ...
arXiv preprint arXiv:2310.10634, 2023
542023
Lemur: Harmonizing natural language and code for language agents
Y Xu, H Su, C Xing, B Mi, Q Liu, W Shi, B Hui, F Zhou, Y Liu, T Xie, ...
The Twelfth International Conference on Learning Representations (ICLR 2024), 2024
53*2024
Osworld: Benchmarking multimodal agents for open-ended tasks in real computer environments
T Xie, D Zhang, J Chen, X Li, S Zhao, R Cao, TJ Hua, Z Cheng, D Shin, ...
arXiv preprint arXiv:2404.07972, 2024
452024
In-context learning with many demonstration examples
M Li, S Gong, J Feng, Y Xu, J Zhang, Z Wu, L Kong
arXiv preprint arXiv:2302.04931, 2023
172023
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Y Xu, Z Wang, J Wang, D Lu, T Xie, A Saha, D Sahoo, T Yu, C Xiong
arXiv preprint arXiv:2412.04454, 2024
2024
Reading order detection in a document
L Cui, XU Yiheng, Y Xu, F Wei, Z Wang
US Patent App. 18/563,002, 2024
2024
Nie można teraz wykonać tej operacji. Spróbuj ponownie później.
Prace 1–16