Trocr: Transformer-based optical character recognition with pre-trained models M Li, T Lv, J Chen, L Cui, Y Lu, D Florencio, C Zhang, Z Li, F Wei Proceedings of the AAAI Conference on Artificial Intelligence 37 (11), 13094 …, 2023 | 422 | 2023 |
Scene Text Telescope: Text-Focused Scene Image Super-Resolution J Chen, B Li, X Xue Computer Vision and Pattern Recognition (CVPR 2021), 2021 | 125 | 2021 |
TextDiffuser: Diffusion Models as Text Painters J Chen, Y Huang, T Lv, L Cui, Q Chen, F Wei Neural Information Processing Systems (NeurIPS 2023), 2024 | 87 | 2024 |
Benchmarking chinese text recognition: Datasets, baselines, and an empirical study J Chen, H Yu, J Ma, M Guan, X Xu, X Wang, S Qu, B Li, X Xue arXiv preprint arXiv:2112.15093, 2021 | 63 | 2021 |
Zero-Shot Chinese Character Recognition with Stroke-Level Decomposition J Chen, B Li, X Xue International Joint Conference on Artificial Intelligence (IJCAI 2021), 2021 | 57 | 2021 |
Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution J Chen, H Yu, J Ma, B Li, X Xue Association for the Advancement of Artificial Intelligence (AAAI 2022), 2022 | 52 | 2022 |
Kosmos-2.5: A Multimodal Literate Model T Lv*, Y Huang*, J Chen*, L Cui*, S Ma, Y Chang, S Huang, W Wang, ... arXiv preprint arXiv:2309.11419, 2023 | 39 | 2023 |
TextDiffuser-2: Unleashing the Power of Language Models for Text Rendering J Chen, Y Huang, T Lv, L Cui, Q Chen, F Wei European Conference on Computer Vision (ECCV 2024 Oral), 2024 | 35 | 2024 |
MT-TransUNet: Mediating Multi-Task Tokens in Transformers for Skin Lesion Segmentation and Classification J Chen, J Chen, Z Zhou, B Li, A Yuille, Y Lu arXiv preprint arXiv:2112.01767, 2021 | 23 | 2021 |
XDoc: Unified Pre-training for Cross-Format Document Understanding J Chen, T Lv, L Cui, C Zhang, F Wei Empirical Methods in Natural Language Processing (EMNLP-Findings 2022), 2022 | 14 | 2022 |
Chinese character recognition with radical-structured stroke trees H Yu, J Chen, B Li, X Xue Machine Learning 113 (6), 3807-3827, 2024 | 12 | 2024 |
LLMs Meet Multimodal Generation and Editing: A Survey Y He, Z Liu, J Chen, Z Tian, H Liu, X Chi, R Liu, R Yuan, Y Xing, W Wang, ... arXiv preprint arXiv:2405.19334, 2024 | 11 | 2024 |
TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization KT Pham, J Chen, Q Chen ACM Multimedia 2024, 2024 | | 2024 |