Multi-Grained Vision Language Pre-Training: Aligning Texts with Visual Concepts Y Zeng, X Zhang, H Li
arXiv preprint arXiv:2111.08276, 2021
209 2021 What Matters in Training a GPT4-Style Language Model with Multimodal Inputs? Y Zeng, H Zhang, J Zheng, J Xia, G Wei, Y Wei, Y Zhang, T Kong
arXiv preprint arXiv:2307.02469, 2023
35 2023 X -VLM: All-In-One Pre-trained Model For Vision-Language Tasks Y Zeng, X Zhang, H Li, J Wang, J Zhang, W Zhou
arXiv preprint arXiv:2211.12402, 2022
32 2022 Jointly Optimizing State Operation Prediction and Value Generation for Dialogue State Tracking Y Zeng, JY Nie
arXiv preprint arXiv:2010.14061, 2020
21 * 2020 Make Pixels Dance: High-Dynamic Video Generation Y Zeng, G Wei, J Zheng, J Zou, Y Wei, Y Zhang, H Li
arXiv preprint arXiv:2311.10982, 2023
17 2023 A Simple and Efficient Multi-Task Learning Approach for Conditioned Dialogue Generation Y Zeng, JY Nie
Proceedings of the 2021 Conference of the North American Chapter of the …, 2021
15 * 2021 Cross-View Language Modeling: Towards Unified Cross-Lingual Cross-Modal Pre-training Y Zeng, W Zhou, A Luo, X Zhang
arXiv preprint arXiv:2206.00621, 2022
14 2022 VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models W Zhou, Y Zeng, S Diao, X Zhang
arXiv preprint arXiv:2205.15237, 2022
11 2022 Multi-domain dialogue state tracking based on state graph Y Zeng, JY Nie
arXiv preprint arXiv:2010.11137, 2020
11 2020 An Investigation of Suitability of Pre-Trained Language Models for Dialogue Generation–Avoiding Discrepancies Y Zeng, JY Nie
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 …, 2021
8 * 2021