Biography
I am currently a Research Scientist at Shanghai AI Lab, working with Dr. Conghui He. My research focuses on Intelligent Document Understanding, Multimodal Large Language Models, and Data-Centric AI.
I am the Head of R&D for the MinerU project, an open-source toolkit for high-quality document parsing that has garnered over 40k stars on GitHub. MinerU supports both traditional model pipelines and advanced multimodal large model approaches.
Prior to joining Shanghai AI Lab, I was engaged in data algorithm research at SenseTime Group Inc. (2020-2022). I obtained my Ph.D. from the University of Chinese Academy of Sciences in 2020. Between 2018 and 2019, I participated in the National Natural Science Foundation of China’s joint Ph.D. training program at the University of Central Florida, under the supervision of Professors Yongdong Zhang and Guo-Jun Qi.
欢迎加入
我们持续招收对科研有强烈兴趣、自驱力强、有责任心的同学。开放职位包括:
- 算法实习生
- 青年研究员
- 博士联培生(欢迎希望来实验室读博的同学投递,直博和普博均可)
让我们一起做有影响力的事情。请将简历发送至:wangbin@pjlab.org.cn 或 ictwangbin@gmail.com。
🔥 News
2025:
- 2025.06: 🎉🎉 OHR, LEGION and Chimera are accepted by ICCV 2025.
- 2025.02: 🎉🎉 OmniDocBench and CDM are accepted by CVPR 2025.
- 2025.01: 🎉🎉 GeoX and OmniCorpus are accepted by ICLR 2025.
2024:
- 2024.09: 🎉🎉 InternLM-XComposer2-4KHD is accepted by NeurIPS 2024.
- 2024.07: 🔥🔥🔥
has received 3500+ GitHub stars within one month.
- 2024.07: 🔥🔥🔥
has received 4200+ GitHub stars and ranked #1 on the GitHub Trending list.
- 2024.07: 🎉🎉 CLIP-Parrot-Bias is accepted by ECCV 2024 (Oral).
- 2024.02: 🎉🎉 OPERA is accepted by CVPR 2024.
- 2023.12: 🎉🎉 VIGC is accepted by AAAI 2024.
- 2023.12: 🎉🎉 One paper is accepted by IJAEOG 2024.
- 2023.08: 🎉🎉 DropQueries is accepted by TMM 2023.
- 2023.08: 🎉🎉 V3Det is accepted by ICCV 2023 (Oral).
🚀 Project

PDF-Extract-Kit: A Comprehensive Toolkit for High-Quality PDF Content Extraction
📝 Publications

Parrot Captions Teach CLIP to Spot Text
Yiqi Lin*, Conghui He*, Alex Jinpeng Wang*, Bin Wang*, Weijia Li, Mike Zheng Shou

VIGC: Visual Instruction Generation and Correction
Bin Wang, Fan Wu, Xiao Han, Jiahui Peng, Huaping Zhong, Pan Zhang, Xiaoyi Dong, Weijia Li, Wei Li, Jiaqi Wang, Conghui He

Dinghao Yang*, Bin Wang*, Weijia Li, Conghui He
IJAEOG 2024, | Github

DropQueries: A Simple Way to Discover Comprehensive Segment Representations
Haojie Ding, Bin Wang, Guoliang Kang, Weijia Li, Conghui He, Yao Zhao, and Yunchao Wei
TMM 2023

V3Det: Vast Vocabulary Visual Detection Dataset
Jiaqi Wang, Pan Zhang, Tao Chu, Yuhang Cao, Yujie Zhou, Tong Wu, Bin Wang, Conghui He, and Dahua Lin

Boundary perception guidance: A scribble-supervised semantic segmentation approach
Bin Wang, Guojun Qi, Sheng Tang, Tianzhu Zhang, Yunchao Wei, Linghui Li, and Yongdong Zhang
IJCAI 2019

Spatiotemporal Breast Mass Detection Network(MD-Net) in 4D DCE-MRI Images
Lixi Deng, Sheng Tang, Huazhu Fu, Bin Wang, and Yongdong Zhang
MICCAI 2019

Automated pulmonary nodule detection: High sensitivity with few candidates
Bin Wang, Guojun Qi, Sheng Tang, Liheng Zhang, Lixi Deng, and Yongdong Zhang
MICCAI 2018
🎖 Honors and Awards
- 2020.06, Zhu Li Yuehua Outstanding Ph.D. student Scholarship, Chinese Academy of Sciences (CAS).
- 2016.09, Won 3rd place in the ILSVRC 2016 VID task (Object Detection from Video).
🏢 Work Experience
- 2020.07 - 2022.08, Researcher, SenseTime, Shenzhen, China.
📖 Education
- 2015.09 - 2020.06, Ph.D., University of Chinese Academy of Sciences, Beijing, China.
- 2013.09 - 2015.06, M.S., Beijing Jiaotong University, Beijing, China.