Qiyuan.Tech

Research and development of AGI for the benefit of mankind.

Chinese CLIP: Contrastive Vision-Language Pretraining in Chinese

CLIP1 is a phenomenal playmaker in vision and multimodal representation learning. It plays not only as a foundation model but also a bridge between vision and language. It has triggered a series of research in different fields, especially text-to-image generation. However, we find that there is a necessity for a language-specific CLIP for applications, especially cross-modal retrieval, and there is no opensourced Chinese CLIP with good performance. We therefore launched this project to promote the Chinese multimodal representation learning....

December 24, 2022 · 4 min · 850 words · Qiyuan.Tech