CODER（ECCV-2022）

Introdcurion

This is the official source code for the paper CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval (ECCV 2022).

Abstract: Image-Text Retrieval (ITR) is challenging in bridging visual and lingual modalities. Contrastive learning has been adopted by most prior arts. Except for limited amount of negative image-text pairs, the capability of constrastive learning is restricted by manually weighting negative pairs as well as unawareness of external knowledge. In this paper, we propose our novel Coupled Diversity-Sensitive Momentum Constrastive Learning (CODER) for improving cross-modal representation. Firstly, a novel diversity-sensitive contrastive learning (DCL) architecture is invented. We introduce dynamic dictionaries for both modalities to enlarge the scale of image-text pairs, and diversity-sensitiveness is achieved by adaptive negative pair weighting. Furthermore, two branches are designed in CODER. One learns instance-level embeddings from image/text, and it also generates pseudo online clustering labels for its input image/text based on their embeddings. Meanwhile, the other branch learns to query from commonsense knowledge graph to form conceptlevel descriptors for both modalities. Afterwards, both branches leverage DCL to align the cross-modal embedding spaces while an extra pseudo clustering label prediction loss is utilized to promote concept-level representation learning for the second branch. Extensive experiments conducted on two popular benchmarks, i.e. MSCOCO and Flicker30K, validate CODER remarkably outperforms the state-of-the-art approaches.

The results on MSCOCO and Flicke30K dataset:

	Image-to-Text			Text-to-Image
Dataset	R@1	R@5	R@10	R@1	R@5	R@10	R@sum
MSCOCO	82.1	96.6	98.8	65.5	91.5	96.2	530.6
Flickr30k	83.2	96.5	98.0	63.1	87.1	93.0	520.9

Requirements and Installation

You can config the running enrionment by using

pip install -r requirements.txt

We recommended the following dependencies.

Python 3.7
NumPy 1.19
PyTorch 1.8
transformers 2.1.0
TensorBoard
torchtext 0.4.0
torchvision 0.9.0

Download data

Download the dataset files. We use the image feature created by SCAN, downloaded here. All the data needed for reproducing the experiments in the paper, including image features, text, vocabularies and concept annotation files, can be downloaded from:

wget https://pan.baidu.com/s/1ATcSpcOxn6CJCHvYL0ap-A?pwd=duxh

Checkpoints

The checkpoints of our trained models can be downloaded from:

wget https://pan.baidu.com/s/1otO_LB5RSNH235HNkYJZvQ?pwd=qp7b

Extract the runs.tar.gz to get the trained model files for Flickr30K dataset and put the extracted folder runs in the root directory.

Training

Train on MSCOCO dataset:

python train_mine_coco_CODER.py

Train on Flickr30K dataset:

python train_mine_f30k_CODER.py.py

Evaluate

Test on Flickr30K dataset:

python eval_mine_f30k_CODER.py

Reference

If this repo is useful for your research, please cite our paper:

@inproceedings{wang2022coder,
  title={Coder: Coupled diversity-sensitive momentum contrastive learning for image-text retrieval},
  author={Wang, Haoran and He, Dongliang and Wu, Wenhao and Xia, Boyang and Yang, Min and Li, Fu and Yu, Yunlong and Ji, Zhong and Ding, Errui and Wang, Jingdong},
  booktitle={European conference on computer vision},
  pages={700--716},
  year={2022},
  organization={Springer}
}

@article{Wang2020CVSE,
  title={Consensus-Aware Visual-Semantic Embedding for Image-Text Matching},
  author={Wang, Haoran and Zhang, Ying and Ji, Zhong and Pang, Yanwei and Ma, Lin},
  booktitle={ECCV},
  year={2020}
}

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
lib		lib
Figure_concept.png		Figure_concept.png
README.md		README.md
arguments_coco_CODER.py		arguments_coco_CODER.py
arguments_f30k_CODER.py		arguments_f30k_CODER.py
eval_mine_f30k_CODER.py		eval_mine_f30k_CODER.py
requirements.txt		requirements.txt
set_env.sh		set_env.sh
train_mine_coco_CODER.py		train_mine_coco_CODER.py
train_mine_f30k_CODER.py		train_mine_f30k_CODER.py
train_region_mine_f30k_CODER.sh		train_region_mine_f30k_CODER.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CODER（ECCV-2022）

Introdcurion

Requirements and Installation

Download data

Checkpoints

Training

Evaluate

Reference

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CODER（ECCV-2022）

Introdcurion

Requirements and Installation

Download data

Checkpoints

Training

Evaluate

Reference

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages