Graph Matching Benchmark

pygmtools also provides a protocol to fairly compare existing deep graph matching algorithms under different datasets & experiment settings. The Benchmark module provides a unified data interface and an evaluating platform for different datasets.

If you are interested in the performance and the full deep learning pipeline, please refer to our ThinkMatch project.

Evaluation Metrics and Results

Our evaluation metrics include matching_precision (p), matching_recall (r) and f1_score (f1). Also, to measure the reliability of the evaluation result, we define coverage (cvg) for each class in the dataset as the number of evaluated pairs in the class/number of all possible pairs in the class. Therefore, larger coverage refers to higher reliability.

An example of evaluation result (p==r==f1 because this evaluation does not involve partial matching/outliers):

Matching accuracy
Car: p = 0.8395±0.2280, r = 0.8395±0.2280, f1 = 0.8395±0.2280, cvg = 1.0000
Duck: p = 0.7713±0.2255, r = 0.7713±0.2255, f1 = 0.7713±0.2255, cvg = 1.0000
Face: p = 0.9656±0.0913, r = 0.9656±0.0913, f1 = 0.9656±0.0913, cvg = 0.2612
Motorbike: p = 0.8821±0.1821, r = 0.8821±0.1821, f1 = 0.8821±0.1821, cvg = 1.0000
Winebottle: p = 0.8929±0.1569, r = 0.8929±0.1569, f1 = 0.8929±0.1569, cvg = 0.9662
average accuracy: p = 0.8703±0.1767, r = 0.8703±0.1767, f1 = 0.8703±0.1767
Evaluation complete in 1m 55s

Available Datasets

Dataset can be automatically downloaded and unzipped, but you can also download the dataset yourself, and make sure it in the right path.

PascalVOC-Keypoint Dataset

  1. Download VOC2011 dataset and make sure it looks like data/PascalVOC/TrainVal/VOCdevkit/VOC2011

  2. Download keypoint annotation for VOC2011 from Berkeley server or google drive and make sure it looks like data/PascalVOC/annotations

  3. Download the train/test split file and make sure it looks like data/PascalVOC/voc2011_pairs.npz

Please cite the following papers if you use PascalVOC-Keypoint dataset:

  title={The pascal visual object classes (voc) challenge},
  author={Everingham, Mark and Van Gool, Luc and Williams, Christopher KI and Winn, John and Zisserman, Andrew},
  journal={International Journal of Computer Vision},

  title={Poselets: Body part detectors trained using 3d human pose annotations},
  author={Bourdev, L. and Malik, J.},
  booktitle={International Conference on Computer Vision},

Willow-Object-Class Dataset

  1. Download Willow-ObjectClass dataset

  2. Unzip the dataset and make sure it looks like data/WillowObject/WILLOW-ObjectClass

Please cite the following paper if you use Willow-Object-Class dataset:

  author={Cho, Minsu and Alahari, Karteek and Ponce, Jean},
  title = {Learning Graphs to Match},
  booktitle = {International Conference on Computer Vision},

CUB2011 Dataset

  1. Download CUB-200-2011 dataset.

  2. Unzip the dataset and make sure it looks like data/CUB_200_2011/CUB_200_2011

Please cite the following report if you use CUB2011 dataset:

  Title = {{The Caltech-UCSD Birds-200-2011 Dataset}},
  Author = {Wah, C. and Branson, S. and Welinder, P. and Perona, P. and Belongie, S.},
  Year = {2011},
  Institution = {California Institute of Technology},
  Number = {CNS-TR-2011-001}

IMC-PT-SparseGM Dataset

  1. Download the IMC-PT-SparseGM dataset from google drive or baidu drive (code: 0576)

  2. Unzip the dataset and make sure it looks like data/IMC_PT_SparseGM/annotations

Please cite the following papers if you use IMC-PT-SparseGM dataset:

  title={Image Matching across Wide Baselines: From Paper to Practice},
  author={Jin, Yuhe and Mishkin, Dmytro and Mishchuk, Anastasiia and Matas, Jiri and Fua, Pascal and Yi, Kwang Moo and Trulls, Eduard},
  journal={International Journal of Computer Vision},

SPair-71k Dataset

  1. Download SPair-71k dataset

  2. Unzip the dataset and make sure it looks like data/SPair-71k

Please cite the following papers if you use SPair-71k dataset:

   title={SPair-71k: A Large-scale Benchmark for Semantic Correspondence},
   author={Juhong Min and Jongmin Lee and Jean Ponce and Minsu Cho},
   journal={arXiv prepreint arXiv:1908.10543},

   title={Hyperpixel Flow: Semantic Correspondence with Multi-layer Neural Features},
   author={Juhong Min and Jongmin Lee and Jean Ponce and Minsu Cho},

API Reference

See the API doc of Benchmark module and the API doc of datasets for details.

File Organization

  • The file includes 5 dataset classes, used to automatically download the dataset and process the dataset into a json file, and also save the training set and the testing set.

  • The file includes Benchmark class that can be used to fetch data from the json file and evaluate prediction results.

  • The default dataset settings, mostly dataset path and classes.


import pygmtools as pygm
from pygm.benchmark import Benchmark

# Define Benchmark on PascalVOC.
bm = Benchmark(name='PascalVOC', sets='train',
               obj_resize=(256, 256), problem='2GM',

# Random fetch data and ground truth.
data_list, gt_dict, _ = bm.rand_get_data(cls=None, num=2)