Graph Matching Benchmark
pygmtools also provides a protocol to fairly compare existing deep graph matching algorithms under different datasets & experiment settings.
The Benchmark
module provides a unified data interface and an evaluating platform for different datasets.
If you are interested in the performance and the full deep learning pipeline, please refer to our ThinkMatch project.
Evaluation Metrics and Results
Our evaluation metrics include matching_precision (p), matching_recall (r) and f1_score (f1). Also, to measure the reliability of the evaluation result, we define coverage (cvg) for each class in the dataset as the number of evaluated pairs in the class/number of all possible pairs in the class. Therefore, larger coverage refers to higher reliability.
An example of evaluation result (p==r==f1
because this evaluation does not involve partial matching/outliers):
Matching accuracy
Car: p = 0.8395±0.2280, r = 0.8395±0.2280, f1 = 0.8395±0.2280, cvg = 1.0000
Duck: p = 0.7713±0.2255, r = 0.7713±0.2255, f1 = 0.7713±0.2255, cvg = 1.0000
Face: p = 0.9656±0.0913, r = 0.9656±0.0913, f1 = 0.9656±0.0913, cvg = 0.2612
Motorbike: p = 0.8821±0.1821, r = 0.8821±0.1821, f1 = 0.8821±0.1821, cvg = 1.0000
Winebottle: p = 0.8929±0.1569, r = 0.8929±0.1569, f1 = 0.8929±0.1569, cvg = 0.9662
average accuracy: p = 0.8703±0.1767, r = 0.8703±0.1767, f1 = 0.8703±0.1767
Evaluation complete in 1m 55s
Available Datasets
Dataset can be automatically downloaded and unzipped, but you can also download the dataset yourself, and make sure it in the right path.
PascalVOC-Keypoint Dataset
Download VOC2011 dataset and make sure it looks like
data/PascalVOC/TrainVal/VOCdevkit/VOC2011
Download keypoint annotation for VOC2011 from Berkeley server or google drive and make sure it looks like
data/PascalVOC/annotations
Download the train/test split file and make sure it looks like
data/PascalVOC/voc2011_pairs.npz
Please cite the following papers if you use PascalVOC-Keypoint dataset:
@article{EveringhamIJCV10,
title={The pascal visual object classes (voc) challenge},
author={Everingham, Mark and Van Gool, Luc and Williams, Christopher KI and Winn, John and Zisserman, Andrew},
journal={International Journal of Computer Vision},
volume={88},
pages={303–338},
year={2010}
}
@inproceedings{BourdevICCV09,
title={Poselets: Body part detectors trained using 3d human pose annotations},
author={Bourdev, L. and Malik, J.},
booktitle={International Conference on Computer Vision},
pages={1365--1372},
year={2009},
organization={IEEE}
}
Willow-Object-Class Dataset
Download Willow-ObjectClass dataset
Unzip the dataset and make sure it looks like
data/WillowObject/WILLOW-ObjectClass
Please cite the following paper if you use Willow-Object-Class dataset:
@inproceedings{ChoICCV13,
author={Cho, Minsu and Alahari, Karteek and Ponce, Jean},
title = {Learning Graphs to Match},
booktitle = {International Conference on Computer Vision},
pages={25--32},
year={2013}
}
CUB2011 Dataset
Download CUB-200-2011 dataset.
Unzip the dataset and make sure it looks like
data/CUB_200_2011/CUB_200_2011
Please cite the following report if you use CUB2011 dataset:
@techreport{CUB2011,
Title = {{The Caltech-UCSD Birds-200-2011 Dataset}},
Author = {Wah, C. and Branson, S. and Welinder, P. and Perona, P. and Belongie, S.},
Year = {2011},
Institution = {California Institute of Technology},
Number = {CNS-TR-2011-001}
}
IMC-PT-SparseGM Dataset
Download the IMC-PT-SparseGM dataset from google drive or baidu drive (code: 0576)
Unzip the dataset and make sure it looks like
data/IMC_PT_SparseGM/annotations
Please cite the following papers if you use IMC-PT-SparseGM dataset:
@article{JinIJCV21,
title={Image Matching across Wide Baselines: From Paper to Practice},
author={Jin, Yuhe and Mishkin, Dmytro and Mishchuk, Anastasiia and Matas, Jiri and Fua, Pascal and Yi, Kwang Moo and Trulls, Eduard},
journal={International Journal of Computer Vision},
pages={517--547},
year={2021}
}
SPair-71k Dataset
Download SPair-71k dataset
Unzip the dataset and make sure it looks like
data/SPair-71k
Please cite the following papers if you use SPair-71k dataset:
@article{min2019spair,
title={SPair-71k: A Large-scale Benchmark for Semantic Correspondence},
author={Juhong Min and Jongmin Lee and Jean Ponce and Minsu Cho},
journal={arXiv prepreint arXiv:1908.10543},
year={2019}
}
@InProceedings{min2019hyperpixel,
title={Hyperpixel Flow: Semantic Correspondence with Multi-layer Neural Features},
author={Juhong Min and Jongmin Lee and Jean Ponce and Minsu Cho},
booktitle={ICCV},
year={2019}
}
API Reference
See the API doc of Benchmark module and the API doc of datasets for details.
File Organization
dataset.py
: The file includes 5 dataset classes, used to automatically download the dataset and process the dataset into a json file, and also save the training set and the testing set.benchmark.py
: The file includes Benchmark class that can be used to fetch data from the json file and evaluate prediction results.dataset_config.py
: The default dataset settings, mostly dataset path and classes.
Example
import pygmtools as pygm
from pygm.benchmark import Benchmark
# Define Benchmark on PascalVOC.
bm = Benchmark(name='PascalVOC', sets='train',
obj_resize=(256, 256), problem='2GM',
filter='intersection')
# Random fetch data and ground truth.
data_list, gt_dict, _ = bm.rand_get_data(cls=None, num=2)
Running Time Evaluation
Overall Comparison
Charts below illustrate the results of our experimental investigation into the efficiency of some pygmtools
solvers, comparing execution time among different backends and against previous packages (ZAC_GM
for classic solvers and Multiway
for multigraph solvers).
Experiments of pygmtools
were conducted on both CPU and GPU to explore the acceleration of CUDA for graph matching problems, and the existing packages were executed by both Matlab and Octave. Also examined are the variance of computation time with different input graph sizes and the dissimilar trends in time increments on different devices and backends. These information combined provide rich indication in hope that you can select a preferable backend and determine the necessity of enabling CUDA for specific problem scales.

Original Results
We provide the original data of our time tests here. Input affinity matrices are randomly generated with a fixed batchsize of 64 and the execution times have been averaged across 50 runs, with the first run of each test configuration excluded to mitigate initialization biases.
Note
All experiments were performed on a consistent platform of Linux Ubuntu 20.04 with Python 3.9.17 and the latest compatible versions of the numerical backends listed as follows. Runtime discrepancy shall occur due to different platform, package version, CUDA version, hardware configuration, etc.
numpy==1.24.3
torch==2.0.1
jittor==1.3.8.5
paddlepaddle-gpu==2.5.1.post116
tensorflow==2.13.0
mindspore-gpu==1.10.0
sinkhorn
Num_nodes |
100 |
200 |
300 |
400 |
500 |
---|---|---|---|---|---|
Numpy |
0.0774 |
0.3446 |
0.8339 |
2.5503 |
2.6804 |
Jittor |
0.0949 |
0.3758 |
0.9787 |
1.8182 |
2.2302 |
Jittor(gpu) |
0.0247 |
0.0404 |
0.0579 |
0.0505 |
0.0664 |
Mindspore |
0.6812 |
1.59 |
3.3054 |
6.1207 |
9.0556 |
Mindspore(gpu) |
0.6437 |
0.9862 |
1.6007 |
2.5595 |
4.0663 |
Paddle |
0.5251 |
2.1498 |
5.1901 |
9.196 |
13.6967 |
Paddle(gpu) |
0.0516 |
0.0706 |
0.0853 |
0.0976 |
0.1299 |
PyTorch |
0.0215 |
0.0826 |
0.2163 |
0.5345 |
0.8254 |
PyTorch(gpu) |
0.0253 |
0.0536 |
0.0682 |
0.0901 |
0.1193 |
Tensorflow |
0.2674 |
0.4068 |
0.7785 |
1.2411 |
1.3815 |
Tensorflow(gpu) |
0.3364 |
0.3946 |
0.3461 |
0.3532 |
0.3891 |
ZAC_GM(matlab) |
0.0168 |
0.0547 |
0.085 |
0.1935 |
0.3495 |
ZAC_GM(octave) |
0.4838 |
0.5544 |
0.7245 |
0.9855 |
1.3513 |
rrwm
Num_nodes |
10 |
20 |
30 |
40 |
50 |
---|---|---|---|---|---|
Numpy |
0.0672 |
0.2686 |
0.5725 |
1.4909 |
2.6849 |
Jittor |
0.1888 |
1.4861 |
6.4132 |
19.3547 |
48.8096 |
Jittor(gpu) |
1.233 |
1.2734 |
1.367 |
1.4691 |
1.3938 |
Mindspore |
8.996 |
9.654 |
10.4509 |
13.1224 |
15.8549 |
Mindspore(gpu) |
24.4615 |
24.6842 |
26.9007 |
28.3129 |
28.6972 |
Paddle |
0.5027 |
1.0883 |
2.0557 |
4.0692 |
9.7136 |
Paddle(gpu) |
2.5013 |
5.7596 |
2.8423 |
2.4371 |
2.4668 |
PyTorch |
0.1416 |
0.3493 |
0.6419 |
1.552 |
3.0807 |
PyTorch(gpu) |
1.2784 |
1.9233 |
2.4289 |
1.4122 |
1.3357 |
Tensorflow |
7.2743 |
7.4241 |
8.3388 |
8.7745 |
10.4648 |
Tensorflow(gpu) |
12.171 |
12.2908 |
13.1086 |
13.837 |
14.833 |
ZAC_GM(matlab) |
0.1013 |
0.6335 |
1.6572 |
4.9238 |
15.9601 |
ZAC_GM(octave) |
2.5925 |
2.9179 |
4.2427 |
16.9212 |
25.8831 |
sm
Num_nodes |
10 |
20 |
30 |
40 |
50 |
---|---|---|---|---|---|
Numpy |
0.0005 |
0.0068 |
0.0188 |
0.0552 |
0.1289 |
Jittor |
0.0812 |
1.2849 |
6.1659 |
19.394 |
47.4565 |
Jittor(gpu) |
0.0763 |
0.0866 |
0.1207 |
0.1709 |
0.3336 |
Mindspore |
0.1917 |
0.2764 |
0.4744 |
0.813 |
1.8217 |
Mindspore(gpu) |
0.6202 |
0.6836 |
0.7277 |
0.6812 |
0.9488 |
Paddle |
0.008 |
0.0117 |
0.0393 |
0.0933 |
0.2295 |
Paddle(gpu) |
0.0244 |
0.0245 |
0.0255 |
0.0388 |
0.0438 |
PyTorch |
0.0032 |
0.0075 |
0.0266 |
0.0768 |
0.2101 |
PyTorch(gpu) |
0.0117 |
0.013 |
0.0124 |
0.0211 |
0.0295 |
Tensorflow |
0.0621 |
0.0644 |
0.0772 |
0.1182 |
0.2152 |
Tensorflow(gpu) |
0.0607 |
0.0645 |
0.081 |
0.1185 |
0.1751 |
ZAC_GM(matlab) |
0.0278 |
0.0465 |
0.1008 |
0.2831 |
1.0303 |
ZAC_GM(octave) |
0.0979 |
0.1256 |
0.2433 |
0.5288 |
1.4496 |
cao
Num_nodes |
5 |
10 |
15 |
20 |
25 |
---|---|---|---|---|---|
Numpy |
0.0657 |
0.4048 |
1.2281 |
8.7141 |
16.6373 |
Jittor |
0.1652 |
0.3034 |
0.9381 |
2.528 |
5.8438 |
Jittor(gpu) |
1.5175 |
1.4686 |
1.3623 |
1.6918 |
1.621 |
Paddle |
0.4363 |
0.645 |
1.1394 |
2.2024 |
4.1144 |
Paddle(gpu) |
3.983 |
4.0496 |
3.5733 |
3.9038 |
3.566 |
PyTorch |
0.1973 |
0.2583 |
0.4398 |
0.7367 |
1.5745 |
PyTorch(gpu) |
1.8465 |
1.7263 |
1.877 |
1.819 |
1.4205 |
Multiway(matlab) |
0.1618 |
0.5324 |
0.9494 |
1.2673 |
1.3074 |
Multiway(octave) |
4.3373 |
12.7005 |
28.5129 |
32.5435 |
41.7382 |
mgm_floyd
Num_nodes |
10 |
20 |
30 |
40 |
50 |
---|---|---|---|---|---|
Numpy |
1.4202 |
2.1465 |
13.2036 |
33.611 |
70.1147 |
Jittor |
0.2715 |
2.1916 |
10.1757 |
30.9919 |
82.4783 |
Jittor(gpu) |
1.4565 |
1.411 |
1.2093 |
1.5656 |
2.0477 |
Paddle |
0.5789 |
1.6008 |
4.0375 |
9.4972 |
20.0651 |
Paddle(gpu) |
3.6822 |
3.5828 |
3.5775 |
3.4221 |
3.8406 |
PyTorch |
0.2361 |
0.5059 |
1.4968 |
3.7183 |
8.6155 |
PyTorch(gpu) |
1.9935 |
1.7762 |
1.5959 |
1.8578 |
2.2322 |
Multiway(matlab) |
0.5324 |
1.2673 |
1.6997 |
2.8531 |
3.8514 |
Multiway(octave) |
12.7005 |
32.5435 |
47.9178 |
59.411 |
77.5208 |
gamgm
Num_nodes |
50 |
100 |
150 |
200 |
250 |
---|---|---|---|---|---|
Numpy |
1.3975 |
6.4284 |
16.7187 |
37.5795 |
55.3014 |
Jittor |
0.5642 |
2.4157 |
6.3132 |
13.4241 |
27.8963 |
Jittor(gpu) |
2.121 |
2.6745 |
4.2369 |
6.688 |
10.3406 |
Paddle |
1.2089 |
6.3663 |
18.9799 |
43.4897 |
74.0847 |
Paddle(gpu) |
16.5545 |
10.959 |
25.8544 |
28.4271 |
32.6698 |
PyTorch |
0.3582 |
1.6374 |
3.9013 |
7.7074 |
12.0233 |
PyTorch(gpu) |
1.2207 |
1.4522 |
2.719 |
5.0526 |
7.7791 |
Multiway(matlab) |
3.8514 |
8.8039 |
18.0332 |
23.6242 |
31.9381 |
Multiway(octave) |
77.5208 |
208.5821 |
256.3383 |
308.7697 |
326.1246 |