
class pygmtools.benchmark.Benchmark(name, sets, obj_resize=(256, 256), problem='2GM', filter='intersection', **args)[source]

The Benchmark module provides a unified data interface and an evaluating platform for different datasets.

  • name – str, dataset name, currently support 'PascalVOC', 'WillowObject', 'IMC_PT_SparseGM', 'CUB2011', 'SPair71k'

  • sets – str, problem set, 'train' for training set and 'test' for test set

  • obj_resize – tuple, (default: (256, 256)) resized object size

  • problem – str, (default: '2GM') problem type, '2GM' for 2-graph matching and 'MGM' for multi-graph matching

  • filter – str, (default: 'intersection') filter of nodes, 'intersection' refers to retaining only common nodes; 'inclusion' is only for 2GM and refers to filtering only one graph to make its nodes a subset of the other graph, and 'unfiltered' refers to retaining all nodes in all graphs

  • args – keyword settings for specific dataset


Compute number of images in specified classes.


classes – list of dataset classes


list of numbers of images in each class

compute_length(cls=None, num=2)[source]

Compute the length of image combinations in specified class.

  • cls – int or str, class of expected data. None for all classes

  • num – int, number of images in each image ID list; for example, 2 for two-graph matching problem


length of combinations

eval(prediction, classes, verbose=False)[source]

Evaluate test results and compute matching accuracy and coverage.

  • prediction – list, prediction result, like [{'ids': (id1, id2), 'cls': cls, 'permmat': np.array or scipy.sparse}, ...]

  • classes – list of evaluated classes

  • verbose – bool, whether to print the result


evaluation result in each class and their averages, including p, r, f1 and their standard deviation and coverage

eval_cls(prediction, cls, verbose=False)[source]

Evaluate test results and compute matching accuracy and coverage on one specified class.

  • prediction – list, prediction result on one class, like [{'ids': (id1, id2), 'cls': cls, 'permmat': np.array or scipy.sparse}, ...]

  • cls – str, evaluated class

  • verbose – bool, whether to print the result


evaluation result on the specified class, including p, r, f1 and their standard deviation and coverage

get_data(ids, test=False, shuffle=True)[source]

Fetch a data pair or pairs of data by image ID for training or test.

  • ids – list of image ID, usually in train.json or test.json

  • test – bool, whether the fetched data is used for test; if true, this function will not return ground truth

  • shuffle – bool, whether to shuffle the order of keypoints


data_list: list of data, like [{'img': np.array, 'kpts': coordinates of kpts}, ...]

perm_mat_dict: ground truth, like {(0,1):scipy.sparse, (0,2):scipy.sparse, ...}, (0,1) refers to data pair (ids[0],ids[1])

ids: list of image ID

get_id_combination(cls=None, num=2)[source]

Get the combination of images and length of combinations in specified class.

  • cls – int or str, class of expected data. None for all classes

  • num – int, number of images in each image ID list; for example, 2 for 2GM


id_combination_list: list of combinations of image ids

length: length of combinations

rand_get_data(cls=None, num=2, test=False, shuffle=True)[source]

Randomly fetch data for training or test. Implemented by calling get_data function.

  • cls – int or str, class of expected data. None for random class

  • num – int, number of images; for example, 2 for 2GM

  • test – bool, whether the fetched data is used for test; if true, this function will not return ground truth

  • shuffle – bool, whether to shuffle the order of keypoints


data_list: list of data, like [{'img': np.array, 'kpts': coordinates of kpts}, ...]

perm_mat_dict: ground truth, like {(0,1):scipy.sparse, (0,2):scipy.sparse, ...}, (0,1) refers to data pair (ids[0],ids[1])

ids: list of image ID


Remove ground truth cache.


last_epoch – Boolean variable, whether this epoch is last epoch; if true, the directory of cache will also be removed.