Benchmark

class pygmtools.benchmark.Benchmark(name, sets, obj_resize=(256, 256), problem='2GM', filter='intersection', **args)[source]

The Benchmark module provides a unified data interface and an evaluating platform for different datasets.

Parameters
  • name – str, dataset name, currently support 'PascalVOC', 'WillowObject', 'IMC_PT_SparseGM', 'CUB2011', 'SPair71k'

  • sets – str, problem set, 'train' for training set and 'test' for test set

  • obj_resize – tuple, (default: (256, 256)) resized object size

  • problem – str, (default: '2GM') problem type, '2GM' for 2-graph matching and 'MGM' for multi-graph matching

  • filter – str, (default: 'intersection') filter of nodes, 'intersection' refers to retaining only common nodes; 'inclusion' is only for 2GM and refers to filtering only one graph to make its nodes a subset of the other graph, and 'unfiltered' refers to retaining all nodes in all graphs

  • args – keyword settings for specific dataset

Note

Ground truth cache is saved only when the parameter sets is 'test', so the functions eval() and eval_cls() are only for 'test' set.

compute_img_num(classes)[source]

Compute number of images in specified classes.

Parameters

classes – list of dataset classes

Returns

list of numbers of images in each class

compute_length(cls=None, num=2)[source]

Compute the length of image combinations in specified class.

Parameters
  • cls – int or str, class of expected data. None for all classes

  • num – int, number of images in each image ID list; for example, 2 for two-graph matching problem

Returns

length of combinations

eval(prediction, classes, verbose=False, rm_gt_cache=True)[source]

Evaluate test results and compute matching accuracy and coverage.

Parameters
  • prediction – list, prediction result, like [{'ids': (id1, id2), 'cls': cls, 'permmat': np.array or scipy.sparse}, ...]

  • classes – list of evaluated classes

  • verbose – bool, whether to print the result

  • rm_gt_cache – bool, whether to remove ground truth cache

Returns

evaluation result in each class and their averages, including p, r, f1 and their standard deviation and coverage

Note

If there are duplicate data pair in prediction, this function will only evaluate the first pair and expect that this pair is also the first fetched pair. Therefore, it is recommended that prediction is built in an ordered manner, and not shuffled.

Note

Ground truth cache is saved when data pairs are fetched, and should be removed after evaluation. Make sure all data pairs are evaluated at once, i.e., prediction should contain all fetched data pairs.

eval_cls(prediction, cls, verbose=False)[source]

Evaluate test results and compute matching accuracy and coverage on one specified class.

Parameters
  • prediction – list, prediction result on one class, like [{'ids': (id1, id2), 'cls': cls, 'permmat': np.array or scipy.sparse}, ...]

  • cls – str, evaluated class

  • verbose – bool, whether to print the result

Returns

evaluation result on the specified class, including p, r, f1 and their standard deviation and coverage

Note

If there are duplicate data pair in prediction, this function will only evaluate the first pair and expect that this pair is also the first fetched pair. Therefore, it is recommended that prediction is built in an ordered manner, and not shuffled. Same as the function eval.

Note

This function will not automatically remove ground truth cache. However, you can still mannually call the class function rm_gt_cache to remove groud truth cache after evaluation.

get_data(ids, test=False, shuffle=True)[source]

Fetch a data pair or pairs of data by image ID for training or test.

Parameters
  • ids – list of image ID, usually in train.json or test.json

  • test – bool, whether the fetched data is used for test; if true, this function will not return ground truth

  • shuffle – bool, whether to shuffle the order of keypoints

Returns

data_list: list of data, like [{'img': np.array, 'kpts': coordinates of kpts}, ...]

perm_mat_dict: ground truth, like {(0,1):scipy.sparse, (0,2):scipy.sparse, ...}, (0,1) refers to data pair (ids[0],ids[1])

ids: list of image ID

get_id_combination(cls=None, num=2)[source]

Get the combination of images and length of combinations in specified class.

Parameters
  • cls – int or str, class of expected data. None for all classes

  • num – int, number of images in each image ID list; for example, 2 for 2GM

Returns

id_combination_list: list of combinations of image ids

length: length of combinations

rand_get_data(cls=None, num=2, test=False, shuffle=True)[source]

Randomly fetch data for training or test. Implemented by calling get_data function.

Parameters
  • cls – int or str, class of expected data. None for random class

  • num – int, number of images; for example, 2 for 2GM

  • test – bool, whether the fetched data is used for test; if true, this function will not return ground truth

  • shuffle – bool, whether to shuffle the order of keypoints

Returns

data_list: list of data, like [{'img': np.array, 'kpts': coordinates of kpts}, ...]

perm_mat_dict: ground truth, like {(0,1):scipy.sparse, (0,2):scipy.sparse, ...}, (0,1) refers to data pair (ids[0],ids[1])

ids: list of image ID

rm_gt_cache(last_epoch=False)[source]

Remove ground truth cache. It is recommended to call this function after evaluation.

Parameters

last_epoch – bool, whether this epoch is last epoch; if true, the directory of cache will also be removed, and no more data should be evaluated