Benchmark

class pygmtools.benchmark.Benchmark(name, sets, obj_resize=(256, 256), problem='2GM', filter='intersection', **args)[source]

The Benchmark module provides a unified data interface and an evaluating platform for different datasets.

Parameters

name – str, dataset name, currently support 'PascalVOC', 'WillowObject', 'IMC_PT_SparseGM', 'CUB2011', 'SPair71k'
sets – str, problem set, 'train' for training set and 'test' for test set
obj_resize – tuple, (default: (256, 256)) resized object size
problem – str, (default: '2GM') problem type, '2GM' for 2-graph matching and 'MGM' for multi-graph matching
filter – str, (default: 'intersection') filter of nodes, 'intersection' refers to retaining only common nodes; 'inclusion' is only for 2GM and refers to filtering only one graph to make its nodes a subset of the other graph, and 'unfiltered' refers to retaining all nodes in all graphs
args – keyword settings for specific dataset

Note

Ground truth cache is saved only when the parameter sets is 'test', so the functions eval() and eval_cls() are only for 'test' set.

compute_img_num(classes)[source]

Compute number of images in specified classes.

Parameters: classes – list of dataset classes
Returns: list of numbers of images in each class

compute_length(cls=None, num=2)[source]

Compute the length of image combinations in specified class.

Parameters

cls – int or str, class of expected data. None for all classes
num – int, number of images in each image ID list; for example, 2 for two-graph matching problem

Returns

length of combinations

eval(prediction, classes, verbose=False, rm_gt_cache=True)[source]

Evaluate test results and compute matching accuracy and coverage.

Parameters

prediction – list, prediction result, like [{'ids': (id1, id2), 'cls': cls, 'permmat': np.array or scipy.sparse}, ...]
classes – list of evaluated classes
verbose – bool, whether to print the result
rm_gt_cache – bool, whether to remove ground truth cache

Returns

evaluation result in each class and their averages, including p, r, f1 and their standard deviation and coverage

Note

If there are duplicate data pair in prediction, this function will only evaluate the first pair and expect that this pair is also the first fetched pair. Therefore, it is recommended that prediction is built in an ordered manner, and not shuffled.

Note

Ground truth cache is saved when data pairs are fetched, and should be removed after evaluation. Make sure all data pairs are evaluated at once, i.e., prediction should contain all fetched data pairs.

eval_cls(prediction, cls, verbose=False)[source]

Evaluate test results and compute matching accuracy and coverage on one specified class.

Parameters

prediction – list, prediction result on one class, like [{'ids': (id1, id2), 'cls': cls, 'permmat': np.array or scipy.sparse}, ...]
cls – str, evaluated class
verbose – bool, whether to print the result

Returns

evaluation result on the specified class, including p, r, f1 and their standard deviation and coverage

Note

If there are duplicate data pair in prediction, this function will only evaluate the first pair and expect that this pair is also the first fetched pair. Therefore, it is recommended that prediction is built in an ordered manner, and not shuffled. Same as the function eval.

Note

This function will not automatically remove ground truth cache. However, you can still mannually call the class function rm_gt_cache to remove groud truth cache after evaluation.

get_data(ids, test=False, shuffle=True)[source]

Fetch a data pair or pairs of data by image ID for training or test.

Parameters

ids – list of image ID, usually in train.json or test.json
test – bool, whether the fetched data is used for test; if true, this function will not return ground truth
shuffle – bool, whether to shuffle the order of keypoints

Returns

data_list: list of data, like [{'img': np.array, 'kpts': coordinates of kpts}, ...]

perm_mat_dict: ground truth, like {(0,1):scipy.sparse, (0,2):scipy.sparse, ...}, (0,1) refers to data pair (ids[0],ids[1])

ids: list of image ID

get_id_combination(cls=None, num=2)[source]

Get the combination of images and length of combinations in specified class.

Parameters

cls – int or str, class of expected data. None for all classes
num – int, number of images in each image ID list; for example, 2 for 2GM

Returns

id_combination_list: list of combinations of image ids

length: length of combinations

rand_get_data(cls=None, num=2, test=False, shuffle=True)[source]

Randomly fetch data for training or test. Implemented by calling get_data function.

Parameters

cls – int or str, class of expected data. None for random class
num – int, number of images; for example, 2 for 2GM
test – bool, whether the fetched data is used for test; if true, this function will not return ground truth
shuffle – bool, whether to shuffle the order of keypoints

Returns

data_list: list of data, like [{'img': np.array, 'kpts': coordinates of kpts}, ...]

perm_mat_dict: ground truth, like {(0,1):scipy.sparse, (0,2):scipy.sparse, ...}, (0,1) refers to data pair (ids[0],ids[1])

ids: list of image ID

rm_gt_cache(last_epoch=False)[source]

Remove ground truth cache. It is recommended to call this function after evaluation.

Parameters: last_epoch – bool, whether this epoch is last epoch; if true, the directory of cache will also be removed, and no more data should be evaluated