Table Of Contents

This Page

mrec.evaluation Package

preprocessing Module

class mrec.evaluation.preprocessing.SplitCreator(test_size, normalize=False, discard_zeros=False, sample_before_thresholding=False)

Bases: object

Split ratings for a user randomly into train and test groups. Only items with positive scores will be included in the test group.

Parameters :

test_size : float

If test_size >= 1 this specifies the absolute number of items to put in the test group; if test_size < 1 then this specifies the test proportion.

normalize : bool (default: False)

If True, scale training scores for each user to have unit norm.

discard_zeros : bool (default: False)

If True then discard items with zero scores, if False then retain them in the training group.This should normally be False as such items have been seen (if not liked) and so the training set should include them so that it can be used to determine which items are actually novel at recommendation time.

sample_before_thresholding : bool (default: False)

If True then consider any item seen by the user for inclusion in the test group, even though only items with positive scrore will be selected. If the input includes items with zero scores this means that the test set may be smaller than the requested size for some users, even though they have apparently seen enough items.

Methods

handle(vals)
num_train(vals)
pos_neg_vals(vals)
split(vals)
stratified_split(vals)
handle(vals)
num_train(vals)
pos_neg_vals(vals)
split(vals)
stratified_split(vals)
class mrec.evaluation.preprocessing.TSVParser(thresh=0, binarize=False, delimiter='t')

Bases: object

Parses tsv input: user, item, score.

Parameters :

thresh : float (default: 0)

Set scores below this to zero.

binarize : bool (default: False)

If True, set all non-zero scores to 1.

Methods

parse(line)
parse(line)

metrics Module

Metrics to evaluate recommendations: * with hit rate, following e.g. Karypis lab SLIM and FISM papers * with prec@k and MRR

mrec.evaluation.metrics.compute_hit_rate(recommended, known)
mrec.evaluation.metrics.compute_main_metrics(recommended, known)
mrec.evaluation.metrics.evaluate(model, train, users, get_known_items, compute_metrics)
mrec.evaluation.metrics.generate_metrics(get_known_items, compute_metrics)
class mrec.evaluation.metrics.get_known_items_from_csr_matrix(data)

Bases: object

class mrec.evaluation.metrics.get_known_items_from_dict(data)

Bases: object

class mrec.evaluation.metrics.get_known_items_from_thresholded_csr_matrix(data, min_value)

Bases: object

mrec.evaluation.metrics.hit_rate(predicted, true, k)

Compute hit rate i.e. recall@k assume a single test item.

Parameters :

predicted : array like

Predicted items.

true : array like

Containing the single true test item.

k : int

Measure hit rate@k.

Returns :

hitrate : int

1 if true is amongst predicted, 0 if not.

mrec.evaluation.metrics.prec(predicted, true, k, ignore_missing=False)

Compute precision@k.

Parameters :

predicted : array like

Predicted items.

true : array like

True items.

k : int

Measure precision@k.

ignore_missing : boolean (default: False)

If True then measure precision only up to rank len(predicted) even if this is less than k, otherwise assume that the missing predictions were all incorrect

Returns :

prec@k : float

Precision at k.

mrec.evaluation.metrics.print_report(models, metrics)

Call this to print out the metrics returned by run_evaluation().

mrec.evaluation.metrics.retrain_recommender(model, dataset)
mrec.evaluation.metrics.rr(predicted, true)

Compute Reciprocal Rank.

Parameters :

predicted : array like

Predicted items.

true : array like

True items.

Returns :

rr : float

Reciprocal of rank at which first true item is found in predicted.

Notes

We’ll under report this as our predictions are truncated.

mrec.evaluation.metrics.run_evaluation(models, retrain, get_split, num_runs, evaluation_func)

This is the main entry point to run an evaluation.

Supply functions to retrain model, to get a new split of data on each run, to get known items from the test set, and to compute the metrics you want: - retrain(model,dataset) should retrain model - get_split() should return train_data,test_users,test_data - evaluation_func(model,users,test) should return a dict of metrics A number of suitable functions are already available in the module.

mrec.evaluation.metrics.sort_metrics_by_name(names)