Table Of Contents

Next topic

mrec.item_similarity Package

This Page

mrec.examples Package

train Module

Train an item similarity model in parallel on an ipython cluster. We assume a shared filesystem (as you’ll have when running locally or on an AWS cluster fired up with StarCluster) to avoid passing data between the controller and the worker engines, as this can cause OOM issues for the controller.

You can specify multiple training sets and the model will learn a separate similarity matrix for each input dataset: this makes it easy to generate data for cross-validated evaluation.

mrec.examples.train.main()

predict Module

Make and evaluate recommendations in parallel on an ipython cluster, using models that have previously been trained and saved to file. We assume a shared filesystem (as you’ll have when running locally or on an AWS cluster fired up with StarCluster) to avoid passing data between the controller and the worker engines, as this can cause OOM issues for the controller.

You can specify multiple training sets / models and separate recommendations will be output and evaluated for each of them: this makes it easy to run a cross-validated evaluation.

mrec.examples.predict.create_tasks(modelfile, input_format, trainfile, test_input_format, testfile, item_feature_format, featurefile, outdir, mb_per_task, done, evaluator)
mrec.examples.predict.estimate_users_per_task(mb_per_task, input_format, trainfile, modelfile)
mrec.examples.predict.find_done(outdir)
mrec.examples.predict.get_dataset_size(input_format, datafile)
mrec.examples.predict.main()
mrec.examples.predict.process(view, opts, modelfile, trainfile, testfile, featurefile, outdir, evaluator)

evaluate Module

Evaluate precomputed recommendations for one or more training/test sets. Test and recommendation files must following naming conventions relative to the training filepaths.

mrec.examples.evaluate.main()

filename_conventions Module

File naming conventions:

  • training files must contain ‘train’ in their filename.
  • the corresponding test files must have the same filepaths, but with ‘test’ in place of ‘train’ in their filenames.
  • models, similarity matrices and recommendations will be written to filenames based on the training file.
mrec.examples.filename_conventions.get_factorsdir(trainfile, outdir)
mrec.examples.filename_conventions.get_modelfile(trainfile, outdir)
mrec.examples.filename_conventions.get_modelsdir(trainfile, outdir)
mrec.examples.filename_conventions.get_recsdir(trainfile, outdir)
mrec.examples.filename_conventions.get_recsfile(trainfile, outdir)
mrec.examples.filename_conventions.get_simsdir(trainfile, outdir)
mrec.examples.filename_conventions.get_simsfile(trainfile, outdir)
mrec.examples.filename_conventions.get_sortedfile(infile, outdir)
mrec.examples.filename_conventions.get_splitfile(infile, outdir, split_type, i)
mrec.examples.filename_conventions.get_testfile(trainfile)