irspack.recommenders.BPRFMRecommender

class irspack.recommenders.BPRFMRecommender(X_train_all, n_components=128, item_alpha=1e-09, user_alpha=1e-09, loss='bpr', n_threads=None, train_epochs=128)[source]

Bases: BaseRecommenderWithEarlyStopping, BaseRecommenderWithUserEmbedding, BaseRecommenderWithItemEmbedding

A LightFM wrapper for our interface.

This will create LightFM instance by

fm = LightFM(
    no_components=n_components,
    item_alpha=item_alpha,
    user_alpha=user_alpha,
    loss=loss,
)

and run fm.fit_partial(X, num_threads=self.n_threads) to train through a single epoch.

Parameters:
  • X_train_all (csr_matrix) – Input interaction matrix.

  • n_components (int) – The dimension for latent factor. Defaults to 128.

  • item_alpha (float) – The regularization coefficient for item factors. Defaults to 1e-9.

  • user_alpha (float) – The regularization coefficient for user factors. Defaults to 1e-9.

  • loss (str) – Specifies the loss function type of LightFM. Must be one of {“bpr”, “warp”}. Defaults to “bpr”.

  • train_epochs (int) – Number of training epochs. Defaults to 128.

  • n_threads (Optional[int]) – Specifies the number of threads to use for the computation. If None, the environment variable "IRSPACK_NUM_THREADS_DEFAULT" will be looked up, and if the variable is not set, it will be set to os.cpu_count(). Defaults to None.

__init__(X_train_all, n_components=128, item_alpha=1e-09, user_alpha=1e-09, loss='bpr', n_threads=None, train_epochs=128)[source]
Parameters:
  • X_train_all (Union[csr_matrix, csc_matrix]) –

  • n_components (int) –

  • item_alpha (float) –

  • user_alpha (float) –

  • loss (str) –

  • n_threads (Optional[int]) –

  • train_epochs (int) –

Methods

__init__(X_train_all[, n_components, ...])

default_suggest_parameter(trial, fixed_params)

from_config(X_train_all, config)

get_item_embedding()

Get item embedding vectors.

get_score(index)

Compute the item recommendation score for a subset of users.

get_score_block(begin, end)

Compute the score for a block of the users.

get_score_cold_user(X)

Compute the item recommendation score for unseen users whose profiles are given as another user-item relation matrix.

get_score_cold_user_remove_seen(X)

Compute the item recommendation score for unseen users whose profiles are given as another user-item relation matrix.

get_score_from_item_embedding(user_indices, ...)

get_score_from_user_embedding(user_embedding)

Compute the item score from user embedding.

get_score_remove_seen(user_indices)

Compute the item score and mask the item in the training set.

get_score_remove_seen_block(begin, end)

Compute the score for a block of the users, and mask the items in the training set.

get_user_embedding()

Get user embedding vectors.

learn()

Learns and returns itself.

learn_with_optimizer(evaluator, trial[, ...])

Learning procedures with early stopping and pruning.

load_state()

run_epoch()

save_state()

start_learning()

tune(data, evaluator[, n_trials, timeout, ...])

Perform the optimization step.

tune_with_study(study, data, evaluator[, ...])

Attributes

default_tune_range

fm

X_train_all: sps.csr_matrix

The matrix to feed into recommender.

get_item_embedding()[source]

Get item embedding vectors.

Returns:

The latent vector representation of items. Its number of rows is equal to the number of the items.

Return type:

ndarray

get_score(index)[source]

Compute the item recommendation score for a subset of users.

Parameters:
  • user_indices – The index defines the subset of users.

  • index (ndarray) –

Returns:

The item scores. Its shape will be (len(user_indices), self.n_items)

Return type:

ndarray

get_score_block(begin, end)[source]

Compute the score for a block of the users.

Parameters:
  • begin (int) – where the evaluated user block begins.

  • end (int) – where the evaluated user block ends.

Returns:

The item scores. Its shape will be (end - begin, self.n_items)

Return type:

ndarray

get_score_cold_user(X)

Compute the item recommendation score for unseen users whose profiles are given as another user-item relation matrix.

Parameters:

X (Union[csr_matrix, csc_matrix]) – The profile user-item relation matrix for unseen users. Its number of rows is arbitrary, but the number of columns must be self.n_items.

Returns:

Computed item scores for users. Its shape is equal to X.

Return type:

ndarray

get_score_cold_user_remove_seen(X)

Compute the item recommendation score for unseen users whose profiles are given as another user-item relation matrix. The score will then be masked by the input.

Parameters:

X (Union[csr_matrix, csc_matrix]) – The profile user-item relation matrix for unseen users. Its number of rows is arbitrary, but the number of columns must be self.n_items.

Returns:

Computed & masked item scores for users. Its shape is equal to X.

Return type:

ndarray

get_score_from_user_embedding(user_embedding)[source]

Compute the item score from user embedding. Mainly used for cold-start scenario.

Parameters:

user_embedding (ndarray) – Latent user representation obtained elsewhere.

Returns:

The score array. Its shape will be (user_embedding.shape[0], self.n_items)

Return type:

DenseScoreArray

get_score_remove_seen(user_indices)

Compute the item score and mask the item in the training set. Masked items will have the score -inf.

Parameters:

user_indices (ndarray) – Specifies the subset of users.

Returns:

The masked item scores. Its shape will be (len(user_indices), self.n_items)

Return type:

ndarray

get_score_remove_seen_block(begin, end)

Compute the score for a block of the users, and mask the items in the training set. Masked items will have the score -inf.

Parameters:
  • begin (int) – where the evaluated user block begins.

  • end (int) – where the evaluated user block ends.

Returns:

The masked item scores. Its shape will be (end - begin, self.n_items)

Return type:

ndarray

get_user_embedding()[source]

Get user embedding vectors.

Returns:

The latent vector representation of users. Its number of rows is equal to the number of the users.

Return type:

ndarray

learn()

Learns and returns itself.

Returns:

The model after fitting process.

Parameters:

self (R) –

Return type:

R

learn_with_optimizer(evaluator, trial, max_epoch=128, validate_epoch=5, score_degradation_max=5)

Learning procedures with early stopping and pruning.

Parameters:
  • evaluator (Optional[evaluation.Evaluator]) – The evaluator to measure the score.

  • trial (Optional[Trial]) – The current optuna trial under the study (if any.)

  • max_epoch (int) – Maximal number of epochs. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 128.

  • validate_epoch (int) – The frequency of validation score measurement. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5.

  • validate_epoch – The frequency of validation score measurement. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5.

  • score_degradation_max (int) – Maximal number of allowed score degradation. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5.

Return type:

None

classmethod tune(data, evaluator, n_trials=20, timeout=None, data_suggest_function=None, parameter_suggest_function=None, fixed_params={}, random_seed=None, prunning_n_startup_trials=10, max_epoch=128, validate_epoch=5, score_degradation_max=5, logger=None)

Perform the optimization step. optuna.Study object is created inside this function.

Parameters:
  • data (Optional[Union[csr_matrix, csc_matrix]]) – The training data. You can also provide tunable parameter dependent training data by providing data_suggest_function. In that case, data must be None.

  • evaluator (evaluation.Evaluator) – The validation evaluator that measures the performance of the recommenders.

  • n_trials (int) – The number of expected trials (including pruned ones). Defaults to 20.

  • timeout (Optional[int]) – If set to some value (in seconds), the study will exit after that time period. Note that the running trials is not interrupted, though. Defaults to None.

  • data_suggest_function (Optional[Callable[[Trial], Union[csr_matrix, csc_matrix]]]) – If not None, this must be a function which takes optuna.Trial as its argument and returns training data. Defaults to None.

  • parameter_suggest_function (Optional[Callable[[Trial], Dict[str, Any]]]) – If not None, this must be a function which takes optuna.Trial as its argument and returns Dict[str, Any] (i.e., some keyword arguments of the recommender class). If None, cls.default_suggest_parameter will be used for the parameter suggestion. Defaults to None.

  • fixed_params (Dict[str, Any]) – Fixed parameters passed to recommenders during the optimization procedure. This will replace the suggested parameter (either by cls.default_suggest_parameter or parameter_suggest_function). Defaults to dict().

  • random_seed (Optional[int]) – The random seed to control optuna.samplers.TPESampler. Defaults to None.

  • prunning_n_startup_trials (int) – n_startup_trials argument passed to the constructor of optuna.pruners.MedianPruner.

  • max_epoch (int) – The maximal number of epochs for the training. If iterative learning procedure is not available, this parameter will be ignored.

  • validate_epoch (int, optional) – The frequency of validation score measurement. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5.

  • score_degradation_max (int, optional) – Maximal number of allowed score degradation. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5. Defaults to 5.

  • logger (Optional[Logger]) –

Returns:

A tuple that consists of

  1. A dict containing the best paramaters. This dict can be passed to the recommender as **kwargs.

  2. A pandas.DataFrame that contains the history of optimization.

Return type:

Tuple[Dict[str, Any], DataFrame]