irspack.recommenders.IALSRecommender

class irspack.recommenders.IALSRecommender(X_train_all, n_components=20, alpha0=0.0, reg=0.001, nu=1.0, confidence_scaling='none', epsilon=1.0, init_std=0.1, solver_type='CG', max_cg_steps=3, ialspp_subspace_dimension=64, loss_type='IALSPP', nu_star=None, random_seed=42, n_threads=None, train_epochs=16, prediction_time_max_cg_steps=5, prediction_time_ialspp_iteration=7)[source]

Bases: BaseRecommenderWithEarlyStopping, BaseRecommenderWithUserEmbedding, BaseRecommenderWithItemEmbedding

Implementation of implicit Alternating Least Squares (iALS) or Weighted Matrix Factorization (WMF).

By default, it tries to minimize the following loss:

\[\frac{1}{2} \sum _{u, i \in S} c_{ui} (\mathbf{u}_u \cdot \mathbf{v}_i - 1) ^ 2 + \frac{\alpha_0}{2} \sum_{u, i} (\mathbf{u}_u \cdot \mathbf{v}_i) ^ 2 + \frac{\text{reg}}{2} \left( \sum_u (\alpha_0 I + N_u) ^ \nu || \mathbf{u}_u || ^2 + \sum_i (\alpha_0 U + N_i) ^ \nu || \mathbf{v}_i || ^2 \right)\]

where \(S\) denotes the set of all pairs wher \(X_{ui}\) is non-zero.

See the seminal paper:

By default it uses a conjugate gradient descent version:

The loss above is slightly different from the original version. See the following paper for the loss used here

Parameters:
  • X_train_all (Union[scipy.sparse.csr_matrix, scipy.sparse.csc_matrix]) – Input interaction matrix.

  • n_components (int, optional) – The dimension for latent factor. Defaults to 20.

  • alpha0 (float, optional) – The “unobserved” weight.

  • reg (float, optional) – Regularization coefficient for both user & item factors. Defaults to 1e-3.

  • nu (float, optional) – Controlles frequency regularization introduced in the paper, “Revisiting the Performance of iALS on Item Recommendation Benchmarks”.

  • confidence_scaling (str, optional) –

    Specifies how to scale confidence scaling \(c_{ui}\). Must be either “none” or “log”. If “none”, the non-zero (not-necessarily 1) \(X_{ui}\) yields

    \[c_{ui} = A + X_{ui}\]

    If “log”,

    \[c_{ui} = A + \log (1 + X_{ui} / \epsilon )\]

    The constant \(A\) above will be 0 if loss_type is "IALSPP", \(\alpha_0\) if loss_type is "ORIGINAL".

    Defaults to “none”.

  • epsilon (float, optional) – The \(\epsilon\) parameter for log-scaling described above. Will not have any effect if confidence_scaling is “none”. Defaults to 1.0f.

  • init_std (float, optional) – Standard deviation for initialization normal distribution. The actual std for each user/item vector components are scaled by 1 / n_components ** .5. Defaults to 0.1.

  • solver_type ("CHOLESKY" | "CG" | "IALSPP", optional) – Which solver to use. Defaults to “CG”.

  • max_cg_steps (int, optional) – Maximal number of conjute gradient descent steps during the training time. Defaults to 3. Used only when solver_type=="CG". By increasing this parameter, the result will be closer to Cholesky decomposition method (i.e., when solver_type == "CHOLESKY"), but it wll take longer time.

  • ialspp_subspace_dimension (int, optional) – The subspace dimension of iALS++ (ignored if the solver_type is not “IALSPP”). If this value is 1, specialized implementation described in Fast Matrix Factorization for Online Recommendation with Implicit Feedback will be used instead. Defaults to 64.

  • loss_type (Literal["IALSPP", "ORIGINAL"], optional) – Specifies the subtle difference between iALS++ vs Original Loss.

  • nu_star (Optional[float], optional) – If not None, used as the reference scale for nu described in the “Revisiting…” paper. Defaults to None.

  • random_seed (int, optional) – The random seed to initialize the parameters.

  • n_threads (Optional[int], optional) – Specifies the number of threads to use for the computation. If None, the environment variable "IRSPACK_NUM_THREADS_DEFAULT" will be looked up, and if the variable is not set, it will be set to os.cpu_count(). Defaults to None.

  • train_epochs (int, optional) – Maximal number of epochs. Defaults to 16.

  • prediction_time_max_cg_steps (int, optional) – Maximal number of conjute gradient descent steps during the prediction time, i.e., the case when a user unseen at the training time is given as a history matrix. Defaults to 5.

  • prediction_time_ialspp_iteration (int) –

Examples

>>> from irspack import IALSRecommender, rowwise_train_test_split, Evaluator
>>> from irspack.utils.sample_data import mf_example_data
>>> X = mf_example_data(100, 30, random_state=1)
>>> X_train, X_test = rowwise_train_test_split(X, random_state=0)
>>> rec = IALSRecommender(X_train)
>>> rec.learn()
>>> evaluator=Evaluator(X_test)
>>> print(evaluator.get_scores(rec, [20]))
OrderedDict([('hit@20', 1.0), ('recall@20', 0.9003412698412698), ('ndcg@20', 0.6175493479217139), ('map@20', 0.3848785870622406), ('precision@20', 0.3385), ('gini_index@20', 0.0814), ('entropy@20', 3.382497875272383), ('appeared_item@20', 30.0)])
__init__(X_train_all, n_components=20, alpha0=0.0, reg=0.001, nu=1.0, confidence_scaling='none', epsilon=1.0, init_std=0.1, solver_type='CG', max_cg_steps=3, ialspp_subspace_dimension=64, loss_type='IALSPP', nu_star=None, random_seed=42, n_threads=None, train_epochs=16, prediction_time_max_cg_steps=5, prediction_time_ialspp_iteration=7)[source]
Parameters:
  • X_train_all (Union[csr_matrix, csc_matrix]) –

  • n_components (int) –

  • alpha0 (float) –

  • reg (float) –

  • nu (float) –

  • confidence_scaling (str) –

  • epsilon (float) –

  • init_std (float) –

  • solver_type (typing_extensions.Literal[CG, CHOLESKY, IALSPP]) –

  • max_cg_steps (int) –

  • ialspp_subspace_dimension (int) –

  • loss_type (typing_extensions.Literal[IALSPP, ORIGINAL]) –

  • nu_star (Optional[float]) –

  • random_seed (int) –

  • n_threads (Optional[int]) –

  • train_epochs (int) –

  • prediction_time_max_cg_steps (int) –

  • prediction_time_ialspp_iteration (int) –

Return type:

None

Methods

__init__(X_train_all[, n_components, ...])

compute_item_embedding(X)

Given an unknown items' interaction with known user, computes the latent factors of the items by least square (fixing user embeddings).

compute_user_embedding(X)

Given an unknown users' interaction with known items, computes the latent factors of the users by least square (fixing item embeddings).

default_suggest_parameter(trial, fixed_params)

from_config(X_train_all, config)

get_item_embedding()

Get item embedding vectors.

get_score(user_indices)

Compute the item recommendation score for a subset of users.

get_score_block(begin, end)

Compute the score for a block of the users.

get_score_cold_user(X)

Compute the item recommendation score for unseen users whose profiles are given as another user-item relation matrix.

get_score_cold_user_remove_seen(X)

Compute the item recommendation score for unseen users whose profiles are given as another user-item relation matrix.

get_score_from_item_embedding(user_indices, ...)

get_score_from_user_embedding(user_embedding)

Compute the item score from user embedding.

get_score_remove_seen(user_indices)

Compute the item score and mask the item in the training set.

get_score_remove_seen_block(begin, end)

Compute the score for a block of the users, and mask the items in the training set.

get_user_embedding()

Get user embedding vectors.

learn()

Learns and returns itself.

learn_with_optimizer(evaluator, trial[, ...])

Learning procedures with early stopping and pruning.

load_state()

run_epoch()

save_state()

start_learning()

tune(data, evaluator[, n_trials, timeout, ...])

Perform the optimization step.

tune_doubling_dimension(data, evaluator, ...)

Perform tuning gradually doubling n_components.

tune_with_study(study, data, evaluator[, ...])

Attributes

default_tune_range

trainer_as_ials

X_train_all: sps.csr_matrix

The matrix to feed into recommender.

compute_item_embedding(X)[source]

Given an unknown items’ interaction with known user, computes the latent factors of the items by least square (fixing user embeddings).

Parameters:

X (Union[csr_matrix, csc_matrix]) – The interaction history of the new users. X.shape[0] must be equal to self.n_users.

Return type:

ndarray

compute_user_embedding(X)[source]

Given an unknown users’ interaction with known items, computes the latent factors of the users by least square (fixing item embeddings).

Parameters:

X (Union[csr_matrix, csc_matrix]) – The interaction history of the new users. X.shape[1] must be equal to self.n_items.

Return type:

ndarray

get_item_embedding()[source]

Get item embedding vectors.

Returns:

The latent vector representation of items. Its number of rows is equal to the number of the items.

Return type:

ndarray

get_score(user_indices)[source]

Compute the item recommendation score for a subset of users.

Parameters:

user_indices (ndarray) – The index defines the subset of users.

Returns:

The item scores. Its shape will be (len(user_indices), self.n_items)

Return type:

ndarray

get_score_block(begin, end)[source]

Compute the score for a block of the users.

Parameters:
  • begin (int) – where the evaluated user block begins.

  • end (int) – where the evaluated user block ends.

Returns:

The item scores. Its shape will be (end - begin, self.n_items)

Return type:

ndarray

get_score_cold_user(X)[source]

Compute the item recommendation score for unseen users whose profiles are given as another user-item relation matrix.

Parameters:

X (Union[csr_matrix, csc_matrix]) – The profile user-item relation matrix for unseen users. Its number of rows is arbitrary, but the number of columns must be self.n_items.

Returns:

Computed item scores for users. Its shape is equal to X.

Return type:

ndarray

get_score_cold_user_remove_seen(X)

Compute the item recommendation score for unseen users whose profiles are given as another user-item relation matrix. The score will then be masked by the input.

Parameters:

X (Union[csr_matrix, csc_matrix]) – The profile user-item relation matrix for unseen users. Its number of rows is arbitrary, but the number of columns must be self.n_items.

Returns:

Computed & masked item scores for users. Its shape is equal to X.

Return type:

ndarray

get_score_from_user_embedding(user_embedding)[source]

Compute the item score from user embedding. Mainly used for cold-start scenario.

Parameters:

user_embedding (ndarray) – Latent user representation obtained elsewhere.

Returns:

The score array. Its shape will be (user_embedding.shape[0], self.n_items)

Return type:

DenseScoreArray

get_score_remove_seen(user_indices)

Compute the item score and mask the item in the training set. Masked items will have the score -inf.

Parameters:

user_indices (ndarray) – Specifies the subset of users.

Returns:

The masked item scores. Its shape will be (len(user_indices), self.n_items)

Return type:

ndarray

get_score_remove_seen_block(begin, end)

Compute the score for a block of the users, and mask the items in the training set. Masked items will have the score -inf.

Parameters:
  • begin (int) – where the evaluated user block begins.

  • end (int) – where the evaluated user block ends.

Returns:

The masked item scores. Its shape will be (end - begin, self.n_items)

Return type:

ndarray

get_user_embedding()[source]

Get user embedding vectors.

Returns:

The latent vector representation of users. Its number of rows is equal to the number of the users.

Return type:

ndarray

learn()

Learns and returns itself.

Returns:

The model after fitting process.

Parameters:

self (R) –

Return type:

R

learn_with_optimizer(evaluator, trial, max_epoch=128, validate_epoch=5, score_degradation_max=5)

Learning procedures with early stopping and pruning.

Parameters:
  • evaluator (Optional[evaluation.Evaluator]) – The evaluator to measure the score.

  • trial (Optional[Trial]) – The current optuna trial under the study (if any.)

  • max_epoch (int) – Maximal number of epochs. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 128.

  • validate_epoch (int) – The frequency of validation score measurement. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5.

  • validate_epoch – The frequency of validation score measurement. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5.

  • score_degradation_max (int) – Maximal number of allowed score degradation. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5.

Return type:

None

classmethod tune(data, evaluator, n_trials=20, timeout=None, data_suggest_function=None, parameter_suggest_function=None, fixed_params={}, random_seed=None, prunning_n_startup_trials=10, max_epoch=16, validate_epoch=1, score_degradation_max=3, logger=None)[source]

Perform the optimization step. optuna.Study object is created inside this function.

Parameters:
  • data (Optional[Union[csr_matrix, csc_matrix]]) – The training data. You can also provide tunable parameter dependent training data by providing data_suggest_function. In that case, data must be None.

  • evaluator (evaluation.Evaluator) – The validation evaluator that measures the performance of the recommenders.

  • n_trials (int) – The number of expected trials (including pruned ones). Defaults to 20.

  • timeout (Optional[int]) – If set to some value (in seconds), the study will exit after that time period. Note that the running trials is not interrupted, though. Defaults to None.

  • data_suggest_function (Optional[Callable[[Trial], Union[csr_matrix, csc_matrix]]]) – If not None, this must be a function which takes optuna.Trial as its argument and returns training data. Defaults to None.

  • parameter_suggest_function (Optional[Callable[[Trial], Dict[str, Any]]]) – If not None, this must be a function which takes optuna.Trial as its argument and returns Dict[str, Any] (i.e., some keyword arguments of the recommender class). If None, cls.default_suggest_parameter will be used for the parameter suggestion. Defaults to None.

  • fixed_params (Dict[str, Any]) – Fixed parameters passed to recommenders during the optimization procedure. This will replace the suggested parameter (either by cls.default_suggest_parameter or parameter_suggest_function). Defaults to dict().

  • random_seed (Optional[int]) – The random seed to control optuna.samplers.TPESampler. Defaults to None.

  • prunning_n_startup_trials (int) – n_startup_trials argument passed to the constructor of optuna.pruners.MedianPruner.

  • max_epoch (int) – The maximal number of epochs for the training. If iterative learning procedure is not available, this parameter will be ignored.

  • validate_epoch (int, optional) – The frequency of validation score measurement. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5.

  • score_degradation_max (int, optional) – Maximal number of allowed score degradation. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5. Defaults to 5.

  • logger (Optional[Logger]) –

Returns:

A tuple that consists of

  1. A dict containing the best paramaters. This dict can be passed to the recommender as **kwargs.

  2. A pandas.DataFrame that contains the history of optimization.

Return type:

Tuple[Dict[str, Any], DataFrame]

classmethod tune_doubling_dimension(data, evaluator, initial_dimension, maximal_dimension, storage=None, study_name_prefix=None, n_trials_initial=40, n_trials_following=20, n_startup_trials_initial=10, n_startup_trials_following=5, max_epoch=16, validate_epoch=1, score_degradation_max=3, neighborhood_scale=3.0, suggest_function_initial=None, random_seed=None)[source]

Perform tuning gradually doubling n_components. Typically, with the initial n_components, the search will be more exhaustive, and with larger n_components, less exploration will be done around previously found parameters. This strategy is described in Revisiting the Performance of iALS on Item Recommendation Benchmarks.

Parameters:
  • initial_dimension (int) – The initial dimension.

  • maximal_dimension (int) – The maximal (inclusive) dimension to be tried.

  • storage (Optional[RDBStorage]) – The storage where multiple optuna.Study will be created corresponding to the various dimensions. If None, all Study will be created in-memory.

  • study_name_prefix (Optional[str]) – The prefix for the names of optuna.Study. For dimension d, the full name of the Study will be “{study_name_prefix}_{d}”. If None, we will use a random string for this prefix.

  • n_trials_initial (int) – The number of trials for the initial dimension.

  • n_trials_following (int) – The number of trials for the following dimensions.

  • n_startup_trials_initial (int) – Passed on to n_startup_trials argument of optuna.pruners.MedianPruner in the initial optuna.Study. Defaults to 10.

  • n_startup_trials_following (int) – Passed on to n_startup_trials argument of optuna.pruners.MedianPruner in the following optuna.Study. Defaults to 5.

  • neighborhood_scale (float) – alpha_0 and reg parameters will be searched within the log-uniform range [previous_dimension_result / neighborhood_scale, previous_dimension_result * neighborhood_scale]. Defaults to 3.0

  • suggest_overwrite_initial – Overwrites the suggestion parameters in the initial optuna.Study. Defaults to [].

  • random_seed (Optional[int]) – The random seed to control optuna.samplers.TPESampler. Defaults to None.

  • data (Union[csr_matrix, csc_matrix]) –

  • evaluator (Evaluator) –

  • max_epoch (int) –

  • validate_epoch (int) –

  • score_degradation_max (int) –

  • suggest_function_initial (Optional[Callable[[Trial], Dict[str, Any]]]) –

Returns:

A tuple that consists of
  1. A dict containing the best paramaters. This dict can be passed to the recommender as **kwargs.

  2. A pandas.DataFrame that contains the history of optimization for all dimensions.

Return type:

Tuple[Dict[str, Any], DataFrame]