irspack.recommenders.SLIMRecommender
- class irspack.recommenders.SLIMRecommender(X_train_all, alpha=0.05, l1_ratio=0.01, positive_only=True, n_iter=100, tol=0.0001, top_k=None, n_threads=None)[source]
Bases:
BaseSimilarityRecommenderSLIM with ElasticNet-type loss function:
\[\mathrm{loss} = \frac{1}{2} ||X - XB|| ^2 _F + \frac{\alpha (1 - l_1) U}{2} ||B|| ^2 _FF + \alpha l_1 U |B|\]The implementation relies on a simple (parallelized) cyclic-coordinate descent method.
- Parameters:
X_train_all (csr_matrix) – Input interaction matrix.
alpha (float) – Determines the strength of L1/L2 regularization (see above). Defaults to 0.05.
l1_ratio (float) – Determines the strength of L1 regularization relative to alpha. Defaults to 0.01.
positive_only (bool) – Whether we constrain the weight matrix to be non-negative. Defaults to True.
n_iter (int) – The number of coordinate-descent iterations. Defaults to 100.
tol (float) – Tolerance parameter for cd iteration, i.e., if the maximal parameter change of the coordinate-descent single iteration is smaller than this value, the iteration will terminate. Defaults to 1e-4.
top_k (Optional[int]) – Specifies the maximal number of allowed non-zero coefficients per item. Defaults to None.
n_threads (Optional[int]) – Specifies the number of threads to use for the computation. If
None, the environment variable"IRSPACK_NUM_THREADS_DEFAULT"will be looked up, and if the variable is not set, it will be set toos.cpu_count(). Defaults to None.
- __init__(X_train_all, alpha=0.05, l1_ratio=0.01, positive_only=True, n_iter=100, tol=0.0001, top_k=None, n_threads=None)[source]
- Parameters:
X_train_all (Union[csr_matrix, csc_matrix]) –
alpha (float) –
l1_ratio (float) –
positive_only (bool) –
n_iter (int) –
tol (float) –
top_k (Optional[int]) –
n_threads (Optional[int]) –
Methods
__init__(X_train_all[, alpha, l1_ratio, ...])default_suggest_parameter(trial, fixed_params)from_config(X_train_all, config)get_score(user_indices)Compute the item recommendation score for a subset of users.
get_score_block(begin, end)Compute the score for a block of the users.
Compute the item recommendation score for unseen users whose profiles are given as another user-item relation matrix.
Compute the item recommendation score for unseen users whose profiles are given as another user-item relation matrix.
get_score_remove_seen(user_indices)Compute the item score and mask the item in the training set.
get_score_remove_seen_block(begin, end)Compute the score for a block of the users, and mask the items in the training set.
learn()Learns and returns itself.
learn_with_optimizer(evaluator, trial[, ...])Learning procedures with early stopping and pruning.
tune(data, evaluator[, n_trials, timeout, ...])Perform the optimization step.
tune_with_study(study, data, evaluator[, ...])Attributes
The computed item-item similarity weight matrix.
default_tune_range- property W: Union[csr_matrix, csc_matrix, ndarray]
The computed item-item similarity weight matrix.
- X_train_all: sps.csr_matrix
The matrix to feed into recommender.
- get_score(user_indices)
Compute the item recommendation score for a subset of users.
- Parameters:
user_indices (ndarray) – The index defines the subset of users.
- Returns:
The item scores. Its shape will be (len(user_indices), self.n_items)
- Return type:
ndarray
- get_score_block(begin, end)
Compute the score for a block of the users.
- Parameters:
begin (int) – where the evaluated user block begins.
end (int) – where the evaluated user block ends.
- Returns:
The item scores. Its shape will be (end - begin, self.n_items)
- Return type:
ndarray
- get_score_cold_user(X)
Compute the item recommendation score for unseen users whose profiles are given as another user-item relation matrix.
- Parameters:
X (Union[csr_matrix, csc_matrix]) – The profile user-item relation matrix for unseen users. Its number of rows is arbitrary, but the number of columns must be self.n_items.
- Returns:
Computed item scores for users. Its shape is equal to X.
- Return type:
ndarray
- get_score_cold_user_remove_seen(X)
Compute the item recommendation score for unseen users whose profiles are given as another user-item relation matrix. The score will then be masked by the input.
- Parameters:
X (Union[csr_matrix, csc_matrix]) – The profile user-item relation matrix for unseen users. Its number of rows is arbitrary, but the number of columns must be self.n_items.
- Returns:
Computed & masked item scores for users. Its shape is equal to X.
- Return type:
ndarray
- get_score_remove_seen(user_indices)
Compute the item score and mask the item in the training set. Masked items will have the score -inf.
- Parameters:
user_indices (ndarray) – Specifies the subset of users.
- Returns:
The masked item scores. Its shape will be (len(user_indices), self.n_items)
- Return type:
ndarray
- get_score_remove_seen_block(begin, end)
Compute the score for a block of the users, and mask the items in the training set. Masked items will have the score -inf.
- Parameters:
begin (int) – where the evaluated user block begins.
end (int) – where the evaluated user block ends.
- Returns:
The masked item scores. Its shape will be (end - begin, self.n_items)
- Return type:
ndarray
- learn()
Learns and returns itself.
- Returns:
The model after fitting process.
- Parameters:
self (R) –
- Return type:
R
- learn_with_optimizer(evaluator, trial, max_epoch=128, validate_epoch=5, score_degradation_max=5)
Learning procedures with early stopping and pruning.
- Parameters:
evaluator (Optional[evaluation.Evaluator]) – The evaluator to measure the score.
trial (Optional[Trial]) – The current optuna trial under the study (if any.)
max_epoch (int) – Maximal number of epochs. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 128.
validate_epoch (int) – The frequency of validation score measurement. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5.
validate_epoch – The frequency of validation score measurement. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5.
score_degradation_max (int) – Maximal number of allowed score degradation. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5.
- Return type:
None
- classmethod tune(data, evaluator, n_trials=20, timeout=None, data_suggest_function=None, parameter_suggest_function=None, fixed_params={}, random_seed=None, prunning_n_startup_trials=10, max_epoch=128, validate_epoch=5, score_degradation_max=5, logger=None)
Perform the optimization step. optuna.Study object is created inside this function.
- Parameters:
data (Optional[Union[csr_matrix, csc_matrix]]) – The training data. You can also provide tunable parameter dependent training data by providing data_suggest_function. In that case, data must be None.
evaluator (evaluation.Evaluator) – The validation evaluator that measures the performance of the recommenders.
n_trials (int) – The number of expected trials (including pruned ones). Defaults to 20.
timeout (Optional[int]) – If set to some value (in seconds), the study will exit after that time period. Note that the running trials is not interrupted, though. Defaults to None.
data_suggest_function (Optional[Callable[[Trial], Union[csr_matrix, csc_matrix]]]) – If not None, this must be a function which takes optuna.Trial as its argument and returns training data. Defaults to None.
parameter_suggest_function (Optional[Callable[[Trial], Dict[str, Any]]]) – If not None, this must be a function which takes optuna.Trial as its argument and returns Dict[str, Any] (i.e., some keyword arguments of the recommender class). If None, cls.default_suggest_parameter will be used for the parameter suggestion. Defaults to None.
fixed_params (Dict[str, Any]) – Fixed parameters passed to recommenders during the optimization procedure. This will replace the suggested parameter (either by cls.default_suggest_parameter or parameter_suggest_function). Defaults to dict().
random_seed (Optional[int]) – The random seed to control optuna.samplers.TPESampler. Defaults to None.
prunning_n_startup_trials (int) – n_startup_trials argument passed to the constructor of optuna.pruners.MedianPruner.
max_epoch (int) – The maximal number of epochs for the training. If iterative learning procedure is not available, this parameter will be ignored.
validate_epoch (int, optional) – The frequency of validation score measurement. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5.
score_degradation_max (int, optional) – Maximal number of allowed score degradation. If iterative learning procedure is not available, this parameter will be ignored. Defaults to 5. Defaults to 5.
logger (Optional[Logger]) –
- Returns:
A tuple that consists of
A dict containing the best paramaters. This dict can be passed to the recommender as
**kwargs.A
pandas.DataFramethat contains the history of optimization.
- Return type:
Tuple[Dict[str, Any], DataFrame]