Utils & Metrics¶
_clip¶
- pyldl.algorithms.utils._clip(func)¶
_reduction¶
- pyldl.algorithms.utils._reduction(func)¶
binaryzation¶
- pyldl.algorithms.utils.binaryzation(D: ndarray, method='threshold', param: any = None) ndarray ¶
Transform label distribution matrix to logical label matrix.
- Parameters:
D (np.ndarray) – Label distribution matrix (shape: \([n,\, l]\)).
method ({'threshold', 'topk'}, optional) –
Type of binaryzation method, defaults to ‘threshold’. The options are ‘threshold’ and ‘topk’, which can refer to:
[BIN-KWT+24]Zhiqiang Kou, Jing Wang, Jiawei Tang, Yuheng Jia, Boyu Shi, and Xin Geng. Exploiting multi-label correlation in label distribution learning. In Proceedings of the International Joint Conference on Artificial Intelligence, 4326–4334. 2024. URL: https://doi.org/10.24963/ijcai.2024/478.
param (any, optional) – Parameter of binaryzation method, defaults to None. If None, the default value is .5 for ‘threshold’ and \(\lfloor l / 2 \rfloor\) for ‘topk’.
- Returns:
Logical label matrix (shape: \([n,\, l]\)).
- Return type:
np.ndarray
kl_divergence¶
- pyldl.algorithms.utils.kl_divergence(D, D_pred)¶
Kullback-Leibler divergence. It is defined as:
\[\text{KLD}(\boldsymbol{u}, \, \boldsymbol{v}) = \sum^l_{j=1}u_j \ln \frac{u_j}{v_j}\text{.}\]
pairwise_cosine¶
- pyldl.algorithms.utils.pairwise_cosine(X: ndarray | Tensor, Y: ndarray | Tensor | None = None, mode: str = 'similarity') ndarray | Tensor ¶
Pairwise cosine distance/similarity.
- Parameters:
X (tf.Tensor) – Matrix \(\boldsymbol{X}\) (shape: \([m_{\boldsymbol{X}},\, n]\)).
Y (tf.Tensor) – Matrix \(\boldsymbol{Y}\) (shape: \([m_{\boldsymbol{Y}},\, n]\)).
mode (str) – Defaults to ‘similarity’. The options are ‘similarity’ and ‘distance’.
- Returns:
Pairwise cosine similarity (shape: \([m_{\boldsymbol{X}},\, m_{\boldsymbol{Y}}]\)).
- Return type:
tf.Tensor
pairwise_euclidean¶
- pyldl.algorithms.utils.pairwise_euclidean(X: ndarray | Tensor, Y: ndarray | Tensor | None = None) ndarray | Tensor ¶
Pairwise Euclidean distance.
- Parameters:
X (Union[np.ndarray, tf.Tensor]) – Matrix \(\boldsymbol{X}\) (shape: \([m_{\boldsymbol{X}},\, n]\)).
Y (Union[np.ndarray, tf.Tensor], optional) – Matrix \(\boldsymbol{Y}\) (shape: \([m_{\boldsymbol{Y}},\, n]\)), if None, \(\boldsymbol{Y} = \boldsymbol{X}\), defaults to None.
- Returns:
Pairwise Euclidean distance (shape: \([m_{\boldsymbol{X}},\, m_Y]\)).
- Return type:
Union[np.ndarray, tf.Tensor]
proj¶
soft_thresholding¶
- pyldl.algorithms.utils.soft_thresholding(A: ndarray, tau: float) ndarray ¶
Soft thresholding operation. It is defined as \(\text{soft}(\boldsymbol{A}, \, \tau) = \text{sgn}(\boldsymbol{A}) \odot \max\lbrace \lvert \boldsymbol{A} \rvert - \tau, 0 \rbrace\), where \(\odot\) denotes element-wise multiplication.
- Parameters:
A (np.ndarray) – Matrix \(\boldsymbol{A}\).
tau (float) – \(\tau\).
- Returns:
The result of soft thresholding operation.
- Return type:
np.ndarray
solvel21¶
- pyldl.algorithms.utils.solvel21(A: ndarray, tau: float) ndarray ¶
This approach is proposed in paper [CY14].
The solution to the optimization problem \(\mathop{\arg\min}_{\boldsymbol{X}} \Vert \boldsymbol{X} - \boldsymbol{A} \Vert_\text{F}^2 + \tau \Vert \boldsymbol{X} \Vert_{2,\,1}\) is given by the following formula:
\[\begin{split}\vec{x}_{\bullet j}^{\ast} = \left\{ \begin{aligned} & \frac{\Vert \vec{a}_{\bullet j} \Vert - \tau}{\Vert \vec{a}_{\bullet j} \Vert} \vec{a}_{\bullet j}, & \tau \le \Vert \vec{a}_{\bullet j} \Vert \\ & 0, & \text{otherwise} \end{aligned} \right.\text{.}\end{split}\]where \(\vec{x}_{\bullet j}\) is the \(j\)-th column of matrix \(\boldsymbol{X}\), and \(\vec{a}_{\bullet j}\) is the \(j\)-th column of matrix \(\boldsymbol{A}\).
- Parameters:
A (np.ndarray) – Matrix \(\boldsymbol{A}\).
tau (float) – \(\tau\).
- Returns:
The solution to the optimization problem.
- Return type:
np.ndarray
svt¶
- pyldl.algorithms.utils.svt(A: ndarray, tau: float) ndarray ¶
Singular value thresholding (SVT) is proposed in paper [CCS10].
The solution to the optimization problem \(\mathop{\arg\min}_{\boldsymbol{X}} \Vert \boldsymbol{X} - \boldsymbol{A} \Vert_\text{F}^2 + \tau \Vert \boldsymbol{X} \Vert_{\ast}\) is given by \(\boldsymbol{U} \max \lbrace \boldsymbol{\Sigma} - \tau, 0 \rbrace \boldsymbol{V}^\top\), where \(\boldsymbol{A} = \boldsymbol{U} \boldsymbol{\Sigma} \boldsymbol{V}^\top\) is the singular value decomposition of matrix \(\boldsymbol{A}\).
- Parameters:
A (np.ndarray) – Matrix \(\boldsymbol{A}\).
tau (float) – \(\tau\).
- Returns:
The solution to the optimization problem.
- Return type:
np.ndarray
artificial¶
- pyldl.utils.artificial(X, a=1.0, b=0.5, c=0.2, d=1.0, w1=array([[4., 2., 1.]]), w2=array([[1., 2., 4.]]), w3=array([[1., 4., 2.]]), lambda1=0.01, lambda2=0.01)¶
download_dataset¶
- pyldl.utils.download_dataset(name, dataset_path)¶
emphasize¶
- pyldl.utils.emphasize(D, rate=0.5, **kwargs)¶
load_dataset¶
- pyldl.utils.load_dataset(name, dir='dataset')¶
make_ldl¶
- pyldl.utils.make_ldl(n_samples=200, **kwargs)¶
plot_artificial¶
- pyldl.utils.plot_artificial(n_samples=50, model=None, file_name=None, **kwargs)¶
random_missing¶
- pyldl.utils.random_missing(D, missing_rate=0.9, weighted=False)¶
accuracy¶
- pyldl.metrics.accuracy(y, y_pred)¶
canberra¶
- pyldl.metrics.canberra(D, D_pred)¶
Canberra distance. It is defined as:
\[\text{Can.}(\boldsymbol{u}, \, \boldsymbol{v}) = \sum^l_{j=1}\frac{\left\vert u_j - v_j \right\vert}{u_j + v_j}\text{.}\]
chebyshev¶
- pyldl.metrics.chebyshev(D, D_pred)¶
Chebyshev distance. It is defined as:
\[\text{Cheby.}(\boldsymbol{u}, \, \boldsymbol{v}) = \max_j \left\vert u_j - v_j \right\vert\text{.}\]
clark¶
- pyldl.metrics.clark(D, D_pred)¶
Clark distance. It is defined as:
\[\text{Clark}(\boldsymbol{u}, \, \boldsymbol{v}) = \sqrt{\sum^l_{j=1}\frac{\left( u_j - v_j \right)^2}{\left( u_j + v_j \right)^2}}\text{.}\]
cosine¶
- pyldl.metrics.cosine(D, D_pred)¶
Cosine similarity. It is defined as:
\[\text{Cosine}(\boldsymbol{u}, \, \boldsymbol{v}) = \frac{\sum^l_{j=1}u_j v_j}{\sqrt{\sum^l_{j=1}u_j^2}\sqrt{\sum^l_{j=1}v_j^2}}\text{.}\]
dpa¶
- pyldl.metrics.dpa(D, D_pred)¶
error_probability¶
- pyldl.metrics.error_probability(D, D_pred)¶
Error probability. It is defined as:
\[\text{Err. prob.}(\boldsymbol{u}, \, \boldsymbol{v}) = 1 - u_{\arg\max(\boldsymbol{v})}\text{.}\]
euclidean¶
- pyldl.metrics.euclidean(D, D_pred)¶
fidelity¶
- pyldl.metrics.fidelity(D, D_pred)¶
intersection¶
- pyldl.metrics.intersection(D, D_pred)¶
Intersection similarity. It is defined as:
\[\text{Int.}(\boldsymbol{u}, \, \boldsymbol{v}) = \sum^l_{j=1} \min\left(u_j, \, v_j\right)\text{.}\]
kendall¶
- pyldl.metrics.kendall(D, D_pred)¶
Kendall’s rank correlation coefficient. It is defined as:
\[\text{Ken.}(\boldsymbol{u}, \, \boldsymbol{v}) = \frac{2 \sum_{j < k} \text{sgn}(u_j - u_k) \text{sgn}(v_j - v_k) }{l (l-1)}\text{.}\]
match_m¶
- pyldl.metrics.match_m(D, D_pred, m=None)¶
max_roc_auc¶
- pyldl.metrics.max_roc_auc(D, D_pred)¶
mean_absolute_error¶
- pyldl.metrics.mean_absolute_error(D, D_pred, mode='macro')¶
mean_squared_error¶
- pyldl.metrics.mean_squared_error(D, D_pred, mode='macro')¶
precision¶
- pyldl.metrics.precision(y, y_pred)¶
score¶
- pyldl.metrics.score(target: ndarray, pred: ndarray, metrics: list | None = None, return_dict: bool = False)¶
sensitivity¶
- pyldl.metrics.sensitivity(y, y_pred)¶
sorensen¶
- pyldl.metrics.sorensen(D, D_pred)¶
spearman¶
- pyldl.metrics.spearman(D, D_pred)¶
Spearman’s rank correlation coefficient. It is defined as:
\[\text{Spear.}(\boldsymbol{u}, \, \boldsymbol{v}) = 1 - \frac{6 \sum_{j=1}^{l} (\rho(u_j) - \rho(v_j))^2 }{l(l^2 - 1)}\text{,}\]where \(\rho\) is the rank of the element in the vector.
specificity¶
- pyldl.metrics.specificity(y, y_pred)¶
squared_chi2¶
- pyldl.metrics.squared_chi2(D, D_pred)¶
top_k¶
- pyldl.metrics.top_k(D, D_pred, k=None, mode='f1_score')¶
youden_index¶
- pyldl.metrics.youden_index(y, y_pred)¶
zero_one_loss¶
- pyldl.metrics.zero_one_loss(D, D_pred)¶
0/1 loss. It is defined as:
\[\text{0/1 loss}(\boldsymbol{u}, \, \boldsymbol{v}) = \delta(\arg\max(\boldsymbol{u}), \, \arg\max(\boldsymbol{v}))\text{,}\]where \(\delta\) is the Kronecker delta function.
References¶
Laurent Condat. Fast projection onto the simplex and the l1 ball. Mathematical Programming, 158(1):575–585, 2016. URL: https://doi.org/10.1007/s10107-015-0946-6.
Jinhui Chen and Jian Yang. Robust subspace segmentation via low-rank representation. IEEE Transactions on Cybernetics, 44(8):1432–1445, 2014. URL: https://doi.org/10.1109/TCYB.2013.2286106.
Jian-Feng Cai, Emmanuel J Candès, and Zuowei Shen. A singular value thresholding algorithm for matrix completion. SIAM Journal on optimization, 20(4):1956–1982, 2010. URL: https://doi.org/10.1137/080738970.