-
Notifications
You must be signed in to change notification settings - Fork 29
Add Multivariate Distance Covariance metric and the corresponding test of independence #62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Changes from 20 commits
b7b0653
68a7cfa
af0568d
21f4b8a
d44fb7d
946397c
62152ef
3207181
3b6fde1
8890215
0b73538
19a71f6
db98194
d1b4050
a41bdb8
0496e5a
0718f7e
b25130d
7b55726
c89131e
974c84e
3d25d06
bb32079
3f3824c
1a55352
eaced17
6b1ae7e
d90b4a3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -41,6 +41,12 @@ | |||||
| array_namespace, | ||||||
| numpy_namespace, | ||||||
| ) | ||||||
| ##Additional module for Multivariate dcov test-------------------------------------------------------------- | ||||||
| from scipy.special import gammaln | ||||||
| import math | ||||||
|
|
||||||
| from dcor._rowwise import rowwise | ||||||
| ##------------------------------------------------------------------------------------- | ||||||
|
|
||||||
| Array = TypeVar("Array", bound=ArrayType) | ||||||
|
|
||||||
|
|
@@ -1169,3 +1175,131 @@ def distance_correlation_af_inv( | |||||
| compile_mode=compile_mode, | ||||||
| ), | ||||||
| ) | ||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
| """ | ||||||
| A Statistically and Numerically Efficient Independence Test Based on | ||||||
| Random Projections and Distance Covariance | ||||||
|
|
||||||
| :cite:`b-dcov_random_projection`. | ||||||
|
|
||||||
| References | ||||||
| ---------- | ||||||
| .. bibliography:: ../refs.bib | ||||||
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would really appreciate it if you kindly guide me how to do that, I did search this names on the internet, but I am not quite certain how to do it. It would be really nice if you guide me step-by-step process.
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just use You can use this docstring as an example:
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I implemented the exact format that you have mentioned |
||||||
| :labelprefix: B | ||||||
| :keyprefix: b- | ||||||
| """ | ||||||
|
|
||||||
|
|
||||||
| def gamma_ratio(p): | ||||||
| """ | ||||||
| Parameters | ||||||
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We prefer Google style docstrings over NumPy style ones.
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You closed the conversation without changing it, please don't do that.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have replaced ' ' ' by " " "
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Google style docstrings, as opposed to NumPy style docstrings, use a colon to specify sections such as |
||||||
| ---------- | ||||||
| p : is the dimension of the data | ||||||
|
|
||||||
| Returns | ||||||
| ------- | ||||||
| TYPE float | ||||||
|
|
||||||
| This function evaluates the gamma ratio, which is | ||||||
| required to calculate the constants C_p and C_q (in function u_dist_cov_sqr_mv()) | ||||||
|
|
||||||
| """ | ||||||
|
|
||||||
| return np.exp(gammaln((p+1) / 2) - gammaln(p / 2)) | ||||||
|
|
||||||
|
|
||||||
|
|
||||||
| def rndm_projection(X, p): | ||||||
| """ | ||||||
| Parameters | ||||||
| ---------- | ||||||
| X : N x p, array of arrays | ||||||
| where, p: number of dimensions (p >= 1) and N: number of samples | ||||||
| p : number of dimensions (p >= 1) | ||||||
|
|
||||||
|
Palash123-4 marked this conversation as resolved.
|
||||||
| Returns | ||||||
| ------- | ||||||
| X_new : an array of size N | ||||||
| DESCRIPTION: Random projection of multivariate array | ||||||
| """ | ||||||
|
|
||||||
| # X_std = multivariate_normal.rvs( np.zeros(p), np.identity(p), size = 1) | ||||||
| X_std = np.random.standard_normal(p) | ||||||
|
|
||||||
| X_norm = np.linalg.norm(X_std) | ||||||
| U_sphere = np.array(X_std) / X_norm # Normalize X_std | ||||||
|
|
||||||
| if p > 1: | ||||||
| X_new = U_sphere @ X.T | ||||||
| else: | ||||||
| X_new = U_sphere * X | ||||||
| return X_new | ||||||
|
|
||||||
|
|
||||||
| def u_dist_cov_sqr_mv(X, Y, n_projs = 500, method ='mergesort'): | ||||||
|
Owner
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Please, add double quotes everywhere.
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I made the changes |
||||||
| """ | ||||||
| Parameters | ||||||
| ---------- | ||||||
| X : N x p, array of arrays, where p > 1 | ||||||
| Y : N x q, array of arrays, where q >= 1 | ||||||
| where p and q: number of dimensions of variable X and Y, respectively and N: number of samples | ||||||
|
|
||||||
| n_projs : Number of projections (integer type), optional | ||||||
| DESCRIPTION. The default is 500.(paper suggests: n_projs < N/logN, larger n_projs provides better results) | ||||||
| method : fast computation method either 'mergesort' or 'avl', optional | ||||||
| DESCRIPTION. The default is 'mergesort'. | ||||||
|
|
||||||
| Returns | ||||||
| ------- | ||||||
| omega_bar : Float type | ||||||
| DESCRIPTION: Produce fastly computed unbiased distance covariance between X and Y | ||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
| Examples: | ||||||
| >>> import numpy as np | ||||||
| >>> import dcor | ||||||
| >>> from scipy.stats import multivariate_normal | ||||||
| >>> mean_vector = [2, 3, 5, 3, 2, 1] | ||||||
| >>> matrix_size = 6 | ||||||
| >>> np.random.seed(123) # in order to achieve reproducible results | ||||||
| >>> A = 0.5 * np.random.rand(matrix_size, matrix_size) | ||||||
| >>> B = np.dot(A, A.transpose()) | ||||||
| >>> n_samples = 3000 | ||||||
| >>> mv = multivariate_normal( mean = mean_vector, cov = B) | ||||||
| >>> X = mv.rvs(size = n_samples, random_state = 123) | ||||||
| >>> X1 = X.T[:4] | ||||||
| >>> X2 = X.T[4:] | ||||||
| >>> print(f"Computing fast distance covariance = {u_dist_cov_sqr_mv(X1.T, X2.T)}") | ||||||
| """ | ||||||
|
|
||||||
| n_samples = np.shape(X)[0] | ||||||
| p = np.shape(X)[1] | ||||||
| if Y.T.ndim == 1: | ||||||
| q = 1 | ||||||
| else: | ||||||
| q = np.shape(Y)[1] | ||||||
|
|
||||||
| sqrt_pi_value = math.sqrt(math.pi) | ||||||
| C_p = sqrt_pi_value * gamma_ratio(p) | ||||||
| C_q = sqrt_pi_value * gamma_ratio(q) | ||||||
|
|
||||||
|
|
||||||
| X_proj = np.empty(( n_projs, n_samples)) | ||||||
| Y_proj = np.empty(( n_projs, n_samples)) | ||||||
|
|
||||||
| for i in range(n_projs): | ||||||
|
Palash123-4 marked this conversation as resolved.
|
||||||
| X_proj[i, :] = rndm_projection(X, p) | ||||||
| Y_proj[i, :] = rndm_projection(Y, q) | ||||||
| pass | ||||||
|
|
||||||
| omega_ = rowwise(u_distance_covariance_sqr, | ||||||
| X_proj, Y_proj, rowwise_mode = method) | ||||||
| omega_bar = C_p * C_q * np.mean(omega_) | ||||||
|
|
||||||
| return omega_bar | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring needs to be inside the corresponding function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completed the task