PCA using sklearn

Ask Time：2018-10-31T02:15:59 Author：MAtennis9

I have a large input matrix, size (20, 20000) and am trying to perform PCA using the sklearn Python package. Here, 20 refers to 20 subjects, and 20,000 refers to 20,000 features. Below is sample code:

import numpy as np
from sklearn.decomposition import PCA

rng = np.random.RandomState(1)
X = rng.randn(20, 20000)
pca.fit(X)
X.shape = 

>> (20, 20000)

pca = PCA(n_components=21)
pca.fit(X)
X_pca = pca.transform(X)
print("Original shape: ", X.shape)
print("Transformed shape: ", X_pca.shape)

>> Original shape: (20, 20000)
>> Transformed shape: (20, 20)

Using PCA, am I not able to get back more components than my number of x values(why are we limited by the length of our x-values when we obtain pca components)?

Author:MAtennis9，eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article：https://stackoverflow.com/questions/53070481/pca-using-sklearn

John Rouhana :

This has more to do with the PCA implementation than sklearn, but:\n\nif n_samples <= n_features:\n maxn_pc = n_samples - 1\nelse:\n maxn_pc = n_features\n\n\nNamely, if your number of samples (n) is less than or equal the number of features (f), the greatest number of non-trivial components you can extract is n-1. Otherwise, the greatest number of non-trivial components is n. ",

2018-10-30T18:27:45

PCA using sklearn