Before the actual content, we discussed the course groupwork. Each group
should submit their predictions to Kaggle as described in the
assignment text.
Next, we studied a demo of projecting 2D data to 1D with different
degrees of separation. The below animation recaps the idea: a good
projection makes the classes nicely distinct in the 1D projected space.
This is also seen in the Fisher score: bigger score means better
separation. Luckily we do not need to try all directions (like in the
demo), but can use the LDA formula to find the best direction with one
line of code.
The LDA projection vector can be found in two ways: 1) Solve a generalized eigenvalue problem, or, 2) use the simpler formula:
w = (S1 + S2)-1 (m1 - m2),
with S1 and S2 the covariance matrices of the two classes and m1 and m2 the respective class means. It was also mentioned that if the covariance matrices are identity matrices, the vector w simplifies to a line connecting the centers of mass of the two classes (m1 - m2). In this case, the classes would look circular. However, if the distributions appear elliptic, this is not enough and the covariance term is needed for correcting the direction.
It should be noted that LDA assumes that the distributions are Gaussian. In real life this is seldom the case, and the severity of the violation depends on the case (LDA may or may not work--you just need to experiment).
In the eigenvector based approach, we discussed the use of LDA for dimensionality reduction. Namely, the more often used dimensionality reduction technique, the PCA, compresses the data to dimensions of maximum variance. However, having a large variance may not indicate the importance to classification. Therefore, LDA could provide an alternative, as it finds subset of directions that maximize class separation (instead of variance).

Ei kommentteja:
Lähetä kommentti