perjantai 25. tammikuuta 2019

Jan 24: More on the LDA classifier

Before the actual content, we discussed the course groupwork. Each group should submit their predictions to Kaggle as described in the assignment text.

Next, we studied a demo of projecting 2D data to 1D with different degrees of separation. The below animation recaps the idea: a good projection makes the classes nicely distinct in the 1D projected space. This is also seen in the Fisher score: bigger score means better separation. Luckily we do not need to try all directions (like in the demo), but can use the LDA formula to find the best direction with one line of code.


The LDA projection vector can be found in two ways: 1) Solve a generalized eigenvalue problem, or, 2) use the simpler formula:

             w = (S1 + S2)-1 (m1 - m2),

with S1 and S2 the covariance matrices of the two classes and m1 and m2 the respective class means. It was also mentioned that if the covariance matrices are identity matrices, the vector w simplifies to a line connecting the centers of mass of the two classes (m1 - m2). In this case, the classes would look circular. However, if the distributions appear elliptic, this is not enough and the covariance term is needed for correcting the direction.

It should be noted that LDA assumes that the distributions are Gaussian. In real life this is seldom the case, and the severity of the violation depends on the case (LDA may or may not work--you just need to experiment).

In the eigenvector based approach, we discussed the use of LDA for dimensionality reduction. Namely, the more often used dimensionality reduction technique, the PCA, compresses the data to dimensions of maximum variance. However, having a large variance may not indicate the importance to classification. Therefore, LDA could provide an alternative, as it finds subset of directions that maximize class separation (instead of variance).

Ei kommentteja:

Lähetä kommentti