(1) Todd K. Moon, Electrical and Computer Engineering Department, Utah State University, Logan, Utah;
(2) Jacob H. Gunther, Electrical and Computer Engineering Department, Utah State University, Logan, Utah.
Abstract and 1 Introduction and Background
2 Statistical Parsing and Extracted Features
7 Conclusions, Discussion, and Future Work
A. A Brief Introduction to Statistical Parsing
B. Dimension Reduction: Some Mathematical Details
This material is drawn from [27]. The trace of Sw provides a measure of the clustering of the feature for each class around their respective centroids,
Note that Sw, being the sum of the outer product of n terms, generically has rank min(n, m). In the work here, the dimension of the feature vectors m is very large, so that rank(Sw) = n; Sw is singular.
Similarly, tr(Sb) measures the total distance between cluster centroids and the overall centroid,
A measure of cluster quality which measures the degree to which tr(Sw) is small and tr(Sb) is large is
To express the algorithm, the following matrices are defined. The scatter matrices Sw, Sb and Sm can be expressed in terms of the matrices
That is, Hm, Hb and Hm form factors of the respective scatter matrices.
The algorithm for computing G is shown below. (adapted from Algorithm 1 of [27]).
This paper is available on arxiv under CC BY 4.0 DEED license.