Detection and classification of defect patterns on semiconductor wafers

IIE Transactions, Dec, 2006 by Chih-Hsuan Wang, Way Kuo, Halima Bensmail

3.3. Principle of the Gaussian EM algorithm

Gaussian-mixture-based models are commonly used as a basis for cluster analysis and Gaussian EM algorithms based on the maximum likelihood or Maximum A Posteriori (MAP) estimations are popular and powerful tools. Bensmail and Celeux (1996) provided a good alternative, called eigenvalue decomposition discriminant analysis and used it to analyze 14 discrimination models. Moreover, Bensmail et al. (1997) studied a stochastic Gaussian clustering approach. The Gaussian EM model assumes that the population of interest consists of G different subpopulations. Observations [x.sub.1], [x.sub.2],..., [x.sub.n] in [R.sup.d] (n is the number of observations andd denotes the input dimension), are assumed to arise from a random vector with a joint probability density of:

f(x; [theta]) = [G.summation over (k=1)] [p.sub.k][f.sub.k](x; [[theta].sub.k]), (3)

where G is the number of components and [p.sub.k] is the probability that an observation belongs to the kth component. The properties of [p.sub.k] [greater than or equal to] 0 and [[SIGMA].sub.k=1.sup.G] [p.sub.k] = 1 will hold for any observation. If mixture kernel [f.sub.[gamma].sub.i] ([x.sub.i]; [[theta].sub.k]) is MultiVariate Normal (MVN) (where [[theta].sub.k] = ([[mu].sub.k], [[SIGMA].sub.k]) and [[gamma].sub.i] = k if [x.sub.i] belongs to the kth component), the density function based on its mean [[mu].sub.k] and covariance [[SIGMA].sub.k] can be computed as:

[f.sub.k] ([x.sub.i] | [[mu].sub.k], [[SIGMA].sub.k]) = [exp{-[1/2]([x.sub.i] - [[mu].sub.k])[.sup.T] [[SIGMA].sub.k.sup.-1] ([x.sub.i] - [[mu].sub.k])}]/[2[pi][.sup.d/2]|[[SIGMA].sub.k]|[.sup.1/2]] (4)

Due to its geometric properties in a MVN distribution, the covariance matrix can be decomposed as [[SIGMA].sub.k] = [[lambda].sub.k] [D.sub.k] [A.sub.k] [D.sub.k.sup.T] (Bensmail and Celeux, 1996), where the superscript T stands for matrix transpose, [[lambda].sub.k] is the eigenvalue of its covariance (controlling the hyper-volume occupied by cluster k as [[lambda].sub.k.sup.d]|[A.sub.k]|), [D.sub.k] is its corresponding eigenvector (that decides the orientation of the principal component in cluster k) and [A.sub.k] is the diagonal matrix (that decides the shape of the covariance). Under a descending sorting of the covariance ([A.sub.k] = diag{[[alpha].sub.1k],...[[alpha].sub.dk]}, 1 = [[alpha].sub.1k] > [[alpha].sub.2k] > ... [[alpha].sub.dk] > 0), the kth cluster tends to be hyper-spherical if all diagonal elements [[alpha].sub.jk] are of similar magnitude, whereas it appears to be a line if [[alpha].sub.2k] [much less than] 1 = [[alpha].sub.1k] holds. Thus, the Gaussian kernel naturally includes two categories of defect patterns, namely the linear scratch and elliptic zone patterns. In brief, all the geometric features (shape, volume, orientation) of the mixtures are summarized by the covariance matrix [[SIGMA].sub.k]. Common instances include [[SIGMA].sub.k] = [lambda]I (I is the identity matrix), where all clusters are spherical and of the same size; [[SIGMA].sub.k] = [SIGMA] constant across clusters, where all clusters have the same geometry but need not be spherical; and unrestricted [[SIGMA].sub.k] where each cluster may have a different geometry (see Bensmail et al. (1997)). The classical approach maximizes the likelihood as:

[FIGURE 4 OMITTED]

L([theta], [gamma]|x) = [n.[product].[i=l]] [f.sub.[gamma].sub.i] ([x.sub.i]; [[theta].sub.[gamma].sub.i]).

Under the assumption of a MVN distribution, the likelihood then becomes:

L([theta], [gamma]) [proportional] [G.[product].[k=1]] [[product].[i[member of][G.sub.k]]] | [[SIGMA].sub.k]|[.sup.-1/2] exp{-1/2([x.sub.i] - [[mu].sub.k])[.sup.T] x [[SIGMA].sub.k.sup.-1] ([x.sub.i] - [[mu].sub.k])}. (5)

In comparison, the traditional K-means method is equivalent to maximizing the MVN classification likelihood when the covariance matrix is proportional to an identity matrix. Details of the expectation and maximization step (EM algorithm) are now briefly described.


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
CXO UnpluggedSmart Business interviews on BNET

See and hear how senior level executives across the Asia Pacific are developing smart business ideas across a variety of sectors. The focus is on the future, and on how businesses need to evolve.

advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement
Click Here

Content provided in partnership with Thompson Gale