Document image analysis by probabilistic network and circuit diagram extraction

Informatica, Oct, 2005 by Andras Barta, Istvan Vajk

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII.] (5)

T is an operator that performs an orthogonal linear transformation on the image bases. The parameters of the transformation are stored in the [r.sub.i] parameter vector. The image bases may be parameterized by an [a.sub.i] attribute vector. Since features belong to parameterized feature classes the [a.sub.i] vector is necessary to identify their parameters. This description defines a tree structure. The tree is constructed from its nodes and a library. The library is a list of common, frequently used image bases. Figure 2 illustrates the object tree and the library.

[FIGURE 2 OMITTED]

Visual information is inherently spatially ordered, so the tree is defined to represent these spatial relationships. This transformation has three components, displacement, rotation and scaling. The four parameters of the transformation of node i are placed in a reference vector

[r.sub.i]= [x.sup.r.sub.i] [s.sup.r.sub.i] [[phi].sup.r.sub.i] (6)

where [x.sup.r.sub.i] = [[x.sup.r.sub.i] [y.sup.r.sub.i] is the position of the image element in the coordinate system of its parent node, [s.sup.r.sub.i] is the scaling parameter and [[phi].sup.r.sub.i] is the rotation angle. With the object tree, the object library and the image coordinate system the object can be reconstructed. A picture element or a feature is represented in its own local coordinate system. Since only two-dimensional objects are used therefore the scale factor is the same for both axes. Each image base is defined in a unit coordinate system and stored in the library. When the image of an object is reconstructed the image base is transformed from the library to a new coordinate system, which can be described by the vector; [i.sub.i] = [[x.sub.i] [s.sub.i] [[phi].sub.i]]. This coordinate system is calculated from the [r.sub.i] reference vector of the node and the image coordinate system of the parent node, [i.sub.i-1]. This is a recursive reconstruction that iterates through the tree.

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII.] (7)

With this reconstruction algorithm the tree representation of an object can be compared against the image.

4.2 Network Parameters

The network structure is determined by the p(y | x) conditional probabilities, where x,y are nodes of the network. Conditionally independent nodes are not connected by edge. In order to define the network the p(y | x) parameters have to be calculated. These parameters can be assessed based on experimental training data. In our case of document processing the network is trained on circuit diagrams. Here, it is assumed that the image bases of an object description are independent. The probability parameters [[theta].sub.i,j] are learned as relative frequencies. It can be shown that the distribution of the [[theta].sub.i,j] parameters is a Dirichlet distribution [22], [4]. The conditional probabilities of the network can be described by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII.] (8)

where [n.sub.k] is the number of time node k occurs in the sample data and n = [summation of (L k=1)] [n.sub.k is the sample size. The [GAMMA](x) function for integer values is the factorial function, [GAMMA](x) = (x-1)!. The parameters of the Dirichlet distribution correspond to the physical probabilities and the relative frequencies,


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale