Document image analysis by probabilistic network and circuit diagram extraction

Informatica, Oct, 2005 by Andras Barta, Istvan Vajk

Step 5: The probability of the child modes propagates upward as new evidence. The upward probabilities are combined to calculate the probability of root b.

Step 6: Only the high probability nodes are processed, the others are neglected. This is true for both the upward and downward object hypotheses. This process creates a structure with several root nodes. These root nodes can be an input to a next level of recognition step. The root nodes are either subcomponents that the algorithm will grow further or they are the final solutions.

These root nodes are the final solution only for our processing. They can be the input to a higher level processing in which the electrical connections of the circuit diagram are interpreted. Since the number of root nodes quantifies the state of the circuit diagram processing, it can be used to evaluate the performance of the method. The search method is adaptive and local; only certain area of the image is processed at a time. This is advantageous for images with noise or clutter.

Several simulations have been performed to evaluate the method. The algorithm is programmed in Matlab object oriented environment. The circuit diagrams processed and the components are extracted in a few seconds on a regular PC. Figure 8 shows the results of the simulations. The processing have been run 20 times and the mean and standard deviation of the result is calculated and plotted.

5.2 Complexity of the Algorithm

For a general Bayesian network the worst case computational complexity is n([2l.sup.2] 2l [lk.sub.m]) , where n is the number of nodes and l is the number of possible states of the nodes and [k.sub.m] is the maximum number of children [22]. The complexity calculation in our case is different, since it is necessary to calculate also the r spatial transformation parameter vector for every node. The algorithm proceeds by creating upward and downward object hypotheses. The processing of the circuit diagram involves identifying the circuit components as projected library bases. This identification is performed by projections and comparisons; therefore it is unavoidable of having a minimum overhead of calculations. The algorithm is evaluated by the overhead complexity which is calculated from the complexities of the failed and successful object hypotheses. The average wasted calculation of creating one upward ([x.sub.i] [left arrow] y) hypothesis is

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII.]

The algorithmic complexity of calculating one upward hypothesis is [c.sup.u.sub.p]. This value is constant for every node and it is the sum of the complexity of calculating the spatial transformation and the probability updating, [k.sup.y.sub.xi] is the number of the child nodes of node [x.sub.i] with the same library index as y. For example if y is a line and [x.sub.i] is an opamp then [K.sup.line.sub.opamp] = 4 that is they can be connected four different ways. The [mu]y term is added to account for the symmetries of the objects. [I.sub.0] (p([x.sub.i][y)) is an indicator function; its value is 0 if p([x.sub.i]|y) = 0 and 1 otherwise. The complexity, due to the sparse coding does not depend on the library size. It depends only on L.sup.u.sub.y], the number of nonzero upward conditional probability values of node y. The right hypothesis is found with p([x.sub.i] [y) probability therefore the wasted effort is proportional to 1-p([x.sub.i] | y). The calculation for every hypothesis propagates downward. The algorithmic complexity of calculating the downward hypothesis for one node is [c.sup.d.sub.H] . This value is constant for every node and it is the sum of the complexity of calculating the spatial projection, the image base comparison and the probability updating. During the downward propagation the projected library base is compared against an image element. In case of failed hypotheses the result of the comparisons are large error term which causes the downward calculations to abort. The numbers of these failed node comparisons are different for every object, but for the complexity estimation an average [sigma] value is used (its typical value is 2-3). The average wasted complexity of a failed object hypothesis is

 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
advertisement
  • Click Here
  • Click Here
  • Click Here
  • Click Here
advertisement
Click Here

Content provided in partnership with Thompson Gale