Document image analysis by probabilistic network and circuit diagram extraction

Informatica, Oct, 2005 by Andras Barta, Istvan Vajk

The paper presents a hierarchical object recognition system for document processing. It is based on a spatial tree structure representation and Bayesian framework. The image components are built up from lower level image components stored in a library. The tree representations of the objects are assembled from these components. A probabilistic framework is used in order to get robust behaviour. The method is able to convert general circuit diagrams to their components and store them in a hierarchical datastructure. The paper presents simulation for extracting the components of sample circuit diagrams.

Povzetek: Predstavljen je sistem za prepoznavanje objektov pri obdelavi dokumentov.

Keywords: Document processing, Bayesian network, Object recognition

1 Introduction

Optical technology has gone through significant development during the past few years. Still enormous quantities of documents are in printed form, making them difficult to store and access. Automatic document processing should be able to provide a solution. The documents should be digitalized, their information extracted and stored in a format that retains this structured information. Several good solutions exist for document processing and analysis, but their efforts are mainly focused on character registration tasks. This paper tries to find a solution for a special document processing application, interpreting circuit diagrams. Many old blue-prints of electrical equipment are sitting on shelves. Converting them to a meaningful digital representation would make it possible to search and retrieve them by content.

In this paper we present a method to convert general circuit diagrams to their components and store them in a hierarchical data-structure. The task of an object recognition system is to represent images by a set of image bases. In this research a hierarchical structure of bases is selected in order to be able to represent the complexities of the circuits.

Circuit reconstruction has to be performed at several levels. At the lowest level the image pixels are processed and low level image objects, edges, lines and arcs are extracted. At the middle level the basic circuit components are constructed from these elements. At the highest level the electrical connections of the components are interpreted. This paper deals mainly with the middle part. Many papers investigate low level image processing algorithms; for example Heath [19] provides a good comparison of the most frequently used edge detecting methods. Rosin [21] investigates ellipsis fitting and also compares some of the methods. Arc extraction is also well treated in the literature [24]. At the high end the electrical interpretation is highly application dependent and it is not treated here.

In image processing the selection of data structure is important and open question. Generally the structural relationships of the object components can be captured by graphs [7]. In many vision applications, however, simpler data structure is sufficient to represent the image components. In this research tree structure is used. Tree structures are widely applied for image processing tasks. In many cases the object recognition is treated as a tree isomorphism problem [13], [14]. In tree isomorphism the tree of the object is created and compared against a library tree. In our research a different approach is used: the tree is identified by an adaptive process and only those image components are processed that are necessary for growing the tree.

For robust image and document processing systems a probabilistic approach is desirable. Since the appearance of objects varies on different images, a probabilistic model is capable of representing this variation. Another reason for using probabilistic description is to quantify the knowledge that is collected about an object during the object recognition. This is the belief interpretation of probability. Bayesian network provides a solution for these problems and it is used for the implementation because it provides a probabilistic representation, a data structure to store the extracted information and also an inference algorithm. The other significant advantage of the network representation is that the operating code and the data are completely separated. Many methods are based on probabilistic trees. Perl presented a tree based belief network inference with linear complexity [11]. Dynamic tree structures are gaining popularity, because of their better object representation capabilities [1], [16]. Markov random field models present good solutions for low level image processing applications [12], but they lack the hierarchical object representation capabilities. In the next section the related literature is overviewed in more detail. The extracted information consists of two components the library that contains the image bases and the coding of the input circuit diagram. In order to be able to encode images the system has to go through a two phase learning process. First the image bases of the library and then the network parameters are learned. In section 3 a few issues related to image coding are investigated. Section 4 treats the theoretical background that is used for creating the document processing system. Bayesian network, network parameter learning and the visual vocabulary creation is investigated here. Section 5 shows how network inference can be implemented for circuit diagram extraction. It also presents a simulation for extracting the components of sample circuit diagrams. Section 6 explores the possibility of using the presented method for integrated document processing. Finally the last section concludes the paper by raising some issues to extend the method for other applications.

 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale