Enhancing a biomedical information extraction system with dictionary mining and context disambiguation

IBM Journal of Research and Development, Sep-Nov 2004 by Mukherjea, S, Subramaniam, L V, Chanda, G, Sankararaman, S, Et al

12. L. Hirshman, J. C. Park, J. Tsujii, L. Wong, and C. H. Wu, "Accomplishments and Challenges in Literature Data Mining for Biology," BioInform. Rev. 18, No. 12, 1553-1561 (2002).

13. C. Nobata, N. Collier, and J. Tsujii, "Automatic Term Identification and Classification," Proceedings of the 5th ' Natural Language Processing Pacific Rim Symposium, Beijing, China, 1999, pp. 369-374.

14. D. Yarowsky, "Unsupervised Word Sense Disambiguation Rivaling Supervised Methods," Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, MA, 1995, pp. 189-196.

15. V. Hatzivassiloglou, P. A. Duboue, and A. Rzhetsky, "Disambiguating Proteins, Genes, and RNA in Text: A Machine Learning Approach," Bioinform. 1, No. 1, 1-10 (2001).

16. P. Resnik and D. Yarowsky, "A Perspective on Word Sense Disambiguation Methods and Their Evaluation," Proceedings of the Association for Computational Linguistics (ACLISIGLEX) Workshop, Washington, DC, 1997, pp. 79-87.

17. P. Zweigenbaum and N. Grabar, "A Contribution of Medical Terminology to Medical Language Processing Resources: Experiments in Morphological Knowledge Acquisition from Thesauri," Proceedings of the International Medical Information Association (IMIA-WG6) Conference, 1999, pp. 131-141.

18. Genia Corpus; seehttp://www-tsujii.is.s.u-tokyo.ac.jp/ ~genia/topics/Corpus/.

19. E. Fredkin, "TRIE Memory," Commun. ACM 3, 490-500 (I960).

20. A. W. Gale, K. W. Church, and D. Yarowsky, "One Sense Per Discourse," Proceedings of the DARPA Speech and Natural Language Workshop, Harriman, NY, 1992, pp. 233-237.

21. J. Thomas, D. Milnard, C. Ouzounis, S. Pulman, and M. Carroll, "Automatic Extraction of Protein Interactions from Scientific Abstracts," Proceedings of the Pacific Symposium on Biocomputing, Hawaii, 2000, pp. 541-551.

22. A. Bairoch and R. Apweiler, "The SWISS-PROT Protein Sequence Databank and Its New Supplement TrEMBL, Nucleic Acids Res. 25, 31-36 (1997).

Received October 15, 2003; accepted for publication Januajy 6, 2004

Sougata Mukherjea IBM Research Division, IBM India Research Laboratory, Block I, Indian Institute of Technology (IIT), Hanz Khas, New Delhi 110016 (smukherj@in.ibm.com). Dr. Mukherjca is a Research Staff Member in the IBM India Research Laboratory. He received his bachelor's degree from Jadavpur University, Calcutta, his M.S. degree from Northeastern University, Boston, and his Ph.D. degree from the Georgia Institute of Technology, Atlanta (all in computer science). Before joining IBM, he held research and software architect positions in Silicon Valley companies including NEC USA, BEA Systems, and Verity. His research interests include information visualization and retrieval and applications of text mining in areas such as Web search and bioinformatics.

L. Venkata Subramaniam IBM Research Division, IBM India Research Laboratory, Block I, Indian Institute of Technology (IIT), Hauz Khas, New Delhi 110016 (lvsubram@in.ibm.com). Dr. Subramaniam has been a Research Staff Member in the IBM India Research Laboratory since 1998. He received his bachelor's degree in electronics and communication engineering from the P. E. S. College of Engineering, his M.S. degree in electrical engineering from Washington University, St. Louis, and his Ph.D. degree in electronics from the Indian Institute of Technology, Delhi. His research interests include unstructured information management, statistical natural language processing, machine learning, text mining, and the application of these technologies.


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with ProQuest