Extending, trimming and fusing WordNet for technical documents

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademic

38 Downloads (Pure)

Abstract

This paper describes a tool for the automatic extension and trimming of a multilingual WordNet database for cross-lingual retrieval and multilingual ontology building in intranets and domain-specific document collections. Hierarchies, built from automatically extracted terms and combined with the WordNet relations, are trimmed with a disambiguation method based on the document salience of the words in the glosses. The disambiguation is tested in a cross-lingual retrieval task, showing considerable improvement (7%-11%). The condensed hierarchies can be used as browse-interfaces to the documents complementary to retrieval.
Original languageEnglish
Title of host publicationProceedings on NAACL-2001 workshop on WordNet and other lexical resources applications, extensions and customizations,Pittsburgh,USA June 2001
PublisherThe Association for Computational Linguistics
Publication statusPublished - 2001

Bibliographical note

Proceedings on NAACL-2001 workshop on WordNet and other lexical resources applications, extensions and customizations,Pittsburgh,USA , June 2001

Fingerprint

Dive into the research topics of 'Extending, trimming and fusing WordNet for technical documents'. Together they form a unique fingerprint.

Cite this