| 
 Web Corpora for Information ManagementWe investigate the use of corpora with connectivity information (hyperlinks) for information management applications in specific domains. We will build up a web corpus for the language technology domain, which consists of a database of documents (with full-text index and meta-information)  and a database of hyperlinks between documents. As a starting point for collection of the web corpus, we use the database of categorised web pages from LT-World. Information management applications include summarisation, categorisation, clustering, information extraction  (discovery of relations), information retrieval, terminology extraction, and definition mining. 
 |