Journal Article

Domain adaptation for semantic role labeling in the biomedical domain

Daniel Dahlmeier and Hwee Tou Ng

in Bioinformatics

Volume 26, issue 8, pages 1098-1104
Published in print April 2010 | ISSN: 1367-4803
Published online February 2010 | e-ISSN: 1460-2059 | DOI:
Domain adaptation for semantic role labeling in the biomedical domain

Show Summary Details


Motivation: Semantic role labeling (SRL) is a natural language processing (NLP) task that extracts a shallow meaning representation from free text sentences. Several efforts to create SRL systems for the biomedical domain have been made during the last few years. However, state-of-the-art SRL relies on manually annotated training instances, which are rare and expensive to prepare. In this article, we address SRL for the biomedical domain as a domain adaptation problem to leverage existing SRL resources from the newswire domain.

Results: We evaluate the performance of three recently proposed domain adaptation algorithms for SRL. Our results show that by using domain adaptation, the cost of developing an SRL system for the biomedical domain can be reduced significantly. Using domain adaptation, our system can achieve 97% of the performance with as little as 60 annotated target domain abstracts.

Availability: Our BioKIT system that performs SRL in the biomedical domain as described in this article is implemented in Python and C and operates under the Linux operating system. BioKIT can be downloaded at The domain adaptation software is available for download at The BioProp corpus is available from the Linguistic Data Consortium


Journal Article.  5264 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.