The TAP classifier has been tightly integrated with the RASP toolkit so that it is easy to undertake experiments to find the optimal set of feature types and instances for a particular classification task, whether this be at the document, passage, sentence or (sub)sentence level. However, in many real world applications it is not possible to train a classifier in a fully supervised fashion because data is only partially or noisily labelled.

A significant element of the research undertaken with the toolkit has been to explore the use of bootstrapping and other semi-supervised techniques to circumvent the need for large quantities of well-annotated training data. In areas such as anonymisation (Medlock, 2006) and biomedical named entity recognition (Vlachos etal, 2006) we have been successful in bootstrapping accurate classifiers from text automatically annotated with RASP.