For instance, if you are interested in finding nuggets of information like `Google acquired YouTube' or `The gene Toll interacts with the adaptor proteins DmMyD88', these can be expressed in a wide variety of ways:
- The NYT today announced Google's acquisition of video hosting service, YouTube;
- Toll has been found by Chen et al. (2006) to interact with the adaptor proteins DmMyD8.
So a useful step, possibly in conjunction with open source named entity recognizers, is to parse text containing the entities and relations of interest and to recover the grammatical relations (GRs) which hold between these entities and surrounding words. For instance, for the sentences above, RASP will output GRs:
- possessive(acquisition, Google), indirect-object(acquisition, of), object(of, YouTube), and
- subject(interact, Toll), indirect-object(interact, with), object(with DmMyD88)
amongst others, and similar or identical GRs will be output from text expressing the same information. Thus RASP helps discover relevant text with high precision, going well beyond the capabilities of keyword search.