Thanks to Geoff Bilder‘s excellent presentation at the UKSG this April, I finally caught up with Nature Publishing Group’s Open Text Mining Interface and why it could help transform the way STM literature is used.
I was fortunate enough to talk early in the morning while people were still lively (talk is here) and there were several questions afterwards both in the Q&A and later during the breaks and the poster session at end of the workshop. Some questions/observations listed below:
1. General. Seems that there is not enough appreciation that OTMI is being proposed as a standard framework and methodology for disclosing subscription full text for text mining. That is, most of the features are parametrized and it is up to individual publishers to determine e.g. whether a snippet is a paragraph or a phrase, whether snippets are randomized or not, etc.
Of course OTMI is far from being the only game in town as regards text-mining or semantic enrichment of STM literature. Here are some random links to some other interesting projects I’ve been looking at recently:
- WikiProteins (see also Key biology databases go wiki in Nature 445 (691) – subscription required)
- Project Prospect (RSC’s semantic enrichment project)
- uBioRSS (I blogged on this a few days ago here)
- SWAN Project (Semantic Web Applications in Neuroscience)
- Neurocommons (“an Open Source knowledge management platform for biological research”)