Tuesday, March 27, 2012

BHL and GBIF as biomedical databases

When I think of the Biodiversity Heritage Library (BHL) or GBIF I tend to think of taxonomy and biodiversity. Folk wisdom has it that BHL is full of old books, mostly pre-1923. Great for finding old taxonomic names, or nice artwork, but not exactly "modern" biology. GBIF is mainly about displaying organism distributions based on museum specimens, the primary data of taxonomic research. Again, great stuff, but aren't museums simply full of dead stuff that people have collected and forgotten about?

But BHL has a lot more post-1923 content than I suspect most people realise (several museum or society journals have 21st century issues in BHL's archives, for example). Continuing the theme of linking BHL and GBIF content, as part of a forthcoming project on taxonomic names (to be made available "real soon now") I stumbled across this 1976 paper in BHL (now in BioStor):

Monograph on "Lithoglyphopsis" aperta, the snail host of Mekong River Schistosomiasis by Davis et al..

Malacologia157576inst 0263

This paper has been indexed in PubMed (PMID:948206, but as far as I'm aware, BHL (and BioStor) has the only digital copy of this paper. (As a side note, wouldn't it be great if PubMed could link to BHL content?).

The article page in BioStor shows a map derived from the OCR text, showing a two localities:

Mekong

Below the map are the specimen codes I've automatically extracted from the OCR text, linked to the corresponding records in GBIF, which are georeferenced (e.g., ANSP Malacology 330925).

If we joined these things up just a little more, we could do some useful things. For example, what if a researcher searching in PubMed for schistosomiasis in South East Asia could find the Davis et al. paper, and then go to BHL or BioStor to read it? What if a researcher looking at gastropod distributions in the Mekong River in the GBIF portal could see that BHL had publications on diseases associated with these organisms (as well as their taxonomy and biology). We could also traverse the link from GBIF to BHL to PubMed and provide a direct route from distribution maps to biomedical literature.

It seems there's scope for trying to connect BHL, GBIF, and PubMed, and that BHL and GBIF may have important roles to play in providing access to basic information about organisms that have a serious impact on human populations.