Thursday, September 05, 2013

"Lost Branches on the Tree of Life" - why must the answer be enforcing behaviour?

Bryan Drew and colleagues have published a piece in PLoS Biology bemoaning the lack of databased phylogenies:

Drew, B. T., Gazis, R., Cabezas, P., Swithers, K. S., Deng, J., Rodriguez, R., Katz, L. A., et al. (2013). Lost Branches on the Tree of Life. PLoS Biology, 11(9), e1001636. doi:10.1371/journal.pbio.1001636 (see also blog post Dude, Where’s My Data?)

This is an old problem (see for example "Towards a Taxonomically Intelligent Phylogenetic Database" doi:10.1038/npre.2007.1028.1), but alas the solution proposed by Drew et al. is also old:

Optimally, all peer-reviewed journals that publish phylogenetic datasets should require deposition (and activation for public access) of alignments and trees prior to publication, and these trees and alignments will include the same characters and taxa (and taxon names) as in the published study.

In my opinion, as soon as you start demanding people do something you've lost the argument, and you're relying on power ("you don't get to publish with us unless you do 'x'"). This is also lazy. In a talk I gave to the NSF AVATOL meeting I argued that this is the wrong approach, when building shared resources carrots are better than sticks.

In that talk I used the example of Mendeley where they build an incredibly valuable resource (a bibliography of academic research in the cloud that they sold for $US 100M) by providing a service that meet people's needs ("where's that damn PDF again?"). No brow beating, no "you must do this", just clever social engineering.

So, my challenge to the phylogenetics community (and the authors of "Lost Branches on the Tree of Life" in particular) is to stop resorting to bullying people, and ask instead how you could make it a no brainer for people to share their trees. In other words, build something people actually need and will be inspired to contribute to.