True scientific revolution: radical sharing
Ready for the future? “Warming Ocean Threatens Sea Life” is it.
It’s not the climate change content of the article on which I’m focused at the moment; that’s a topic for another day. More singular, and worthy of our attention now, is that this article is the first in Scientific American to be accompanied by its own IPython notebook.
‘Have a question about how the author performed his statistical reductions, or exactly what was in his datasets? It’s all there. You can review or extend the calculations yourself. So can the five, or fifty thousand, other people in the world qualified to judge the analysis. This is how science deserves to be done.
I recognize every generation imagines its own shufflings to be unimagined leaps. In this case, perhaps, the conceit is justified. The science of recent decades has been deeply plagued by faulty and improperly-analyzed data. The most certain fix for this defect is well-organized, transparent, teamwork or collective action: “open-source science“. IPython, a fascinating project of research neuroscientist Fernando Pérez, already has a strong record of achievement in encouraging scientists and engineers to share not just their ideas, but details of data technique.
IPython isn’t alone; such other open-source products as Sage, TeX, R, the Open Genomics Engine, and arXiv have great stories to tell, and a few proprietary products, including , SAS, and MATLAB, have nurtured scientifically-significant ecosystems of shared add-ons.
Science is generally recognized as one of humanity’s great achievements. This is despite, not because of, a rather pathetic record of managing the data and collaborative opportunities on which folklore holds science is built. We don’t yet know how good the wisely-tooled alternative can be.
What’s this matter for working DevOps? Perhaps not much, at least not immediately. It does suggest a few tips, though:
- There’s really no excuse for static presentations. I recognize conventional corporate culture puts a premium on colorful and visually-rounded charts. Long-term value comes, though, when live data are liberated and visualized constructively, often in ways their original custodians didn’t imagine.
- It’s a great time to learn new tools. Just in the last day, I’ve come across announcements for courses in IPython-based computational fluid dynamics, “Data Science using R“, “Computational Electronics“, “Quantitative Cladistics and Use of TNT“, and much more.
- However low you estimate the quality of data “out there”, you’re probably hitting too high. (Nearly) everyone who seeks to verify specific scientific results reports that the data are dirtier than imagined.
- Look for ways to amplify your results with open source.