TechBlog: Git: The reproducibility tool scientists love to hate

{credit}PLOS Comput Biol, 12, e1004668 (2016){/credit}

Early in his graduate career, John Blischak found himself creating figures for his advisor’s grant application.

Blischak was using the programming language R to generate the figures, and as he iterated and optimized his code, he ran into a familiar problem: Determined not to lose his work, he gave each new version a different filename — analysis_1, analysis_2, and so on, for instance — but failed to document how they had evolved.

“I had no idea what had changed between them,” says Blischak, who now is a postdoctoral scholar at the University of Chicago. “If the professor were to come back and say, ‘which version did you use to create this figure?’ I would have had no idea.”

Later, while attending a workshop on basic research computing skills, he discovered a better approach: Git.

Continue reading