Travelling the petabyte highway is harder than it looks

CERN-GRID.jpgFor the past year, the Large Hadron Collider (LHC), the world’s most powerful particle accelerator, has been gathering data at an astonishing rate. Just as astonishing is what happens to that data after they are gathered: They are fed through a global computing system known as the Grid, and distributed to literally thousands of physicists all over the world. The Grid is pretty much the only way that the masses of data produced by the collider can be processed. Without it, the LHC would be an elaborate performance art project on the French-Swiss border.

About three months ago, I had what seemed at the time to be a good idea. I would follow a small quantity of data through the Grid to see where it went and how it was used. “How hard could it be?” I thought. The data would all be in tidy little files, the files would have time stamps, locations and so on. Piece of cake.

As I learned the hard way, that’s not how the Grid works. While it’s true that files are given dates and times, figuring out where a packet of data came from is a lot more difficult than just reading the file name. Data are copied, split, copied again and deleted. Bits and pieces travel all over the Grid, often without any input from a human user. In fact, as explained to me by Andrew Sansum, the Grid is so vast and complicated that attempts to model it using the Grid have actually failed.

It took a lot of hard work from over half-a-dozen physicists and IT professionals, but we did eventually track some of the data from the first run of 2010 all they way to Chicago, Illinois. Unfortunately, the piece was too short to name everyone involved, but I’d like to especially thank Ian Bird and Frédéric Hemmer from CERN, who provided a lot of the details about how things work there; Alastair Dewhurst and Brian Davies at Rutherford Lab, who tracked the data passing through their bit of the Grid; and grad students Imai Jen-La Plante and Eric Feng at the University of Chicago, who helped me figure out what the data had been used for.

Many others were involved, including some named in the story. Hopefully, despite the pain, it’ll give you a sense of how the coolest piece of scientific software in the world actually works!

If you’d like to watch the Grid in action, I’d suggest checking out this excellent Grid Dashboard from CERN (requires GoogleEarth). For real-time monitoring, there’s also a nice Java application written by the CMS group GriddPP members at Imperial College in London.

Leave a Reply

Your email address will not be published. Required fields are marked *