For my recent Toolbox on 3D genome visualization tools, Nils Gehlenborg at Harvard Medical School clued me into two interesting pieces of software. One, HiGlass, was included in my article; a related tool, HiPiler, was not. But that doesn’t mean it’s not worth talking about.
HiPiler, says Gehlenborg, “is a tool to visualize individual features in [Hi-C] maps.” Think of it like a digital photo application that can find and extract all the faces in your image library*.
Tools like HiGlass allow researchers to navigate and explore chromatin contact matrices, data from experiments such as Hi-C that indirectly reveals the folding of chromosomes. Sometimes, though, researchers need to focus on specific structural features, such as chromosomal loops or domains. Using bioinformatic algorithms, researchers can query chromatin-contact datasets to find all such structures. But these algorithms may return thousands of hits. How is a researcher to study them?
Enter HiPiler.
Developed by Gehlenborg’s lab in collaboration with the lab of Hanspeter Pfister at the John A. Paulson School of Engineering and Applied Sciences at Harvard, HiPiler excises these features and presents them on a canvas as miniature snippets — basically tiny segments of the full-sized contact map, which users can then organize, sort, and cluster based on parameters such as noise or location. Among other things, users can stack related snippets into a pile (hence the tool’s name) and study them either in aggregate or individually.
“We’re no longer bound to this matrix and having to navigate from one part of the matrix to another part,” Gehlenborg explains. “You can just grab whatever part you’re interested in, extract it, literally cut it out, and then look at these cutouts.”
The software maintains the link between each snippet and its origin using an integrated HiGlass viewer, which presents the snippets in the context of the original data matrix.
According to Gehlenborg, HiPiler fills a void in the Hi-C data analysis toolbox. Previously, researchers who wanted to compare, say, the performance of different feature-finding algorithms had to manually generate and compare hundreds or thousands of static screenshots. “In that effort,” he says, “HiPiler is extremely useful, because they can directly get all the tools they need to inspect whatever regions they’re identifying and whatever algorithm they might be developing.”
HiPiler’s strength, Gehlenborg continues, lies in revealing unexpected patterns. “There are certainly things that we might not be aware of, simply because we’ve never really had a view like the one that HiPiler is providing. And that’s where we’re now working with our collaborators on applying this approach to additional datasets.”
Bioinformatician Leonid Mirny at the Massachusetts Institute of Technology in Cambridge, is one such collaborator. Mirny uses HiPiler in his research into chromatin folding, a process that involves the protein CTCF. Using ChIP-seq, which maps the locations of DNA-binding proteins to the genome, Mirny can identify all the genomic locations at which CTCF is found. He can use HiPiler to collect snippets for every location and aggregate them to see what the ‘average’ CTCF site looks like — the equivalent, he says, of asking Google Maps what the typical beach looks like.
He also can study the snippets individually and group them into classes, in order to discover the features that make certain sites unique — like finding that while some beaches face west, others north, south, or east.
“That’s what HiPiler would show you: what kind of structures are there that you don’t see in the average map.”
Gehlenborg’s team have created a HiPiler demo at hipiler.higlass.io, as well as a video explaining how the software works. Their paper on HiPiler was accepted for publication in IEEE Transactions on Visualization and Computer Graphics, and will be presented at a data visualization conference in Phoenix, Arizona, in October.
*That’s not a perfect analogy, as HiPiler cannot actually find features; users have to tell it what regions to extract. But, you get the idea.
Jeffrey Perkel is Technology Editor, Nature.
Image: screenshot/Jeffrey Perkel
Update (2017-09-11): The post has been updated to reflect the fact that HiPiler was a collaboration between Gehlenborg’s lab and that of Hanspeter Pfister.
Suggested posts
Mike Goodstadt: A circuitous route to bioinformatics
Recent comments on this blog
African astronomy and how one student broke into the field
From Doctorate to Data Science: A very short guide
Work/life balance: New definitions