Glycoscience: a tea party no longer

Later this year or early next Richard Cummings plans to launch The Human Glycome Project. It will happen during a workshop that he is currently organizing and which is open to scientists from near and far. The workshop is slated to be held at the Radcliffe Institute for Advanced Study at Harvard University. Also in the works is a Harvard-based center for glycoscience that reaches out to potential collaborators at all Boston-area universities and academic medical centers.

Cummings, who hails from Alabama and who moved from Emory University School of Medicine to Harvard Medical School last fall, loves glycans, which are the ubiquitous carbohydrates made by all cells, and which can be linked to lipids or proteins. Both in humans and in a variety of animal species, the universe of glycolipids and glycoproteins is extraordinary, he says.

In Cummings’ box of plans is the development a human reference glycome so the growing research community committed to these macromolecules can explore the diversity of the human glycome and develop methods and standards with which to do so. He also envisions comparative glycomics, the comparison of human, porcine and bovine glycomics to tease out differences and similarities. “It wasn’t possible before, really,” he says. But dreaming big in glycoscience is now becoming possible.

Glycobiology has been hampered by complicated methods, which his and other labs have been addressing over the years. In his recent work, published the June issue of Nature Methods, the Cummings lab uses household bleach to release glycans from tissue and cells. He started this research at Emory School of Medicine and continued at his new lab at Harvard Medical School. He also directs the Center for Functional Glycomics, a virtual center that he already led at Emory and that is funded by the National Institutes of Health to explore protein-glycan interactions and to develop new tools and technologies to explore glycoconjugate functions.

When people now stop by the Cummings lab they can, for example, leave with four grams of carbohydrates in a 50ml tube full of white powder. “Those are all the carbohydrate structures in the pig lung,” he says. With this material on hand scientists can use nuclear magnetic resonance techniques for glycan analysis.

Cummings and his team want to enable more labs around the world to study glycoscience by shipping material to colleagues upon request.

Hear Rick Cummings talk about the offer here (14 seconds)



Glycans are difficult to synthesize but now it is possible to harvest them from natural sources such as eggs, meat or plants. “We can make them at such large scale now, we‘re going to just give them away,” he says. Once purified, glycans can be archived, printed on microarrays to explore glycan recognition by lectins, antibodies, bacteria or viruses, or sequenced with mass spectrometry, nuclear magnetic resonance techniques or other methods.

As researchers become aware of the role of carbohydrates in health and disease, the field of glycoscience is broadening, says Cummings. Glycans are being recognized as one of the four major classes of macromolecules, alongside nucleic acids, proteins and lipids.

In the 1970s and 1980s, this field was just getting its start and it was considered merely another part of biochemistry. When carbohydrate researchers got together at meetings, it was more like “tea parties” with 50 to 100 attendees, says Cummings. Glycoscience was far from the spotlight. The community began using the term glycobiology, which Raymond Dwek coined in 1985 and which resonated with researchers. And then, he says,  “all of us kind of chose the term glycomics at some point to distinguish ourselves scientifically from proteomics and the other ‘omics.”

Hear Rick Cummings talk about the history of the field here (40 seconds)


Studying glycan function preceded the study of carbohydrate structure, says Cummings, a situation not unlike molecular biology. For example, work by the chemist Linus Pauling on sickle cell disease occurred before the responsible mutation had been identified and before it was possible to sequence DNA. “We really didn’t know the gene until years later,” says Cummings. The molecular biology arena exploded when it became possible to clone and to synthesize oligonucleotides. “We’re at that point now in glyco-science,” he says.

These days it’s increasingly difficult for scientists to overlook glycans, says Cummings. Access and collaboration are what is needed next to grow the field now that researchers are more than willing to, as he says, “dip their little toes in the glycoscience waters.” That being said, he does still hear disparaging comments about glycoscience, but he takes the remarks as a matter of pride. “So you can think of glycans as being like that little awkward kid on the playground who grew up to be a sizable individual whom no one bullies anymore.”

Sharing data to advance structural biology

In our May editorial, we highlight two new archives: for raw X-ray crystallography (the Structural Biology Data Grid, or SBDG) and for cryo-EM (EMPIAR). These archives join the long-established Biological Magnetic Resonance Data Bank, or BMRB (which hosts biomolecular NMR spectral data) as important resources which will facilitate greater transparency and accelerate progress in structural biology.

Note that neither archive is intended as a “data dump”: datasets in the SBDG must be tied to a journal publication and must be sponsored by the principle investigator of the work, and datasets in EMPIAR must be tied to an Electron Microscopy Data Bank (EMDB) EM density map entry.

Though we do not mandate raw X-ray or cryo-EM data deposition at this time, we applaud these efforts and welcome feedback from the structural biology community about how these archives are bolstering community needs.

Methods and probes for cleared tissue: an imperfect table

TF table methagora post

{credit}Digital Vision/Punchstock {/credit}

In the March issue of Nature Methods, the technology feature explores some ways that labs are optimizing probes to image cleared tissue. As we interviewed scientists, we learned about published work and ongoing unpublished experiences. Here is a snapshot of how some probes work with some clearing methods. It’s an imperfect table and is likely to evolve as research continues. We welcome your comments. We know there are different viewpoints and varying experiences and we hope it will be helpful to others to hear about them.

We wish this could be a wiki page, then again that might deprive you of your nighttime rest, as you edit one another’s entries. This way, your comments can be seen by all.

 

An imperfect table about some labels that have been used in cleared tissue
Probe Labeling shown in large samplesb Labeling shown in small samples Does not work well?
Genetically encoded fluorescent proteins 3DISCO (signal quenched after a few days), CLARITY, CUBIC, ExM (variant)a, PACT/PARS (signal retained 6 months and longer)a, sDISCOa BABB (signal quenched after a few hours), ClearT2, ExM (variant)a , ExM/ePACTa , ScaleA2, ScaleS, SeeDB,
Sucrose, TDE
Immunolabels 3DISCO, CLARITY, iDISCO, iDISCO+a, iSeeDB, PACT/PARS, Sucrose, SWITCH BABB, ClearT, ClearT2, CUBIC, SeeDB, SeeDB2a (in press) Sucrose, TDE, ExM
Specific nucleic acid detection EDC-CLARITY CLARITY, ExM (variant)a, PACT/PARS
Dyes and stains
Congo Red 3DISCO
Lipophilic dyes such as DiI, Sudan Black PACT/PARS (Sudan Black), SWITCH (DiI) ClearT, ScaleS, SeeDB
Nuclear stains such as DAPI, DRAQ5, SYTO and PI CUBIC, ExM (variant)a, PACT/PARS, SWITCH SeeDB, SeeDB2a
SNAP-tags with SiR probes PACT/PARS, BABB, Scale SeeDB

Clearing method: Solvent-based; simple immersion; hyperhydration; hydrogel-based.

aUnpublished information

bIndicates larger samples such as whole organs

Sources: E. Boyden, MIT; K. Chung, MIT, K. Deisseroth, Stanford University; H-U. Dodt, Vienna University of Technology/Medical University of Vienna; V. Gradinaru, Caltech; P. Heppenstall, EMBL; Takeshi Imai, RIKEN; ­­K. Johnsson, EPFL; J. Lichtman, D. Richardson, Harvard University; A. Miyawaki, RIKEN; M. Tessier-Lavigne, N. Renier, Rockefeller University.

*

Glossary of some tissue clearing agent acronyms

3DISCO :  Three-dimensional imaging of solvent-cleared organs

sDISCO: Stabilized three-dimensional imaging of solvent-cleared organs

BABB :  Benzyl alcohol and benzyl-benzoate

CLARITY : Clear Lipid-exchanged Acrylamide-hybridized Rigid Imaging/Immunostaining/In situ hybridization-compatible Tissue-hYdrogel

CUBIC : Clear unobstructed brain imaging cocktails and computational analysis

EDC-CLARITY : 1-Ethyl-3-3-dimethyl-aminopropyl carbodiimide-CLARITY

ePACT: PACT-based expansion clearing.

iDISCO : Immunolabeling-enabled 3D imaging of solvent-cleared organs

iDISCO+ : Immunolabeling-enabled 3D imaging of solvent-cleared organs plus

PACT :  Passive Clear Lipid-exchanged Acrylamide-hybridized Rigid Imaging/Immunostaining/In situ hybridization-compatible Tissue-hYdrogel

PARS : Perfusion-assisted agent release in situ

SeeDB : See Deep Brain

Spalteholz’s preparation :  Benzylbanzoate/methylsalicate

TDE  : 2,2′-thiodiethanol

 

Some lab resource pages

Chung lab resources – Literature, protocols, videos and discussion pages from the MIT lab of Kwanghun Chung related to SWITCH, electrostochastic transport and CLARITY.

CLARITY Resources – Protocols, literature, data and videos related to CLARITY, developed in the lab of Karl Deisseroth at Stanford University. Links to CLARITY Wiki and CLARITY Forum

Expansion microscopy resources – Literature and protocols related to expansion microscopy developed in the MIT lab of Edward Boyden.

iDISCO resources – Literature, protocols and information about validated antibodies related to iDISCO

SeeDB Resources – Protocols, literature videos related to SeeDB developed in the RIKEN lab of Takeshi Imai.

Genomics at top-speed: Q&A with Stephen Kingsmore

 

Stephen Kingsmore

Stephen Kingsmore

Biomedical researcher Stephen Kingsmore is on the move. He has just taken on his new post running the new Rady Pediatric Genomics and Systems Medicine Institute, which is part of Rady Children’s Hospital in San Diego. He is leaving Children’s Mercy Hospital (CMH) in Kansas City where he founded the Center for Pediatric Genomic Medicine.

Kingsmore has also just published along with colleagues at CMH a method called STAT-seq in which the team performs whole genome sequencing and analysis in 26 hours.

As Neil Miller, CMH’s director of informatics explains, CMH is in the process of making most of the downstream characterization and interpretation software behind the STAT-seq pipeline freely available. The team also plans to make its warehouse of genetic variants available and they want to launch a software-as-a-service offering for people without IT infrastructure so that they can use these tools.

Nature Methods   spoke to Kingsmore and what follows is an edited version of the conversation.

Q: To better diagnose and make treatment decisions about seriously ill babies in intensive care you have found a way to sequence whole genomes of parents and their newborn and analyze them in 26 hours. The babies might be having unexplained seizures, parents are deeply upset. Does this involve a lot of people doing the analysis? Or is much of the analysis automated? Just thinking there might be a new world of jobs opening up.

Stephen Kingsmore: The analysis and interpretation are highly automated. That being said, there will be a new world of jobs opening up as folk like us scale up to meet the needs of local populations.

There are 9 million people living in the Rady catchment area, so we foresee a need for 25,000 parent or child genomes a year! That’s a very large number of new genetic counselors.

Q: How do you validate that the genomic analysis is right, especially under these high-pressure circumstances with high stakes? Sanger sequencing?

S.K.: Yes, Sanger sequencing or other appropriate confirmatory test, depending on the type of mutation.

There may also be a need for functional validation, since all that Sanger does is to say that the letter code is correct – it doesn’t speak to whether the mutation is actually causing disease.

Q: Structural variants are complicated to find, making for time-consuming analysis, but they play a role in many diseases. What is needed to make them part of speedy genome analysis?

S.K.: Yes, this is a key need. We need robust, fast methods for finding structural variants genome wide. Microarrays don’t pick up small structural variants nor complex variants, like inversions.

This will be a race between longer read or longer insert whole genome sequencing and newer methods such as offerings from companies such as 10X Genomics and BioNano.

We then need to integrate the two types of variant information so we get a full picture of variations. And all of that can happen with the ease of interpretation and speed now possible for whole genome sequencing.

Q There are a number of fast computational analysis pipelines such as Churchill, SpeedSeq and now yours—is this officially a race of speed demons? One of them, Speed-Seq describes genome analysis in 13 hours. How to compare these tools, their sensitivity and specificity?

S.K.: We need a bake-off! I’m biased, but I think ours is fastest with its genome analysis that takes between one and one and a half hours. It has the highest sensitivity and specificity for nucleotide variants, and the smallest IT footprint for local implementation.

However, ours is not yet in the public domain nor yet available on a software-as-a-service basis, and does not yet have fully integrated structural variant calls. We hope to rectify these things by the end of the year.

Q: In your new study you use proprietary hardware by a company called Edico Genome; others are using open source software. Do people need to decide on belonging to the open or closed club when they want to try to implement what you are doing?

S.K.: Edico is a mix of hardware and software. The overall cost, I think, is significantly lower than traditional compute plus freeware. We are strongly focused on making freeware versions of the software described in the manuscript available by the end of the year.

That being said, there are some excellent commercial software options, and Genomics England has gone that route after their bake-off for the 100k Genome Project. So yes, people should really step back at this juncture and think critically about their needs over the next two to three years

Q: The data , particularly on childrens’ genomes is sensitive but also of great interest to researchers. How do you work out data-sharing schemes with these data?

S.K. This is a delicate balancing act. There is great value for researchers to be able to re-analyze genomes together with structured clinical data, especially where a diagnosis was not evident.

We like the secure NIH database of Genotypes and Phenotypes (dbGAP) route, which balances the need for confidentiality, even of de-identified data, with the needs of researchers and funding agencies.