From the archives (2004): Large-scale structural variation in the human genome

Scherer_Lee

{credit}Iafrate et al. Nature Genetics 2004{/credit}

During the past 25 years, Nature Genetics has been lucky to publish many exciting papers, more than a few of which can be described as “landmark” papers—publications that have had a dramatic and long-lasting impact on a field. In 2004, the Journal published such a study by Stephen Scherer, Charles Lee and colleagues (Iafrate et al.) in which they reported 255 loci across the human genome containing large structural variants.

In 2017, the idea that there exist large numbers of structural variants in the genome (such as rearrangements, deletions and insertions) that differ from person to person is an established fact. But in 2004, this was not the prevailing wisdom. Prof. Scherer has already written an excellent essay at The Winnower about the study and its importance to the field, so I won’t recap it in detail here—I will simply encourage you to the read the piece.

Charles Lee wrote us about the study by email. “I saw a talk by Dr. Dan Pinkel at the 2002 ASHG meeting where he presented his latest array CGH findings,” he remembers. “In his talk, one of the slides showed the array CGH results of a trisomy 18 patient and Dan remarked how cleanly his array platform performed, especially for the other chromosomes. But in fact, I (and others, I’m sure) could see that there were actually occasional clones that deviated from the expected log2 ratio of 0. During the question period, I sheepishly asked him about these clones. I really didn’t mean to criticize his platform, but I think that he took it that way. Those “blips” bothered me and when I returned to Boston, John Iafrate (who was a postdoc with me at the time) began our own array CGH experiments. Ironically, there were several other groups that were way ahead of us with respect to technical expertise and experience with array CGH, but it could be that they considered these “blips” as technical artifacts – without biological implications.”

Prof. Lee added, “In late 2003, I gave a talk at the University of Toronto and met Stephen Scherer in person for the first time. In a casual conversation, we realized that we were both using the same 1 MB chromosome microarray platform from Spectral Genomics and that we were both seeing these recurrent ‘blips’ in our data.”

Stephen Scherer also corresponded with us by email about the study and the mutual decision to collaborate with the Lee lab. “We were both were fresh enough to look beyond what others were calling ‘noise’ to realize these aberrations represented intermediate and gene-level copy number variation.”

“Many of us suspected it was there,” he said of the large-scale variation they uncovered, “based on the fact there were lots of smaller indels and that 0.6% of the population carried cytogenetic alterations. We kind of predicted it in our chromosome 7 mapping and sequence paper, but only at the chromosomal level.”

ng1416-F1

Circles to the right of each chromosome ideogram show the number of individuals with copy gains (blue) and losses (red) for each clone among 39 unrelated, healthy control individuals. Green circles to the left indicate known genome sequence gaps within 100 kb of the clone, or segmental duplications known to overlap the clone, as compared to the Human Recent Segmental Duplication Browser. Cytogenetic band positions are shown to the left. {credit}Fig. 1 from Iafrate et al. 2004{/credit}

The study by Iafrate et al. was published on August 1, 2004. Exactly one week prior, a very similar study by Michael Wigler and colleagues (Sebat et al.) was published in Science. The methods used by the two groups were different, but the findings and implications were consistent with each other. “Charles and I were happy to see the Wigler paper,” said Prof. Scherer, “because nobody believed our results.” Prof. Lee added, “This was one of the most difficult papers for me to publish. The reviewers were very skeptical. We had to keep providing more and more validation data, and one of the reviewers even commented that s/he did not believe that the paper was worthy of being an article and we had to shortened the paper into a Brief Communication. At the end, Reviewer #2, who was persistently negative wrote: ‘… I still feel hesitant about publication of this work in Nature Genetics… and I still doubt the importance and novelty of their work.” Prof. Scherer remembers similar levels of skepticism in the community. “Prior to publication I was showing the data at talks, including one at Michigan where they were trying to recruit me, and I remember getting trashed. People in my own department were mostly the same.”

[I looked up the referee reports and internal notes from the review process and Prof. Lee is correct that at least one of the reviewers was very skeptical about the impact of the study. However, I do want to note the very unusual fact, at least by today’s standards, that the study was published a little more than 2 months after initial submission, according to our records. I wish this was more common!]

After publication, however, the importance of the studies was immediately clear, at least to those working most closely in the field. Nigel Carter contributed a News and Views article in Nature Genetics about the studies. He wrote, “This unexpected level of LCV [large-scale copy-number variation] forces us to re-evaluate our view of the structure of the normal human genome.”

However, Prof. Lee remembers some ongoing skepticism about the work. “For more than 18 months after the paper was published, I had trouble getting grant funding for continuing my work in human copy number variation. Some comments that I received included, ‘If this was real, the Human Genome Project would have found it.’ I am embarrassed to say that I was forced to write for smaller grants on other topics and when funded, did everything I could to complete the projects using less money and use the ‘extra’ funds for my human copy number variation interests. It was very, very frustrating.”

In 2007, Science announced Human Genetic Variation as the Breakthrough of the Year.  “When I saw this article in Science,” Prof. Lee said, “I felt like there was finally some widespread acceptance of our findings in the general scientific community.”

“However, this came with different issues.” For example, he often received the response from the GWAS community that structural variation is interesting, but it is too difficult to incorporate into GWAS. “So, most association studies continued to focus on SNPs, which is a problem that persists to this very day.”

The findings in Iafrate et al. were based on, by today’s standards, a fairly small sample of 55 individuals profiled by array comparative hybridization array comprising ~12% of the genome (the study in Science reported results from 20 individuals using representational oligonucleotide microarray analysis). However, the impact on the field was anything but small. Part of the legacy of the studies was the establishment of the Database of Genomic Variants (originally the Genome Variation Database) that has now collected over 550,000 CNVs. The discovery that so many structural variants are present in our genomes, even in healthy individuals, opened up an entire field of study to understand the function of these variants, and much is still to be discovered (see for example a recent study on the impact of structural variation on human gene expression).

Prof. Scherer summed up the impact of the studies this way: “If you remember the fights between the public Human Genome Project and Celera Genomics, and them finger-pointing to the errors in each other’s assemblies, in many cases these were due to CNV and other structural variations. They had no idea these CNV variants existed. It was really the 2004 Nature Genetics and Science papers, coincident, pure discovery, that opened the eyes of the community and it took some longer than others to believe it.”

From the archives (1995): Guidelines for interpreting and reporting linkage results

NG1995In 1995, Nature Genetics published a report by Eric Lander and Leonid Kruglyak, recommending clear statistical guidelines for reporting linkage results for complex traits. The paper had an immediate impact, setting the bar for what could or could not be called “significant” in the literature. Although originally focused on human genetic linkage studies, the guidelines set forth by Lander & Kruglyak influenced fields from model organism genetics to plant genetics, and eventually genome-wide association studies (GWAS).

The mid-1990’s was a very exciting time in genetics. The human genome project had recently been announced and advances like microsatellite linkage maps of the human genome and multiplex sequencing technology were now available. Mapping genes underlying complex phenotypes was now a real possibility, and human geneticists were busy prospecting for genetic gold. However, as Lander & Kruglyak cautioned in their paper, the lack of clear guidelines could foster a spate a false positive reports that would, if left unchecked, discredit a the nascent field (for example, see this 1993 paper in Nature Genetics finding no evidence for a previously-reported linkage region for manic depressive illness).

On the other hand, setting too high a bar for reporting significance would mean missing many true signals where they exist, an equally dangerous proposition for a new field. As explained in the paper, “striking the right balance requires both a mathematical understanding of how positive results will occur just by chance and a value judgment about the relative costs of false positives and false negatives.” The paper then outlines the mathematical and statistical arguments in favor of the standards we now all know and love.

Capture

{credit}Lander & Kruglyak, Nature Genetics 1995{/credit}

I spoke with Leonid Kruglyak, co-author of this landmark paper, to get a sense of the context in which this paper came about, and the impact it had on the field at the time of publication. He first explained that it was finally possible to conduct genome-wide linkage studies with hundreds of individuals, allowing linkage mapping methods to be applied to complex traits (for example, this genome-wide screen for schizophrenia susceptibility genes published in the same issue). However, unlike Mendelian genes, there was no clue as to “how many signals there should be, or what their expected sizes were.” Thus, the need for a statistical framework.

This need was recognized as well by the Journal. As Prof Kruglyak recalls, Kevin Davies (founding editor of Nature Genetics) originally commissioned this work as a News & Views article, but it then evolved into a more extensive piece as its implications became clear. However, as he remembers, there was still a very strict deadline for the paper as it had to make the next issue (and these were still the days of hard-copy submissions). At the time, Prof Kruglyak was a young postdoc, so it fell to him to rush to the main FedEx office in downtown Boston before closing time, to make sure the manuscript got to the printer on time.

Prior to submitting the final text, Lander & Kruglyak produced some of the “original preprints”, sending a copy of the paper by snail mail or email to “everyone we knew in statistical genetics”, for comments and suggestions. After all, these guidelines would affect quite a lot of people and “signals that people would like to be results might not be real results anymore”.

Presentation1

{credit}Curtis, Nature Genetics 1996{/credit}

Following publication, “the reactions came in essentially two flavors,” Prof Kruglyak recalls. There were those who thanked the authors, saying that someone really needed to do this. Others were less enthused. “They said, ‘you’re standing in the way of progress and making it harder to publish.’” In fact, Nature Genetics published two letters to the editor arguing that the proposed genome-wide significance threshold was too strict, or that at the very least additional discussion was warranted before these guidelines were adopted (see the letters here and here, and the authors’ reply here). Personally, I agree with the overall sentiment of Lander & Kruglyak as summed up in this portion of their reply: “The correspondents (all trained statisticians) argue that there is no need for guidelines because everyone should be able to interpret the genomewide significance of pointwise P values on their own. In our view, this is naïve. Most geneticists are not statisticians, and rules of thumb can be extremely helpful in promoting sensible discussion.”

The legacy of this paper is clear to anyone familiar with GWAS. “The GWAS community learned a lot from that whole experience [of false positive linkage reports],” says Prof Kruglyak. “There were many serious statistical geneticists involved [in the GWAS field] from the beginning, with a lot of carryover from the linkage era to the GWAS era.”

“Guidelines are not just ‘external gatekeepers’”, he noted.  They are not just there to tell you what you can and can’t publish. “You know what they say, the easiest person to fool is yourself.” These guidelines were developed to help researchers understand their own findings better and decide which are worth following up. “You can often make up a plausible story, but how strong is the evidence?”

Woolly mammoth hemoglobin brought to life: From the archives (2010)

Combarelles-mammouth

{credit}Cave painting: Mammouth gravé de la grotte des Combarelles (Dordogne, France){/credit}

As part of the ongoing celebration of the last 25 years of Nature Genetics, the editors are each choosing a few papers from our archives that we want to highlight. My first pick a paper from Kevin Campbell, Alan Cooper and colleagues on their structure-function analysis of woolly mammoth hemoglobin, published in May 2010.

I’ve picked this one to highlight because, well, who doesn’t love woolly mammoths?

The authors compared the gene sequences of the adult-expressed α- and β-like globin genes from extant elephant species (African and Asian elephants) and from a 43,000 year-old Siberian mammoth specimen reported first in Science. They found that the mammoth β-like genes (designated HBB/HBD by the authors) had 3 amino acid-altering substitutions compared to the extant species.

To test the effects of these protein-coding differences, the authors then “resurrected” the mammoth hemoglobin protein by expressing the mammoth sequence in E. coli and testing its O2 affinity at different ambient temperatures. They found that the O2 affinity of the recreated mammoth hemoglobin is less affected by temperature than that of modern-day elephants. The detailed structure-function analysis reported by Campbell et al. offered us a rare glimpse into the evolutionary process that shaped an extinct organism.

25 years of Nature Genetics

 

AprilThis April marks the 25th anniversary of the first issue of Nature Genetics, and I think it’s safe to say that the field of genetics has come quite a long way. In 1992, we were still nearly a decade away from the draft human genome sequence, “omics” was not yet a word in common usage, and CRISPR/Cas9 gene editing wasn’t even a pipe dream.

Most of the content in our current issue would have possibly seemed like far-fetched science fiction to geneticists in 1992. Take for instance the new-and-improved domestic goat genome assembly reported on page 643 of this issue, for which multiple, relatively new technologies were employed to create one of the most complete and contiguous genome assemblies to date. However, as the News & Views by Kim Worley exemplifies, science marches on. While the geneticists of the past might have marveled at the possibility of a whole-genome shotgun assembly (indeed, a major advance reported in that first issue was a new technology allowing for automated sequencing of 106kb), Worley refers to the scientists of the present who are “frustrated with the highly fragmented genome sequences available for most species.”

Still, many things have remained the same.

Taking a look back at the very first editorial published in the journal, much of the journal’s mission in 1992 is still applicable to 2017. Take this passage:

“Researchers should not be dismayed that developments like this are widely reported in the general press. That is merely a measure of the widespread compassionate interest in inheritable disease. Who can be but flattered by such public testimony to the importance of a field of research?

“The research community’s interest, rather, is that there should also be a wide general understanding that the identification of an aberrant gene does not imply that there is a cure at hand for the condition for which it is responsible. […] The elucidation of the mechanisms by which genes determine the behaviour of the cells that carry them will be a general preoccupation in the years ahead. Nature Genetics intends to play its part in the publication of this important research, and also of course, in classical genetics that throws light on the human genome.”

NG1992

{credit}doi:10.1038/ng0492-1{/credit}

While there is no denying that important medical advances have been enabled by the identification of disease genes, it is still painfully true that simply finding the gene does not directly lead to a cure on its own. Thus, both the identification of new disease-causing genetic alterations and studies that bring new mechanistic understanding of how a given mutation gives rise to disease are still core to the journal’s scope and aims.

The focus of the journal, as can be seen from this first editorial, was very much on human genetics at the beginning. Model organisms were considered just that, models for human biology. One of the major changes in the journal since that time has been our expansion to genetics (and genomics) more broadly, as represented by the many reference genomes and population genetics studies published for other organisms.

Too many landmarks to count

The editorial published in this month’s issue highlights a few selected articles from our among our more than 5,000 research publications over the years. These are obviously a restricted set of examples, and they are by no means the “best” papers, as such a ranking system would be ill-advised and ultimately useless. But the papers selected cover a wide range (though not all) of the sub-fields represented by the journal. This list includes landmark papers in human genome mapping (Kong et al. 2002) and cataloging of genetic variation (Iafrate et al. 2004); statistical methods that helped drive an entire field of research (Price et al. 2006); Mendelian disease gene discoveries that shed new light on biological mechanisms (Amir et al. 1999); key advances in the field of epigenetics (Heintzman et al. 2007); and advances in crop plant improvement (Ren et al. 2005).

We invite you to take a trip down memory lane and revisit these and other landmark papers from our archives. As a part of the celebration of 25 years of Nature Genetics, the editors will be blogging throughout April to highlight some of our past content.

A brief history of Nature Genetics

Nature Genetics was launched as the first of the Nature Research journals (if we ignore the very brief existence of Nature New Biology and Nature Physical Science in the early 1970s and the earlier version of Nature Biotechnology, Bio/Technology, published first in 1983).

While the history of genetics as field is by far more interesting than the history of a single journal, the occasion of our 25th anniversary has us thinking about our roots. For our 15th anniversary, founding editor Kevin Davies contributed a guest editorial telling the story of how Nature Genetics came about. I highly recommend that you check it out, if you haven’t seen it before.

Another feature of our 15th birthday celebration was the Question of the year. What would you do if the $1,000 genome were a reality today? To read the nearly 50 replies we received from leaders in the field, see the Question of the Year special here: https://go.nature.com/2mTMKBf.

The next 25 years

Just as researchers in 1992 would have been very unlikely able to predict the many breakthroughs that have occurred in genetics over the past 25 years, we have no idea where the next 25 years will take us. The goals will remain the same: to elucidate the mechanisms by which the genetic material produces the many phenotypic variations we see in nature and to identify the causes (and, more hopefully, cures) for human genetic disease.

That said, let’s take a stab at looking toward the future. What do you think will be the next major breakthrough in genetics? What will the field of genetics look like in another 25 years? Tell us below in the comments.

25 years from now, I hope to still be watching as geneticists make some of the greatest discoveries in biology. And I am confident that Nature Genetics will be there, playing its small role in announcing those discoveries to the world.

 

October issue cover: What’s going on here?

Oct

{credit}Convergent cabbages by Keyong Chang{/credit}

For all of October, we at Nature Genetics have been admiring the lovely cabbages on our cover. The image, created by photographer Keyong Chang, was contributed by the authors of the study on page 1218 of the issue.

But what is the story behind these pretty green cabbages?

Xiaowu Wang, corresponding author of the study, gave us a behind-the-scenes look at the process that led to the picture on our cover.

The image conveys the main idea of the study, namely that Brassica oleracea (cabbage, left) and Brassica rapa (Chinese cabbage, right) have taken similar evolutionary paths to arrive at their similar, but distinct, appearances. During domestication, farmers selected for cabbages of both species to have the large, leafy heads for which they are known. As shown in the study, the farmers were unknowingly selecting for orthologous genes in these two species. Continue reading

Learning every way to break a gene

From Fig. 1 in Majithia et al. Nature Genetics 2016

From Fig. 1 in Majithia et al. Nature Genetics 2016

Finding the genetic cause of a disease—a mutation or genetic variant—is a lot like looking for a needle in a haystack. Except in the case of exome sequencing, it’s not always clear what a needle even looks like.

When a clinician finds a protein-altering variant in a gene known to cause disease, it could be the cause of the patient’s disease…or it could be nothing. This is the definition of a variant of uncertain significance (VUS). VUS’s often stay unknown unless someone puts in the time and resources to functionally characterize the variant. For obvious reasons, functional characterization of each individual VUS of every gene implicated in human disease is not practical.

One way to determine which variants cause disease and which don’t is to look at the DNA of many healthy individuals. If a healthy person carries a variant, it is unlikely to cause disease. The Exome Aggregation Consortium (ExAC) is the largest such effort—over 60,000 people have contributed their exome sequences to ExAC and thus provided a unique resource for clinicians to de-prioritize specific VUS’s as causes of disease.

But what if your variant is too rare or isn’t in ExAC?

Amit Majithia, David Altshuler and colleagues from the Broad Institute of Harvard and MIT developed a different strategy, published last week in Nature Genetics. The authors chose a gene, PPARG, that can cause Mendelian lipodystrophy when mutated. Some variants in this gene can also increase the risk of developing type 2 diabetes. PPARG also has many VUS’s in the population.

To find out which variants are likely to be pathogenic, the authors constructed a library of all 9,595 possible protein-altering variants of PPARG and tested them in pools using a functional assay in human macrophages. Cell pools that showed a positive result were sequenced and the numbers of each variant were counted, allowing the authors to assign a function score to each variant. These scores were then used, along with known benign and pathogenic variants from the prior literature, to train a machine-learning algorithm that could then classify each variant as pathogenic or not. The classifier found 6 new likely pathogenic variants, which were then validated through additional tests.

Summary of strategy used by Majithia et al.

Summary of strategy used by Majithia et al.

The strategy used by Majithia et al. could potentially be used by other researchers to study other proteins implicated in disease. Although it is somewhat of a brute-force approach to test every possible variant, the use of a pooled functional assay and computational classifier increases both the efficiency and accuracy of the result. We asked Dr. Majithia to tell us a little more about this study.

Author Q&A:

Why did you choose to focus your study on PPARG

In the diabetes community PPARG is a very important and storied gene. It has been linked to both common type 2 diabetes and rare familial forms. PPARG is also the target of multiple FDA approved drugs to treat diabetes. So on one hand, PPARG has been studied in humans and in the lab for two decades. On the other hand, we showed in 2014 (Majithia et al. PNAS) that even though PPARG had been sequenced in humans for so many years, we had only scratched the surface of the possible missense mutations that people in the general populations carry. Most of these mutations were benign, some strongly increased diabetes risk, and it took laboratory experiments with each and every mutation to sort them out. So PPARG was the perfect test case for our prospective experimental approach to test all possible missense mutations: it is relevant to common and rare genetic disease, has mutations of known function we could use for validation of our method, and has many unknown mutations, i.e. variants of uncertain significance that need to be functionally characterized.

Do you think that a similar strategy could be employed for other genes with many variants of unknown significance in the population? What would be the major challenges for applying this strategy to other genes?

Absolutely. A major purpose of this study was to demonstrate proof-of-concept that other investigators and clinicians could utilize for VUS in other genes/diseases. In principle there are three challenges in applying a prospective experimental approach to generate a “lookup table” for missense VUS: 1) making every possible missense mutation 2) building an assay with the scale and throughput to study every missense mutation and 3) connecting the lab experiments to what happens in people (i.e. phenotypes)

  1. The mutation synthesis technique we use, which was pioneered by Tarjei Mikkelsen, applies to any gene. Other groups like Jay Shendure’s in Seattle have independently developed methods to mutate genes at scale and now companies like TWIST Biosciences offer high quality mutation libraries for any gene. This is no longer a barrier to entry.
  2. Genes have myriad functions and so building an appropriately scaled, high throughput readout is a gene by gene process. No single assay can test the function of every gene, but there are partially generalizable strategies that can be used to study certain classes of genes. The strategy we used for PPARG, combining reporter gene expression and FACS, could be deployed for any gene that activates transcription. In fact we are taking this approach with another diabetes relevant transcription factor, HNF1A.
  3. Establishing clinical relevance of saturation mutagenesis data is a critical step. In our study we set a criterion for our assay, that in order for us to be able to discriminate variants of “unknown” significance, we should be able to accurately discriminate variants of “known” significance. For PPARG we benefitted from decades of research that had resulted in a series of missense variants with known function and disease effect. Many genes do not have such “allelic series” but with the increasingly widespread use exome/genome sequencing our knowledge of allelic series for genes is rapidly growing.

From your perspective, was the most surprising aspect of the study?

To independently prove the findings from our “lookup table” our collaborators in Cambridge took a series of VUS from patients referred to their clinic and tested them in single variant assays that are the standard for PPARG functional testing. In one case, our “lookup table” had a major discrepancy from the single variant assay. Careful follow-up work by our Cambridge colleagues resolved that the single variant assay that has been used for decades was actually incorrect and the “lookup table” made the correct call. We were surprised that, in this case, our new, high throughput assay had higher fidelity with human biology and it introduces a degree of nuance into what we consider “gold standard” when discrepancies inevitably arise.

How do you see your findings impacting on future research?

We hope this study will have its largest impact as a proof-of-concept for other genetic diseases and VUS interpretation. Our method is capable of improvement. For example, it does not assess missense effects on gene splicing, but others in the community are developing complementary methods that will overcome this (for example, see this paper from Gregory Findlay et al. published in Nature). We are particularly excited that our lookup table approach opens the door to systematically characterizing drug-by-genotype interactions that could be useful to guide treatment.

 

Spreadsheets have misprints – it is known

by Myles Axton

Normally we do not re-examine supplementary information in this detail, but there is a common minor problem that systematically affects a small number of gene IDs within long lists of gene names copied into spreadsheets in the supplementary tables of many articles. We suggest checking for this problem before submitting tables to journals. It is easy to see the altered gene names by sorting the column in a separate version of the file and then searching for the misspelled name to correct it in the replacement version intended for publication.

Example of the Excel formatting issue

Example of the Excel formatting issue

The authors of this paper claim that gene names in a large proportion of papers reporting gene expression data have this problem. Here we list the supplementary tables they identified in the journal prior to 2015 and from the first nine months of 2016 that we found to contain one or more misprints. We think that these mistakes do not prevent reuse of the datasets provided and as stated in the accompanying editorial Legible ledgers we do not propose to publish formal Corrigenda for the supplementary tables of these articles.

 

August issue cover: What’s going on here?

Rhinopithecus bieti

Rhinopithecus bieti{credit}Yong-cheng Long{/credit}

This month’s cover image is inspired by the paper on page 947 reporting the reference genome sequence of the black snub-nosed monkey, the second snub-nosed monkey genome paper published in Nature Genetics. The golden snub-nosed monkey genome was published in 2014.

In their paper, Li Yu and colleagues present the de novo genome sequence assembly of Rhinopithecus bieti as well as whole genome resequencing of all four other snub-nosed monkey species. All five species are among the world’s most endangered primate species. Three species, R. bieti, R. roxellana and R. strykeri, live at very high altitudes—above 3,000 meters. R. bieti lives exclusively on the Yunnan and Tibetan plateaus. The other two species, R. brelichi and R. avunculus, inhabit lowland regions. The authors compared the genome sequences between these species to identify genomic regions showing evidence of positive selection that could be related to living at high altitudes.

The photograph on the cover image was taken by one of the study’s co-authors, Dr. Yong-Cheng Long, who was profiled by the Nature Conservancy for his work on conservation of R. bieti (also called the Yunnan golden monkey by the locals). We asked Dr. Long to tell us a little about the monkey shown in the picture.

“The monkey is [a] male, whose name is ‘Big Guy’, and he is feeding on some leaves,” he said by email. “The Big Guy used to have 4 wives (about 6 years ago) and now has only 2, as he is getting old and is not strong enough to hold all of them because the females are more likely to find a strong shoulder to cry on.”

Dr. Long said there are 57 R. bieti individuals in the habituated “Yunnan snuby” group, which is open to the public. Because many of the individuals in the area are fully habituated to human presence, it is not difficult to get photographs of them. The group is only a small portion of the largest natural monkey troop (approximately 1,000 in total) in the world. Dr. Long emphasized the impact that illegal poaching has had on the monkeys. “This species has been endangered by human’s killing, and the monkeys can certainly survive once the killing is stopped.” In China, 2016 is the Year of the Monkey, and it has turned out to also be a lucky year for these particular monkeys. “We found the monkey group has boomed,” said Dr. Long. “12 of the 57 are the infants born this year.”

monkey

Nature Genetics office mascot

The lead author of the study, Dr. Yu, became interested in studying these species because of his focus on conservation genetics of endangered mammals distributed in Yunnan Province, China. This is one of the core regions of biodiversity in the world. “The most notable among the endangered mammals distributed in Yunnan Province is R. bieti, which is found exclusively on Yunnan and Tibetan Plateau”, said Dr. Yu by email. “It is unique in that it is the only primate having a red mouth like most humans, which [is why it’s called] one of the most beautiful animals.” Dr. Yu also noted that it is the highest altitude-dwelling nonhuman primate. It can survive in very cold and hypoxic environments that other primates cannot tolerate. “So, I was deeply attracted by this mysterious and interesting species, and was eager to come to understand it.”

 

IMG_1863We at Nature Genetics are also celebrating the Chinese Year of the Monkey. Our office mascot is this golden snub-nosed monkey (right), which was produced for marketing purposes in China (I snagged one during a recent visit to the Shanghai office). Scanning a barcode on the monkey’s rear end (left) will take you to the publication of the R. roxellana (golden snub-nosed monkey) genome paper.

 

 

July issue cover: What’s going on here?

JulyThis month’s cover features the inspiring block-like karst mountains of the Li River between Guilin and Yangshuo in Guangxi province. The image was inspired by a study in this month’s issue reporting deep sequencing of the MHC region in individuals of Han Chinese ancestry. The study represents an important resource for the study of immune-related disorders in Asian populations. It also identifies loci associated with risk of psoriasis, thus demonstrating the power of this resource.

In addition to simply being a beautiful image evocative of the mountains in Guangxi province, the image also brings to mind the peaks that might be observed in many types of genomic data, such as Sanger sequencing reads, ChIP-seq peaks, etc.

Our own chief editor, Myles Axton, did first-hand research leading to the selection of this cover image. As he found, the Yulong River in Yangshuo is less muddy than the Li River and better for swimming and sightseeing from bamboo rafts (arrow indicates NG editor in the field).

Yulong River

Yulong River{credit}Myles Axton{/credit}

 

Myles holding a 20 yuan note with drawing of karst mountains.

Myles holding a 20 yuan note with drawing of karst mountains.{credit}Myles Axton{/credit}

June issue cover: What’s going on here?

Carrot canang sari by Rachel Meyer

Carrot canang sari by Rachel Meyer

As June comes to a close, it’s time to look back at our June issue and ask “what’s going on here?” with the cover image. As you may have guessed, the image is related to the publication of the carrot genome sequence in this month’s issue.

The cover image was provided by Rachel Meyer, a scientist who was not a co-author of the genome paper. Dr. Meyer was previously a postdoctoral researcher with Michael Purugganan at NYU and is an AAAS Science and Technology Policy fellow. She is also a co-founder of Shoots & Roots in New York.

Dr. Meyer gave us the following information about the carrot canang sari on the June cover:

Celebrating the recent availability of rainbow carrots year-round in Washington DC, I cut them in various ways and laid them out in a public dirt plot between the sidewalk and the street that was still bare because Spring had barely started and planting was far from beginning. The cold kept the carrots nicely preserved for three days. The installation took about 6 hours, and the design itself was lifted from a Persian carpet, sharing an origin with some of the earliest domesticated carrots. I had no intention to leave the installation there but people in the busy U-street/Shaw district, coming home late at night from the bars, would stop and photograph it, and even some of the suits interrupted their morning power walks to work to investigate it. After a few days, to my surprise it was not rats, but a middle-aged man who had decimated the carrots for a meal.

Shelby Ellison, an author of the carrot genome article this cover references, did this research as part of her NSF Plant Genome Postdoctoral Fellowship. We were in the same class of Fellows together and became friends because we would look for cool restaurants around DC together during our brief visits to NSF for annual Plant Genome meetings. I’m grateful to be able to explore the subject of her science through installation.

For more about the carrot genome paper, see our previous blog post, featuring Q&A with the corresponding author.