Cuddly Koala Genomics

Rebecca Johnson

The genome assembly of the koala is reported in a paper published online in Nature Genetics. This high quality genome represents the most complete genome sequence for a marsupial to date. The data give insight into the highly specialized koala diet, consisting of eucalyptus leaves, and provide information that may be useful to combatting infectious disease.

Koalas are a vulnerable species and part of the aim of the the project was to use the genomic data to inform conservation efforts. We spoke with lead author Rebecca Johnson to get some background on this work:

Koala Rebecca Johnson

How did the koala genome project come to be?

The genome project started as a small group of Australian researchers (from the Australian Museum, University of the Sunshine Coast and University of Sydney) who were enthusiastic about koala conservation and using genomics to manage populations and diseases. We partnered up with colleagues at the Ramaciotti Centre at the University of New South Wales (UNSW) who were enthusiastic to try out their new sequencing equipment on a ‘de novo mammal sized genome’. This hadn’t been done before in Australia.

We decided to take a bit of a risk and announce to the world in 2013 that we were establishing the Koala Genome Consortium and sequencing the genome. This was a very effective way of getting our project on the scientific horizon but then the pressure was on us to deliver! Fortunately for me (and the koala) one of my biggest career risks (announcing the genome well ahead of time) has resulted in a brilliant collaboration of scientists producing a high quality genome with many exciting outcomes and applications.

 

What do you think were the most interesting or surprising findings that came out of the genome data?

So many interesting things have come out of this work, so it is difficult for me to pinpoint one in particular. However, as a conservation geneticist I’m particularly fond of the conservation genomics work, particularly the historical population reconstruction which infers what koala populations would have looked like through evolutionary time. It was a little surprising to discover that koalas underwent such a dramatic decrease in population size 30-40kya, which was around the time many of the megafauna were experiencing extinction in Australia. Another surprise was that the three koalas used for this analysis are from two quite geographically separate locations (~600 km apart) but both suggest a dramatic reduction in population size indicative of widespread pressures across the continent.

Having this ‘deep-time’ perspective on koala populations, combined with the contemporary population work we did as part of this study we have a long term understanding of koalas in the landscape (i.e. the importance of long-term regional gene flow). Conservation management efforts can now be based on this holistic knowledge rather than a single genetic snapshot taken in time.

 

What are the biggest threat to the koalas now?

The koala is now classified as ‘vulnerable’ due to habitat loss and widespread disease. Threats to koalas are multifaceted, with the biggest primarily due to loss and fragmentation of habitat, urbanization, climate change and disease. Current estimates put the number of koalas in Australia at only 329,000 animals (range 144,000-605,000), and a continuing decline is predicted unless measures are put in place to arrest this decline.

 

How do you envision that this genomic information can aid conservation efforts?

The benefit of the genome to conservation efforts is widespread. The population diversity information presented in our work provides the impetus for a conservation management strategy to maintain gene flow regionally while incorporating the genetic legacy of biogeographic barriers. We have also identified the huge contrast in genome-wide levels of diversity across the northern and southern populations of koalas which will be factored into future decision making. The importance of genetic diversity indices for koala conservation has been included in the recently released NSW koala strategy so we will be focusing on highlighting the genetically healthy koala populations and ensuring they maintain regional gene flow. If more intensive measures such as translocations are required (for example from the genetically diverse populations to the genetically depauperate populations), we now have the tools and data to inform those decisions.

The immune gene repertoire we report as part of the genome is also being used directly in efforts to understand the response of koalas to disease such as chlamydia and the koala retrovirus (KoRV). Several of our collaborators on this work are involved in very important work developing and trialing vaccines for both chlamydia and KoRV. The genome affords the ability to understand which immune genes are up or down regulated in response to disease or treatment and provides the platform for future therapies to be tailored to the genome level.

 

What is it like working with koalas? Do you have any good stories that you would like to share?

It never gets tiring working with koalas and it was not difficult at all to bring collaborators on board to work on this project!

Koalas are notoriously chilled out animals (spending most of their time sleeping or eating), although my friends and colleagues who wrangle them in the field do report how unpleasant it is to be on the receiving end of their extremely sharp claws and nippy diprotodon teeth!

As part of sequencing the genome, our efforts to extract suitable quality DNA from koala blood were unsuccessful (possibly because they have a high lipid content in their blood) the only way we could get suitable quality DNA was to wait for an animal to be euthanized so we could access tissues suitable for genome and transcriptome work. Our two females were euthanized because they had advanced untreatable chlamydia. It is an extremely sobering experience to be involved in these necropsies because you can see the ravages of the disease on the body. While these moments are very tough they also inspire you to work harder to ensure we are producing the best possible science to conserve this amazing species.

 

For more video information, please see:

 

https://www.youtube.com/watch?v=tcMCni28nNo&t=4s

Sea lamprey genomics

sea lamprey

Jeramiah Smith

The sea lamprey (Petromyzon marinus) is an important model in evolutionary biology. It was discovered in 2009 (https://www.pnas.org/content/106/27/11212.long) that the genome of the sea lamprey undergoes extensive programmed genome rearrangement during development, where ~0.5 Gb (around 20%) of DNA is eliminated from the genome. The somatic tissues contain smaller genomes and only the germ cells retain the full complement of genetic material. The genome of the sea lamprey had been sequenced previously from the blood and liver, so only the somatic genome has been thoroughly characterized (https://www.nature.com/articles/ng.2568).

Smith et al., Nature Genetics, 2018

Smith et al., Nature Genetics, 2018

In a paper published this week in Nature Genetics, Jeramiah Smith and colleagues report the germline genome sequence of the sea lamprey.  Using a combination of shot-gun and long-read sequencing integrated with scaffolding data and a meiotic map, the authors assembled a high-quality genome with near-chromosome level of contiguity. This allowed them to identify hundreds of genes that were systematically eliminated from the genome during development. Comparative analysis showed that mouse homologues of these genes are often marked by repressive complexes, indicating parallel strategies for programmed development.

We spoke with lead author Jeramiah Smith from the University of Kentucky to get some background on this research:

  • What inspired you to sequence the germline of the sea lamprey?

I have worked with lamprey for years. I originally got involved with lamprey because it holds a special place in the vertebrate tree of life that shed light on the common ancestor of all vertebrates. That was the motivation for the first lamprey genome project, which sequenced DNA from blood and liver cells.  Once we started working with lamprey we found out that the genome was much more complex than we ever anticipated. This included the fact that the genome changes its sequence content in a reproducible manner over the course of its normal development: something we call programmed genome rearrangement. The amount of DNA that is eliminated from sea lamprey is more than is present in some entire fish genomes, roughly half a billion bases. For me, this finding was the major inspiration behind sequencing the germline genome.

 

  • What do you think were the most surprising or interesting findings to come out of the sequencing?

There were quite a few, but the strong overlap between programmed genome rearrangement and Polycomb-mediated silencing was near the top. The other was the rather strong evidence suggesting the some chromosomes, including chromosomes carrying the HOX genes, appear to have duplicated rather recently and seemingly independently from the rest of the genome. It’s a really strange genome.

 

  • Can you comment on programmed genetic elimination as a developmental strategy versus Polycomb-mediated silencing? 

Polycomb-mediated silencing arose deep in our evolutionary history, and is even present in unicellular organisms. We know that lamprey possesses human homologs of all Polycomb genes, but also uses programmed elimination. The difference between programmed elimination and other mechanisms of gene silencing is that programmed elimination is essentially irreversible, given that the DNA is physically removed. This means that the genes can never be expressed after an embryonic cell lineage has undergone elimination. Other silencing mechanisms are generally reversible, meaning that gene expression can be reactivated. In some cases reactivation is important. For example, in the context of development and regeneration. But in other cases activation of genes in the wrong tissue can case diseases, such as cancer. Lamprey seems to know which genes should never be reactivated outside of the germline.

 

  • What is the most challenging part about working with sea lamprey?

The Genome! Aside from undergoing complex changes during development it also contains a large amount of repetitive DNA and a lot of sequence polymorphism. These features present substantial challenges for assembly and downstream analyses, but we’ve found that they can also be useful tools. We’ve used the abundance of sequence polymorphisms as a tool for mapping genes in lamprey and we now think that some classes of repeats are going to be critical for our future work aimed at figuring out how eliminated DNA is identified and packaged in the early embryos. Lampreys also only breed once a year and take from 5 to maybe 20 years to mature, this makes some experiments impossible, but lamprey researchers are very creative and the community has figured out how to get a lot done in this system.

  • What organisms would you like to see sequenced in the future to help resolve the evolutionary relationships of vertebrates?

There are so many! Hagfish are going to be critical. They are another deep lineage that provides important perspective on vertebrate evolution and also happen to undergo programmed DNA elimination. There are also two other deep lamprey lineages that I also think will be important. Those species live in the southern hemisphere and diverged from sea lamprey around 300 million years ago, as opposed to the roughly 600 million year divergence between lampreys and other vertebrates. A lot of evolution can happen over 600 million years and these species should help bridge that gap. Salamanders and other amphibians are also going to fill important gaps and teach us a lot about the way vertebrate genomes evolve and function. It also seems certain that new sequencing technologies are also going to give us better genomes for other important species that have already been sequenced (e.g. amphioxus, sharks and shark relatives, and even sea lamprey). Finally, I think the zebrafinch germline genome will also be really interesting. They seem to have recently evolved something similar to lamprey’s programmed eliminations, and have a chromosome that’s unique to their germline. I’d really like to know what’s on that chromosome.

25 years of Nature Genetics

 

AprilThis April marks the 25th anniversary of the first issue of Nature Genetics, and I think it’s safe to say that the field of genetics has come quite a long way. In 1992, we were still nearly a decade away from the draft human genome sequence, “omics” was not yet a word in common usage, and CRISPR/Cas9 gene editing wasn’t even a pipe dream.

Most of the content in our current issue would have possibly seemed like far-fetched science fiction to geneticists in 1992. Take for instance the new-and-improved domestic goat genome assembly reported on page 643 of this issue, for which multiple, relatively new technologies were employed to create one of the most complete and contiguous genome assemblies to date. However, as the News & Views by Kim Worley exemplifies, science marches on. While the geneticists of the past might have marveled at the possibility of a whole-genome shotgun assembly (indeed, a major advance reported in that first issue was a new technology allowing for automated sequencing of 106kb), Worley refers to the scientists of the present who are “frustrated with the highly fragmented genome sequences available for most species.”

Still, many things have remained the same.

Taking a look back at the very first editorial published in the journal, much of the journal’s mission in 1992 is still applicable to 2017. Take this passage:

“Researchers should not be dismayed that developments like this are widely reported in the general press. That is merely a measure of the widespread compassionate interest in inheritable disease. Who can be but flattered by such public testimony to the importance of a field of research?

“The research community’s interest, rather, is that there should also be a wide general understanding that the identification of an aberrant gene does not imply that there is a cure at hand for the condition for which it is responsible. […] The elucidation of the mechanisms by which genes determine the behaviour of the cells that carry them will be a general preoccupation in the years ahead. Nature Genetics intends to play its part in the publication of this important research, and also of course, in classical genetics that throws light on the human genome.”

NG1992

{credit}doi:10.1038/ng0492-1{/credit}

While there is no denying that important medical advances have been enabled by the identification of disease genes, it is still painfully true that simply finding the gene does not directly lead to a cure on its own. Thus, both the identification of new disease-causing genetic alterations and studies that bring new mechanistic understanding of how a given mutation gives rise to disease are still core to the journal’s scope and aims.

The focus of the journal, as can be seen from this first editorial, was very much on human genetics at the beginning. Model organisms were considered just that, models for human biology. One of the major changes in the journal since that time has been our expansion to genetics (and genomics) more broadly, as represented by the many reference genomes and population genetics studies published for other organisms.

Too many landmarks to count

The editorial published in this month’s issue highlights a few selected articles from our among our more than 5,000 research publications over the years. These are obviously a restricted set of examples, and they are by no means the “best” papers, as such a ranking system would be ill-advised and ultimately useless. But the papers selected cover a wide range (though not all) of the sub-fields represented by the journal. This list includes landmark papers in human genome mapping (Kong et al. 2002) and cataloging of genetic variation (Iafrate et al. 2004); statistical methods that helped drive an entire field of research (Price et al. 2006); Mendelian disease gene discoveries that shed new light on biological mechanisms (Amir et al. 1999); key advances in the field of epigenetics (Heintzman et al. 2007); and advances in crop plant improvement (Ren et al. 2005).

We invite you to take a trip down memory lane and revisit these and other landmark papers from our archives. As a part of the celebration of 25 years of Nature Genetics, the editors will be blogging throughout April to highlight some of our past content.

A brief history of Nature Genetics

Nature Genetics was launched as the first of the Nature Research journals (if we ignore the very brief existence of Nature New Biology and Nature Physical Science in the early 1970s and the earlier version of Nature Biotechnology, Bio/Technology, published first in 1983).

While the history of genetics as field is by far more interesting than the history of a single journal, the occasion of our 25th anniversary has us thinking about our roots. For our 15th anniversary, founding editor Kevin Davies contributed a guest editorial telling the story of how Nature Genetics came about. I highly recommend that you check it out, if you haven’t seen it before.

Another feature of our 15th birthday celebration was the Question of the year. What would you do if the $1,000 genome were a reality today? To read the nearly 50 replies we received from leaders in the field, see the Question of the Year special here: https://go.nature.com/2mTMKBf.

The next 25 years

Just as researchers in 1992 would have been very unlikely able to predict the many breakthroughs that have occurred in genetics over the past 25 years, we have no idea where the next 25 years will take us. The goals will remain the same: to elucidate the mechanisms by which the genetic material produces the many phenotypic variations we see in nature and to identify the causes (and, more hopefully, cures) for human genetic disease.

That said, let’s take a stab at looking toward the future. What do you think will be the next major breakthrough in genetics? What will the field of genetics look like in another 25 years? Tell us below in the comments.

25 years from now, I hope to still be watching as geneticists make some of the greatest discoveries in biology. And I am confident that Nature Genetics will be there, playing its small role in announcing those discoveries to the world.

 

Farm to Genomes: African Rice

Meyer at al., Nature Genetics, 2016

Meyer at al., Nature Genetics, 2016

Rice is one of the most important crops on the planet, responsible for feeding billions of people. Given this global significance, studying rice in different geographies can be useful and aid in harnessing genetic diversity underlying particular traits and adaptations favorable to different environments. African rice (Oryza glaberrima Steud.) is mainly grown in sub-Saharan Africa and known for its stress tolerance. In a new article this week in Nature Genetics, Michael Purugganan and colleagues report the whole genome re-sequencing of 93 African rice landraces from various regions of Western coastal and sub-Saharan Africa. They create a genome-wide SNP map and through comparative genomic analysis study the domestication and population history of African rice. They use their map to perform GWAS for salt tolerance and find 11 significantly associated regions, highlighting the value of this unique genetic resource.

Meyer et al., Nature Genetics, 2016

Meyer et al., Nature Genetics, 2016

By studying various regions with distinct environments, the authors were able to get clues about adaptation and geographic spread of the populations. They focused on coastal Senegal and inland Togo, which have higher and lower levels of soil salinity, respectively, and interviewed farmers in the region to understand the agricultural practices they employ in each region. The knowledge of the farmers helped to inform the genetic analysis and contributed to the model of African rice domestication and dispersal.

You can watch some of the interviews with the farmers here:

African rice farmers- interviews

Additionally, we spoke with authors Michael Purugganan and Rachel Meyer to get some background on this research.

Why do you think that rice is understudied in Africa compared to other places?

MP: I think it’s because it is not widely grown, unlike its Asian counterpart which has pretty much taken over the world.  But there definitely is more interest in African rice as breeders are trying to figure out how to increase food production in Africa, as well as to try to see what genes in African rice can be used to improve Asian rice.

RM: There is a lot of great research on improving Asian rice for African farmers that is being done by brilliant AfricaRice scientists, and they are working hard on the social science side too. But there are so many challenges that Africa disproportionately faces – particularly climate variation – that demands ramping up rice research. There is insufficient support for programs that integrate crop experiments and trials into the different farmlands. A better connection between scientists and small-scale farmers would really help farmers adopt new varieties too- because there is sometimes resistance to trying new ones.

How did you choose which samples to include in your analysis?

RM: Recognizing that a lot of NGO work encouraging farmers to grow Asian rice ramped up in the 80’s and 90’s, we took advantage of the germplasm largely donated in the 70’s to the West Africa Rice Development Association, which were duplicated and available through IRRI (International Rice Research Institute). We chose accessions with the most metadata available, preferring ones with georeferenced location and a cultivar name. It wasn’t until later that we realized water tables far inland were high in salinity, so we just tried to make sure we had a fair number of samples within 250km of the coast, or along rivers connecting to the ocean.

Were you surprised by any of your findings?

MP: There definitely were a few surprises in the data, but the big revelation for me was the long time for the population bottleneck that led to domestication.  We found from the genomic data that it may have taken more than 10,000 years of steady population decline before full-blown domesticated African rice shows up in the archaeological record.  This suggests the possibility that humans were already cultivating or managing its ancestor for thousands of years, and I think if this pattern holds for other domesticated crop species it will change our thinking on how domestication has taken place.

RM: I was surprised we got nice GWAS results with so few samples, and even more surprised that we saw several of those exhibiting signatures of geographic selection. We were lucky to find a broad distribution of traits in the landraces we chose to sequence, for we had made the DNA libraries ahead of the phenotyping experiments.

What was it like to meet and talk with the farmers?

RM: It was one of the highlights of my life to meet the farmers! I’m grateful to have gotten a glimpse of their heritage, their pride, and their struggles. We were all so impressed with the generosity of women, in particular, to help each other. We were also shocked by how many farms are run by the elderly; their children don’t see farming as profitable and many have left. For the three of us in the field, it made us think hard about how we can give back to the communities that gave us their time. I hope that crop science, publicity (like this blog) and policy changes can raise the profile of the small-scale farmer.

In each interview, the farmers also had a chance to interview us, and that part was especially interesting. Several asked really good questions about African and Asian rice domestication. You could see the cultural value of the basic science.

You chose to focus on salinity tolerance as a trait particularly relevant to farming in Africa.  In what ways do you see your results being used for crop improvement?

RM: One of the authors, from AfricaRice, Dr. Kofi Bimpong, had actually been working on salt tolerance separately as well, and has two graduate student collecting African rice landraces in Casamance. If from this paper we can consider that domestication possibly occurred in the Inner Niger Delta region and also in the West, then these collecting efforts are all the more important because they are from a center of origin, promising more genetic variation than people would have ever estimated. If you look through the available germplasm there is so little that has been collected or studied from Casamance. It’s tricky collecting there, for there is social unrest, and landmines. Hats off to the young graduate students, Mamadou Sock and Bathe Diop, doing that fieldwork; I’m sure there is a lot of discovery to be made with those collections, and more promising salt tolerant landraces to integrate into breeding programs.

In addition, our results suggesting many of the salt tolerance genes are shared in both rice species make them more valuable to explore in other crops.  Shared adaptive mechanisms are especially fascinating to evolutionary biologists and are powerful assets of the breeder’s toolbox.

June issue cover: What’s going on here?

Carrot canang sari by Rachel Meyer

Carrot canang sari by Rachel Meyer

As June comes to a close, it’s time to look back at our June issue and ask “what’s going on here?” with the cover image. As you may have guessed, the image is related to the publication of the carrot genome sequence in this month’s issue.

The cover image was provided by Rachel Meyer, a scientist who was not a co-author of the genome paper. Dr. Meyer was previously a postdoctoral researcher with Michael Purugganan at NYU and is an AAAS Science and Technology Policy fellow. She is also a co-founder of Shoots & Roots in New York.

Dr. Meyer gave us the following information about the carrot canang sari on the June cover:

Celebrating the recent availability of rainbow carrots year-round in Washington DC, I cut them in various ways and laid them out in a public dirt plot between the sidewalk and the street that was still bare because Spring had barely started and planting was far from beginning. The cold kept the carrots nicely preserved for three days. The installation took about 6 hours, and the design itself was lifted from a Persian carpet, sharing an origin with some of the earliest domesticated carrots. I had no intention to leave the installation there but people in the busy U-street/Shaw district, coming home late at night from the bars, would stop and photograph it, and even some of the suits interrupted their morning power walks to work to investigate it. After a few days, to my surprise it was not rats, but a middle-aged man who had decimated the carrots for a meal.

Shelby Ellison, an author of the carrot genome article this cover references, did this research as part of her NSF Plant Genome Postdoctoral Fellowship. We were in the same class of Fellows together and became friends because we would look for cool restaurants around DC together during our brief visits to NSF for annual Plant Genome meetings. I’m grateful to be able to explore the subject of her science through installation.

For more about the carrot genome paper, see our previous blog post, featuring Q&A with the corresponding author.

Cancer clones- mixing and spreading

Shah 1

McPherson et al., Nature Genetics 2016

The trajectory of tumor cells during metastasis can be influenced by many factors, including the physical environment and the genetic makeup of metastatic clones. In high-grade serous ovarian cancer, there are limited barriers in the intraperitoneal space, allowing for extensive spreading and mixing of tumor cells. A recent article published in Nature Genetics explores these different patterns of clonal evolution in metastatic ovarian cancer using a combination of bulk and single cell sequencing.

The authors characterized the mutation landscapes of different metastatic tumors and find both monophyletic and polyphyletic clones. While in most patients there was unidirectional seeding from the original ovarian tumor, two patients exhibited polyclonal spread and reseeding. Therefore, high-grade serous ovarian cancer cells can migrate through and establish metastasis within the intraperitoneal space via different evolutionary routes.

McPherson et al., Nature Genetics 2016

McPherson et al., Nature Genetics 2016

We spoke to lead author, Sohrab Shah, to get some background on this research.

What features of this particular cancer made you want to study its metastasis? Were you surprised by your findings?

High grade serous ovarian cancers are often widespread through the peritoneal cavity at diagnosis.  We wanted to ask what are the characteristics of cells that spread and what is the distribution of these cells throughout the abdominal lesions.  The focus was to study the disease state prior to any treatment to characterize the diversity and take in inventory of the ‘substrate’ of clones upon which treatment selective pressures may be acting.  Many patients experience relapse after initial response to treatment.  Mapping which clones lead to relapse remains a key question in the field.  This was borne out in one patient in our study where specific clones that led to relapses were already present at diagnosis but only represented a minority of branches in the clonal phylogeny.

It is important to note that the mode of spread in this disease differs from most solid cancers, where spread is achieved through the bloodstream or lymphatics.  Ovarian cancer represents a unique opportunity to study disease spread through a relatively physically unencumbered anatomic space.  One might expect that in such an environment the potential for clonal intermixing is high.   This might lead to many clones co-existing at many sites.  But the majority of intraperitoneal samples were clonally pure, suggesting unidirectional spreading from ovary sites with diverse clonal repertoires, and a lack of clonal intermixing.

You provide evidence that the microenvironment influences the metastatic success of tumors. What does this say about in vitro cancer models that don’t account for tissue context?

One of the intriguing findings suggested that specific clones were present in specific sites.  This may indicate that particular microenvironments are differently suited to particular clones. Another surprising finding was that every patient harbored at least one lesion that was very diverse in its clonal make-up (typically within primary ovary sites).  This leads to the natural question of whether properties of specific microenvironments in some way promote or ‘tolerate’ clonal diversity.  If this were the case, then both in vitro and in vivo model systems such as cell lines, organoids and mouse xenografts may not adequately represent the natural disease state we find in patients prior to treatment.

How did you choose your sampling strategy?

The study results are naturally biased by the sampling strategy.  The study design was subject to what material could be obtained during the provision of care.  In our setup, we consented patients for collection and study of all material removed at primary debulking surgery.  Wherever possible tissue was cryopreserved, but inevitably many deposits were preserved in formalin.   Our strategy led to acquisition of a median n=10 samples per patient.  The nature of the samples and their locations are presented in Figure 4 and are also available in interactive web-form at:

https://compbio.bccrc.ca/research/tumour-evolution/

Users can click on the links for each patient and explore the clonal maps.

You utilize both bulk and single cell sequencing as complementary approaches to elucidating tumor evolution. Can you comment on the trade offs between cost and throughput and how you chose your sample sizes?

The field is entering an interesting time.   There are several limitations to both bulk and single cell sequencing strategies to define the clonal constituents of a tumor sample.  Most single cell techniques suffer from vast under-sampling of the clonal repertoire since they are limited in throughput and may only practically yield data from 100s of cells.  Furthermore, single cell techniques are prone to two key experimental sources of noise: missing data and allele-dropout.  We used targeted, multiplexed single cell sequencing as a form of validation from inferences made from the bulk sampling including validating co-occurrence of point mutations and structural variations in the same cells.  Hypotheses were generated from multi-site bulk analysis and were then tested using orthogonal single cell approaches.   Accordingly, the sample sizes in single cell were chosen to identify clones that were detected in bulk samples – in the range of 5% prevalence.  Notably, the noise properties of targeted multiplexed single cell data required some careful statistical treatment, the results of which were published as a standalone contribution in Nature Methods simultaneously with this publication.  As the field moves forward, it may become practical to sequence the whole genomes of 1000s of cells per sample. I look forward to the day when a single experimental design would be sufficient to dissect the important clones present in a cancer.  This would enable studying evolutionary properties at scale, leveraging richly defined principles and statistical models from the field of population genetics.

You find that there are differences in the potential for migration and metastasis across the tumors from your patients. What clinical implications might this have?

Our study is underpowered to provide a clear answer on this.  Our results hint anecdotally that cases with strong patterns of unidirectional spread fared poorly in their treatment trajectories.  Whether cancers harboring clones with strong potential to invade new micro-environments and dominate their local landscapes indicates potential to evade chemotherapy remains an important question to consider.  As we take this study forward in model systems derived from spatially distinct sites, reproducible treatment selection experiments can be carried out to robustly address this question.

 

May issue cover: What’s going on here?

May2016This month’s cover image is inspired by the Article on p. 528 of this issue, by Jeff Wall, Nicola Illing, Nadav Ahituv and colleagues. The paper reports the genome of the bat Miniopterus natalensis and transcriptional dynamics in the developing bat wing. This species, one of a group known as vesper bats, is also known as the Natal long-fingered bat and is found in parts of Africa.

The image chosen for the cover is a frontal view of a bat embryo at a late stage of development (stage CS21) taken by study co-author Mandy Mason. This developmental stage is known as
“Translucent Wing”, as you can clearly see the skeletal structures in the wing and the membrane between the outstretched digits. The embryo in this image was stained with Alizarin red (maroon-red-pink) for bone and Alcian blue (blue-cyan) for cartilage. The image was actually taken as part of an earlier study to understand the progression of limb development in this species and to compare it with that of the mouse.

The current study presents not only the genome sequence of the Natal long-fingered bat, but also RNA-seq and ChIP-seq (for H3K27ac and H3K27me3) profiling of the developing limbs. The authors identified more than 7,000 genes that were differentially expressed between the forelimbs—the eventual wings—and the hindlimbs. Through comparative genomics analyses, they found nearly 3,000 regions showing evidence of accelerated evolution along the bat lineage that overlapped with H3K27ac peaks, suggesting that these are candidate enhancer regions for wing development. “This study offers a comprehensive resource for future work in comparative limb development,” co-author Mandy Mason told us. “Aside from the results that we have presented in this paper, these open datasets can be queried to help answer additional questions that may be asked by both our and other research groups.”

 

April issue cover: What’s going on here?

Tlalcacahuatl gold by Erin Dewalt

Tlalcacahuatl gold by Erin Dewalt

This month’s cover image is a visual tribute to the peanut and its importance to both the ancient civilizations of the Americas and modern agriculture. The genome sequences of the two progenitor species to the cultivated peanut were published in this month’s issue by David Bertioli and colleagues. The genome sequences are the first step to characterizing the genome of cultivated peanut, which was formed by the hybridization of these two species thousands of years ago. The genome sequences give us valuable clues about the evolution of these species. The authors also identified candidate genes for pest resistance, which could lead to advances in peanut cultivation in the future.

The image was inspired by a gold and silver necklace with beads in the shape of peanuts that was found in the tomb of the Great Lord of Sipan of the ancient Peruvian Moche culture. The necklace (c. 300) is now at the Museo Arqueológico Nacional Brüning in Peru. You can see an image of the necklace here and with more context here. The peanuts in the cover image have the same wavy shape as the beads in the necklace. The speckled texture and symmetric division of gold and silverish-blue in the cover image are also inspired by this ancient artifact.

Erin Dewalt, senior graphic designer for Nature Publishing Group, developed the image concept. She shows the peanuts underground, almost dangling from the plant above like beads. Peanut seeds develop underground after the flowers are fertilized. The ovary develops into a “peg” (gynophore) that drives back down into the soil, where it develops into the fruit that we cultivate as peanuts.

640px-Arachis_hypogaea_006

Peanut pegs growing into the soil. The tip of the peg, once buried, swells and develops into a peanut fruit. {credit}H. Zell via Wikimedia Commons{/credit}

The title of the image, Tlalcacahuatl gold, is a reference to the ancient Aztec name for peanut, tlalcacahuatl. But it is also a reference to the wealth represented by the peanut, both for ancient cultures and for modern agriculture. Because peanut plants fix nitrogen, thanks to the symbiotic bacteria in their root nodules, they return nutrients to the soil and improve cultivation of other crops (a fact famously advertised to farmers in the U.S. by George Washington Carver).

Tangential reading: The peanut necklace of the Great Lord of Sipan was almost lost to history forever. As this LA Times article from 1988 reported, grave robbers nearly made off with the treasures of the Lord of Sipan, including the necklace.  

 

December issue cover: What’s going on here?

December

{credit}Sahve Greef & Aurora Lupus{/credit}

This month’s cover image is related to the pineapple genome paper, but is also a celebration of all things genome. The cover art is from a collage produced by young artists Sahve Greef and Aurora Lupus. The image shows a pineapple outline with genome tracks or chromosomes contained within the scales of the outer fruit, all set on a background reminiscent of outer space.

We asked Sahve to give us some insight into the process that led to this design:

 I was working on the cover and had a difficult time creating my original concept which would have been genomes shaped like pineapples and than it became a pineapple silhouette with genomes shaped like pineapples inside, pineapple inception! It was becoming too complicated so I was thinking it over. Aurora  suggested creating the pineapple’s scales from genome tracks, and we began working together. Originally, I was composing the collage on an orange bristol board, but felt that it made the pineapple appear flat, and disappearing into the background too much. I wanted to create a dynamic image, one that exploded off the cover and made people wonder, “Hey, what’s going on with that crazy pineapple that just punched me in the eyeballs???”  I’m amazed by how incredibly small genomes are in relation to just about everything, and it makes me really think about how small we are in the universe. 

To see more of Sahve’s art, visit her Facebook and Tumblr sites. Aurora’s art can be found at her Tumblr site.

 

Original artwork (via Aurora Lupus on Instagram)

Original artwork (via Aurora Lupus on Instagram){credit}Sahve Greef & Aurora Lupus{/credit}

On the history of pigs

USDA_ARS_Meishan_pig-Cropped

{credit}Agricultural Research Service via Wikipedia{/credit}

Understanding the genomic changes that occurred during the domestication of animals and plants by humans is important on many levels. Such insights can provide information about human history and our interactions with other species, as is the case with genetic studies of dog and cat domestication. These studies can also help us to improve crop plants (such as tomato) and livestock (such as cattle) for human consumption or other use. Finally, genetic studies on domestication can help to identify disease-causing mutations that have been selected for as a by product of selection for beneficial traits (for example, in cats and dogs).

Though humans have a huge influence on important traits in domesticated species, those species are still responding to natural selection during the domestication process, which in turn may affect traits important for agricultural purposes. Identifying genomic regions influenced by positive natural selection in domesticated animals  can lead to important insights into the biology of specific breeds.

In this respect, the pig is an excellent model to study. Humans domesticated pigs approximately 10,000 years ago in the Near East and China, but a relatively open method of keeping pigs allowed for continued interbreeding with wild boars for some time. In a study published this week in Nature GeneticsLusheng Huang, Jun Ren and colleagues from Jiangxi Agricultural University sequenced the genomes of 69 diverse domestic and wild pigs in China to better understand their evolutionary history.

Pig sampling in China

Pig sampling in China{credit}Lusheng Huang{/credit}

The study included pigs from 11 diverse breeds (and 3 populations of wild boar) within China in order to compare the adaptations in breeds from cold vs. hot areas. They identified over 700 genomic regions that showed evidence of selective sweeps. Many of the genes in these regions were involved in processes important for regulation of temperature during cold or heat stress, such as hair development, energy metabolism and blood circulation.

However, one of the most striking results was the identification of a large (~14Mb) sweep region on the X-chromosome. More than 94% of the single nucleotide polymorphisms (SNPs) in the 69 pig sample that had extreme allele frequency differences between North and South populations were located within the X-linked sweep region. All Northern Chinese samples showed a strong signature of selection in this region. Upon further analysis, the authors were able to determine that the most likely scenario, given their data, was that this region was introgressed from a now-extinct species of Sus. This region of the X-chromosome undergoes very little recombination. This fact, combined with the strong signal of positive selection in the region, meant the introgressed sequence remained mostly preserved for more than 8 million years.

We asked one of the study’s senior authors, Lusheng Huang, to tell us a little more about the work:

How did you collect the DNA samples from the pigs for your study? Were any of the samples difficult to get?

We collected DNA samples from 4,100 three-generation consangeneously unrelated pigs representing all 68 indigenous breeds that are distributed in 24 provinces of China. It took us four and half years to complete sample collections, Some native pigs lived in the high attitude regions (Yunnan, Guizhou, Sichuan and Tibet) were very hard to get. Afterwards, we constructed a DNA bank for Whole China indigenous pigs. As a pilot study, we first genotyped 520 unrelated pigs (no common ancestor within 3 generations) from 32 Chinese breeds for 60K SNPs in the Illumina porcine beadchip. Then, we selected 69 representative pigs from the 520 pigs according to their genetic relationships in the neighbor-joining tree constructed with the 60K SNP data. The 69 pigs selected for whole-genome sequencing are highly rep­resentative of populations at the geographical extremes of China.

pig sampling

{credit}Lusheng Huang{/credit}

Most of the sampled pigs were originally raised in government-sponsored conservation farms. We selected animals to cover a majority of consanguinity of each breed according to their pedigree information. However, samples of several breeds were collected from isolated villages or farms at rural areas. For example, it was a big challenge for us to collect samples of Tibetan pigs from different geographic populations in the vast region of the Tibet Plateau. To find purebred Tibetan pigs that were not influenced by human-mediated hybrid with exotic breeds, we had to travel to remote pastoral areas at high altitudes and make an in-depth field investigation with the kind help of local residents. To cover the consanguinity of each Tibetan population as broad as possible, we preferably collected samples from Tibetan boars that are usually aggressive like wild boars and were really difficult to get (see above picture).

What do the positively selected regions tell us about the history of pig domestication?

These regions clearly illustrate that pigs have experienced natural selection for local fitness before (ancient event) or after (recent event) domestication. The selection footprints in the pig genomes can be visualized by whole-genome sequencing, characterized by reduced heterozygosity, excess of low-frequency variants, extended and differentiated haplotypes. The selected sweep regions harbor functional genes that play a role in adaptation to local environments. DCF17 and VPS13A are two such examples highlighted in this study.

What do you think was the most unexpected result in this study? Did you believe it at first?

The extremely divergent haplotype in the X-linked sweep region between Southern and Northern Chinese pigs, an indication of a possible ancient interspecies introgression event, was the most unexpected result in this study. It is a big surprise. Frankly speaking, we did not believe it at first.

Adapted from Fig. 4a in Huashui Ai et al. 2014

The pattern of haplotype sharing in diverse populations. The haplotypes were reconstructed for each individual using all of the variants on the X chromosome. Alleles that are identical to or different from the ones in the Wuzhishan reference genome are indicated by red and blue, respectively. Adapted from Fig. 4a in Huashui Ai et al. 2014{credit}Nature Genetics{/credit}

Why is the finding of a large introgression region on the X chromosome important?

Although evidence of adaptive evolution driven by introgression from archaic species has been recently identified in some species including humans, the X-linked introgression region shows that adaptive introgression is not limited to closely related species, but in some cases, introgression with very divergent species can provide the basis for the evolution of radically new traits in a species. This radical example of so-called ‘reticulate evolution’ in mammals shakes the foundation of most modern evolutionary biology and provides a new view of adaptive evolution that emphasizes saltationist (sudden) processes driven by introgression. Moreover, as discussed in the paper, our ability to detect this, potentially quite old, introgression event is facilitated by the fact that the introgression fragment falls in a recombination-decreasing region. This has allowed the introgressed haplotype to be maintained for a prolonged period. Our results may suggest that introgression generally plays a much more dominant role in adaptive evolution than previously thought, but has been difficult to detect because introgression fragments in other systems degenerate quickly due to recombination.

Do you think similar ancient introgressions have occurred in other domesticated species? If so, how would you test this?

We cannot rule out the possibility. If one wants to test this hypothesis, we would suggest to use a research strategy similar to that used in this study. First, we would need to get the genome sequences of multiple species divergent from a domesticated species. Then, we can perform a genome-wide scan for possible introgression regions from another divergent species in the domestic species. Several statistics of ABBA, F4, haplotype sharing and phylogenetic analysis can be explored to identify such ancient introgressions.

Erhualian

{credit}Lusheng Huang{/credit}

Bonus question: What is your favorite breed of domestic pig?

Erhualian, the most prolific pig breed in the world.