Cuddly Koala Genomics

Rebecca Johnson

The genome assembly of the koala is reported in a paper published online in Nature Genetics. This high quality genome represents the most complete genome sequence for a marsupial to date. The data give insight into the highly specialized koala diet, consisting of eucalyptus leaves, and provide information that may be useful to combatting infectious disease.

Koalas are a vulnerable species and part of the aim of the the project was to use the genomic data to inform conservation efforts. We spoke with lead author Rebecca Johnson to get some background on this work:

Koala Rebecca Johnson

How did the koala genome project come to be?

The genome project started as a small group of Australian researchers (from the Australian Museum, University of the Sunshine Coast and University of Sydney) who were enthusiastic about koala conservation and using genomics to manage populations and diseases. We partnered up with colleagues at the Ramaciotti Centre at the University of New South Wales (UNSW) who were enthusiastic to try out their new sequencing equipment on a ‘de novo mammal sized genome’. This hadn’t been done before in Australia.

We decided to take a bit of a risk and announce to the world in 2013 that we were establishing the Koala Genome Consortium and sequencing the genome. This was a very effective way of getting our project on the scientific horizon but then the pressure was on us to deliver! Fortunately for me (and the koala) one of my biggest career risks (announcing the genome well ahead of time) has resulted in a brilliant collaboration of scientists producing a high quality genome with many exciting outcomes and applications.

 

What do you think were the most interesting or surprising findings that came out of the genome data?

So many interesting things have come out of this work, so it is difficult for me to pinpoint one in particular. However, as a conservation geneticist I’m particularly fond of the conservation genomics work, particularly the historical population reconstruction which infers what koala populations would have looked like through evolutionary time. It was a little surprising to discover that koalas underwent such a dramatic decrease in population size 30-40kya, which was around the time many of the megafauna were experiencing extinction in Australia. Another surprise was that the three koalas used for this analysis are from two quite geographically separate locations (~600 km apart) but both suggest a dramatic reduction in population size indicative of widespread pressures across the continent.

Having this ‘deep-time’ perspective on koala populations, combined with the contemporary population work we did as part of this study we have a long term understanding of koalas in the landscape (i.e. the importance of long-term regional gene flow). Conservation management efforts can now be based on this holistic knowledge rather than a single genetic snapshot taken in time.

 

What are the biggest threat to the koalas now?

The koala is now classified as ‘vulnerable’ due to habitat loss and widespread disease. Threats to koalas are multifaceted, with the biggest primarily due to loss and fragmentation of habitat, urbanization, climate change and disease. Current estimates put the number of koalas in Australia at only 329,000 animals (range 144,000-605,000), and a continuing decline is predicted unless measures are put in place to arrest this decline.

 

How do you envision that this genomic information can aid conservation efforts?

The benefit of the genome to conservation efforts is widespread. The population diversity information presented in our work provides the impetus for a conservation management strategy to maintain gene flow regionally while incorporating the genetic legacy of biogeographic barriers. We have also identified the huge contrast in genome-wide levels of diversity across the northern and southern populations of koalas which will be factored into future decision making. The importance of genetic diversity indices for koala conservation has been included in the recently released NSW koala strategy so we will be focusing on highlighting the genetically healthy koala populations and ensuring they maintain regional gene flow. If more intensive measures such as translocations are required (for example from the genetically diverse populations to the genetically depauperate populations), we now have the tools and data to inform those decisions.

The immune gene repertoire we report as part of the genome is also being used directly in efforts to understand the response of koalas to disease such as chlamydia and the koala retrovirus (KoRV). Several of our collaborators on this work are involved in very important work developing and trialing vaccines for both chlamydia and KoRV. The genome affords the ability to understand which immune genes are up or down regulated in response to disease or treatment and provides the platform for future therapies to be tailored to the genome level.

 

What is it like working with koalas? Do you have any good stories that you would like to share?

It never gets tiring working with koalas and it was not difficult at all to bring collaborators on board to work on this project!

Koalas are notoriously chilled out animals (spending most of their time sleeping or eating), although my friends and colleagues who wrangle them in the field do report how unpleasant it is to be on the receiving end of their extremely sharp claws and nippy diprotodon teeth!

As part of sequencing the genome, our efforts to extract suitable quality DNA from koala blood were unsuccessful (possibly because they have a high lipid content in their blood) the only way we could get suitable quality DNA was to wait for an animal to be euthanized so we could access tissues suitable for genome and transcriptome work. Our two females were euthanized because they had advanced untreatable chlamydia. It is an extremely sobering experience to be involved in these necropsies because you can see the ravages of the disease on the body. While these moments are very tough they also inspire you to work harder to ensure we are producing the best possible science to conserve this amazing species.

 

For more video information, please see:

 

https://www.youtube.com/watch?v=tcMCni28nNo&t=4s

Sea lamprey genomics

sea lamprey

Jeramiah Smith

The sea lamprey (Petromyzon marinus) is an important model in evolutionary biology. It was discovered in 2009 (https://www.pnas.org/content/106/27/11212.long) that the genome of the sea lamprey undergoes extensive programmed genome rearrangement during development, where ~0.5 Gb (around 20%) of DNA is eliminated from the genome. The somatic tissues contain smaller genomes and only the germ cells retain the full complement of genetic material. The genome of the sea lamprey had been sequenced previously from the blood and liver, so only the somatic genome has been thoroughly characterized (https://www.nature.com/articles/ng.2568).

Smith et al., Nature Genetics, 2018

Smith et al., Nature Genetics, 2018

In a paper published this week in Nature Genetics, Jeramiah Smith and colleagues report the germline genome sequence of the sea lamprey.  Using a combination of shot-gun and long-read sequencing integrated with scaffolding data and a meiotic map, the authors assembled a high-quality genome with near-chromosome level of contiguity. This allowed them to identify hundreds of genes that were systematically eliminated from the genome during development. Comparative analysis showed that mouse homologues of these genes are often marked by repressive complexes, indicating parallel strategies for programmed development.

We spoke with lead author Jeramiah Smith from the University of Kentucky to get some background on this research:

  • What inspired you to sequence the germline of the sea lamprey?

I have worked with lamprey for years. I originally got involved with lamprey because it holds a special place in the vertebrate tree of life that shed light on the common ancestor of all vertebrates. That was the motivation for the first lamprey genome project, which sequenced DNA from blood and liver cells.  Once we started working with lamprey we found out that the genome was much more complex than we ever anticipated. This included the fact that the genome changes its sequence content in a reproducible manner over the course of its normal development: something we call programmed genome rearrangement. The amount of DNA that is eliminated from sea lamprey is more than is present in some entire fish genomes, roughly half a billion bases. For me, this finding was the major inspiration behind sequencing the germline genome.

 

  • What do you think were the most surprising or interesting findings to come out of the sequencing?

There were quite a few, but the strong overlap between programmed genome rearrangement and Polycomb-mediated silencing was near the top. The other was the rather strong evidence suggesting the some chromosomes, including chromosomes carrying the HOX genes, appear to have duplicated rather recently and seemingly independently from the rest of the genome. It’s a really strange genome.

 

  • Can you comment on programmed genetic elimination as a developmental strategy versus Polycomb-mediated silencing? 

Polycomb-mediated silencing arose deep in our evolutionary history, and is even present in unicellular organisms. We know that lamprey possesses human homologs of all Polycomb genes, but also uses programmed elimination. The difference between programmed elimination and other mechanisms of gene silencing is that programmed elimination is essentially irreversible, given that the DNA is physically removed. This means that the genes can never be expressed after an embryonic cell lineage has undergone elimination. Other silencing mechanisms are generally reversible, meaning that gene expression can be reactivated. In some cases reactivation is important. For example, in the context of development and regeneration. But in other cases activation of genes in the wrong tissue can case diseases, such as cancer. Lamprey seems to know which genes should never be reactivated outside of the germline.

 

  • What is the most challenging part about working with sea lamprey?

The Genome! Aside from undergoing complex changes during development it also contains a large amount of repetitive DNA and a lot of sequence polymorphism. These features present substantial challenges for assembly and downstream analyses, but we’ve found that they can also be useful tools. We’ve used the abundance of sequence polymorphisms as a tool for mapping genes in lamprey and we now think that some classes of repeats are going to be critical for our future work aimed at figuring out how eliminated DNA is identified and packaged in the early embryos. Lampreys also only breed once a year and take from 5 to maybe 20 years to mature, this makes some experiments impossible, but lamprey researchers are very creative and the community has figured out how to get a lot done in this system.

  • What organisms would you like to see sequenced in the future to help resolve the evolutionary relationships of vertebrates?

There are so many! Hagfish are going to be critical. They are another deep lineage that provides important perspective on vertebrate evolution and also happen to undergo programmed DNA elimination. There are also two other deep lamprey lineages that I also think will be important. Those species live in the southern hemisphere and diverged from sea lamprey around 300 million years ago, as opposed to the roughly 600 million year divergence between lampreys and other vertebrates. A lot of evolution can happen over 600 million years and these species should help bridge that gap. Salamanders and other amphibians are also going to fill important gaps and teach us a lot about the way vertebrate genomes evolve and function. It also seems certain that new sequencing technologies are also going to give us better genomes for other important species that have already been sequenced (e.g. amphioxus, sharks and shark relatives, and even sea lamprey). Finally, I think the zebrafinch germline genome will also be really interesting. They seem to have recently evolved something similar to lamprey’s programmed eliminations, and have a chromosome that’s unique to their germline. I’d really like to know what’s on that chromosome.

Mutation rates of Mycobacterium tuberculosis: From the archives (2013)

Mycobacterium tuberculosis- credit: NIH-NIAID (CC-BY)

Mycobacterium tuberculosis- credit: NIH-NIAID (CC-BY)

Continuing with our month-long celebration of Nature Genetics 25th anniversary, I have chosen to highlight a study by Sarah Fortune and colleagues estimating mutation rate differences between different lineages of Mycobacterium tuberculosis published in June 2013.

Multidrug resistance in M. tuberculosis is a global problem, and understanding the origins and dynamics of the emergence of resistance is an important scientific and public health endeavor.

Building on their previous work that used whole genome sequencing to estimate mutation rates of M. tuberculosis during latent infection, the authors then went on to study the rate at which different strains acquire drug resistance mutations. Using classical fluctuation tests and measuring rifampicin resistance in both clinical and laboratory isolates, they determined the mutation rates for strains from lineage 2 and lineage 4, observing an order of magnitude difference between them, with lineage 2 having the higher rate. These lineage 2 strains also acquired resistance to other antibiotics (ethambutol, isoniazid) at a higher rate than lineage 4 strains.

The authors then sought to relate the in vitro data to the in vivo infection environment. They analyzed whole-genome sequences from a lineage 4 outbreak and determined the base substitution rate; the in vivo data were in concordance with the in vitro per-day mutation rate.

Finally, the authors took these data and developed a simulation model of the evolution of drug resistance during infection in a human host. They simulate the emergence of multidrug resistance and show that in the model, individuals infected with lineage 2 strains had a substantially higher risk of acquiring multidrug resistance mutations.

Using a combination of in vitro, clinical and simulated data, Ford et al. contributed to our understanding of the emergence of multidrug resistance, highlighting the differences between strains and underscoring the importance of timely and sufficient treatment.

A CRISPR screen for HIV targets

A new study published online this week in Nature Genetics reports the discovery of novel host targets of HIV infection identified from a high-throughput CRISPR/Cas9-based screen. This screen was performed in CD4 + T-cells and was designed to find candidate genes required for successful HIV infection, but whose inactivation did not affect cell viability. In this way, potential drug targets for anti-HIV therapy could be discovered.

Park et al., Nature Genetics 2016

Park et al., Nature Genetics 2016

Park et al., Nature Genetics 2016

Park et al., Nature Genetics 2016

 

The authors found two known (CCR5 and CD4) and three novel (ALCAM, SLC35B2 and TPST2) cellular factors that, upon abrogation, prevented HIV infection but did not have any negative effects on the cell itself. These new genes are involved in sulfation and cell aggregation pathways and represent candidate targets for interventional HIV therapy.

We spoke with first author Ryan Park to get some background on this research:

 Previous screens for host factors affecting HIV pathogenesis found a high number of hits, with low reproducibility across screens.  With your CRISPR/Cas9 approach, were you expecting similar results? Did the low number of hits in your screen surprise you?

We designed our screen stringently, as the existing literature has not been clear on what genes would potentially serve as good targets for host-directed anti-HIV therapies. Our goal was thus to identify these host factors with high confidence while maintaining an unbiased approach. The very low number of hits was certainly surprising, though, as you note, the limited overlap among the previous screens raised the suspicion of a high false positive rate and/or low reproducibility.

You find three novel genes that are dispensable for cell viability but that are needed for successful HIV infection.  Do you think that there could be natural polymorphisms in these genes in human populations that might mitigate susceptibility to HIV entry and transmission?

In the Exome Aggregation Consortium (ExAC) dataset recently published in Nature, there are individuals with truncations and/or homozygous mis-sense mutations in each of the three genes, as well as ITGAL (the loss of which we find is protective against HIV infection in primary CD4+ T cells). More work remains to be done to determine whether these individuals are relatively less susceptible to HIV infection.

Due to the high mutation rate of HIV and the emergence of resistance to drug therapies, potential targeting of host factors can be a useful strategy.  Do you envision these findings being utilized to develop novel anti-HIV therapies?

Host-targeted HIV therapies are of great interest for multiple reasons. Firstly, as you note, the emergence of drug-resistant HIV strains remains a major issue, particularly in settings where adherence to a daily antiretroviral regimen is challenging. Drug-resistant strains are less likely to emerge in the face of incomplete adherence to host-targeted therapies. Secondly, the identification of host factors may also serve as a basis for gene therapies (in which gene editing is used to produce a population of HIV-resistant target cells) that could result in a permanent HIV cure. As noted above, more work remains to be done to determine whether inactivation of these genes protects against HIV infection at the organismal level without causing detrimental effects.

How might this screen be adapted to find host factors important at other stages of the HIV life cycle and do you have future plans to explore such work?

Our screen captured all but the latest stages of the HIV life cycle (particularly virion assembly, budding, and maturation); this is because HIV Tat, which drives the GFP reporter in our cell line model, is expressed prior to these steps. Development of an alternative reporter system that is activated by virion budding or maturation would allow identification of host factors involved only at these late stages. Because completion of the HIV life cycle is not required for host cell killing by HIV, cells lacking these late-acting host factors may still not be captured in a screen; more importantly, these late-acting host factors may therefore not be attractive therapeutic targets.

Can this screening method be employed to find host factors important for infection by other viruses?  Do you speculate that there would be viruses for which a large number of non-essential host factors would be identified as important for infection?

The key elements of our approach, which include identification of a physiologically relevant cell line and the use of a high-complexity genome-wide sgRNA library, can be readily generalized to identify host factors that are critical to the propagation of any viral pathogen yet dispensable for cell viability. Our findings suggest that the number of non-essential host factors that are critical for HIV infection is quite limited, and that many candidate host factors identified by other screens or targeted studies may not be required for HIV infection or may compromise cell viability. Whether this is the case for other viruses is hard to know, but we have demonstrated that our approach can be quite powerful and specific in identifying the range of potential host targets with high confidence.

 

Ubiquitin, keratin and skin fragility

Lin et al. Nature Genetics, 2016

Lin et al. Nature Genetics, 2016

Protein degradation is a highly coordinated process with multiple levels of regulation, including both targeted and autodegradation.  This sophisticated cascade of protein turnover must be precisely balanced to maintain proper physiological function. A recent article published in Nature Genetics reports the discovery of gene with protein-truncating mutations that lead to the skin condition epidermolysis bullosa, which is characterized by tendency to blister, itching and other abnormalities. The authors found 5 patients all with start codon mutations in the KLHL24 gene, which encodes Kelch-like protein 24, a substrate receptor of the cullin 3 (CUL3)–RBX1–KLHL24 ubiquitin ligase complex.

Lin et al., Nature Genetics, 2016

Lin et al., Nature Genetics, 2016

The mutant proteins from these patients were found to be stabilized, with increased levels in patient samples, leading the authors to hypothesize that KLHL24 may target a substrate that is important for the structural integrity of the skin.  Indeed, through mass spectrometry and biochemical analysis, they identify keratin 14 (KRT14) as a KLHL24 substrate, and find that KRT14 levels are decreased in patient samples. Keratin 14 is an intermediate filament component important for maintaining keratinocyte integrity and mutations in the gene are found in some epidermolysis bullosa patients. The authors further show that KLHL24 is autoubiquitinated and that the truncated mutant has reduced levels of autoubiquitination, stabilizing the protein. This increased KLHL24 stability leads to increased KRT14 degradation, resulting in the skin fragility phenotype observed in the patients.

Lin et al., Nature Genetics, 2016

Lin et al., Nature Genetics, 2016

 

 

 

 

 

 

 

 

 

 

 

Although dynamic regulation of keratins by the ubiquitin–proteasome system had been proposed, no targeting E3 ligases had been identified. This work established KLHL24 as a keratin-targeting E3 ligase.

 

We spoke with authors Dr. Xu Tan and Dr. Yong Yang to get some background on their research.

Can you briefly describe how you found the KLHL24 mutations in these different patients?

The first three epidermolysis bullosa (EB) patients were first screened for the 18 previously known causative genes but no mutations were found. Then we performed whole exome sequencing and pinned down only one common variant gene among all three patients, namely KLHL24. We then acquired samples from two additional patients without mutations in the 18 known causative genes and used Sanger sequencing to show that both of them also have the mutations in the same KLHL24 gene, confirming that this is a new causative gene of EB.

All the patients you studied had start codon mutations leading to truncations in the protein. This must have been intriguing. What where your initial thoughts about this finding?

We were shocked. The first thought was that these must be gain-of-function mutations, unlike all the other EB mutations, which are loss-of-function mutations that can occur all over the places.

 

You very nicely demonstrate a model whereby Keratin 14 is an ubiquitination substrate of KLHL24, and that the truncated mutant is stabilized, thus leading to greater Keratin 14 degradation and the skin fragility phenotype. Can you walk us through how you teased apart this model? What do you consider the key piece of evidence that supports this model?

We used an unbiased “pull down + mass spectrometry” method to look for the binding proteins to the substrate binding domain of KLHL24 and Keratin 14 was the only one we found that specifically binds KLHL24 but not a carefully designed mutant that is predicted structurally to lose the substrate binding capacity. We immediately verified the binding and also showed that knocking down/overexpressing KLHL24 can increase/decrease Keratin 14 levels. A key piece of evidence is that transfection of KLHL24 in cell lines can boost Keratin 14 ubiquitination. Afterwards, we obtained two important pieces of in vivo evidence to show the anti-correlation of KLHL24 level and Keratin 14 level (in human skin samples and a knock-in mouse model), nicely confirming that Keratin 14 is a ubiquitination substrate of KLHL24.

 

You make a knock-in mouse, which recapitulates the decreased Keratin 14 levels similar to what is seen in patients, but not the skin fragility phenotype. Can you comment on why this might be so?

Many differences exist between human and mouse skin, the most obvious is the presence of fur in the mouse skin, which might afford better mechanic support of the epidermis than that in the human skin. In addition, there is actually a small but significant difference between the degrees of Keratin 14 decrease in patients and the mouse model (~70% decrease in patients vs. ~50% decrease in mice). Previously mouse models having ~50% decrease of Keratin 14 (the Krt14+/- mouse model) also did not show skin fragility. We don’t yet know the reason for the differential decrease of levels in human and mouse skin but are working on finding out the answers.

 

Do your findings have any potential implications for novel therapies for epidermolysis bullosa?

Absolutely, as I mentioned these are the first gain-of-function mutations found for EB, which should be easier to target therapeutically than loss-of-function mutations. Inhibiting KLHL24 in patients that we identified with these types of mutations should be able to effectively treat the conditions. We are now actively working on finding a specific KLHL24 inhibitor. In addition, because KLHL24 is a negative regulator of Keratin 14, other EB patients with partial loss-of-function mutations of Keratin 14 could also be helped by treatment with a KLHL24 inhibitor. In general, drug development targeting the ubiquitin-proteasome pathway has been given high hopes but it is not very obvious how to target the pathway specifically. Our studies provide a good example showing the importance of autoubiquitination of an E3 ligase, which might suggest previously over-looked strategies to target E3s.

 

Farm to Genomes: African Rice

Meyer at al., Nature Genetics, 2016

Meyer at al., Nature Genetics, 2016

Rice is one of the most important crops on the planet, responsible for feeding billions of people. Given this global significance, studying rice in different geographies can be useful and aid in harnessing genetic diversity underlying particular traits and adaptations favorable to different environments. African rice (Oryza glaberrima Steud.) is mainly grown in sub-Saharan Africa and known for its stress tolerance. In a new article this week in Nature Genetics, Michael Purugganan and colleagues report the whole genome re-sequencing of 93 African rice landraces from various regions of Western coastal and sub-Saharan Africa. They create a genome-wide SNP map and through comparative genomic analysis study the domestication and population history of African rice. They use their map to perform GWAS for salt tolerance and find 11 significantly associated regions, highlighting the value of this unique genetic resource.

Meyer et al., Nature Genetics, 2016

Meyer et al., Nature Genetics, 2016

By studying various regions with distinct environments, the authors were able to get clues about adaptation and geographic spread of the populations. They focused on coastal Senegal and inland Togo, which have higher and lower levels of soil salinity, respectively, and interviewed farmers in the region to understand the agricultural practices they employ in each region. The knowledge of the farmers helped to inform the genetic analysis and contributed to the model of African rice domestication and dispersal.

You can watch some of the interviews with the farmers here:

African rice farmers- interviews

Additionally, we spoke with authors Michael Purugganan and Rachel Meyer to get some background on this research.

Why do you think that rice is understudied in Africa compared to other places?

MP: I think it’s because it is not widely grown, unlike its Asian counterpart which has pretty much taken over the world.  But there definitely is more interest in African rice as breeders are trying to figure out how to increase food production in Africa, as well as to try to see what genes in African rice can be used to improve Asian rice.

RM: There is a lot of great research on improving Asian rice for African farmers that is being done by brilliant AfricaRice scientists, and they are working hard on the social science side too. But there are so many challenges that Africa disproportionately faces – particularly climate variation – that demands ramping up rice research. There is insufficient support for programs that integrate crop experiments and trials into the different farmlands. A better connection between scientists and small-scale farmers would really help farmers adopt new varieties too- because there is sometimes resistance to trying new ones.

How did you choose which samples to include in your analysis?

RM: Recognizing that a lot of NGO work encouraging farmers to grow Asian rice ramped up in the 80’s and 90’s, we took advantage of the germplasm largely donated in the 70’s to the West Africa Rice Development Association, which were duplicated and available through IRRI (International Rice Research Institute). We chose accessions with the most metadata available, preferring ones with georeferenced location and a cultivar name. It wasn’t until later that we realized water tables far inland were high in salinity, so we just tried to make sure we had a fair number of samples within 250km of the coast, or along rivers connecting to the ocean.

Were you surprised by any of your findings?

MP: There definitely were a few surprises in the data, but the big revelation for me was the long time for the population bottleneck that led to domestication.  We found from the genomic data that it may have taken more than 10,000 years of steady population decline before full-blown domesticated African rice shows up in the archaeological record.  This suggests the possibility that humans were already cultivating or managing its ancestor for thousands of years, and I think if this pattern holds for other domesticated crop species it will change our thinking on how domestication has taken place.

RM: I was surprised we got nice GWAS results with so few samples, and even more surprised that we saw several of those exhibiting signatures of geographic selection. We were lucky to find a broad distribution of traits in the landraces we chose to sequence, for we had made the DNA libraries ahead of the phenotyping experiments.

What was it like to meet and talk with the farmers?

RM: It was one of the highlights of my life to meet the farmers! I’m grateful to have gotten a glimpse of their heritage, their pride, and their struggles. We were all so impressed with the generosity of women, in particular, to help each other. We were also shocked by how many farms are run by the elderly; their children don’t see farming as profitable and many have left. For the three of us in the field, it made us think hard about how we can give back to the communities that gave us their time. I hope that crop science, publicity (like this blog) and policy changes can raise the profile of the small-scale farmer.

In each interview, the farmers also had a chance to interview us, and that part was especially interesting. Several asked really good questions about African and Asian rice domestication. You could see the cultural value of the basic science.

You chose to focus on salinity tolerance as a trait particularly relevant to farming in Africa.  In what ways do you see your results being used for crop improvement?

RM: One of the authors, from AfricaRice, Dr. Kofi Bimpong, had actually been working on salt tolerance separately as well, and has two graduate student collecting African rice landraces in Casamance. If from this paper we can consider that domestication possibly occurred in the Inner Niger Delta region and also in the West, then these collecting efforts are all the more important because they are from a center of origin, promising more genetic variation than people would have ever estimated. If you look through the available germplasm there is so little that has been collected or studied from Casamance. It’s tricky collecting there, for there is social unrest, and landmines. Hats off to the young graduate students, Mamadou Sock and Bathe Diop, doing that fieldwork; I’m sure there is a lot of discovery to be made with those collections, and more promising salt tolerant landraces to integrate into breeding programs.

In addition, our results suggesting many of the salt tolerance genes are shared in both rice species make them more valuable to explore in other crops.  Shared adaptive mechanisms are especially fascinating to evolutionary biologists and are powerful assets of the breeder’s toolbox.

Cancer clones- mixing and spreading

Shah 1

McPherson et al., Nature Genetics 2016

The trajectory of tumor cells during metastasis can be influenced by many factors, including the physical environment and the genetic makeup of metastatic clones. In high-grade serous ovarian cancer, there are limited barriers in the intraperitoneal space, allowing for extensive spreading and mixing of tumor cells. A recent article published in Nature Genetics explores these different patterns of clonal evolution in metastatic ovarian cancer using a combination of bulk and single cell sequencing.

The authors characterized the mutation landscapes of different metastatic tumors and find both monophyletic and polyphyletic clones. While in most patients there was unidirectional seeding from the original ovarian tumor, two patients exhibited polyclonal spread and reseeding. Therefore, high-grade serous ovarian cancer cells can migrate through and establish metastasis within the intraperitoneal space via different evolutionary routes.

McPherson et al., Nature Genetics 2016

McPherson et al., Nature Genetics 2016

We spoke to lead author, Sohrab Shah, to get some background on this research.

What features of this particular cancer made you want to study its metastasis? Were you surprised by your findings?

High grade serous ovarian cancers are often widespread through the peritoneal cavity at diagnosis.  We wanted to ask what are the characteristics of cells that spread and what is the distribution of these cells throughout the abdominal lesions.  The focus was to study the disease state prior to any treatment to characterize the diversity and take in inventory of the ‘substrate’ of clones upon which treatment selective pressures may be acting.  Many patients experience relapse after initial response to treatment.  Mapping which clones lead to relapse remains a key question in the field.  This was borne out in one patient in our study where specific clones that led to relapses were already present at diagnosis but only represented a minority of branches in the clonal phylogeny.

It is important to note that the mode of spread in this disease differs from most solid cancers, where spread is achieved through the bloodstream or lymphatics.  Ovarian cancer represents a unique opportunity to study disease spread through a relatively physically unencumbered anatomic space.  One might expect that in such an environment the potential for clonal intermixing is high.   This might lead to many clones co-existing at many sites.  But the majority of intraperitoneal samples were clonally pure, suggesting unidirectional spreading from ovary sites with diverse clonal repertoires, and a lack of clonal intermixing.

You provide evidence that the microenvironment influences the metastatic success of tumors. What does this say about in vitro cancer models that don’t account for tissue context?

One of the intriguing findings suggested that specific clones were present in specific sites.  This may indicate that particular microenvironments are differently suited to particular clones. Another surprising finding was that every patient harbored at least one lesion that was very diverse in its clonal make-up (typically within primary ovary sites).  This leads to the natural question of whether properties of specific microenvironments in some way promote or ‘tolerate’ clonal diversity.  If this were the case, then both in vitro and in vivo model systems such as cell lines, organoids and mouse xenografts may not adequately represent the natural disease state we find in patients prior to treatment.

How did you choose your sampling strategy?

The study results are naturally biased by the sampling strategy.  The study design was subject to what material could be obtained during the provision of care.  In our setup, we consented patients for collection and study of all material removed at primary debulking surgery.  Wherever possible tissue was cryopreserved, but inevitably many deposits were preserved in formalin.   Our strategy led to acquisition of a median n=10 samples per patient.  The nature of the samples and their locations are presented in Figure 4 and are also available in interactive web-form at:

https://compbio.bccrc.ca/research/tumour-evolution/

Users can click on the links for each patient and explore the clonal maps.

You utilize both bulk and single cell sequencing as complementary approaches to elucidating tumor evolution. Can you comment on the trade offs between cost and throughput and how you chose your sample sizes?

The field is entering an interesting time.   There are several limitations to both bulk and single cell sequencing strategies to define the clonal constituents of a tumor sample.  Most single cell techniques suffer from vast under-sampling of the clonal repertoire since they are limited in throughput and may only practically yield data from 100s of cells.  Furthermore, single cell techniques are prone to two key experimental sources of noise: missing data and allele-dropout.  We used targeted, multiplexed single cell sequencing as a form of validation from inferences made from the bulk sampling including validating co-occurrence of point mutations and structural variations in the same cells.  Hypotheses were generated from multi-site bulk analysis and were then tested using orthogonal single cell approaches.   Accordingly, the sample sizes in single cell were chosen to identify clones that were detected in bulk samples – in the range of 5% prevalence.  Notably, the noise properties of targeted multiplexed single cell data required some careful statistical treatment, the results of which were published as a standalone contribution in Nature Methods simultaneously with this publication.  As the field moves forward, it may become practical to sequence the whole genomes of 1000s of cells per sample. I look forward to the day when a single experimental design would be sufficient to dissect the important clones present in a cancer.  This would enable studying evolutionary properties at scale, leveraging richly defined principles and statistical models from the field of population genetics.

You find that there are differences in the potential for migration and metastasis across the tumors from your patients. What clinical implications might this have?

Our study is underpowered to provide a clear answer on this.  Our results hint anecdotally that cases with strong patterns of unidirectional spread fared poorly in their treatment trajectories.  Whether cancers harboring clones with strong potential to invade new micro-environments and dominate their local landscapes indicates potential to evade chemotherapy remains an important question to consider.  As we take this study forward in model systems derived from spatially distinct sites, reproducible treatment selection experiments can be carried out to robustly address this question.

 

The Colorful Carrot Genome

Simon carrots 1

Iorizzo et al. Nature Genetics, 2016

A high-quality assembly of the carrot (Daucus carota) genome is reported this week in Nature Genetics. Carrot is an important crop due to its high content of Vitamin A precursors, alpha- and beta-carotenes, as well as its popularity in global cuisines.  The bright orange color of the modern carrot and its high carotenoid content are features that emerged through selection and breeding- the complete genome sequence will serve as a resource to aid breeders in crop improvement strategies.

Iorizzo et al., 2016, Nature Genetics

Iorizzo et al., 2016, Nature Genetics

Sequencing the carrot genome allowed for the identification of two novel Whole Genome Duplication events and 634 proposed pest and disease resistant genes. In addition, a novel candidate gene regulating carotenoid accumulation was found. Finally, the authors re-sequenced 35 carrot species and outgroups to determine genomic regions associated with domestication and estimated genetic diversity. Further phylogenomic comparisons with other plants clarified evolutionary divergence between carrot and tomato, grape and kiwifruit.

Iorizzo et al., 2016, Nature Genetics

Iorizzo et al., 2016, Nature Genetics

We spoke with lead author Philipp Simon to get some background on the research.

How did you end up working on carrots?

The position I am in focuses on carrot genetics and breeding. It became advertised soon after I completed my Ph.D. in genetics. The ability to do genetic research on a crop with a strong positive impact on consumers appealed to me. I was fortunate enough to enter that position.

What do you consider your most surprising result coming out of sequencing the whole genome?

The discovery of a candidate gene for the Y locus, which conditions the accumulation of carotenoid pigments in carrot roots. In previous work we were able to map the trait and also genes for enzymes in the carotenoid biosynthetic pathway, but none of those genes involved in carotenoid biosynthesis mapped with the Y locus. With a well-characterized genome available, we discovered a candidate for that important gene. The Y locus is one of the two genes responsible for the domestication of wild white carrots (ancestral wild type) to orange.

What user group do you think will benefit the most from these data?

The immediate users of the whole genome sequence will be by plant breeders for marker-assisted selection they have underway for carrot disease resistance and seed production traits. There are also several public sector labs doing more basic research on carrot pigments, biotic and abiotic stress response, reproduction, and evolution that will find it useful.

You propose an interesting model for carotenoid accumulation in the carrot. How might this knowledge be applied to the potential improvement of other crops?

 There are several possibilities. The knowledge of this mutation in carrot may provide insights for identifying similar mutations in sequenced genomes of other crops, or generating similar mutations with genome editing technologies, for example. This could have application with other root crops such as cassava, but similar mutations are also known to influence pigment accumulation in fruit crops, so there may be applications beyond root crops.

What are some of your future directions going forward now that the genome assembly is complete?

 Now we are using the carrot genome to understand genes for other carrot traits, including traits influencing accumulation of carotenoids, anthocyanins, carbohydrates and flavor terpenoids; pest and disease resistance; abiotic stress responses; plant reproduction and growth.

Bonus- do you have a favorite carrot recipe?

Regarding carrots in my diet, I usually eat raw carrots, but roasted or stir-fried carrots are also very tasty.

Genetic link between type 1 and type 2 diabetes

Dooley et al., Nature Genetics 2016

Dooley et al., Nature Genetics 2016

Type 1 and Type 2 diabetes (T1D and T2D) are complex diseases characterized by insulin signaling defects resulting from either autoimmune deregulation or metabolic dysfunction, respectively. Both cause disruption of blood glucose regulation and can lead to significant systemic effects. Despite the physiological distinctions underlying disease development, there are commonalities between T1D and T2D; in T1D, pancreatic beta cells are targeted by the autoimmune system, while in T2D there is gradual, progressive beta cell mass decline. There are some shared genetic risk factors associated with both conditions, but distinguishing between genetic versus secondary causes related to beta cell failure has been challenging.

A new study this week in Nature Genetics reports on a T1D model and the identification of genetic loci underlying beta cell fragility, independent of an immune component. TD1 non-obese diabetic (NOD) mice expressing the insHEL transgene, which causes unfolded protein stress, developed diabetes, and the authors determined that this was not dependent on adaptive immunity. They characterize mutations in two genes, Glis3 and Xrcc4, which compound the stress effects, leading to apoptosis. Changes in these molecular pathways are likewise reflected in islet cells of diabetes patients. This mouse model, therefore, could be useful in study possible targets to prevent beta cell loss. Continue reading

Bacterial methylomes and antibiotic potentiation

Cohen et al., Nature Genetics, 2016

Cohen et al., Nature Genetics, 2016

Antibiotics emerged as miracle drugs and “silver bullets” in the early 20th century, revolutionizing medicine and our ability to combat infectious disease while positively impacting health and lifespans on a large scale. This remarkable triumph held steady for many years, and consequently antibiotic research and development diminished as a priority due to the seeming defeat of bacterial infections. However, the selective pressure that came with antibiotic exposure led to the development of bacterial resistance to these compounds, motivating renewed interest in what is now an extremely important public health issue. Mechanisms of resistance are many and ever-evolving, and we know now that it is not a matter of IF bacteria will become resistant to a class of antibiotics, but when. The search for new and potentially exploitable bacterial vulnerabilities, then, becomes a constant enterprise in order for us to keep pace with the bacteria in the antibiotics/resistance arms race.

Cohen et al., Nature Genetics, 2016

Cohen et al., Nature Genetics, 2016

A new study this week in Nature Genetics describes how manipulating the bacterial DNA methylome affects susceptibility to multiple classes of antibiotics. The authors observed that deleting the dam gene, encoding a DNA methyltransferase, from E. coli causes increased susceptibility to sub-lethal doses of the β-lactam antibiotic ampicillin. Dam specifically methylates GATC sites, and deletion of any of the other three DNA methyltransferases found in E. coli had no effect on the level of antibiotic susceptibility. Using SMRT sequencing, the authors saw that genome-wide GATC methylation patterns did not change after exposure to ampicillin, so they sought alternative explanations for the observed phenotype. Continue reading