Cuddly Koala Genomics

Rebecca Johnson

The genome assembly of the koala is reported in a paper published online in Nature Genetics. This high quality genome represents the most complete genome sequence for a marsupial to date. The data give insight into the highly specialized koala diet, consisting of eucalyptus leaves, and provide information that may be useful to combatting infectious disease.

Koalas are a vulnerable species and part of the aim of the the project was to use the genomic data to inform conservation efforts. We spoke with lead author Rebecca Johnson to get some background on this work:

Koala Rebecca Johnson

How did the koala genome project come to be?

The genome project started as a small group of Australian researchers (from the Australian Museum, University of the Sunshine Coast and University of Sydney) who were enthusiastic about koala conservation and using genomics to manage populations and diseases. We partnered up with colleagues at the Ramaciotti Centre at the University of New South Wales (UNSW) who were enthusiastic to try out their new sequencing equipment on a ‘de novo mammal sized genome’. This hadn’t been done before in Australia.

We decided to take a bit of a risk and announce to the world in 2013 that we were establishing the Koala Genome Consortium and sequencing the genome. This was a very effective way of getting our project on the scientific horizon but then the pressure was on us to deliver! Fortunately for me (and the koala) one of my biggest career risks (announcing the genome well ahead of time) has resulted in a brilliant collaboration of scientists producing a high quality genome with many exciting outcomes and applications.

 

What do you think were the most interesting or surprising findings that came out of the genome data?

So many interesting things have come out of this work, so it is difficult for me to pinpoint one in particular. However, as a conservation geneticist I’m particularly fond of the conservation genomics work, particularly the historical population reconstruction which infers what koala populations would have looked like through evolutionary time. It was a little surprising to discover that koalas underwent such a dramatic decrease in population size 30-40kya, which was around the time many of the megafauna were experiencing extinction in Australia. Another surprise was that the three koalas used for this analysis are from two quite geographically separate locations (~600 km apart) but both suggest a dramatic reduction in population size indicative of widespread pressures across the continent.

Having this ‘deep-time’ perspective on koala populations, combined with the contemporary population work we did as part of this study we have a long term understanding of koalas in the landscape (i.e. the importance of long-term regional gene flow). Conservation management efforts can now be based on this holistic knowledge rather than a single genetic snapshot taken in time.

 

What are the biggest threat to the koalas now?

The koala is now classified as ‘vulnerable’ due to habitat loss and widespread disease. Threats to koalas are multifaceted, with the biggest primarily due to loss and fragmentation of habitat, urbanization, climate change and disease. Current estimates put the number of koalas in Australia at only 329,000 animals (range 144,000-605,000), and a continuing decline is predicted unless measures are put in place to arrest this decline.

 

How do you envision that this genomic information can aid conservation efforts?

The benefit of the genome to conservation efforts is widespread. The population diversity information presented in our work provides the impetus for a conservation management strategy to maintain gene flow regionally while incorporating the genetic legacy of biogeographic barriers. We have also identified the huge contrast in genome-wide levels of diversity across the northern and southern populations of koalas which will be factored into future decision making. The importance of genetic diversity indices for koala conservation has been included in the recently released NSW koala strategy so we will be focusing on highlighting the genetically healthy koala populations and ensuring they maintain regional gene flow. If more intensive measures such as translocations are required (for example from the genetically diverse populations to the genetically depauperate populations), we now have the tools and data to inform those decisions.

The immune gene repertoire we report as part of the genome is also being used directly in efforts to understand the response of koalas to disease such as chlamydia and the koala retrovirus (KoRV). Several of our collaborators on this work are involved in very important work developing and trialing vaccines for both chlamydia and KoRV. The genome affords the ability to understand which immune genes are up or down regulated in response to disease or treatment and provides the platform for future therapies to be tailored to the genome level.

 

What is it like working with koalas? Do you have any good stories that you would like to share?

It never gets tiring working with koalas and it was not difficult at all to bring collaborators on board to work on this project!

Koalas are notoriously chilled out animals (spending most of their time sleeping or eating), although my friends and colleagues who wrangle them in the field do report how unpleasant it is to be on the receiving end of their extremely sharp claws and nippy diprotodon teeth!

As part of sequencing the genome, our efforts to extract suitable quality DNA from koala blood were unsuccessful (possibly because they have a high lipid content in their blood) the only way we could get suitable quality DNA was to wait for an animal to be euthanized so we could access tissues suitable for genome and transcriptome work. Our two females were euthanized because they had advanced untreatable chlamydia. It is an extremely sobering experience to be involved in these necropsies because you can see the ravages of the disease on the body. While these moments are very tough they also inspire you to work harder to ensure we are producing the best possible science to conserve this amazing species.

 

For more video information, please see:

 

https://www.youtube.com/watch?v=tcMCni28nNo&t=4s

Sea lamprey genomics

sea lamprey

Jeramiah Smith

The sea lamprey (Petromyzon marinus) is an important model in evolutionary biology. It was discovered in 2009 (https://www.pnas.org/content/106/27/11212.long) that the genome of the sea lamprey undergoes extensive programmed genome rearrangement during development, where ~0.5 Gb (around 20%) of DNA is eliminated from the genome. The somatic tissues contain smaller genomes and only the germ cells retain the full complement of genetic material. The genome of the sea lamprey had been sequenced previously from the blood and liver, so only the somatic genome has been thoroughly characterized (https://www.nature.com/articles/ng.2568).

Smith et al., Nature Genetics, 2018

Smith et al., Nature Genetics, 2018

In a paper published this week in Nature Genetics, Jeramiah Smith and colleagues report the germline genome sequence of the sea lamprey.  Using a combination of shot-gun and long-read sequencing integrated with scaffolding data and a meiotic map, the authors assembled a high-quality genome with near-chromosome level of contiguity. This allowed them to identify hundreds of genes that were systematically eliminated from the genome during development. Comparative analysis showed that mouse homologues of these genes are often marked by repressive complexes, indicating parallel strategies for programmed development.

We spoke with lead author Jeramiah Smith from the University of Kentucky to get some background on this research:

  • What inspired you to sequence the germline of the sea lamprey?

I have worked with lamprey for years. I originally got involved with lamprey because it holds a special place in the vertebrate tree of life that shed light on the common ancestor of all vertebrates. That was the motivation for the first lamprey genome project, which sequenced DNA from blood and liver cells.  Once we started working with lamprey we found out that the genome was much more complex than we ever anticipated. This included the fact that the genome changes its sequence content in a reproducible manner over the course of its normal development: something we call programmed genome rearrangement. The amount of DNA that is eliminated from sea lamprey is more than is present in some entire fish genomes, roughly half a billion bases. For me, this finding was the major inspiration behind sequencing the germline genome.

 

  • What do you think were the most surprising or interesting findings to come out of the sequencing?

There were quite a few, but the strong overlap between programmed genome rearrangement and Polycomb-mediated silencing was near the top. The other was the rather strong evidence suggesting the some chromosomes, including chromosomes carrying the HOX genes, appear to have duplicated rather recently and seemingly independently from the rest of the genome. It’s a really strange genome.

 

  • Can you comment on programmed genetic elimination as a developmental strategy versus Polycomb-mediated silencing? 

Polycomb-mediated silencing arose deep in our evolutionary history, and is even present in unicellular organisms. We know that lamprey possesses human homologs of all Polycomb genes, but also uses programmed elimination. The difference between programmed elimination and other mechanisms of gene silencing is that programmed elimination is essentially irreversible, given that the DNA is physically removed. This means that the genes can never be expressed after an embryonic cell lineage has undergone elimination. Other silencing mechanisms are generally reversible, meaning that gene expression can be reactivated. In some cases reactivation is important. For example, in the context of development and regeneration. But in other cases activation of genes in the wrong tissue can case diseases, such as cancer. Lamprey seems to know which genes should never be reactivated outside of the germline.

 

  • What is the most challenging part about working with sea lamprey?

The Genome! Aside from undergoing complex changes during development it also contains a large amount of repetitive DNA and a lot of sequence polymorphism. These features present substantial challenges for assembly and downstream analyses, but we’ve found that they can also be useful tools. We’ve used the abundance of sequence polymorphisms as a tool for mapping genes in lamprey and we now think that some classes of repeats are going to be critical for our future work aimed at figuring out how eliminated DNA is identified and packaged in the early embryos. Lampreys also only breed once a year and take from 5 to maybe 20 years to mature, this makes some experiments impossible, but lamprey researchers are very creative and the community has figured out how to get a lot done in this system.

  • What organisms would you like to see sequenced in the future to help resolve the evolutionary relationships of vertebrates?

There are so many! Hagfish are going to be critical. They are another deep lineage that provides important perspective on vertebrate evolution and also happen to undergo programmed DNA elimination. There are also two other deep lamprey lineages that I also think will be important. Those species live in the southern hemisphere and diverged from sea lamprey around 300 million years ago, as opposed to the roughly 600 million year divergence between lampreys and other vertebrates. A lot of evolution can happen over 600 million years and these species should help bridge that gap. Salamanders and other amphibians are also going to fill important gaps and teach us a lot about the way vertebrate genomes evolve and function. It also seems certain that new sequencing technologies are also going to give us better genomes for other important species that have already been sequenced (e.g. amphioxus, sharks and shark relatives, and even sea lamprey). Finally, I think the zebrafinch germline genome will also be really interesting. They seem to have recently evolved something similar to lamprey’s programmed eliminations, and have a chromosome that’s unique to their germline. I’d really like to know what’s on that chromosome.

The Colorful Carrot Genome

Simon carrots 1

Iorizzo et al. Nature Genetics, 2016

A high-quality assembly of the carrot (Daucus carota) genome is reported this week in Nature Genetics. Carrot is an important crop due to its high content of Vitamin A precursors, alpha- and beta-carotenes, as well as its popularity in global cuisines.  The bright orange color of the modern carrot and its high carotenoid content are features that emerged through selection and breeding- the complete genome sequence will serve as a resource to aid breeders in crop improvement strategies.

Iorizzo et al., 2016, Nature Genetics

Iorizzo et al., 2016, Nature Genetics

Sequencing the carrot genome allowed for the identification of two novel Whole Genome Duplication events and 634 proposed pest and disease resistant genes. In addition, a novel candidate gene regulating carotenoid accumulation was found. Finally, the authors re-sequenced 35 carrot species and outgroups to determine genomic regions associated with domestication and estimated genetic diversity. Further phylogenomic comparisons with other plants clarified evolutionary divergence between carrot and tomato, grape and kiwifruit.

Iorizzo et al., 2016, Nature Genetics

Iorizzo et al., 2016, Nature Genetics

We spoke with lead author Philipp Simon to get some background on the research.

How did you end up working on carrots?

The position I am in focuses on carrot genetics and breeding. It became advertised soon after I completed my Ph.D. in genetics. The ability to do genetic research on a crop with a strong positive impact on consumers appealed to me. I was fortunate enough to enter that position.

What do you consider your most surprising result coming out of sequencing the whole genome?

The discovery of a candidate gene for the Y locus, which conditions the accumulation of carotenoid pigments in carrot roots. In previous work we were able to map the trait and also genes for enzymes in the carotenoid biosynthetic pathway, but none of those genes involved in carotenoid biosynthesis mapped with the Y locus. With a well-characterized genome available, we discovered a candidate for that important gene. The Y locus is one of the two genes responsible for the domestication of wild white carrots (ancestral wild type) to orange.

What user group do you think will benefit the most from these data?

The immediate users of the whole genome sequence will be by plant breeders for marker-assisted selection they have underway for carrot disease resistance and seed production traits. There are also several public sector labs doing more basic research on carrot pigments, biotic and abiotic stress response, reproduction, and evolution that will find it useful.

You propose an interesting model for carotenoid accumulation in the carrot. How might this knowledge be applied to the potential improvement of other crops?

 There are several possibilities. The knowledge of this mutation in carrot may provide insights for identifying similar mutations in sequenced genomes of other crops, or generating similar mutations with genome editing technologies, for example. This could have application with other root crops such as cassava, but similar mutations are also known to influence pigment accumulation in fruit crops, so there may be applications beyond root crops.

What are some of your future directions going forward now that the genome assembly is complete?

 Now we are using the carrot genome to understand genes for other carrot traits, including traits influencing accumulation of carotenoids, anthocyanins, carbohydrates and flavor terpenoids; pest and disease resistance; abiotic stress responses; plant reproduction and growth.

Bonus- do you have a favorite carrot recipe?

Regarding carrots in my diet, I usually eat raw carrots, but roasted or stir-fried carrots are also very tasty.

Biting into the pineapple genome

"Pineapple and cross section" by Taken byfir0002 | flagstaffotos.com.auCanon 20D + Sigma 150mm f/2.8 - Own work. Licensed under GFDL 1.2 via Commons - https://commons.wikimedia.org/wiki/File:Pineapple_and_cross_section.jpg#/media/File:Pineapple_and_cross_section.jpg

“Pineapple and cross section” by Taken byfir0002 | flagstaffotos.com.auCanon 20D + Sigma 150mm f/2.8 – Own work. Licensed under GFDL 1.2 via Commons

The genome sequences of cultivated pineapple (Ananas comosus) and a related wild species (Ananas bracteatus) were published last week by Ming et al. in Nature Genetics. The genome has already led to insights into monocot evolution and CAM photosynthesis. In the future, studies that use the pineapple genome have the potential to lead to innovations in engineering drought resistant crops.

Every species, plant, animal or microorganism, that is sequenced is a useful resource for the research community. But each time a new genome is sequenced, we ask “what is really new about this one” and “what are we learning about biology”?  Pineapple is of course a delicious and economically important crop, but what makes its genome special?

There are a number of important aspects of pineapple biology that make it an important genome to sequence. First, pineapple uses a metabolic strategy known as crassulacean acid metabolism (CAM). CAM allows the plant to conserve water, making it more resistant to drought. Only one other CAM plant has had its genome sequenced, the orchid Phalaenopsis equestris.

NG-NV42149 Liu_Figure2

{credit}Zhong-Jian Liu, National Orchid Conservation Center of China {/credit}

Another reason to study the pineapple’s genome is to understand how self-incompatibility has evolved in monocotyledon plants. Wild pineapple species are self-compatible, but cultivated pineapples are not. As a result, cultivated pineapple is highly heterozygous. This aspect of pineapple biology also makes sequencing its genome technically challenging. Fortunately, the authors of the study devised a way around this potential problem to generate an extremely high-quality genome assembly (see the image on the right, courtesy of Zhong-Jian Liu, who was not affiliated with the study. Click for a larger view).

One of the most interesting aspects of the pineapple genome was only discovered after the genome was assembled. As the study’s authors found, pineapple has conserved the order of genes on its chromosomes more so than any other monocot studied to date. This high degree of synteny with the hypothetical ancestral monocot makes pineapple an ideal outgroup for comparative evolutionary studies involving other monocot species, such as grasses.

We spoke to the lead author of the study, Ray Ming, to learn a little more about how the study was conducted.

The genomes of many plants have been sequenced, or are in the process of being sequenced. Why did you decided to focus on pineapple?

I started my career at the Hawaii Agriculture Research Center and have been working on genomics of Hawaiian crops, including papaya, pineapple, sugarcane, and coffee.  We sequenced the papaya genome first.  It is a logical choice to sequence the smallest genome of the remaining three next. In addition, pineapple is the most economically important CAM plant crop, the second most important tropical fruit, is self-incompatible, and prone to somatic mutations.

How was the idea arrived at to use hybrids (the F153 x CB5 F1 cross) to overcome issues of high heterozygosity in the assembly process? Was this the initial plan, or were there other ideas as well?

We anticipated the difficulty of assembling the heterozygous pineapple genome.  Before we started the genome project, I discussed this issue with co-author John Bowers during the International Plant and Animal Genome Conference in San Diego, and John was the one who came up with the idea to sequence an F1 individual at deep coverage to have a single molecule from each parent for phasing to improve the assembly of the reference genome F153. Co-author Michael Schatz implemented this strategy, and also designed sophisticated approaches to improve the assembly of this heterozygous genome as detailed in the method section. Mike’s team did an outstanding job to produce a high quality assembly of this highly heterozygous genome. Mike is a pioneer and a leading scientist in assembling complicated and complex plant genomes.

We also tried to sequence the genome from single sperm cell to generate haploid genome sequences, but it wasn’t successful.  The long reads from Moleculo and PacBio improved the genome assembly, and the ultra-high density map of re-sequencing F1 individual genomes substantially improved the quality of the genome assembly and corrected 199 chimeric scaffolds.

Did you expect to see such high levels of conservation of synteny with ancestral monocots in the pineapple?

No. It was a surprise, but it makes sense since pineapple is self-incompatible and vegetatively propagated, hence having fewer generations of sexual reproduction in its evolutionary history.

How do you envision others using the pineapple genome sequence in their research?

The pineapple genome will be used for CAM photosynthesis research as a model system, and it will be used as a reference genome or even the reference genome for comparative genomics in monocots.

800px-PapayaBonus question: What is your favorite fruit?

Pineapple for its extraordinary flavor and aroma, and papaya for its number 1 nutritional value among fruits, and for its flavor.