Sea lamprey genomics

sea lamprey

Jeramiah Smith

The sea lamprey (Petromyzon marinus) is an important model in evolutionary biology. It was discovered in 2009 (https://www.pnas.org/content/106/27/11212.long) that the genome of the sea lamprey undergoes extensive programmed genome rearrangement during development, where ~0.5 Gb (around 20%) of DNA is eliminated from the genome. The somatic tissues contain smaller genomes and only the germ cells retain the full complement of genetic material. The genome of the sea lamprey had been sequenced previously from the blood and liver, so only the somatic genome has been thoroughly characterized (https://www.nature.com/articles/ng.2568).

Smith et al., Nature Genetics, 2018

Smith et al., Nature Genetics, 2018

In a paper published this week in Nature Genetics, Jeramiah Smith and colleagues report the germline genome sequence of the sea lamprey.  Using a combination of shot-gun and long-read sequencing integrated with scaffolding data and a meiotic map, the authors assembled a high-quality genome with near-chromosome level of contiguity. This allowed them to identify hundreds of genes that were systematically eliminated from the genome during development. Comparative analysis showed that mouse homologues of these genes are often marked by repressive complexes, indicating parallel strategies for programmed development.

We spoke with lead author Jeramiah Smith from the University of Kentucky to get some background on this research:

  • What inspired you to sequence the germline of the sea lamprey?

I have worked with lamprey for years. I originally got involved with lamprey because it holds a special place in the vertebrate tree of life that shed light on the common ancestor of all vertebrates. That was the motivation for the first lamprey genome project, which sequenced DNA from blood and liver cells.  Once we started working with lamprey we found out that the genome was much more complex than we ever anticipated. This included the fact that the genome changes its sequence content in a reproducible manner over the course of its normal development: something we call programmed genome rearrangement. The amount of DNA that is eliminated from sea lamprey is more than is present in some entire fish genomes, roughly half a billion bases. For me, this finding was the major inspiration behind sequencing the germline genome.

 

  • What do you think were the most surprising or interesting findings to come out of the sequencing?

There were quite a few, but the strong overlap between programmed genome rearrangement and Polycomb-mediated silencing was near the top. The other was the rather strong evidence suggesting the some chromosomes, including chromosomes carrying the HOX genes, appear to have duplicated rather recently and seemingly independently from the rest of the genome. It’s a really strange genome.

 

  • Can you comment on programmed genetic elimination as a developmental strategy versus Polycomb-mediated silencing? 

Polycomb-mediated silencing arose deep in our evolutionary history, and is even present in unicellular organisms. We know that lamprey possesses human homologs of all Polycomb genes, but also uses programmed elimination. The difference between programmed elimination and other mechanisms of gene silencing is that programmed elimination is essentially irreversible, given that the DNA is physically removed. This means that the genes can never be expressed after an embryonic cell lineage has undergone elimination. Other silencing mechanisms are generally reversible, meaning that gene expression can be reactivated. In some cases reactivation is important. For example, in the context of development and regeneration. But in other cases activation of genes in the wrong tissue can case diseases, such as cancer. Lamprey seems to know which genes should never be reactivated outside of the germline.

 

  • What is the most challenging part about working with sea lamprey?

The Genome! Aside from undergoing complex changes during development it also contains a large amount of repetitive DNA and a lot of sequence polymorphism. These features present substantial challenges for assembly and downstream analyses, but we’ve found that they can also be useful tools. We’ve used the abundance of sequence polymorphisms as a tool for mapping genes in lamprey and we now think that some classes of repeats are going to be critical for our future work aimed at figuring out how eliminated DNA is identified and packaged in the early embryos. Lampreys also only breed once a year and take from 5 to maybe 20 years to mature, this makes some experiments impossible, but lamprey researchers are very creative and the community has figured out how to get a lot done in this system.

  • What organisms would you like to see sequenced in the future to help resolve the evolutionary relationships of vertebrates?

There are so many! Hagfish are going to be critical. They are another deep lineage that provides important perspective on vertebrate evolution and also happen to undergo programmed DNA elimination. There are also two other deep lamprey lineages that I also think will be important. Those species live in the southern hemisphere and diverged from sea lamprey around 300 million years ago, as opposed to the roughly 600 million year divergence between lampreys and other vertebrates. A lot of evolution can happen over 600 million years and these species should help bridge that gap. Salamanders and other amphibians are also going to fill important gaps and teach us a lot about the way vertebrate genomes evolve and function. It also seems certain that new sequencing technologies are also going to give us better genomes for other important species that have already been sequenced (e.g. amphioxus, sharks and shark relatives, and even sea lamprey). Finally, I think the zebrafinch germline genome will also be really interesting. They seem to have recently evolved something similar to lamprey’s programmed eliminations, and have a chromosome that’s unique to their germline. I’d really like to know what’s on that chromosome.

Woolly mammoth hemoglobin brought to life: From the archives (2010)

Combarelles-mammouth

{credit}Cave painting: Mammouth gravé de la grotte des Combarelles (Dordogne, France){/credit}

As part of the ongoing celebration of the last 25 years of Nature Genetics, the editors are each choosing a few papers from our archives that we want to highlight. My first pick a paper from Kevin Campbell, Alan Cooper and colleagues on their structure-function analysis of woolly mammoth hemoglobin, published in May 2010.

I’ve picked this one to highlight because, well, who doesn’t love woolly mammoths?

The authors compared the gene sequences of the adult-expressed α- and β-like globin genes from extant elephant species (African and Asian elephants) and from a 43,000 year-old Siberian mammoth specimen reported first in Science. They found that the mammoth β-like genes (designated HBB/HBD by the authors) had 3 amino acid-altering substitutions compared to the extant species.

To test the effects of these protein-coding differences, the authors then “resurrected” the mammoth hemoglobin protein by expressing the mammoth sequence in E. coli and testing its O2 affinity at different ambient temperatures. They found that the O2 affinity of the recreated mammoth hemoglobin is less affected by temperature than that of modern-day elephants. The detailed structure-function analysis reported by Campbell et al. offered us a rare glimpse into the evolutionary process that shaped an extinct organism.

October issue cover: What’s going on here?

Oct

{credit}Convergent cabbages by Keyong Chang{/credit}

For all of October, we at Nature Genetics have been admiring the lovely cabbages on our cover. The image, created by photographer Keyong Chang, was contributed by the authors of the study on page 1218 of the issue.

But what is the story behind these pretty green cabbages?

Xiaowu Wang, corresponding author of the study, gave us a behind-the-scenes look at the process that led to the picture on our cover.

The image conveys the main idea of the study, namely that Brassica oleracea (cabbage, left) and Brassica rapa (Chinese cabbage, right) have taken similar evolutionary paths to arrive at their similar, but distinct, appearances. During domestication, farmers selected for cabbages of both species to have the large, leafy heads for which they are known. As shown in the study, the farmers were unknowingly selecting for orthologous genes in these two species. Continue reading

Farm to Genomes: African Rice

Meyer at al., Nature Genetics, 2016

Meyer at al., Nature Genetics, 2016

Rice is one of the most important crops on the planet, responsible for feeding billions of people. Given this global significance, studying rice in different geographies can be useful and aid in harnessing genetic diversity underlying particular traits and adaptations favorable to different environments. African rice (Oryza glaberrima Steud.) is mainly grown in sub-Saharan Africa and known for its stress tolerance. In a new article this week in Nature Genetics, Michael Purugganan and colleagues report the whole genome re-sequencing of 93 African rice landraces from various regions of Western coastal and sub-Saharan Africa. They create a genome-wide SNP map and through comparative genomic analysis study the domestication and population history of African rice. They use their map to perform GWAS for salt tolerance and find 11 significantly associated regions, highlighting the value of this unique genetic resource.

Meyer et al., Nature Genetics, 2016

Meyer et al., Nature Genetics, 2016

By studying various regions with distinct environments, the authors were able to get clues about adaptation and geographic spread of the populations. They focused on coastal Senegal and inland Togo, which have higher and lower levels of soil salinity, respectively, and interviewed farmers in the region to understand the agricultural practices they employ in each region. The knowledge of the farmers helped to inform the genetic analysis and contributed to the model of African rice domestication and dispersal.

You can watch some of the interviews with the farmers here:

African rice farmers- interviews

Additionally, we spoke with authors Michael Purugganan and Rachel Meyer to get some background on this research.

Why do you think that rice is understudied in Africa compared to other places?

MP: I think it’s because it is not widely grown, unlike its Asian counterpart which has pretty much taken over the world.  But there definitely is more interest in African rice as breeders are trying to figure out how to increase food production in Africa, as well as to try to see what genes in African rice can be used to improve Asian rice.

RM: There is a lot of great research on improving Asian rice for African farmers that is being done by brilliant AfricaRice scientists, and they are working hard on the social science side too. But there are so many challenges that Africa disproportionately faces – particularly climate variation – that demands ramping up rice research. There is insufficient support for programs that integrate crop experiments and trials into the different farmlands. A better connection between scientists and small-scale farmers would really help farmers adopt new varieties too- because there is sometimes resistance to trying new ones.

How did you choose which samples to include in your analysis?

RM: Recognizing that a lot of NGO work encouraging farmers to grow Asian rice ramped up in the 80’s and 90’s, we took advantage of the germplasm largely donated in the 70’s to the West Africa Rice Development Association, which were duplicated and available through IRRI (International Rice Research Institute). We chose accessions with the most metadata available, preferring ones with georeferenced location and a cultivar name. It wasn’t until later that we realized water tables far inland were high in salinity, so we just tried to make sure we had a fair number of samples within 250km of the coast, or along rivers connecting to the ocean.

Were you surprised by any of your findings?

MP: There definitely were a few surprises in the data, but the big revelation for me was the long time for the population bottleneck that led to domestication.  We found from the genomic data that it may have taken more than 10,000 years of steady population decline before full-blown domesticated African rice shows up in the archaeological record.  This suggests the possibility that humans were already cultivating or managing its ancestor for thousands of years, and I think if this pattern holds for other domesticated crop species it will change our thinking on how domestication has taken place.

RM: I was surprised we got nice GWAS results with so few samples, and even more surprised that we saw several of those exhibiting signatures of geographic selection. We were lucky to find a broad distribution of traits in the landraces we chose to sequence, for we had made the DNA libraries ahead of the phenotyping experiments.

What was it like to meet and talk with the farmers?

RM: It was one of the highlights of my life to meet the farmers! I’m grateful to have gotten a glimpse of their heritage, their pride, and their struggles. We were all so impressed with the generosity of women, in particular, to help each other. We were also shocked by how many farms are run by the elderly; their children don’t see farming as profitable and many have left. For the three of us in the field, it made us think hard about how we can give back to the communities that gave us their time. I hope that crop science, publicity (like this blog) and policy changes can raise the profile of the small-scale farmer.

In each interview, the farmers also had a chance to interview us, and that part was especially interesting. Several asked really good questions about African and Asian rice domestication. You could see the cultural value of the basic science.

You chose to focus on salinity tolerance as a trait particularly relevant to farming in Africa.  In what ways do you see your results being used for crop improvement?

RM: One of the authors, from AfricaRice, Dr. Kofi Bimpong, had actually been working on salt tolerance separately as well, and has two graduate student collecting African rice landraces in Casamance. If from this paper we can consider that domestication possibly occurred in the Inner Niger Delta region and also in the West, then these collecting efforts are all the more important because they are from a center of origin, promising more genetic variation than people would have ever estimated. If you look through the available germplasm there is so little that has been collected or studied from Casamance. It’s tricky collecting there, for there is social unrest, and landmines. Hats off to the young graduate students, Mamadou Sock and Bathe Diop, doing that fieldwork; I’m sure there is a lot of discovery to be made with those collections, and more promising salt tolerant landraces to integrate into breeding programs.

In addition, our results suggesting many of the salt tolerance genes are shared in both rice species make them more valuable to explore in other crops.  Shared adaptive mechanisms are especially fascinating to evolutionary biologists and are powerful assets of the breeder’s toolbox.

May issue cover: What’s going on here?

May2016This month’s cover image is inspired by the Article on p. 528 of this issue, by Jeff Wall, Nicola Illing, Nadav Ahituv and colleagues. The paper reports the genome of the bat Miniopterus natalensis and transcriptional dynamics in the developing bat wing. This species, one of a group known as vesper bats, is also known as the Natal long-fingered bat and is found in parts of Africa.

The image chosen for the cover is a frontal view of a bat embryo at a late stage of development (stage CS21) taken by study co-author Mandy Mason. This developmental stage is known as
“Translucent Wing”, as you can clearly see the skeletal structures in the wing and the membrane between the outstretched digits. The embryo in this image was stained with Alizarin red (maroon-red-pink) for bone and Alcian blue (blue-cyan) for cartilage. The image was actually taken as part of an earlier study to understand the progression of limb development in this species and to compare it with that of the mouse.

The current study presents not only the genome sequence of the Natal long-fingered bat, but also RNA-seq and ChIP-seq (for H3K27ac and H3K27me3) profiling of the developing limbs. The authors identified more than 7,000 genes that were differentially expressed between the forelimbs—the eventual wings—and the hindlimbs. Through comparative genomics analyses, they found nearly 3,000 regions showing evidence of accelerated evolution along the bat lineage that overlapped with H3K27ac peaks, suggesting that these are candidate enhancer regions for wing development. “This study offers a comprehensive resource for future work in comparative limb development,” co-author Mandy Mason told us. “Aside from the results that we have presented in this paper, these open datasets can be queried to help answer additional questions that may be asked by both our and other research groups.”

 

The Colorful Carrot Genome

Simon carrots 1

Iorizzo et al. Nature Genetics, 2016

A high-quality assembly of the carrot (Daucus carota) genome is reported this week in Nature Genetics. Carrot is an important crop due to its high content of Vitamin A precursors, alpha- and beta-carotenes, as well as its popularity in global cuisines.  The bright orange color of the modern carrot and its high carotenoid content are features that emerged through selection and breeding- the complete genome sequence will serve as a resource to aid breeders in crop improvement strategies.

Iorizzo et al., 2016, Nature Genetics

Iorizzo et al., 2016, Nature Genetics

Sequencing the carrot genome allowed for the identification of two novel Whole Genome Duplication events and 634 proposed pest and disease resistant genes. In addition, a novel candidate gene regulating carotenoid accumulation was found. Finally, the authors re-sequenced 35 carrot species and outgroups to determine genomic regions associated with domestication and estimated genetic diversity. Further phylogenomic comparisons with other plants clarified evolutionary divergence between carrot and tomato, grape and kiwifruit.

Iorizzo et al., 2016, Nature Genetics

Iorizzo et al., 2016, Nature Genetics

We spoke with lead author Philipp Simon to get some background on the research.

How did you end up working on carrots?

The position I am in focuses on carrot genetics and breeding. It became advertised soon after I completed my Ph.D. in genetics. The ability to do genetic research on a crop with a strong positive impact on consumers appealed to me. I was fortunate enough to enter that position.

What do you consider your most surprising result coming out of sequencing the whole genome?

The discovery of a candidate gene for the Y locus, which conditions the accumulation of carotenoid pigments in carrot roots. In previous work we were able to map the trait and also genes for enzymes in the carotenoid biosynthetic pathway, but none of those genes involved in carotenoid biosynthesis mapped with the Y locus. With a well-characterized genome available, we discovered a candidate for that important gene. The Y locus is one of the two genes responsible for the domestication of wild white carrots (ancestral wild type) to orange.

What user group do you think will benefit the most from these data?

The immediate users of the whole genome sequence will be by plant breeders for marker-assisted selection they have underway for carrot disease resistance and seed production traits. There are also several public sector labs doing more basic research on carrot pigments, biotic and abiotic stress response, reproduction, and evolution that will find it useful.

You propose an interesting model for carotenoid accumulation in the carrot. How might this knowledge be applied to the potential improvement of other crops?

 There are several possibilities. The knowledge of this mutation in carrot may provide insights for identifying similar mutations in sequenced genomes of other crops, or generating similar mutations with genome editing technologies, for example. This could have application with other root crops such as cassava, but similar mutations are also known to influence pigment accumulation in fruit crops, so there may be applications beyond root crops.

What are some of your future directions going forward now that the genome assembly is complete?

 Now we are using the carrot genome to understand genes for other carrot traits, including traits influencing accumulation of carotenoids, anthocyanins, carbohydrates and flavor terpenoids; pest and disease resistance; abiotic stress responses; plant reproduction and growth.

Bonus- do you have a favorite carrot recipe?

Regarding carrots in my diet, I usually eat raw carrots, but roasted or stir-fried carrots are also very tasty.

Ancient regulatory logic

Yao et al. found that certain brain enhancers were functionally conserved between mice (left) and acorn worm (right), despite very limited sequence conservation.

Yao et al. found that certain brain enhancers were functionally conserved between mice (left) and acorn worm (right), despite very limited sequence conservation. {credit}Douglas Epstein{/credit}

A study published this week in Nature Genetics shows that enhancers can be conserved across very long evolutionary distances, even without extensive sequence conservation. Continue reading

What makes a parasite?

Stronglyoides worm

Genetic clues to what makes parasitic worms different from free-living worms are reported in a paper published online this week in Nature Genetics. Groups led by Mark Viney, Matthew Berriman and Taisei Kikuchi carried out the sequencing and assembly of genomes from six nematode species from the clade that includes the human parasitic roundworm Strongyloides stercoralis. We asked one of the authors, Professor Mark Viney of the University of Bristol, to tell us a little bit about the study.

Although the genomes of several parasitic worm species have been published to date, Strongyloides represents a unique opportunity to learn some of the general rules of being a parasitic worm. According to Mark Viney, “what makes Strongyloides so special is that this clade contains parasites, facultative parasites and free-living species that are all close relatives. This gives us real power to our analysis.  Our work will be used by the international research community who work on these globally important parasites of people and other animals.”

S. stercoralis infects approximately 30-100 million people worldwide and causes a wide range of symptoms. Closely related species in the clade Strongyloides include both free-living and parasitic species that infect a wide range of hosts. In parasitic species, generations alternate between parasitic and free-living, resulting in genetically identical females with starkly different lifestyles.

The authors first compared the genomes of free-living and parasitic species to identify genes specific to the parasites. They found that acquisition of 1,075 gene families was associated with the evolution of parasitism and parasitism was associated with greater expansion of genes and gene families overall.

When asked what the most unexpected aspect of the study was, Professor Viney said, I think the really surprising thing that we found was just how largely expanded some gene families were in the parasitic species. This is quite unprecedented in the nematodes.” The authors also found that most parasitism-related genes were located in genomic clusters. “The important thing about these clusters is that nothing like this has ever been seen before in parasitic worms and it certainly speaks to the possible importance of these in their evolution of parasitism,” said Professor Viney.

 

The life cycle of the 6 sequenced species and the gene gains and losses in each lineage.

The life cycle of the 6 sequenced species and the gene gains and losses in each lineage. {credit}Hunt et al. Nat. Genet. 2016{/credit}

Two gene families were especially expanded in parasitic genomes—those encoding SCP/TAPS and astacin-domain proteins—and based on RNA-sequencing studies, these were also much more highly expressed in parasitic females than free-living females of the same species. This suggests that these gene families in particular are important for the ability of the worm to infect its host. In support of this hypothesis, the authors found that proteins from these two families are secreted by the worms, and would therefore be able to interact with host tissues to aid in invasion and migration.

Asked about the next steps that need to be taken for these findings, Mark Viney said, “For these SCP/TAPS coding genes what we really need to do is to find out what these genes are doing—this is completely unknown at the moment. For the astacins we can probably guess what they do—being involved in digesting host tissue so that the parasites can feed. They might be potential drug targets.”

The study brought together groups from the UK, Japan, Taiwan, Germany, USA, Mexico and Australia and is one of many examples of successful collaboration in science. “The field of parasitology is a very friendly and interactive community,” said Professor Viney, “so this collaboration was very easy to bring together, and worked extremely well—and will do in the future as well.”

 

To learn more about this study, check out this blog post from one of the co-first authors, Adam Reid, at the Wellcome Trust Sanger Institute. More coverage can also be found at the University of Bristol website.

 

Reference:

Hunt, V.L., Tsai I.J., Coghlan, A., Reid, A.J., et al. The genomic basis of parasitism in the Strongyloides clade of nematodes. Nat. Genet. (doi: 10.1038/ng.3495, 1 February 2016)

The paper is available for free online: https://www.nature.com/ng/journal/vaop/ncurrent/full/ng.3495.html

 

 

Pollinators and Petunias

Sheehan et al., Nature Genetics, 2015

Sheehan et al., Nature Genetics, 2015

Pollinators are attracted to flowers based on certain characteristics, including color, scent and morphology. Evolutionary changes in these traits correlate with changes in pollinator-plant relationships, and pollinator syndromes, or suites of floral characteristics that influence pollinator identity, can differ greatly between even closely related species.  Thus, characterizing the molecular basis that underlies shifts in pollinator syndromes can lead to the discovery of speciation genes, as well as to a greater understanding of evolutionary trajectories and timelines that define the species.

A new study this week in Nature Genetics reports on a gene that controls levels of ultraviolet (UV) light absorbance in different species of Petunia, affecting whether the flowers are pollinated by bees, hawkmoths or hummingbirds. Through a series of elegant experiments involving QTL analysis, genetic crosses and a transponson mutagenesis screen, the authors were able to not only find a single gene, but also to describe the particular mutations responsible for the increased UV absorbance seen in one species and the decreased absorbance seen in another.

Sheehan et al., Nature Genetics 2015

Sheehan et al., Nature Genetics 2015

The MYB-FL gene that they isolated is a transcription factor that regulates FLS (flavonol synthase) and thus directly controls the production of flavonol, a compound that absorbs UV light. Flowers with high UV absorbance have a concomitant decrease in visible light absorbance, and this is reflected by pollinator preference. Species with low UV absorbing flowers have pink or red coloring and are pollinated by bees or hummingbirds, while species with high UV absorbing flowers have white coloring and are pollinated by (the nocturnal) hawkmoth. The authors found that the high UV absorbing species has a promoter mutation in the MYB-FL gene that increases its expression, while in the low UV absorbing species that is pollinated by hummingbirds, there is a frameshift mutation in the MYB-FL locus that compromises the function of the protein.

Through this analysis, the authors were able to formulate a model for the evolutionary relationships between three Petunia species. Colorful flowers that have low UV absorbance and that are bee-pollinated represent the ancestral state, as exemplified by P. inflata. The increased UV absorbance of the white flowered, hawkmoth-pollinated P. axillaris evolved via a gain-of-function cis-regulatory mutation in MYB-FL that increases its expression and thus, flavonol production. Finally, a subsequent inactivating frameshift mutation seen in P. exerta restored low UV absorbance and is associated with colorful flowers that are pollinated by hummingbirds.

Sheehan et al., Nature Genetics 2015

Sheehan et al., Nature Genetics 2015

 

We spoke with lead investigator Cris Kuhlemeier to get some background on this research.

Why do you work with Petunia? Is it a particularly good subject for studying pollination syndrome shifts?

Our goal is to find the plant genes responsible for the adaptation to different pollinators. For that, we need a system with good molecular genetics and well-defined pollination syndromes. The garden petunia has a long history as genetic model system, today it is probably best known for the discovery of RNAi. Wild Petunia species are adapted to pollination by bees, hawkmoths and hummingbirds. These species are easy to cross and propagate in the lab and give fertile offspring, and most of the genetic tools can easily be transferred from the garden petunia to the wild species.

You identified different classes of mutations in the MYB-FL gene that help to clarify evolutionary relationships between different Petunia species. What advantage does this approach have over sequencing and phylogenetic analysis?

In recent radiations such as in Petunia, classical phylogenies often have limited resolution and individual gene trees are often in conflict. We try to understand the process of adaptation and speciation by studying the gene modifications that cause reproductive isolation. By superimposing these functionally relevant polymorphisms onto the classical phylogeny, discrepancies between individual gene trees become informative.

It is interesting that you observe a trade-off between levels of anthocyanins and flavonols in these flowers. Were you expecting to see this and were you surprised that a single locus affected both levels?

Anthocyanins and flavonols share the same precursors, so finding metabolic competition was not unexpected. We started this project on the assumption that the genetics of pollination syndromes would be relative simple. At least simple enough to be able to clone the relevant genes. That a single gene can change two traits simultaneously was better than we had hoped for.

You hypothesize that R2R3-MYB transcription factors provide the toolbox for shifts in floral pollination syndromes. Do you think that your results are generalizable to other plants and/or complex traits?

R2R3-MYBs appear indeed to be over-represented, in the same way that HOX factors are overrepresented in segmentation or MADS box factors in floral organ identity. But the sample size is still small, and it is always dangerous to extrapolate, especially in ecology and evolution.

Finally, this works represents a nice combination of laboratory and field studies. Do you enjoy collecting flowers in the wild?

Well, it did rain a lot during my visit last month. But yes, it has been a new and enjoyable for me experience to go to the field with my great Brazilian colleagues. In Brazil with its great biodiversity, I also sense the excitement that, thanks to the recent progress in sequencing technology, we are no longer limited to model systems but can study interesting biological processes in almost any plant species.

On the history of pigs

USDA_ARS_Meishan_pig-Cropped

{credit}Agricultural Research Service via Wikipedia{/credit}

Understanding the genomic changes that occurred during the domestication of animals and plants by humans is important on many levels. Such insights can provide information about human history and our interactions with other species, as is the case with genetic studies of dog and cat domestication. These studies can also help us to improve crop plants (such as tomato) and livestock (such as cattle) for human consumption or other use. Finally, genetic studies on domestication can help to identify disease-causing mutations that have been selected for as a by product of selection for beneficial traits (for example, in cats and dogs).

Though humans have a huge influence on important traits in domesticated species, those species are still responding to natural selection during the domestication process, which in turn may affect traits important for agricultural purposes. Identifying genomic regions influenced by positive natural selection in domesticated animals  can lead to important insights into the biology of specific breeds.

In this respect, the pig is an excellent model to study. Humans domesticated pigs approximately 10,000 years ago in the Near East and China, but a relatively open method of keeping pigs allowed for continued interbreeding with wild boars for some time. In a study published this week in Nature GeneticsLusheng Huang, Jun Ren and colleagues from Jiangxi Agricultural University sequenced the genomes of 69 diverse domestic and wild pigs in China to better understand their evolutionary history.

Pig sampling in China

Pig sampling in China{credit}Lusheng Huang{/credit}

The study included pigs from 11 diverse breeds (and 3 populations of wild boar) within China in order to compare the adaptations in breeds from cold vs. hot areas. They identified over 700 genomic regions that showed evidence of selective sweeps. Many of the genes in these regions were involved in processes important for regulation of temperature during cold or heat stress, such as hair development, energy metabolism and blood circulation.

However, one of the most striking results was the identification of a large (~14Mb) sweep region on the X-chromosome. More than 94% of the single nucleotide polymorphisms (SNPs) in the 69 pig sample that had extreme allele frequency differences between North and South populations were located within the X-linked sweep region. All Northern Chinese samples showed a strong signature of selection in this region. Upon further analysis, the authors were able to determine that the most likely scenario, given their data, was that this region was introgressed from a now-extinct species of Sus. This region of the X-chromosome undergoes very little recombination. This fact, combined with the strong signal of positive selection in the region, meant the introgressed sequence remained mostly preserved for more than 8 million years.

We asked one of the study’s senior authors, Lusheng Huang, to tell us a little more about the work:

How did you collect the DNA samples from the pigs for your study? Were any of the samples difficult to get?

We collected DNA samples from 4,100 three-generation consangeneously unrelated pigs representing all 68 indigenous breeds that are distributed in 24 provinces of China. It took us four and half years to complete sample collections, Some native pigs lived in the high attitude regions (Yunnan, Guizhou, Sichuan and Tibet) were very hard to get. Afterwards, we constructed a DNA bank for Whole China indigenous pigs. As a pilot study, we first genotyped 520 unrelated pigs (no common ancestor within 3 generations) from 32 Chinese breeds for 60K SNPs in the Illumina porcine beadchip. Then, we selected 69 representative pigs from the 520 pigs according to their genetic relationships in the neighbor-joining tree constructed with the 60K SNP data. The 69 pigs selected for whole-genome sequencing are highly rep­resentative of populations at the geographical extremes of China.

pig sampling

{credit}Lusheng Huang{/credit}

Most of the sampled pigs were originally raised in government-sponsored conservation farms. We selected animals to cover a majority of consanguinity of each breed according to their pedigree information. However, samples of several breeds were collected from isolated villages or farms at rural areas. For example, it was a big challenge for us to collect samples of Tibetan pigs from different geographic populations in the vast region of the Tibet Plateau. To find purebred Tibetan pigs that were not influenced by human-mediated hybrid with exotic breeds, we had to travel to remote pastoral areas at high altitudes and make an in-depth field investigation with the kind help of local residents. To cover the consanguinity of each Tibetan population as broad as possible, we preferably collected samples from Tibetan boars that are usually aggressive like wild boars and were really difficult to get (see above picture).

What do the positively selected regions tell us about the history of pig domestication?

These regions clearly illustrate that pigs have experienced natural selection for local fitness before (ancient event) or after (recent event) domestication. The selection footprints in the pig genomes can be visualized by whole-genome sequencing, characterized by reduced heterozygosity, excess of low-frequency variants, extended and differentiated haplotypes. The selected sweep regions harbor functional genes that play a role in adaptation to local environments. DCF17 and VPS13A are two such examples highlighted in this study.

What do you think was the most unexpected result in this study? Did you believe it at first?

The extremely divergent haplotype in the X-linked sweep region between Southern and Northern Chinese pigs, an indication of a possible ancient interspecies introgression event, was the most unexpected result in this study. It is a big surprise. Frankly speaking, we did not believe it at first.

Adapted from Fig. 4a in Huashui Ai et al. 2014

The pattern of haplotype sharing in diverse populations. The haplotypes were reconstructed for each individual using all of the variants on the X chromosome. Alleles that are identical to or different from the ones in the Wuzhishan reference genome are indicated by red and blue, respectively. Adapted from Fig. 4a in Huashui Ai et al. 2014{credit}Nature Genetics{/credit}

Why is the finding of a large introgression region on the X chromosome important?

Although evidence of adaptive evolution driven by introgression from archaic species has been recently identified in some species including humans, the X-linked introgression region shows that adaptive introgression is not limited to closely related species, but in some cases, introgression with very divergent species can provide the basis for the evolution of radically new traits in a species. This radical example of so-called ‘reticulate evolution’ in mammals shakes the foundation of most modern evolutionary biology and provides a new view of adaptive evolution that emphasizes saltationist (sudden) processes driven by introgression. Moreover, as discussed in the paper, our ability to detect this, potentially quite old, introgression event is facilitated by the fact that the introgression fragment falls in a recombination-decreasing region. This has allowed the introgressed haplotype to be maintained for a prolonged period. Our results may suggest that introgression generally plays a much more dominant role in adaptive evolution than previously thought, but has been difficult to detect because introgression fragments in other systems degenerate quickly due to recombination.

Do you think similar ancient introgressions have occurred in other domesticated species? If so, how would you test this?

We cannot rule out the possibility. If one wants to test this hypothesis, we would suggest to use a research strategy similar to that used in this study. First, we would need to get the genome sequences of multiple species divergent from a domesticated species. Then, we can perform a genome-wide scan for possible introgression regions from another divergent species in the domestic species. Several statistics of ABBA, F4, haplotype sharing and phylogenetic analysis can be explored to identify such ancient introgressions.

Erhualian

{credit}Lusheng Huang{/credit}

Bonus question: What is your favorite breed of domestic pig?

Erhualian, the most prolific pig breed in the world.