Manyuan Long
University of Chicago, Illinois, USA
An evolutionary geneticist is surprised by genes of unknown origin.
I once thought that, like us, every gene must have a mother. But recent work has identified some genes that seem to have no genetic ancestry. These 'motherless' genes pose a new challenge to understanding the molecular mechanisms and evolutionary forces that shape our DNA. This isn't the first time we've had to revise our ideas about gene evolution.
About 40 years ago, geneticist Susumu Ohno proposed that new genes originate when an existing gene duplicates, then one of the copies evolves a new function. Working with Chuck Langley in the early 1990s, I had the luck to discover a gene in flies that added another strand to Ohno's story. The gene, named Jingwei, is a chimaera that formed through the combination of two existing genes.
Since then, researchers have identified many other 'new' genes assembled from unrelated genes and mobile DNA elements. Often the sequences' origins can be identified. When they can't, researchers have simply assumed that subsequent evolution has masked the relationship of the gene to its ancestral sequences.
But this is unlikely to be the case for hydra, a gene found recently in Drosophila melanogaster and closely related species (S.-T. Chen et al. PLoS Genet. 3, e107; 2007). No homologous sequences are found in a species that diverged from those carrying hydra only 13 million years ago — too recently for mutations to have obscured any related sequences. This implies that hydra arose de novo.
Another group has found a further 16 de novo genes in flies, which they propose evolved from non-coding DNA (D. J. Begun et al. Genetics 176, 1131–1137; 2007 and M. T. Levine et al. Proc. Natl Acad. Sci. USA 103, 9935–9939; 2006). These genes beg further study: what initiated their formation?
Editor's Note, the entry previously misspelled the name of the author's institution. Nature regrets the error.

Comments
"This implies that hydra arose de novo"
Is the more parsimonious explanation not that the gene was lost in the other lineage?
Posted by: Adam Reid | October 4, 2007 10:44 AM
If anybody has an idea about where HIV's tat gene comes from, I would love to know...
Posted by: METTLING | October 4, 2007 11:17 AM
Considering the genome plasticity of fly genome, it's not such a surprise for me. But I guess this is a rare case, where not only coding regions but also those regulatory sequences controlling expression need to be 'invented' or 'recruited' at the right places.
Posted by: Ke Jiang | October 4, 2007 02:31 PM
These 'motherless' or 'orphan' genes seem to be a widespread phenomenon.
Wilson et al 2005 Microbiology DOI 10.1099/mic0.28146-0 examined 122 bacterial species and found a linear increase in number of orphan genes discovered with number of genomes sequenced.
On page 49 of 'The Origins of Genome Architecture' Michael Lynch states that one third of predicted genes in the human genome are orphans.
He comments "this poorly understood genomic feature is not unique to humans (or primates) as it has been repeatedly found in the sequenced genomes of other metazoans".
Posted by: Richard Buggs | October 11, 2007 09:49 PM
If we find one orphan gene in each genome we sequence, that would be a linear increase. And I think bacteria genes may have slightly higher chance of de novo formation, given the relatively simple regulatory mechanisms.
'Predicted genes' doesn't mean they are real genes. Most of them are identified or predicted computationally. Their expression, products and functions all need to be verified. Before that, whether they are 'orphan genes' are in question.
Posted by: Ke Jiang | October 15, 2007 05:24 PM
Wilson et al (2005) found 6696 putative orphans in 122 bacterial species. In a recent study on 17 genomes of Ascomycota fungi, Wapinski et al (2007) Nature 449:54-61 found 19006 'singleton' genes that had no recognizable orthologs.
Clearly these need to be verified and many will turn out not to be real genes. However, Wapinski et al. note that in S. cerevisiae, 36 of these 'singleton' genes have essential functions. In their supplementary materials they make a very similar point to Manyuan Long (above), about these 36 genes: "With few exceptions, most do not have distant homologues within our orthogroup
catalog, other fungi or metazoa, so it is difficult to postulate the origin of these genes."
Orphan genes seem to be a real and intriguing problem.
Posted by: Richard Buggs | October 18, 2007 04:42 PM
Actually,virus is so powerful to carry lots of different genes,maybe the "motherless gene" is come from virus? So,I think examining the differences between environment in which new gene just appeared and the old ones lived,maybe a way to prove this guess.If,there really exist one or more kinds of virus, which have homologous gene to the "motherless gene",the guess is right
Posted by: larryfay | November 3, 2007 03:25 AM
Where does the mother gene's mother come from?
The 'grandma gene'?
Posted by: Hill | November 4, 2007 03:28 AM
One way to generate new gene material is through exonization of previously non-coding sequences. This has happened in a series of prokaryotes, including intracellular parasites and symbionts of insects, such as Rickettsia and Wolbachia. Long palindromic repeats (~150 nt) were found inserted into protein coding regions, leaving the gene fully functional.
Moreover, it has been shown recently that horizontal gene transfer is quite frequent between Wolbachia and a number of Drosophila species. So there are certainly mechanisms that can explain where new gene material in Drosophila may come from.
However, I would personally agree with Adam Reid's comment: the most parsimonious explanation in this case sounds like linage specific loss to me.
Posted by: Karsten Suhre | November 15, 2007 09:09 PM
larryfay, there is an interesting paper assessing the evidence that "motherless" ("ORFan") genes in microbes come from viruses: BMC Evolutionary Biology 2006, 6:63 doi:10.1186/1471-2148-6-63. The authors found that "only 2.8% of all microbial ORFans have detectable homologs in viruses, while the percentage of non-ORFans with detectable homologs in viruses is 7.9%, a significantly higher figure. This suggests that the current evidence for the origin of ORFans from lateral transfer from viruses is at best weak." They also point out that 27% of viral ORFs are ORFans.
Posted by: Richard Buggs | November 17, 2007 04:58 PM