Occasionally, errors in DNA replication or the replication of mobile genetic elements will generate a cell or even a multicellular organism with two copies of the same gene. Following these gene duplications, the second copy will either be lost or maintained depending on selective pressures. In a recent study published in Molecular Biology and Evolution, MCB graduate student Lisa Kursel and her mentor Dr. Harmit Malik (Basic Sciences Division) identified several independent duplications of a gene central to cell division.
Kursel and Malik were interested in studying the evolutionary history of the gene that specifies the centromere, the specialized part of each chromosome that connects it to fibers that will segregate it into new cells during cell division, called mitosis. Despite this highly essential function, the DNA sequences of centromeres as well as proteins that bind to these regions are surprisingly diverse among different species. This has been called the “centromere paradox”. This continual evolution of centromeric genes is thought to be caused by the unique challenges of centromere function in the male and female germline, the cells that generate reproductive cells, called gametes.
Gene duplications present a situation where the essential functions of a gene can be retained but specialized functions such as inheritance in gametes can evolve in the duplicate copy. This is called subfunctionalization. Kursel and Malik wanted to take advantage of recent sequencing of the genomes of different species of Drosophila to analyze the evolution of the centromere-determining gene CENP-A and search for evidence of subfunctionalization within the gene itself or perhaps duplicate copies that may have arisen.
CENP-A, called Cid in Drosophila, is a histone H3 variant that wraps centromeric DNA in nearly all species of animals, plants, and yeast studied. Kursel used the part of the gene that is most similar between different organisms, the histone-fold domain, to search the newly sequenced genomes of different species of fruit flies for Cid homologs. In agreement with previous work, she found only one Cid gene in the well studied Drosophila melanogaster. Interestingly, she found at least four independent Cid duplications in the genus Drosophila.
In D. eugracilis, the original Cid1 gene was significantly mutated, or "pseudogenized", rendering it nonfunctional, and a duplicate, Cid2 appeared and replaced the original Cid1 gene function. In all of the other species analyzed, Cid1 is intact. In the montium subgroup, two additional copies of the gene, Cid3 and Cid4, appeared in the genome 15 million years ago. Kursel found another duplicate, a "paralog", called Cid5 in every member of the Drosophila subgenus. Based on this, Cid5 must have appeared at least 40 million years ago. The retention of these duplicate genes over such a long period of time suggests that they perform a function that is beneficial to the flies that possess them.
Kursel used codon-based DNA alignment of the histone-fold domain of all Drosophila Cid genes and maximum likelihood as well as neighbor-joining analysis to build phylogenetic trees. Doing this, she constructed a detailed history of the origin of each Cid gene. Interestingly, the pattern of inheritance of the Cid1 and Cid3 genes in a particular set of species suggests either that multiple independent duplication events generated Cid3 from Cid1 or that the genes have recombined, or swapped sequences, repeatedly within this group over time. Recombination could facilitate the retention of ancestral Cid1 functions or the distribution of new adaptive functions.
The researchers wanted to determine whether the newly identified Cid genes showed evidence of retaining partial function or whether they had developed completely new functions in flies. One readout of the original Cid1 function is its specific marking of the centromeres of chromosomes. Therefore, Kursel determined where the new Cid proteins were located in fly cells. She found that Cid3 and Cid4, in D. auraria cells, localized to centromeres. Similarly, in D. virilis, Cid5 localized to centromeres. These data suggest that the Cid duplicates retain at least partial Cid1 function.
The authors suspected that the new Cid genes might have evolved to perform germline-specific functions. Therefore, Kursel tracked where each gene was expressed in flies by performing dissections followed by PCR. She found that Cid3 in the montium subgroup and Cid5 in the Drosophila subgenus were specifically expressed in the testis of male flies but not expressed in other locations in the body, including ovaries of females. This expression pattern suggests that Cid3 and Cid5 may have evolved to respond specifically to the unique challenges of inheritance through sexual reproduction.
"We propose that the centromeric histone may perform multiple distinct functions, including mitosis, meiosis and centromere inheritance. These different roles might have different functional requirements. Therefore, it could be advantageous to have two or three copies of Cid such that each encodes a separate function" said Kursel. “The existence of Cid duplications in genetically tractable organisms provides an opportunity to study the multiple functions of a gene that is essential when present in a single copy. We are especially excited to follow up on the function of the male-germline specific Cid duplicates."
Kursel LE, Malik HS. 2017. "Recurrent gene duplication leads to diverse repertoires of centromeric histones in Drosophila species." Molecular Biology and Evolution. doi:10.1093/molbev/msx091
This research was funded by the National Institutes of Health and Howard Hughes Medical Institute. HSM is an HHMI Investigator.