Hi All,
I read another paper and think I understand it reasonably well, which made it an interesting paper.
And so I have the urge to share.
The origin of new genes is an important topic in evolutionary biology. While there are several different mechanisms behind the formation of new genes, which will be described soon, in the following set of posts, I will be discussing this:-
De Novo Origin of Human Protein Coding Genes
- paper, which deals with a mechanism of new gene formation which is only just being appreciated.
In essence, this process causes new genes to form from the vast stretches of DNA lying between the protein coding genes. Naturally, protein coding genes are those which do code for functional DNA, and the stretches of DNA between are generally named “junk” because it seems that these regions have little to no function. However, scientists are beginning to find evidence that they may be regions from which new genes can arise.
To find examples of these denovo genes, researchers have to trawl massive databases looking for strings of DNA in close relatives of humans (chimpanzees and orangutans) that are similar to strips of DNA in humans but which do not code for identifiable proteins in any of these organisms and were non translatable in the chimps and orangutans, but were translatable in humans. This latter point means that in humans, mutations had occurred in the DNA to cause translation ‘start here’ codons to evolve. Once DNA is translated into polypeptide then at least it is open to selection and is potentially functional, and at best that polypeptide is a fully functional protein, and the underlying DNA is an actual gene.
(The reader needs to remember that the genome of an organism consists of massively long strings of DNA comprised of a sugar backbone from which bases, labelled A, C, T or G are fixed. Sets of three bases define an amino acid, strings of which go to make up polypeptides or proteins. To get from DNA to protein, the cellular machinery first transcribes DNA into messenger RNA, which is very much like DNA, which is then translated into strings of amino acids which are called polypeptides. Polypeptides which are functional are what we call proteins. To refresh your memories, and possibly confuse you even more, there is this which adds a bit of technical detail to my very sloppy description.)
Because humans, chimps and orangutans share common ancestry, these similar strings of DNA were passed on via the process of common descent. The common ancestor had the string and it was passed on to the lineages that became chimp, orangutan and humans, subsequently diverging in each lineage thanks to the accumulation of point mutations. But they did not diverge enough such that all traces of common ancestry were gone. That they were not translated in chimps and orangutans, but were translated in humans meant that in humans these strips had mutated to the point of potentially being functional in humans.
Finding these comparable strips of DNA proved to be a challenge and the number of them suggested to the researchers that denovo origination of new genes from the non coding regions of a genome may be more frequent that had previously been thought.
Because I think I understand the above paper reasonably well, in the next set of posts I will describe what the paper reports, firming up my own understanding and passing on to the reader something of the idea regarding a means by which new genes are thought to arise in organisms.
This has been my introduction. In the next post I discuss the paper’s introduction.
To be continued ....
I read another paper and think I understand it reasonably well, which made it an interesting paper.
And so I have the urge to share.

The origin of new genes is an important topic in evolutionary biology. While there are several different mechanisms behind the formation of new genes, which will be described soon, in the following set of posts, I will be discussing this:-
De Novo Origin of Human Protein Coding Genes
- paper, which deals with a mechanism of new gene formation which is only just being appreciated.
In essence, this process causes new genes to form from the vast stretches of DNA lying between the protein coding genes. Naturally, protein coding genes are those which do code for functional DNA, and the stretches of DNA between are generally named “junk” because it seems that these regions have little to no function. However, scientists are beginning to find evidence that they may be regions from which new genes can arise.
To find examples of these denovo genes, researchers have to trawl massive databases looking for strings of DNA in close relatives of humans (chimpanzees and orangutans) that are similar to strips of DNA in humans but which do not code for identifiable proteins in any of these organisms and were non translatable in the chimps and orangutans, but were translatable in humans. This latter point means that in humans, mutations had occurred in the DNA to cause translation ‘start here’ codons to evolve. Once DNA is translated into polypeptide then at least it is open to selection and is potentially functional, and at best that polypeptide is a fully functional protein, and the underlying DNA is an actual gene.
(The reader needs to remember that the genome of an organism consists of massively long strings of DNA comprised of a sugar backbone from which bases, labelled A, C, T or G are fixed. Sets of three bases define an amino acid, strings of which go to make up polypeptides or proteins. To get from DNA to protein, the cellular machinery first transcribes DNA into messenger RNA, which is very much like DNA, which is then translated into strings of amino acids which are called polypeptides. Polypeptides which are functional are what we call proteins. To refresh your memories, and possibly confuse you even more, there is this which adds a bit of technical detail to my very sloppy description.)
Because humans, chimps and orangutans share common ancestry, these similar strings of DNA were passed on via the process of common descent. The common ancestor had the string and it was passed on to the lineages that became chimp, orangutan and humans, subsequently diverging in each lineage thanks to the accumulation of point mutations. But they did not diverge enough such that all traces of common ancestry were gone. That they were not translated in chimps and orangutans, but were translated in humans meant that in humans these strips had mutated to the point of potentially being functional in humans.
Finding these comparable strips of DNA proved to be a challenge and the number of them suggested to the researchers that denovo origination of new genes from the non coding regions of a genome may be more frequent that had previously been thought.
Because I think I understand the above paper reasonably well, in the next set of posts I will describe what the paper reports, firming up my own understanding and passing on to the reader something of the idea regarding a means by which new genes are thought to arise in organisms.
This has been my introduction. In the next post I discuss the paper’s introduction.
To be continued ....
Comment