Lack of biological significance in the 'linguistic features' of noncoding DNA--a quantitative analysis.

AUTOR(ES)
RESUMO

Recently, the application of two statistical methods (related to Zipf's distribution and Shannon's redundancy), called 'linguistic' tests, to the primary structure of DNA sequences of living organisms has excited considerable interest. Of particular importance is the claim that noncoding DNA sequences in eukaryotes display specific 'linguistic' features, being reminiscent of natural languages. Furthermore, this implies that noncoding regions of DNA may carry some new, thus far unknown, biological information which is revealed by these tests. In this paper these claims are tested quantitatively. With the aid of computer simulations of natural DNA sequences, and by applying the same 'linguistic' tests to both natural and artificial sequences, we investigate in detail the reasons of the appearance of the claimed 'linguistic' features and the associated differences between coding and noncoding DNAs. The presented results show quantitatively that the 'linguistic' tests failed to reveal any new biological information in (noncoding or coding) DNA.

Documentos Relacionados