NotI clones in the analysis of the human genome

AUTOR(ES)
FONTE

Oxford University Press

RESUMO

NotI linking clones contain sequences flanking NotI recognition sites and were previously shown to be tightly associated with CpG islands and genes. To directly assess the value of NotI clones in genome research, high density grids with 50 000 NotI linking clones originating from six representative NotI linking libraries were constructed. Altogether, these libraries contained nearly 100 times the total number of NotI sites in the human genome. A total of 3437 sequences flanking NotI sites were generated. Analysis of 3265 unique sequences demonstrated that 51% of the clones displayed significant protein similarity to SWISSPROT and TREMBL database proteins based on MSPcrunch filtering with stringent parameters. Of the 3265 sequences, 1868 (57.2%) were new sequences, not present in the EMBL and EST databases (similarity ≤ 90%). Among these new sequences, 795 (24.3%) showed similarity to known proteins and 712 (21.8%) displayed an identity of >75% at the nucleotide level to sequences from EMBL or EST databases. The remaining 361 (11.1%) sequences were completely new, i.e. <75% identical. The work also showed tight, specific association of NotI sites with the first exon and suggest that the so-called 3′ ESTs can actually be generated from 5′-ends of genes that contain NotI sites in their first exon.

Documentos Relacionados