Prediction and statistics of pseudoknots in RNA structures using exactly clustered stochastic simulations

AUTOR(ES)
FONTE

National Academy of Sciences

RESUMO

Ab initio RNA secondary structure predictions have long dismissed helices interior to loops, so-called pseudoknots, despite their structural importance. Here we report that many pseudoknots can be predicted through long-time-scale RNA-folding simulations, which follow the stochastic closing and opening of individual RNA helices. The numerical efficacy of these stochastic simulations relies on an 𝒪(n2) clustering algorithm that computes time averages over a continuously updated set of n reference structures. Applying this exact stochastic clustering approach, we typically obtain a 5- to 100-fold simulation speed-up for RNA sequences up to 400 bases, while the effective acceleration can be as high as 105-fold for short, multistable molecules (≤150 bases). We performed extensive folding statistics on random and natural RNA sequences and found that pseudoknots are distributed unevenly among RNA structures and account for up to 30% of base pairs in G+C-rich RNA sequences (online RNA-folding kinetics server including pseudoknots: http://kinefold.u-strasbg.fr).

Documentos Relacionados