Co-reporter:Asaminew H. Aytenfisu;Aleksandar Spasic;Alan Grossfield;Harry A. Stern
Journal of Chemical Theory and Computation February 14, 2017 Volume 13(Issue 2) pp:900-915
Publication Date(Web):January 3, 2017
DOI:10.1021/acs.jctc.6b00870
The backbone dihedral parameters of the Amber RNA force field were improved by fitting using multiple linear regression to potential energies determined by quantum chemistry calculations. Five backbone and four glycosidic dihedral parameters were fit simultaneously to reproduce the potential energies determined by a high-level density functional theory calculation (B97D3 functional with the AUG-CC-PVTZ basis set). Umbrella sampling was used to determine conformational free energies along the dihedral angles, and these better agree with the population of conformations observed in the protein data bank for the new parameters than for the conventional parameters. Molecular dynamics simulations performed on a set of hairpin loops, duplexes and tetramers with the new parameter set show improved modeling for the structures of tetramers CCCC, CAAU, and GACC, and an RNA internal loop of noncanonical pairs, as compared to the conventional parameters. For the tetramers, the new parameters largely avoid the incorrect intercalated structures that dominate the conformational samples from the conventional parameters. For the internal loop, the major conformation solved by NMR is stable with the new parameters, but not with the conventional parameters. The new force field performs similarly to the conventional parameters for the UUCG and GCAA hairpin loops and the [U(UA)6A]2 duplex.
Co-reporter:Zhen Tan, Gaurav Sharma, David H. Mathews
Biophysical Journal 2017 Volume 113, Issue 2(Volume 113, Issue 2) pp:
Publication Date(Web):25 July 2017
DOI:10.1016/j.bpj.2017.06.039
Secondary structure prediction is an important problem in RNA bioinformatics because knowledge of structure is critical to understanding the functions of RNA sequences. Significant improvements in prediction accuracy have recently been demonstrated though the incorporation of experimentally obtained structural information, for instance using selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) mapping. However, such mapping data is currently available only for a limited number of RNA sequences. In this article, we present a method for extending the benefit of experimental mapping data in secondary structure prediction to homologous sequences. Specifically, we propose a method for integrating experimental mapping data into a comparative sequence analysis algorithm for secondary structure prediction of multiple homologs, whereby the mapping data benefits not only the prediction for the specific sequence that was mapped but also other homologs. The proposed method is realized by modifying the TurboFold II algorithm for prediction of RNA secondary structures to utilize basepairing probabilities guided by SHAPE experimental data when such data are available. The SHAPE-mapping-guided basepairing probabilities are obtained using the RSample method. Results demonstrate that the SHAPE mapping data for a sequence improves structure prediction accuracy of other homologous sequences beyond the accuracy obtained by sequence comparison alone (TurboFold II). The updated version of TurboFold II is freely available as part of the RNAstructure software package.
Co-reporter:Asaminew H. Aytenfisu, Aleksandar Spasic, Matthew G. Seetin, John Serafini, and David H. Mathews
Journal of Chemical Theory and Computation 2014 Volume 10(Issue 3) pp:1292-1301
Publication Date(Web):January 22, 2014
DOI:10.1021/ct400861g
Molecular mechanics with all-atom models was used to understand the conformational preference of tandem guanine-adenine (GA) noncanonical pairs in RNA. These tandem GA pairs play important roles in determining stability, flexibility, and structural dynamics of RNA tertiary structures. Previous solution structures showed that these tandem GA pairs adopt either imino (cis Watson–Crick/Watson–Crick A-G) or sheared (trans Hoogsteen/sugar edge A-G) conformations depending on the sequence and orientation of the adjacent closing base pairs. The solution structures (GCGGACGC)2 [Biochemistry, 1996, 35, 9677–9689] and (GCGGAUGC)2 [Biochemistry, 2007, 46, 1511–1522] demonstrate imino and sheared conformations for the two central GA pairs, respectively. These systems were studied using molecular dynamics and free energy change calculations for conformational changes, using umbrella sampling. For the structures to maintain their native conformations during molecular dynamics simulations, a modification to the standard Amber ff10 force field was required, which allowed the amino group of guanine to leave the plane of the base [J. Chem. Theory Comput., 2009, 5, 2088–2100] and form out-of-plane hydrogen bonds with a cross-strand cytosine or uracil. The requirement for this modification suggests the importance of out-of-plane hydrogen bonds in stabilizing the native structures. Free energy change calculations for each sequence demonstrated the correct conformational preference when the force field modification was used, but the extent of the preference is underestimated.
Co-reporter:Xiaoju Zhang, Ross C. Walker, Eric M. Phizicky, and David H. Mathews
Journal of Chemical Theory and Computation 2014 Volume 10(Issue 8) pp:3473-3483
Publication Date(Web):May 28, 2014
DOI:10.1021/ct500107y
Modified nucleotides are prevalent in tRNA. Experimental studies reveal that these covalent modifications play an important role in tuning tRNA function. In this study, molecular dynamics (MD) simulations were used to investigate how modifications alter tRNA dynamics. The X-ray crystal structures of tRNA(Asp), tRNA(Phe), and tRNA(iMet), both with and without modifications, were used as initial structures for 333 ns explicit solvent MD simulations with AMBER. For each tRNA molecule, three independent trajectory calculations were performed, giving an aggregate of 6 μs of total MD across six molecules. The global root-mean-square deviations (RMSD) of atomic positions show that modifications only introduce significant rigidity to the global structure of tRNA(Phe). Interestingly, RMSDs of the anticodon stem-loop (ASL) suggest that modified tRNA has a more rigid structure compared to the unmodified tRNA in this domain. The anticodon RMSDs of the modified tRNAs, however, are higher than those of corresponding unmodified tRNAs. These findings suggest that the rigidity of the anticodon stem-loop is finely tuned by modifications, where rigidity in the anticodon arm is essential for tRNA translocation in the ribosome, and flexibility of the anticodon is important for codon recognition. Sugar pucker and water residence time of pseudouridines in modified tRNAs and corresponding uridines in unmodified tRNAs were assessed, and the results reinforce that pseudouridine favors the 3′-endo conformation and has a higher tendency to interact with water. Principal component analysis (PCA) was used to examine correlated motions in tRNA. Additionally, covariance overlaps of PCAs were compared for trajectories of the same molecule and between trajectories of modified and unmodified tRNAs. The comparison suggests that modifications alter the correlated motions. For the anticodon bases, the extent of stacking was compared between modified and unmodified molecules, and only unmodified tRNA(Asp) has significantly higher percentage of stacking time. Overall, the simulations reveal that the effect of covalent modification on tRNA dynamics is not simple, with modifications increasing flexibility in some regions of the structure and increasing rigidity in other regions.
Co-reporter:Harry A Stern;David H Mathews
Algorithms for Molecular Biology 2013 Volume 8( Issue 1) pp:
Publication Date(Web):2013 December
DOI:10.1186/1748-7188-8-29
RNA performs many diverse functions in the cell in addition to its role as a messenger of genetic information. These functions depend on its ability to fold to a unique three-dimensional structure determined by the sequence. The conformation of RNA is in part determined by its secondary structure, or the particular set of contacts between pairs of complementary bases. Prediction of the secondary structure of RNA from its sequence is therefore of great interest, but can be computationally expensive. In this work we accelerate computations of base-pair probababilities using parallel graphics processing units (GPUs).Calculation of the probabilities of base pairs in RNA secondary structures using nearest-neighbor standard free energy change parameters has been implemented using CUDA to run on hardware with multiprocessor GPUs. A modified set of recursions was introduced, which reduces memory usage by about 25%. GPUs are fastest in single precision, and for some hardware, restricted to single precision. This may introduce significant roundoff error. However, deviations in base-pair probabilities calculated using single precision were found to be negligible compared to those resulting from shifting the nearest-neighbor parameters by a random amount of magnitude similar to their experimental uncertainties. For large sequences running on our particular hardware, the GPU implementation reduces execution time by a factor of close to 60 compared with an optimized serial implementation, and by a factor of 116 compared with the original code.Using GPUs can greatly accelerate computation of RNA secondary structure partition functions, allowing calculation of base-pair probabilities for large sequences in a reasonable amount of time, with a negligible compromise in accuracy due to working in single precision. The source code is integrated into the RNAstructure software package and available for download at http://rna.urmc.rochester.edu.
Co-reporter:Christine E. Hajdin;Stanislav Bellaousov;Wayne Huggins;Christopher W. Leonard;Kevin M. Weeks
PNAS 2013 Volume 110 (Issue 14 ) pp:5498-5503
Publication Date(Web):2013-04-02
DOI:10.1073/pnas.1219988110
A pseudoknot forms in an RNA when nucleotides in a loop pair with a region outside the helices that close the loop. Pseudoknots
occur relatively rarely in RNA but are highly overrepresented in functionally critical motifs in large catalytic RNAs, in
riboswitches, and in regulatory elements of viruses. Pseudoknots are usually excluded from RNA structure prediction algorithms.
When included, these pairings are difficult to model accurately, especially in large RNAs, because allowing this structure
dramatically increases the number of possible incorrect folds and because it is difficult to search the fold space for an
optimal structure. We have developed a concise secondary structure modeling approach that combines SHAPE (selective 2′-hydroxyl
acylation analyzed by primer extension) experimental chemical probing information and a simple, but robust, energy model for
the entropic cost of single pseudoknot formation. Structures are predicted with iterative refinement, using a dynamic programming
algorithm. This melded experimental and thermodynamic energy function predicted the secondary structures and the pseudoknots
for a set of 21 challenging RNAs of known structure ranging in size from 34 to 530 nt. On average, 93% of known base pairs
were predicted, and all pseudoknots in well-folded RNAs were identified.
Co-reporter:Aleksandar Spasic, John Serafini, and David H. Mathews
Journal of Chemical Theory and Computation 2012 Volume 8(Issue 7) pp:2497-2505
Publication Date(Web):June 5, 2012
DOI:10.1021/ct300240k
The ability of the Amber ff99 force field to predict relative free energies of RNA helix formation was investigated. The test systems were three hexaloop RNA hairpins with identical loops and varying stems. The potential of mean force of stretching the hairpins from the native state to an extended conformation was calculated with umbrella sampling. Because the hairpins have identical loop sequence, the differences in free energy changes are only from the stem composition. The Amber ff99 force field was able to correctly predict the order of stabilities of the hairpins, although the magnitude of the free energy change is larger than that determined by optical melting experiments. The two measurements cannot be compared directly because the unfolded state in the optical melting experiments is a random coil, while the end state in the umbrella sampling simulations was an elongated chain. The calculations can be compared to reference data by using a thermodynamic cycle. By applying the thermodynamic cycle to the transitions between the hairpins using simulations and nearest-neighbor data, agreement was found to be within the sampling error of simulations, thus demonstrating that ff99 force field is able to accurately predict relative free energies of RNA helix formation.
Co-reporter:Keith P. Van Nostrand, Scott D. Kennedy, Douglas H. Turner, and David H. Mathews
Journal of Chemical Theory and Computation 2011 Volume 7(Issue 11) pp:3779-3792
Publication Date(Web):October 4, 2011
DOI:10.1021/ct200223q
Conformational changes are important in RNA for binding and catalysis, and understanding these changes is important for understanding how RNA functions. Computational techniques using all-atom molecular models can be used to characterize conformational changes in RNA. These techniques were applied to an RNA conformational change involving a single base pair within a nine base pair RNA duplex. The adenine–adenine (AA) noncanonical pair in the sequence 5′GGUGAAGGCU3′ paired with 3′PCCGAAGCCG5′, where P is purine, undergoes conformational exchange between two conformations on the time scale of tens of microseconds, as demonstrated in a previous NMR solution structure [Chen, G.; et al. Biochemistry2006, 45, 6889–903]. The more populated, major, conformation was estimated to be 0.5 to 1.3 kcal/mol more stable at 30 °C than the less populated, minor, conformation. Both conformations are trans-Hoogsteen/sugar edge pairs, where the interacting edges on the adenines change with the conformational change. Targeted molecular dynamics (TMD) and nudged elastic band (NEB) were used to model the pathway between the major and minor conformations using the AMBER software package. The adenines were predicted to change conformation via intermediates in which they are stacked as opposed to hydrogen-bonded. The predicted pathways can be described by an improper dihedral angle reaction coordinate. Umbrella sampling along the reaction coordinate was performed to model the free energy profile for the conformational change using a total of 1800 ns of sampling. Although the barrier height between the major and minor conformations was reasonable, the free energy difference between the major and minor conformations was the opposite of that expected on the basis of the NMR experiments. Variations in the force field applied did not improve the misrepresentation of the free energies of the major and minor conformations. As an alternative, the molecular mechanics Poisson–Boltzmann surface area (MM-PBSA) approximation was applied to predict free energy differences between the two conformations using a total of 800 ns of sampling. MM-PBSA also incorrectly predicted the major conformation to be higher in free energy than the minor conformation.
Co-reporter:Tian W. Li;Kevin M. Weeks;Katherine E. Deigan
PNAS 2009 Volume 106 (Issue 1 ) pp:97-102
Publication Date(Web):2009-01-06
DOI:10.1073/pnas.0806929106
Almost all RNAs can fold to form extensive base-paired secondary structures. Many of these structures then modulate numerous
fundamental elements of gene expression. Deducing these structure–function relationships requires that it be possible to predict
RNA secondary structures accurately. However, RNA secondary structure prediction for large RNAs, such that a single predicted
structure for a single sequence reliably represents the correct structure, has remained an unsolved problem. Here, we demonstrate
that quantitative, nucleotide-resolution information from a SHAPE experiment can be interpreted as a pseudo-free energy change
term and used to determine RNA secondary structure with high accuracy. Free energy minimization, by using SHAPE pseudo-free
energies, in conjunction with nearest neighbor parameters, predicts the secondary structure of deproteinized Escherichia coli 16S rRNA (>1,300 nt) and a set of smaller RNAs (75–155 nt) with accuracies of up to 96–100%, which are comparable to the
best accuracies achievable by comparative sequence analysis.