Schierup M. Schultz A. Sheward D. Stamatakis A. Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide. Sign In or Create an Account.
Sign In. Advanced Search. Search Menu. Article Navigation. Close mobile search navigation Article Navigation. Volume 7. Article Contents Abstract. Generation of recombination-free datasets. Query vs reference scans for recombination. Automated sequence annotation. Detection of potential false-positive recombination signals. Improved computational performance. Operational limits. Supplementary data.
RDP5: a computer program for analyzing recombination in, and removing signals of recombination from, nucleotide sequence datasets. Corresponding author: E-mail: darrenpatrickmartin gmail. Oxford Academic. Arvind Varsani. Philippe Roumagnac.
Gerrit Botha. Bioinformatics 16, Possible emergence of new geminiviruses by frequent recombination. Virology , A modified bootscan algorithm for automated identification of recombinant sequences and recombination breakpoints.
Analyzing the mosaic structure of genes. J Mol Evol 34, The Chimaera method: Posada, D. Evaluation of methods for detecting recombination from DNA sequences: Computer simulations. Proc Natl Acad Sci 98, The SiScan method: Gibbs, M. Sister-Scanning: a Monte Carlo procedure for assessing signals in recombinant sequences.
The 3Seq method: Lam H. Improved algorithmic complexity for the 3SEQ recombination detection algorithm. Mol Biol Evol, 35, Phylogenetic evidence for recombination in dengue virus. Mol Biol and Evol 16, Key elements of the RDP4 program interface are illustrated in Fig.
The main elements of the RDP4 program interface. RDP4 is able to do this using a range of fast and powerful heuristic recombination detection methods that sequentially test every combination of three sequences in an input alignment for evidence that one of the three sequences is a recombinant and the other two are its parents. Having detected all of the recombination signals that are evident within an input alignment, RDP4 will then proceed to infer the minimum number of recombination events needed to account for these signals.
It does so by sequentially disassembling identified recombinant sequences into their component parts i. This fully exploratory approach means that, without any prior information, RDP4 can be used to characterise complex patterns of recombination such as those arising when recombination events occur between parental sequences that are themselves recombinant.
It is important to note, however, that there are also drawbacks to this approach. Primary among these is that when analysing datasets that contain large numbers of recombinant sequences, it can become very difficult for RDP4 to accurately identify the recombinants.
Similarly, when numerous ancient recombination events have occurred such that multiple sequences in a dataset carry evidence of the same ancestral recombination events, RDP4 will often incorrectly attribute recombination signals arising from multiple different recombination events to a single ancestral event i. To partially rectify such deficits, RDP4 includes an array of tools which can be used to manually check, and correct if necessary, any perceived inference errors that the program has made.
These tools are all accessible via a point-and-click graphical user interface and enable a user to directly test alternative hypotheses relating to the misidentification of recombination breakpoints, parental sequences, and groups of sequences sharing evidence of the same ancestral recombination events.
Among others, these cross-checking tools include the following:. Multiple different phylogenetic tree construction methods that can be used to contrast phylogenetic signals in different parts of an alignment such as on opposite sides of a recombination breakpoint. Shimodaira—Hasegawa and approximately unbiased phylogenetic tree comparison tests Shimodaira and Hasegawa ; Shimodaira Matrix-based visualisations of the statistical plausibility of alternative breakpoint locations.
Statistical and phylogenetic tests that indicate the degree to which recombination signals that are detectable in two different sequences resemble one another. Such alignments will be stripped of all readily detectable evidence of individual recombination events and can then be used with other computer programs such as BEAST Bouckaert et al. RDP4 can also be used to directly construct minimum evolution with FastTree2; Price, Dehal, and Arkin and maximum-likelihood with RAxML8; Stamatakis phylogenetic trees that account for the recombination events that it has detected.
Specifically, it will construct trees using edited versions of the input alignment where fragments of sequence derived through recombination have either been removed altogether or have been re-added to the alignment as new sequences. In cases where the underlying mechanistic or selective causes of detectable recombination patterns are of interest, RDP4 provides a range of useful tools including:. Tests for the presence of recombination hot- and cold spots McVean et al.
Tests of purifying selection acting against recombination induced misfolding of either proteins Voigt et al. Tests of association between recombination breakpoint locations and user-specified genome features such as gene boundaries, the junctions between protein domains or nucleotides that are base-paired within secondary structures Lefeuvre et al. Tests for, and matrix-based visualisations of, the types of imbalanced coinheritance of nucleotide pairs that are expected to occur within recombinant genomes evolving under selection acting against the disruption of favourable epistatic interactions Fig.
Phylogenetic incompatibility visualisations of the overall phylogenetic impacts of recombination within datasets Fig. Examples of tools that are available in RDP4 for visualising overall patterns of recombination.
The dataset examined here is the foot-and-mouth disease virus FMDV full genome dataset analysed in Heath et al. The bottom half of the matrix indicates site-pairs that are significantly more in blue or less in red frequently co-inherited during recombination than would be expected under random recombination.
Such datasets might, e. With default program settings, RDP4 can analyse kb-long sequences in 10 minutes on a standard desktop computer. It is distributed along with programs for generating SDT; Muhire, Varsani, and Martin and aligning IMPALE datasets and an extensive manual that contains detailed descriptions of the various methods implemented in RDP4 and a step-by-step guide describing how best to use these.
Arenas M. Posada D. Google Scholar. Beiko R. Hamilton N. Harlow T. Ragan M. Boni M. Feldman M. Bouckaert R. Dedepsidis E. Felsenstein J. Gibbs M. Armstrong J. Gibbs A. Golden M. Han G. Worobey M. Heath L.
0コメント