For more information contact us at bina.rd@roche.com

Publication [Open access]

If you use MetaSV in your work, please cite the following:
Marghoob Mohiyuddin, John C. Mu, Jian Li, Narges Bani Asadi, Mark B. Gerstein, Alexej Abyzov, Wing H. Wong, and Hugo Y.K. Lam
MetaSV: an accurate and integrative structural-variant caller for next generation sequencing
Bioinformatics first published online April 10, 2015 doi:10.1093/bioinformatics/btv204

Download MetaSV

Latest version: https://github.com/bioinform/metasv/archive/0.5.2.tar.gz

For other versions, see "releases". https://github.com/bioinform/metasv/releases

System Requirements

The following Python packages must be installed:

In addition, paths to the following tools must be provided as MetaSV arguments:

Installing MetaSV

MetaSV is a python package and can be installed using pip. To install type pip install https://github.com/bioinform/metasv/archive/0.5.2.tar.gz. The current version of MetaSV is 0.5.2. In general, the install source would be https://github.com/bioinform/metasv/archive/version.tar.gz

Running MetaSV

Type run_metasv.py -h for help.

Testing MetaSV

cd test

./test_run.sh

Examples

Example 1:

Complete run of MetaSV using all 4 SV detectors, soft-clips based analysis to enhance insertion detection, and local assembly to improve breakpoint resolution.

run_metasv.py --reference reference.fasta --boost_ins --breakdancer_native breakdancer.out --breakseq_native breakseq.gff --cnvnator_native cnvnator.call --pindel_native pindel_D pindel_LI pindel_SI pindel_TD pindel_INV --sample HG005 --bam alignments.bam --spades SPAdes/spades.py --age AGE/age_align --num_threads 15 --workdir work --outdir out --min_ins_support 2 --max_ins_intervals 500000 --isize_mean 500 --isize_sd 150

Example 2:

Only merging output of 4 SV detectors without further sof-clips based analysis or local assembly.

run_metasv.py --reference reference.fasta --breakdancer_native breakdancer.out --breakseq_native breakseq.gff --cnvnator_native cnvnator.call --pindel_native pindel_D pindel_LI pindel_SI pindel_TD pindel_INV --outdir out --sample NA12878 --disable_assembly --filter_gaps --keep_standard_contigs

Example 3:

Only perform analysis of soft-clipped reads to enhance insertions detection along with the assembly (without using other SV detectors).

run_metasv.py --reference reference.fasta --boost_ins --sample HG005 --bam alignments.bam --spades SPAdes/spades.py --age AGE/age_align --num_threads 15 --workdir work --outdir out --min_ins_support 2 --max_ins_intervals 500000 --isize_mean 500 --isize_sd 150

Example 4:

Restrict the analysis to detection of deletion SVs.

run_metasv.py --reference reference.fasta --breakdancer_native breakdancer.out --breakseq_native breakseq.gff --cnvnator_native cnvnator.call --pindel_native pindel_D --sample HG005 --bam alignments.bam --spades SPAdes/spades.py --age AGE/age_align --num_threads 15 --workdir work --outdir out --isize_mean 500 --isize_sd 150 --svs_to_assemble DEL --svs_to_report DEL

Important options

Option Definition Use
--sample STRING Sample name (default: None) --
--reference STRING Reference file --
--gaps STRING Gap bed file (default: None) --
--boost_ins Use soft-clips for improving insertion detection (default: False) Enable for soft-clip analysis
--disable_assembly Disable assembly (default: False) --
--bam STRING BAM file (default: None) Include for assembly and genotyping
--svs_to_assemble {DEL,INS} [{DEL,INS} ...] SVs to assemble (default: set(['DEL', 'INS'])) Include for assembly
--spades STRING Path to SPAdes executable (default: None) Include for assembly
--age STRING Path to AGE executable (default: None) Include for assembly
--num_threads INT Number of threads to use (default: 1) --

Advanced Options for balancing the sensitivity/specificity trade-off

There are different factors that contribute in balancing sensitivity/specificity trade-off:

NOTE: In the following tables:

INC: Increasing/Enabling will increase sensitivity (and thus decrease specificity)

DEC: Decreasing/Disabling will increase sensitivity (and thus decrease specificity)

Reference options:

Option Definition Impact
--filter_gaps Filter out gaps (default: False) DEC
--keep_standard_contigs Keep only the major contigs + MT (default: False) DEC

Input BAM options:

Option Definition Impact
--isize_mean NUM Insert size mean (default: 350.0) -
--isize_sd NUM Insert size standard deviation (default: 50.0) -

Tool output merging options:

Option Definition Impact
--wiggle INT Wiggle for interval overlap (default: 100) -
--inswiggle INT Wiggle for insertions, overides wiggle (default: 100) -
--minsvlen INT Minimum length acceptable to be an SV (default: 50) DEC
--maxsvlen INT Maximum length SV to report (default: 1000000) INC
--overlap_ratio NUM Reciprocal overlap ratio (default: 0.5) -

Insertion detection options:

Option Definition Impact
--min_avg_base_qual NUM Minimum average base quality (default: 20) DEC
--min_mapq NUM Minimum MAPQ (default: 5) DEC
--min_soft_clip INT Minimum soft-clip (default: 20) DEC
--max_nm INT Maximum number of edits (default: 10) INC
--min_matches INT Mininum number of matches (default: 50) DEC
--min_ins_support INT Minimum read support for calling insertions using soft-clips (default: 5) DEC
--min_ins_support_frac NUM Minimum fraction of reads supporting insertion using soft-clips (default: 0) DEC
--max_ins_intervals INT Maximum number of insertion intervals to generate (default: 10000) INC

Assembly options:

Option Definition Impact
--extraction_max_read_pairs INT Maximum number of pairs to extract for assembly (default: 10000) INC
--spades_max_interval_size INT Maximum SV length for assembly (default: 50000) INC

Genotyping options:

Option Definition Impact
--gt_window INT Window for genotyping (default: 100) -
--gt_normal_frac NUM Min. fraction of reads supporting reference for genotyping (default: 0.05) -

References/Tools