SURVIVOR 用于模拟/评估 SV、合并和比较样本内及样本间 SV 的工具集。
githup: https://github.com/fritzsedlazeck/SURVIVOR
wikt: https://github.com/fritzsedlazeck/SURVIVOR/wiki
1. 安装
git clone https://github.com/fritzsedlazeck/SURVIVOR.git
cd SURVIVOR/Debug
make
2. 简单操练
为了提高SV的准确性,可以针对每一份样本通过不同版本进行鉴定SV,比如manta, delly等,而后利用SURVIVOR 将每一个软件得到的vcf进行合并。
## 不同软件的vcf放入一个文本
ls *.vcf >sample_files
## merge
SURVIVOR merge sample_files 1000 2 1 1 0 30 sample_merged.vcf
参数
- 1000表示允许合并的SV间的距离最大为1000bp;
- 2表示仅输出2个工具均鉴定出的SV;
- 1表示仅输出2个工具鉴定出的同类型的SV;
- 1表示仅输出2个工具鉴定出的同方向的SV;
- 30表示仅考虑长度在30bp以上的SV
一些其他参数
-- Simulation/ Evaluation
simSV Simulates SVs and SNPs on a reference genome.
scanreads Obtain error profiles form mapped reads for simulation.
simreads Simulates long reads (Pacio or ONT).
eval Evaluates a VCF file after SV calling over simulated data.
-- Comparison/filtering
merge Compare or merge VCF files to generate a consensus or multi sample vcf files.
filter Filter a vcf file based on size and/or regions to ignore
stats Report multipe stats over a VCF file
compMUMMer Annotates a VCF file with the breakpoints found with MUMMer (Show-diff).
-- Conversion
bincov Bins coverage vector to a bed file to filter SVs in low MQ regions
vcftobed Converts a VCF file to a bed file
bedtovcf Converts a bed file to a VCF file
smaptovcf Converts the smap file to a VCF file (beta version)
bedpetovcf Converts a bedpe file ot a VCF file (beta version)
hapcuttovcf Converts the Hapcut2 final file to a VCF file using the original SNP file provided to Hapcut2
convertAssemblytics Converts Assemblytics to a VCF file
20240909
发现合并以后的vcf会出现一些SV坐标和reference对应不起来的问题,可以通过bcftools进行检查
bcftools norm --check-ref w --fasta-ref ref.fa SURVIRO.vcf
比如合并之前的位点为:
image.png
image.png
而合并之后的位点为:
image.png
- 可以看到,虽然用的ID为INS.2412 (其对应坐标为98676691),但是合并后的坐标为98676480 (INS.2411的坐标),从而导致SV和ref序列对应不起来。
- 还有position可以对应相应的ref和alt序列,但是SV-id却对应不起来。
- 还有,position,SV-id和序列都对应不起来的情况,这种情况较为少!
最终利用脚本 fix_SURVIOR_error.py 将其矫正!