GC含量计算
参考://www.greatytc.com/p/de97067136a9 (利用tbtools小工具)
或者perl脚本(50k窗口统计)From https://blog.csdn.net/hugolee123/article/details/38441927
#! /usr/bin/perl -w
use strict;
die "#usage:perl $0 <BananaB.chr_V2.1.final.fa>" unless @ARGV==1;
my $fa=shift;
my $bin=50000; ##50k
open IN,$fa||die;
$/=">";<IN>;$/="\n";
print "#Chr\tStart\tEnd\tGC_num\tRatio\n";
while(<IN>){
my $chr=$1 if /^(\S+)/;
$/=">";
chomp(my $seq=<IN>);
$/="\n";
$seq=~s/\n+//g;
my $len=length$seq;
for (my $i=0;$i<$len/$bin;$i++){
my $loc=$i*$bin;
my $sub_fa=uc(substr($seq,$loc,$bin));
my $GC=$sub_fa=~tr/GC//;
my $ratio=sprintf "%.4f",$GC/$bin;
my $start=$i*$bin+1;
my $end=($i+1)*$bin;
my $out=join "\t",$chr,$start,$end,$GC,$ratio;
print $out,"\n";
#print "$out\t$sub_fa\n" unless $GC;
}
}
close IN;
计算指定滑窗内基因/TE数量
参考:
//www.greatytc.com/p/7efe0d1139a6 (利用bedtools makewindows把染色体拆分成滑窗)
//www.greatytc.com/p/141de8cfaebf (bedtools intersect详细用法指南)
1. 拆分染色体到指定滑窗(-w参数)
bedtools makewindows -g Chr.length -w 50000 > 50k.windows
2.计算每个滑窗内基因数量
bedtools intersect -a 50k.windows -b TE.track -c > TE.newtrack
注 -c参数是b文件在a的每个区间内存在几个重合。