1 获取:
SRA toolkit主页:
fastq-dump: https://ncbi.github.io/sra-tools/fastq-dump.html
软件地址:
sra-tools github: https://github.com/ncbi/sra-tools
获取预编译程序:
non-sudo sra-tools download:
https://github.com/ncbi/sra-tools/wiki/01.-Downloading-SRA-Toolkit
2 下载、解压、配置:
wget -c https://ftp-trace.ncbi.nlm.nih.gov/sra/sdk/2.10.9/sratoolkit.2.10.9-ubuntu64.tar.gz
tar -zxvf sratoolkit.2.10.9-ubuntu64.tar.gz
cd sratoolkit.2.10.9-ubuntu64/bin
fastq-dump --help
error
./vdb-config --interactive
tab > exit > enter 退出后:
./fastq-dump --help
/route/./fastq-dump --help
3 prefetch下载SRA数据
SRR数据
# zhengzhou
# /public/home/zzumgg03/huty/softwares/sratoolkit.2.10.9-ubuntu64/bin/./prefetch
softwares/sratoolkit.2.10.9-ubuntu64/bin/./prefetch SRR1778450
prefetch下载数据似乎有自动查重的功能,已下载,或者别的程序正在下载的数据不会再次被下载,log文件似乎如此吧。
2021-02-07T08:04:56 prefetch.2.10.9: 1) Downloading 'SRR413758'...
2021-02-07T08:04:56 prefetch.2.10.9 warn: lock exists while copying file - Lock file /public/home/zzumgg03/huty/projects/diabetes/rawdata/SRR_list_
2021-02-07T08:04:56 prefetch.2.10.9: 1) failed to download 'SRR413758': RC(rcExe,rcFile,rcCopying,rcLock,rcExists)
2021-02-07T08:11:39 prefetch.2.10.9: 1) Downloading 'SRR413759'...
2021-02-07T08:11:39 prefetch.2.10.9: Downloading via HTTPS...
2021-02-08T16:09:59 prefetch.2.10.9: HTTPS download succeed
2021-02-08T16:10:10 prefetch.2.10.9: 'SRR413759' is valid
2021-02-08T16:10:10 prefetch.2.10.9: 1) 'SRR413759' was downloaded successfully
2021-02-08T16:10:10 prefetch.2.10.9: 'SRR413760' is a local non-kart file
2021-02-08T16:10:10 prefetch.2.10.9: 'SRR413761' is a local non-kart file
ERR数据
/public/home/zzumgg03/huty/softwares/sratoolkit.2.10.9-ubuntu64/bin/./prefetch ERR1190532
2022-12-21T03:16:27 prefetch.2.10.9: 1) Downloading 'ERR1190532'...
2022-12-21T03:16:27 prefetch.2.10.9: Downloading via HTTPS...
2022-12-21T03:24:36 prefetch.2.10.9: HTTPS download succeed
2022-12-21T03:25:05 prefetch.2.10.9: 'ERR1190532' is valid
2022-12-21T03:25:05 prefetch.2.10.9: 1) 'ERR1190532' was downloaded successfully
4 fastq-dump转格式
sra2fastq
sra2fasta
sratoolkit.2.10.9-ubuntu64/bin/./fastq-dump SRR413773.sra \
--split-files \
--outdir ./
sratoolkit.2.10.9-ubuntu64/bin/./fastq-dump SRR413773.sra \
--fasta default \
--split-files \
--outdir ./
--split-files:
将双端测序分为两份,放在不同的文件,但是对于一方有而一方没有的reads直接丢弃
--split-3 : 将双端测序分为两份,放在不同的文件,但是对于一方有而一方没有的reads会单独放在一个文件夹里
正常的转格式过程是没有中间文件产生的,出现中间文件说明文件损坏,重新下载重新转格式即可。
5 fasterq-dump转格式
sra2fastq
sratoolkit.2.10.9-ubuntu64/bin/./fastq-dump --split-3 SRR341593.sra
# real 2m17.582s
sratoolkit.2.10.9-ubuntu64/bin/./fasterq-dump \
--split-3 SRR341593.sra --threads 20 --outdir ./
# real 2m13.907s
从这里看fastq fasterq速度差不多,sra为1.3Gbytes.