Mafft Trimal Fasttree建树

导读

barrnap预测细菌基因组16S,获取结果文件第二行最长16S,保留>1400bp的16S。mafft做序列对齐。trimal修剪序列。fasttree建树。ggtree可视化。figtree也可简要查看tree文件。下面从整理好的16S序列文件开始。

文章

标题:MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability
期刊:Molecular Biology and Evolution
时间:2013
被引:21366 (谷歌学术 2021.11.24)

标题:trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses
期刊:Bioinformatics
时间:2009
被引:4662(谷歌学术 2021.11.24)

标题:FastTree 2 – Approximately Maximum-Likelihood Trees for Large Alignments
期刊:Plos one
时间:2010
被引:7840(谷歌学术 2021.11.24)

介绍

FastTree 是基于最大似然法构建进化树的软件,它最大的特点就是运行速度快,支持几百万条序列的建树任务。但是fasttree不支持bootstrap检验以及支持的替换模型有限。

官网如下:http://www.microbesonline.org/fasttree/

替换模型选择:
FastTree 支持核酸和蛋白的进化树构建,对于核酸,可选的替换模型包括以下几种:JC(Jukes-Cantor)、GTR(generalized time-reversible),默认的模型为JC。对于蛋白质,可选的替换模型包括以下几种:JTT (Jones-Taylor-Thornton 1992)、LG(Le and Gascuel 2008)、WAG(Whelan & Goldman 2001) 默认的模型为JTT。FastTree要求输入的多序列比对结果为FASTA或者Phylip格式。

来自:https://www.omicsclass.com/article/1343

软件获取

conda create -n xgene
conda activate xgene
conda install mafft trimal fasttree
Downloading and Extracting Packages
mafft-7.487          | 3.0 MB    | ##################################### | 100%
trimal-1.4.1         | 189 KB    | ##################################### | 100%
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

1 准备输入文件

2 mafft 对齐序列

windows版本://www.greatytc.com/p/d61cf5861e65

mafft --auto bgi_illumina_16S.fa > bgi_illumina_16S_align.fa

过程:

generating a scoring matrix for nucleotide (dist=200)

Making a distance matrix
Constructing a UPGMA tree (efffree=0) ...
Progressive alignment 1/2...
Making a distance matrix from msa..
Constructing a UPGMA tree (efffree=1)
Progressive alignment 2/2...
disttbfast (nuc) Version 7.487

generating a scoring matrix for nucleotide (dist=200)
dndpre (nuc) Version 7.487

generating a scoring matrix for nucleotide (dist=200)
dvtditr (nuc) Version 7.487

结果:

3 trimal 修剪

官网:http://trimal.cgenomics.org/
下载:http://trimal.cgenomics.org/downloads
手册:http://trimal.cgenomics.org/getting_started_with_trimal_v1.2

trimal \
-in bgi_illumina_16S_align.fa \
-out bgi_illumina_16S_align_filter.fa \
-automated1
# Use a heuristic selection of the automatic method based on similarity statistics. (see User Guide). (Optimized for Maximum Likelihood phylogenetic tree reconstruction).

4 fasttree建树

官网?:https://bioinformaticsworkbook.org/phylogenetics/FastTree.html#gsc.tab=0
官网?工作流:sequences -> MAFFT -> FastTree -> FigTree -> pdf

fasttree \
-nt bgi_illumina_16S_align_filter.fa \
> bgi_illumina_16S_align_filter.tree

过程:

FastTree Version 2.1.10 Double precision (No SSE3)
Nucleotide distances: Jukes-Cantor Joins: balanced Support: SH-like 1000
Search: Normal +NNI +SPR (2 rounds range 10) +ML-NNI opt-each=1
TopHits: 1.00*sqrtN close=default refresh=0.80
ML Model: Jukes-Cantor, CAT approximation with 20 rate categories

5 ggtree 可视化

library("ggplot2")
library("ggtree")
data = read.tree("bgi_illumina_16S_align_filter.tree")
tree = fortify(data)

gra3 = 
ggtree(data, layout="fan", branch.length="none", size=0.8) %<+% pal +
geom_tiplab(aes(label = mark, col = Platform),  
            size=3) +
scale_color_manual(
    values = c("BGI" = "orangered3", 
               "Illumina" = "deepskyblue3"))

ggsave(gra3, file="tree3.jpg")

iqtree建树

mafft --auto Tree.fas > Tree.fas.mafft
iqtree -s Tree.fas.mafft -m MFP -bb 1000 -bnni -redo -o NC_010433

植物叶绿体基因组--组装,注释和比较作图
构建进化树的简单方法

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容