Mate Pair and Paired-End Sequencing – Illumina
Today I will post my recurring question about thedifferences on MatePair andPaired-End Sequencing technologies used in Next-Gensequences.
Unfortunately because of technology limitations you can notread long reads, just it their ends (from 35 to 100 bp depending of thechemical and sequencer), to workaround this serious limitation, it was designedthis two methodologies.
First of all my sources was the own Illumina Website, hereis the link for Mate Pair and the link for the Paired-End Seq. Asappear in the website and many articles, these technologies are very usefulwhen you deal with De Novo Sequencing (Assembly a entire Genome) or repetitiveparts of genome, why once you have a well aligned sequence and you knowthe distance between the two sequences you can use the first one as a anchor todetermine the second sequence position. I think the easiest one, is thePaired-End soI will start with it.
I recommend looking at this video for those who knownothing about Illumina technology.
Paired-End
Paired-End sequecing
It is a modification of the shotgun sequencing(where yoursequences have no pairs)
Once you have the DNA fragmented in 200-500 bp, you addadapter in both ends of the sequence of interest (A1 and A2),
In the forth step you generate clusters (spotson flowcell of same sequences made by amplification).
Finally after the cluster generation you go to sequencing step(fifth and sixth steps) where using modified dNTPs and primers for knowsequences (SP1 and SP2) you read the reads by light signals.
Because you know in the preparation you made sequences ofknow distance you can/must input this information in your aligner or assembler(depend on your application), because its a very helpful information that willmake these softwares to make less mistakes.
Remember that the orientation of a pair of reads (R1/R2)must appear in the aligner output like(→←) respectively.
Mate Pair
Mate Pair sequecing
In the mate pair the sequence fragmentation is made inbigger fragments (2-5 kb).
A addition of a Biotin in each 5′ ends is done (step 3).
The sequence with correct addition of Biotin willcircularize and after a wash, the sequencing with non-circularized fragmentwill be thrown away (step 4)
In step 5 and 6, the circularized fragments will be cuttedwith the biotin in the middle and size-selected (400-600 bp).
And than the sequencing is done normally: adapter withprimer sequence addition (step 7), the fragments will be spoted and clutered(step 8), and sequencing (step 9 and 10).
Because you know in the preparation you made sequences ofknow distance you can/must input this information in your aligner or assembler(depend on your application), because its a very helpful information that willmake these softwares to make less mistakes.
Remember that the orientation of a pair of reads (R1/R2)must appear in the aligner output like(←→) respectively.
区别
Paired-end方法是指在构建待测DNA文库时在两端的接头上都加上测序引物结合位点,在第一轮测序完成后,去除第一轮测序的模板链,用对读测序模块(Paired-End Module)引导互补链在原位置再生和扩增,以达到第二轮测序所用的模板量,进行第二轮互补链的合成测序。
Mate-pair文库制备旨在生成一些短的DNA片段,这些片段包含基因组中较大跨度(2-10 kb)片段两端的序列,更具体地说:首先将基因组DNA随机打断到特定大小(2-10 kb范围可选);然后经末端修复,生物素标记和环化等实验步骤后,再把环化后的DNA分子打断成400-600 bp的片段并通过带有链亲和霉素的磁珠把那些带有生物素标记的片段捕获。这些捕获的片段再经末端修饰和加上特定接头后建成mate-pair文库,然后上机测序。
总结
高通量测序的方式主要有:单端测序、paired-end/mate-paired(PE/MP)测序。当要进行多 个样品同时测序时可以给不同的样品添加不同接头,混合后一起测序。
其中单端测序就是将 基因组随机打断后,对每个片段的进行测序。该方式建库简单,操作步骤少,常用于小基因 组、转录组、宏基因组测序。PE/MP 测序也叫双向测序,是对一个长的序列测得其两端的序 列。两端的序列形成"一对",中间的距离叫插入长度(insert length, ins_length)。paired-end 和 mate-paired 的区别在于建库的方式不一样。例如在 Solexa 中,基因片段采用桥式扩增法,而未经过环化,得到的测序结果叫 paired-end。而在 454 测序中,被测序的序列先被环化然后 打断测序,得到的结果叫 mate-paired 。
不管采用哪种方式,PE/MP 测序的结果除了序列本 身外还有中间的距离信息。距离信息可以用来判定组装后成对 reads 间的序列是否准确,也可用来帮助组装。这种测序方式可以用来解决基因组中的重复序列难题,被广泛采用。目前在 采用双端测序法时,454 平台建库最长(最长能达到 20k),Illumina 建库长度最短(小于 5k)。 由于 Solid 和 Solexa 都是采用桥式扩增的方式,其本身自带 paired-end 测序能力。而 454 和 Ion Torrent 要对打断后的片段进行环化、酶切,然后才能进行 mate-paired 测序。因此建库的成本会比单端测序的高。
本质上来说:
mate pair测序的DNA文库是将很长的DNA进行环化,环化的接口处连接识别序列,然后打断,富集含有识别序列的DNA,再进行双向测序,那么双向测序的插入片段长度就会很长。而pair end是直接在DNA两端假设接头进行双向测序,插入片段长度较短,这样的话,mate pair可以测更长的片段。