今天我们来到了单细胞空间联合分析的第五个部分,可能有部分同学有这样的疑问,为什么要分享和研究这么多的方法, 有一个不就好了么?这个问题,说明你站的高度还需要提升。好了,开始我们今天的分享,单细胞和空间联合分析的方法----spatialDWLS
文章在SpatialDWLS: accurate deconvolution of spatial transcriptomic data,目前处于前发的状态,中国人写的,里面用到的方法也是解卷积,大家要对比之前分享的方法SPOTlight一起对比学习,我们这里关注重点。
基础知识部分
一、为什么不能利用bulk-seq数据解卷积方法,直接对空间转录组数据进行解卷积??
(1)the number of cells within each spot is typically small. For example, each spot in the 10X Genomics Visium platform has the diameter of 55 μm, corresponding to a spatial resolution of 5-10 cells. The application of a bulk RNAseq deconvolution method to such a small sample size would result in noise from unrelated cell types。(noise)
(2)as spatial expression datasets usually contain thousands of spots, it would be time and memory consuming if deconvolution methods designed for bulk RNA-seq were applied on spatial expression datasets.(第二个原因还是次要的,主要是第一个)
二、spatialDWLS的分析原理
(1)it identifies cell types that likely to be present at each location by using a recently developed cell-type enrichment analysis method(注意这里用到了一种富集方法,算法中我们探讨一下)。
(2)the cell type composition at each location is inferred by extending the dampened weighted least squares (DWLS) method,which was originally developed for deconvolving bulk RNAseq data(我们先来记住这个简单的过程)。
三、spatialDWLS方法的评估
the Root Mean Square Error (RMSE) associated with oligodendrocytes is only 0.03 with the predicted values approximately center around ground-truth。
这里有一个Root Mean Square Error (RMSE),大家可以参考均方根误差。
可见方法中对之前介绍的SPOTlight进行了比较。
这里提一句,文章写肯定自己的方法最好,但是,我们要甄别。
四、运用验证,这里就列举其中的一个例子
During embryonic development, the spatial-temporal distribution of cell types changes
dramatically. Therefore, it is of interest to test whether spatialDWLS could aid the discovery of such dynamic changes. Recently, Asp and colleagues studied the development of human heart in early embryos (4.5–5, 6.5, and 9 post-conception weeks) by using the Spatial Transcriptomics (ST) technology。 Since the data does not have single-cell resolution, they were not able to identify cell-type distribution directly from the ST data. In order to apply spatialDWLS, we utilized the single-cell RNAseq derived gene signatures from this study as reference. All the cell types were mapped to expected locations .
In order to quantitatively compare the change of spatial-temporal organization of cell type composition during embryonic heart development, we first examined the overall abundance of different cell types
有些细胞增多了,有些细胞减少了(联合分析的结果看),总之,结果很好,大家尝试(作者的观点)。
这里我们要重点关注一点文章的方法了。
Cell type selection of spatial expression data by enrichment analysis We use an enrichment based weighted least squares approach for deconvolution of spatial
expression datasets
(1)enrichment analysis using Parametric Analysis of Gene Set Enrichment (PAGE) method22 is applied on spatial expression dataset as previously reported。这里的富集方法就是GSEA。The marker genes can be identified via differential expression gene analysis of Giotto based on the single cell RNA-seq data provided by users(单细胞数据提供的marker,感觉有点Low,)。Alternatively, users can also provide marker gene expression for each cell type for deconvolution.(或者自己提供marker,更扯了)。
细胞marker gene的数量为m,对于每个基因,我们将倍数变化计算为每个点的表达值与所有点的平均表达之比,The mean and standard deviation of the fold change values are defined as μ and δ, respectively.In addition, we calculate the mean fold change of the m marker genes, which is defined as Sm. The enrichment score (ES) is defined as follows:
Then, we binarize the enrichment matrix with the cutoff value of ES = 2 to select cell types that are likely to be present at each point.
恕我直言,这个富集方法,很飘啊。
Estimating cell type composition by using a weighted least squares approach
In previous work, we developed dampened weighted least squares (DWLS) for deconvolution of single-cell RNAseq data.(这个方法大家可以查一下),This method is extended here to deconvolve spatial transcriptomic data using the signature gene identification step described above. In short, DWLS uses a weighted least squares approach to infer cell-type composition, where the weight is selected to minimize the overall relative error rate. In addition, a damping constant d is used to enhance numerical stability, whose value is determined by using a cross-validation procedure. Here, we use the same sets of weights and damping constant across spots within same clusters to reduce technical variation. Finally, since the number of cells present at each spot is generally small, we perform another round deconvolution by remove those cell types that are predicted to present at a low frequency by imposing an additional thresholding (min frequency = 0.02 by default).(这个地方还是需要涉及到算法,大家可以深入)。
最后来一张效果图
这个方法在spatialDWLS,代码都很简单,只需要关注一个函数runDWLSDeconv,算法才是精髓。
生活很好,有你更好1