系列回顾:
ArchR官网教程学习笔记1:Getting Started with ArchR
ArchR官网教程学习笔记2:基于ArchR推测Doublet
ArchR官网教程学习笔记3:创建ArchRProject
ArchR官网教程学习笔记4:ArchR的降维
大多数单细胞聚类方法专注于计算降维的nearest neighbor graphs,然后识别“社区”(communities)或细胞群。这些方法非常有效,是scRNA-seq的标准方法。由于这个原因,ArchR使用来自scRNA-seq包现有的最先进的clustering方法进行聚类。
(一)使用Seurat的FindClusters()
功能
我们使用Seurat的图聚类实现方法取得了很大的成功。在ArchR中,使用addClusters()
函数来执行聚类,它允许更多的聚类参数,传递给Seurat::FindClusters()
函数。使用Seurat::FindClusters()的聚类是确定性的,这意味着完全相同的输入会产生完全相同的输出结果。
> projHeme2 <- addClusters(
input = projHeme2,
reducedDims = "IterativeLSI",
method = "Seurat",
name = "Clusters",
resolution = 0.8
)
ArchR logging to : ArchRLogs\ArchR-addClusters-28e87e1c6324-Date-2020-11-20_Time-03-10-43.log
If there is an issue, please report to github with logFile!
2020-11-20 03:10:44 : Running Seurats FindClusters (Stuart et al. Cell 2019), 0.006 mins elapsed.
Computing nearest neighbor graph
Computing SNN
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
Number of nodes: 10251
Number of edges: 499370
Running Louvain algorithm...
0% 10 20 30 40 50 60 70 80 90 100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.8573
Number of communities: 12
Elapsed time: 1 seconds
2020-11-20 03:11:16 : Testing Outlier Clusters, 0.549 mins elapsed.
2020-11-20 03:11:16 : Assigning Cluster Names to 12 Clusters, 0.549 mins elapsed.
2020-11-20 03:11:17 : Finished addClusters, 0.551 mins elapsed.
可以看一下聚类结果:
> head(projHeme2$Clusters)
[1] "C4" "C7" "C9" "C9" "C9" "C4"
查看每个cluster有多少个细胞数:
> table(projHeme2$Clusters)
C1 C10 C11 C12 C2 C3 C4 C5 C6 C7 C8 C9
1479 436 306 383 1102 845 1168 1403 806 1268 705 350
为了更好地理解哪些样本位于哪些cluster中,我们可以使用confusionMatrix()
函数在每个样本之间创建一个混合cluster矩阵:
> cM <- confusionMatrix(paste0(projHeme2$Clusters), paste0(projHeme2$Sample))
> cM
12 x 3 sparse Matrix of class "dgCMatrix"
scATAC_BMMC_R1 scATAC_CD34_BMMC_R1 scATAC_PBMC_R1
C4 352 813 3
C7 1222 . 46
C9 350 . .
C10 258 4 174
C1 1448 4 27
C5 139 1264 .
C3 189 646 10
C8 133 1 571
C11 152 145 9
C6 254 . 552
C12 93 290 .
C2 99 1 1002
然后把这个混合的矩阵用热图画出来:
> library(pheatmap)
> cM <- cM / Matrix::rowSums(cM)
> p <- pheatmap::pheatmap(
mat = as.matrix(cM),
color = paletteContinuous("whiteBlue"),
border_color = "black"
)
> p
有时,细胞在二维嵌入中的相对位置与确定的clusters并不完全一致。更明确地说,单个cluster的细胞可能出现在嵌入的多个不同区域。在这种情况下,适当地调整聚类参数或嵌入参数,直到两者达成一致。
(二)使用scran进行聚类
第二种聚类的方法,通过更改addClusters()
里的method参数来调整:
> projHeme2 <- addClusters(
input = projHeme2,
reducedDims = "IterativeLSI",
method = "scran",
name = "ScranClusters",
k = 15
)
ArchR logging to : ArchRLogs\ArchR-addClusters-2e10d2f4585-Date-2020-11-20_Time-03-47-21.log
If there is an issue, please report to github with logFile!
2020-11-20 03:47:22 : Running Scran SNN Graph (Lun et al. F1000Res. 2016), 0.017 mins elapsed.
2020-11-20 03:47:30 : Identifying Clusters (Lun et al. F1000Res. 2016), 0.152 mins elapsed.
2020-11-20 03:50:33 : Testing Outlier Clusters, 3.199 mins elapsed.
2020-11-20 03:50:33 : Assigning Cluster Names to 9 Clusters, 3.199 mins elapsed.
2020-11-20 03:50:33 : Finished addClusters, 3.201 mins elapsed.