Cluster Analysis in Integrated Data
clp
11 June, 2020
准备工作
加载前面学习程中的R环境变量和必要的R包
library(data.table)
library(ggplot2)
library(Seurat)
load('out/01_immune_combined.rd') #immune.combined
鉴定保守的细胞类型标记(markers)
为了识别在不同条件下保守的典型细胞类型标记基因,我们提供了FindConservedMarkers
函数。此函数为每个数据集/组执行差异基因表达检测,并使用来自MetaDE
包的荟萃分析方法组合p值。例如,我们可以计算由’NK’细胞标记的簇中的保守标记基因,而不考虑刺激条件。
DefaultAssay(immune.combined) <- "RNA"
Idents(immune.combined) <- "seurat_annotations"
message("This run will take 5+ min ...")
nk.markers <- FindConservedMarkers(immune.combined, ident.1 = "NK", grouping.var = "stim", verbose = FALSE) #default slot: 'data'
head(nk.markers)
#> CTRL_p_val CTRL_avg_logFC CTRL_pct.1 CTRL_pct.2 CTRL_p_val_adj
#> GNLY 0 4.186117 0.943 0.046 0
#> NKG7 0 3.164712 0.953 0.085 0
#> GZMB 0 2.915692 0.839 0.044 0
#> CLIC3 0 2.407695 0.601 0.024 0
#> FGFBP2 0 2.241968 0.500 0.021 0
#> CTSW 0 2.088278 0.537 0.030 0
#> STIM_p_val STIM_avg_logFC STIM_pct.1 STIM_pct.2 STIM_p_val_adj
#> GNLY 0.000000e+00 4.066429 0.956 0.059 0.000000e+00
#> NKG7 0.000000e+00 2.904602 0.950 0.081 0.000000e+00
#> GZMB 0.000000e+00 3.128167 0.897 0.060 0.000000e+00
#> CLIC3 0.000000e+00 2.460388 0.623 0.031 0.000000e+00
#> FGFBP2 1.674159e-159 1.485116 0.259 0.016 2.352696e-155
#> CTSW 0.000000e+00 2.175186 0.592 0.035 0.000000e+00
#> max_pval minimump_p_val
#> GNLY 0.000000e+00 0
#> NKG7 0.000000e+00 0
#> GZMB 0.000000e+00 0
#> CLIC3 0.000000e+00 0
#> FGFBP2 1.674159e-159 0
#> CTSW 0.000000e+00 0
此外,我们可以探索每种细胞类型的以下标记基因,以验证这些clusters是否具有特定的细胞类型。
marker_genes <- c("CD3D", "SELL", "CREM", "CD8A", "GNLY", "CD79A", "FCGR3A", "CCL2", "PPBP")
FeaturePlot(immune.combined, features = marker_genes, min.cutoff = "q9")
image.png
带有split.by
的DotPlot
函数可用于跨条件查看保守的细胞类型标记,显示表达任何给定基因的簇中细胞的表达水平和百分比。在这里,我们为之前获取的13个簇中的每一个绘制了2-3个强标记基因。
markers.to.plot <- c("CD3D", "CREM", "HSPH1", "SELL", "GIMAP5", "CACYBP", "GNLY", "NKG7", "CCL5", "CD8A", "MS4A1", "CD79A", "MIR155HG", "NME1", "FCGR3A", "VMO1", "CCL2", "S100A9", "HLA-DQA1", "GPR183", "PPBP", "GNG11", "HBA2", "HBB", "TSPAN13", "IL3RA", "IGJ")
DotPlot(immune.combined,
features = rev(markers.to.plot),
cols = c("blue", "red"),
dot.scale = 8,
split.by = "stim") + RotatedAxis()
image.png
保存R环境变量留待下次使用:
save(immune.combined, file = 'out/02_immune_cons.rd',compress = TRUE)
到了这一步需要了解的重点
- When conserved gene are useful?
- In Seurat object,
- What is assay?
- What is slot?
- Why there are multiple slots?