1.加载数据
library(Seurat)
library(SeuratData)
library(patchwork)
load(file = "../immune.combined.rds")
> immune.combined
An object of class Seurat
14826 features across 13999 samples within 2 assays
Active assay: RNA (14053 features, 0 variable features)
1 other assay present: integrated
2 dimensional reductions calculated: pca, umap
2.可视化
p1 <- DimPlot(immune.combined, reduction = "umap", group.by = "stim")
p2 <- DimPlot(immune.combined, reduction = "umap", label = TRUE, repel = TRUE)
p3 <- DimPlot(immune.combined,reduction = "umap",split.by = "stim")
p1 + p2
p1
p2
p3
3.识别保守的细胞类型标记基因
为了筛选出跨条件保守的典型细胞类型标记基因,“FindConservedMarkers()”函数对每个数据集/组执行差异基因表达测试,并使用 MetaDE R 包中的元分析方法组合 p 值。 例如,我们可以计算在簇 6(NK 细胞)中无论刺激条件如何都是保守标记的基因。
# For performing differential expression after integration, we switch back to the original data
DefaultAssay(immune.combined) <- "RNA"
nk.markers <- FindConservedMarkers(immune.combined, ident.1 = 6, grouping.var = "stim", verbose = FALSE)
head(nk.markers)
我们可以探索每个细胞群的这些标记基因,并使用它们将我们的细胞亚群注释为特定的细胞类型。
FeaturePlot(immune.combined, features = c("CD3D", "SELL", "CREM", "CD8A", "GNLY", "CD79A", "FCGR3A", "CCL2", "PPBP"), min.cutoff = "q9")
immune.combined <- RenameIdents(immune.combined, "0" = "CD14 Mono", "1" = "CD4 Naive T", "2" = "CD4 Memory T", "3" = "CD16 Mono", "4" = "B", "5" = "CD8 T", "6" = "NK" , "7" = "T activated", "8" = "DC", "9" = "B Activated", "10" = "Mk", "11" = "pDC", "12" = "Eryth", "13" = "Mono/Mk Doublets", "14" = "HSPC")
DimPlot(immune.combined, label = TRUE)
带有
split.by
参数的DotPlot()
函数可用于查看跨条件的保守细胞类型标记,显示表达水平和表达任何给定基因的簇中细胞的百分比。 在这里,我们为 14 个聚类中的每一个绘制了 2-3 个强标记基因。
Idents(immune.combined) <- factor(Idents(immune.combined), levels = c("HSPC", "Mono/Mk Doublets", "pDC", "Eryth","Mk", "DC", "CD14 Mono", "CD16 Mono", "B Activated", "B", "CD8 T", "NK", "T activated", "CD4 Naive T", "CD4 Memory T"))
markers.to.plot <- c("CD3D","CREM","HSPH1","SELL","GIMAP5","CACYBP","GNLY","NKG7","CCL5","CD8A","MS4A1","CD79A","MIR155HG","NME1","FCGR3A","VMO1","CCL2","S100A9","HLA-DQA1","GPR183","PPBP","GNG11","HBA2","HBB","TSPAN13","IL3RA","IGJ","PRSS57")
DotPlot(immune.combined, features = markers.to.plot, cols = c('blue', 'red'), dot.scale = 8, split.by = "stim") + RotatedAxis()
library(ggplot2)
plot <- DotPlot(immune.combined, features = markers.to.plot, cols = c('blue', 'red'),dot.scale = 6, split.by = "stim") + RotatedAxis()
ggsave(filename = "../xx.jpg", height = 7, width = 12, plot = plot, quality = 50)
4跨条件识别差异表达基因
比较分析并查看刺激引起的差异一种方法是绘制受刺激细胞和对照细胞的平均表达,并在散点图上寻找视觉显著值的基因。 在这里,取受刺激和对照幼稚 T 细胞和 CD14 单核细胞群的平均表达,并生成散点图,突出显示对干扰素刺激有显着反应的基因。
library(ggplot2)
library(cowplot)
theme_set(theme_cowplot())
t.cells <- subset(immune.combined, idents = "CD4 Naive T")
Idents(t.cells) <- "stim"
avg.t.cells <- as.data.frame(log1p(AverageExpression(t.cells, verbose = FALSE)$RNA))
avg.t.cells$gene <- rownames(avg.t.cells)
cd14.mono <- subset(immune.combined, idents = "CD14 Mono")
Idents(cd14.mono) <- "stim"
avg.cd14.mono <- as.data.frame(log1p(AverageExpression(cd14.mono, verbose = FALSE)$RNA))
avg.cd14.mono$gene <- rownames(avg.cd14.mono)
genes.to.label = c("ISG15", "LY6E", "IFI6", "ISG20", "MX1", "IFIT2", "IFIT1", "CXCL10", "CCL8")
p1 <- ggplot(avg.t.cells, aes(CTRL, STIM)) + geom_point() + ggtitle("CD4 Naive T Cells")
p1 <- LabelPoints(plot = p1, points = genes.to.label, repel = TRUE)
p2 <- ggplot(avg.cd14.mono, aes(CTRL, STIM)) + geom_point() + ggtitle("CD14 Monocytes")
p2 <- LabelPoints(plot = p2, points = genes.to.label, repel = TRUE)
p1 + p2
如图所示,许多相同的基因在这两种细胞类型中都被上调,并且可能代表了一种保守的干扰素反应途径。
- 除了分析在不同条件下识别出常见的细胞类型,我们也可以探索相同类型细胞在不同条件下哪些基因会发生变化.
首先,我们在 meta.data 中创建一个列来保存细胞类型和刺激信息,并将当前标识切换到该列。 然后我们使用
FindMarkers()
来查找受刺激 B 细胞和对照 B 细胞之间不同的基因。 请注意,此处显示的许多top基因与之前的核心干扰素反应基因相同。
immune.combined$celltype.stim <- paste(Idents(immune.combined), immune.combined$stim, sep = "_")
immune.combined$celltype <- Idents(immune.combined)
Idents(immune.combined) <- "celltype.stim"
b.interferon.response <- FindMarkers(immune.combined, ident.1 = "B_STIM", ident.2 = "B_CTRL", verbose = FALSE)
head(b.interferon.response, n = 15)
另一种可视化基因表达变化的有用方法是使用
FeaturePlot()
或VlnPlot()
函数的split.by
选项。这将显示给定基因列表的特征图,按分组变量(此处为刺激条件)拆分。
FeaturePlot(immune.combined, features = c("CD3D", "GNLY", "IFI6"), split.by = "stim", max.cutoff = 3, cols = c("grey", "red"))
plots <- VlnPlot(immune.combined, features = c("LYZ", "ISG15", "CXCL10"), split.by = "stim", group.by = "celltype", pt.size = 0, combine = FALSE)
wrap_plots(plots = plots, ncol = 1)
wrap_plots