10X单细胞(10X空间转录组)之配受体联合TF因子的通讯分析(cellcall)

hello,大家好,今天我们来分享一个更加重要的通讯方法,cellcall,规则很简单,就是我们在对单细胞进行通讯分析的时候,通讯到底有没有起到作用???之前的我分享的文章10X单细胞通讯分析之scMLnet(配受体与TF,差异基因(靶基因)网络通讯分析)10X单细胞(10X空间转录组)通讯分析之NicheNet,都提到过这个问题,今天,我们就要把配受体和TF因子联合起来分析,得到真正起到通讯作用的配受体,真正了解细胞之间的交流。文章在CellCall: integrating paired ligand–receptor and transcription factor activities for cell–cell communication,2021年8月2号发表于Nucleic Acids Research,影响因子16分。

Overview of CellCall

CellCall 是一个通过整合细胞内和细胞间信号来推断细胞间通讯网络和内部调节信号的工具。
特点
  • 1、 CellCall collects ligand-receptor-transcript factor (L-R-TF) axis datasets based on KEGG pathways(最大的亮点)。

  • 2、 According to prior knowledge of L-R-TF interactions, CellCall infers intercellular communication by combining the expression of ligands/receptors and downstream TF activities for certain L-R pairs. (就是要拿到真正起到作用的配受体对)。

  • 3、CellCall embeds a pathway activity analysis method to identify the crucial pathways involved in communications between certain cell types. (这一点也很重要,上升到通路的高度)。

  • 4、CellCall offers a rich suite of visualization options (Circos plot, Sankey plot, bubble plot, ridge plot, etc.) to intuitively present the analysis results.(可视化,最后的呈现方式)。

The overview figure of CellCall is shown as follows.

图片.png
ligand-receptor-transcript factor (L-R-TF)的分析模式,非常重要,比原来的方法进了一大步。

Main functions of CellCall

CellCall provides a variety of functions including intercellular communication analysis, pathway activity analysis and a rich suite of visualization tools to intuitively present the results of the analysis (including Heatmap, Circos plot, Bubble plot, Sankey plot, TF enrichment plot and Ridge plot).

首先看看Intercellular communication analysis

加载数据(说白了就是矩阵,seurat的分析结果rds完全匹配)
The format of the input file is as follow table:
    1. The row names: gene symbols.
    1. The column names: cell IDs. The colnames can't contain punctuation such as commas, periods, dashes, etc. Using underline to connect barcoder and cell type is recommended. Take the input format below as an example, the column name is made up of index and cell type. Users should set names.field=2 and names.delim="_" in the function CreateNichConObject(). After that, cell type information is obtained and stored in the S4 object for later analysis. Because method in this paper depends on the cell type information, obtaining celltype information correctly is important.
    1. Other place: the expression values (counts or TPM) for a gene in a cell.
图片.png
注意这里的行名,写上了细胞类型。
Create object
 mt <- CreateNichConObject(data=data, min.feature = 3,
                            names.field = 2,
                            names.delim = "_",
                            source = "TPM",
                            scale.factor = 10^6,
                            Org = "Homo sapiens",
                            project = "Microenvironment")
Arguments Detail
data A dataframe with row of gene and column of sample and the value must be numeric. Meanwhile what matters is that the colnames of dataframe should be in line with the paramter 'names.delim' and 'names.field', the former for pattern to splite every colnames, the latter for setting which index in splited colnames is cell type information. The function can get the 'CELLTYPE' information from the colnames 'BARCODE_CLUSTER_CELLTYPE' with names.delim="_" and names.field='3', and then stored in slot meta.data of CreateNichCon. Cell type annotation from every cell is essential for scoring cell communication. If the colnames of data don't coincide with the paramter 'names.delim' and 'names.field', CreateNichCon object may fail to create.
min.feature Include cells where enough features equalling min.feature are detected. It's a preprocess which is the same as Seurat and set min.feature=0, if you don't want to filter cell. This parameter depends on the sequencing technology of the input data.
names.delim Set the pattern to splite column name into vector. If the column name of the input matrix is BARCODE_CLUSTER_CELLTYPE, set names.delim="_" to get CELLTYPE of BARCODE_CLUSTER_CELLTYPE with names.field=3.
names.field Set the index of column name vector which is splited by parameter names.delim to get cell type information. If the column name of the input matrix is BARCODE_CLUSTER_CELLTYPE, set names.field=3 to get CELLTYPE of BARCODE_CLUSTER_CELLTYPE with names.delim="_".
source The type of expression dataframe, eg "UMI", "fullLength", "TPM", or "CPM". When the source of input data is "TPM" or "CPM", no transformation on the data. Otherwise, we transform the data to TPM with the parameter source="fullLength" and to CPM with source="UMI".
scale.factor Sets the scale factor for cell-level normalization, default "10^6", if the parameter is "UMI" or "fullLength". Otherwise this parameter doesn't work.
Org Set the species source of gene, eg "Homo sapiens", "Mus musculus". This parameter decides the paired ligand-receptor dataset and the transcript length which is needed in "TPM" transformation.
project Sets the project name for the CreateNichCon object.

Infer the cell-cell communication score

The communication score of an L-R interaction between cell types is evaluated by integrating the L2- norm of the L-R interaction and the activity score of the downstream TFs.
mt <- TransCommuProfile(object = mt,
                        pValueCor = 0.05,
                        CorValue = 0.1,
                        topTargetCor=1,
                        p.adjust = 0.05,
                        use.type="median",
                        probs = 0.9,
                        method="weighted",
                        IS_core = TRUE,
                        Org = 'Homo sapiens')

参数分析

Arguments Detail
object A Cellcall S4 object, the result of function CreateNichConObject().
pValueCor Set the threshold of spearson Correlation significance between target gene and TF, ( significance < pValueCor, default is 0.05 ).
CorValue Set the threshold of spearson Correlation Coefficient between target gene and TF, ( Coefficient > CorValue, default is 0.1 ).
topTargetCor Set the rank of candidate genes which has firlter by spearson Correlation, default is 1, that means 100% filtered candidate genes will be used.
p.adjust Set the threshold of regulons's GSEA pValue which adjusted by Benjamini & Hochberg, default is 0.05.
use.type With parameter "median", CellCall set the mean value of gene as zero, when the percentile of gene expression in one celltype below the parameter "probs". The other choice is "mean" and means that we not concern about the percentile of gene expression in one celltype but directly use the mean value.
probs Set the percentile of gene expression in one celltype to represent mean value, when use.type="median".
method Choose the proper method to score downstream activation of all regulons of given ligand-receptor relation. Candidate values are "weighted", "max", "mean", of which "weighted" is default.
Org Choose the dataset source of this project, eg "Homo sapiens", "Mus musculus".
IS_core Logical variable, whether use core reference LR data with high confidence or include extended datasets on the basis of core reference.

Pathway activity analysis

CellCall embeds a pathway activity analysis method to help explore the main pathways involved in communication between certain cells.
n <- mt@data$expr_l_r_log2_scale

pathway.hyper.list <- lapply(colnames(n), function(i){
    print(i)
    tmp <- getHyperPathway(data = n, object = mt, cella_cellb = i, Org="Homo sapiens")
    return(tmp)
})
Arguments Detail
data A dataframe of communication score where row name is ligand-receptor and column names is cellA-cellB, stored in the data$expr_l_r_log2_scale slot of S4 object.
object A Cellcall S4 object, the result of function CreateNichConObject() and TransCommuProfile().
cella_cellb If explore the pathway enriched by paired ligand-receptor dataset between sender cellA and receiver cellB, user can set cella_cellb="A-B".
Org Choose the dataset source of this project, eg "Homo sapiens", "Mus musculus".
IS_core Logical variable, whether use core reference LR data with high confidence or include extended datasets on the basis of core reference.
For pathway activity analysis, Bubble plot is adopted to present the analysis results. Function of getForBubble is used to merge the data and plotBubble is used to draw the bubble plot.
myPub.df <- getForBubble(pathway.hyper.list, cella_cellb=colnames(n))
p <- plotBubble(myPub.df)
图片.png

最后的可视化Visualization

Circle plot
  cell_color <- data.frame(color=c('#e31a1c','#1f78b4',
                                   '#e78ac3','#ff7f00'), stringsAsFactors = FALSE)
  rownames(cell_color) <- c("SSC", "SPGing", "SPGed", "ST")
Plotting circle with CellCall object:
ViewInterCircos(object = mt, font = 2, cellColor = cell_color, 
                lrColor = c("#F16B6F", "#84B1ED"),
                arr.type = "big.arrow",arr.length = 0.04,
                trackhight1 = 0.05, slot="expr_l_r_log2_scale",
                linkcolor.from.sender = TRUE,
                linkcolor = NULL, gap.degree = 2,
                order.vector=c('ST', "SSC", "SPGing", "SPGed"),
                trackhight2 = 0.032, track.margin2 = c(0.01,0.12), DIY = FALSE)
图片.png
Plotting circle with DIY dataframe of mt@data$expr_l_r_log2_scale:
  ViewInterCircos(object = mt@data$expr_l_r_log2_scale, font = 2, 
                  cellColor = cell_color,
                  lrColor = c("#F16B6F", "#84B1ED"),
                  arr.type = "big.arrow",arr.length = 0.04,
                  trackhight1 = 0.05, slot="expr_l_r_log2_scale",
                  linkcolor.from.sender = TRUE,
                  linkcolor = NULL, gap.degree = 2,
                  order.vector=c('ST', "SSC", "SPGing", "SPGed"),
                  trackhight2 = 0.032, track.margin2 = c(0.01,0.12), DIY = T)
图片.png
图片.png
Pheatmap plot
viewPheatmap(object = mt, slot="expr_l_r_log2_scale", show_rownames = T,
             show_colnames = T,treeheight_row=0, treeheight_col=10,
             cluster_rows = T,cluster_cols = F,fontsize = 12,angle_col = "45",  
             main="score")
图片.png
图片.png

Sankey plot

  mt <- LR2TF(object = mt, sender_cell="ST", recevier_cell="SSC",
              slot="expr_l_r_log2_scale", org="Homo sapiens")
  head(mt@reductions$sankey)
 if(!require(networkD3)){
        BiocManager::install("networkD3")
  }
  
  sank <- LRT.Dimplot(mt, fontSize = 8, nodeWidth = 30, height = NULL, width = 1200,         
                      sinksRight=FALSE, DIY.color = FALSE)
  networkD3::saveNetwork(sank, "~/ST-SSC_full.html")
图片.png

The first pillar is ligand,the second pillar is receptor and the last pillar is TF. And the color of left and right flow are consistent with ligand and TF respectively.

图片.png

The color depends on ligand and receptor (by function sankey_graph with isGrandSon = FALSE)

library(magrittr)
library(dplyr)
tmp <- mt@reductions$sankey
tmp1 <- dplyr::filter(tmp, weight1>0) ## filter triple relation with weight1 (LR score)
tmp.df <- trans2tripleScore(tmp1)  ## transform weight1 and weight2 to one value (weight)
head(tmp.df)

## set the color of node in sankey graph
mycol.vector = c('#5d62b5','#29c3be','#f2726f','#62b58f','#bc95df', '#67cdf2', '#ffc533', '#5d62b5', '#29c3be')  
elments.num <-  tmp.df %>% unlist %>% unique %>% length()
mycol.vector.list <- rep(mycol.vector, times=ceiling(elments.num/length(mycol.vector)))
sankey_graph(df = tmp.df, axes=1:3, mycol = mycol.vector.list[1:elments.num], nudge_x = NULL, font.size = 4, boder.col="white", isGrandSon = F)

The first pillar is ligand,the second pillar is receptor and the last pillar is TF. And the color of left and right flow are consistent with ligand and receptor respectively.

图片.png

The color depends on ligand (by function sankey_graph with isGrandSon = TRUE)

library(magrittr)
library(dplyr)
tmp <- mt@reductions$sankey
tmp1 <- dplyr::filter(tmp, weight1>0)  ## filter triple relation with weight1 (LR score)
tmp.df <- trans2tripleScore(tmp1)  ## transform weight1 and weight2 to one value (weight)

## set the color of node in sankey graph
mycol.vector = c('#9e0142','#d53e4f','#f46d43','#fdae61','#fee08b','#e6f598','#abdda4','#66c2a5','#3288bd','#5e4fa2')
elments.num <-  length(unique(tmp.df$Ligand))
mycol.vector.list <- rep(mycol.vector, times=ceiling(elments.num/length(mycol.vector)))

sankey_graph(df = tmp.df, axes=1:3, mycol = mycol.vector.list[1:elments.num], 
             isGrandSon = TRUE, nudge_x = nudge_x, font.size = 2, boder.col="white",            
             set_alpha = 0.8)

The first pillar is ligand,the second pillar is receptor and the last pillar is TF. And the color of left and right flow are consistent with ligand.

图片.png
图片.png

TF enrichment plot

TF enrichment plot is adopted to present the TF activities in receiver cells.

Obtain the gene sets (TGs of TF):

For some biologists, they pay more attention to the TGs of the TF. This tool provide options to present or export the details of the TG list, stored in the NichConObject@datagsea.listcell_type@geneSets.

mt@data$gsea.list$SSC@geneSets

The figure presents a part of result stored in the NichConObject@datagsea.listcell_type@geneSets.

图片.png

Show all TFs in the SSC:

  ssc.tf <- names(mt@data$gsea.list$SSC@geneSets)
  ssc.tf
图片.png

Draw the TF enrichment plot:

getGSEAplot(gsea.list=mt@data$gsea.list, geneSetID=c("CREBBP", "ESR1", "FOXO3"), 
            myCelltype="SSC", fc.list=mt@data$fc.list,  
            selectedGeneID = mt@data$gsea.list$SSC@geneSets$CREBBP[1:10],
            mycol = NULL)
图片.png
图片.png

Ridge plot

Ridge plot is adopted to present FC distribution of TGs of activated TFs.

## gsea object
egmt <- mt@data$gsea.list$SSC

## filter TF
egmt.df <- data.frame(egmt)
head(egmt.df[,1:6])
flag.index <- which(egmt.df$p.adjust < 0.05)

ridgeplot.DIY(x=egmt, fill="p.adjust", showCategory=flag.index, core_enrichment = T,
                orderBy = "NES", decreasing = FALSE)

图片.png
图片.png

生活很好,有你更好

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
禁止转载,如需转载请通过简信或评论联系作者。
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 212,686评论 6 492
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 90,668评论 3 385
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 158,160评论 0 348
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 56,736评论 1 284
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 65,847评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,043评论 1 291
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,129评论 3 410
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 37,872评论 0 268
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,318评论 1 303
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,645评论 2 327
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,777评论 1 341
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,470评论 4 333
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,126评论 3 317
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,861评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,095评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 46,589评论 2 362
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 43,687评论 2 351

推荐阅读更多精彩内容