通过空间行为（optimal transport）推断空间细胞间通讯信号方向(COMMOT)

作者，Evil Genius

周四了，变懒了，重申一遍，不要抄袭。

分享文章Screening cell-cell communication in spatial transcriptomics via collective optimal transport, 2023年1月发表于nature methods。

方法链接在COMMOT

网页示例在COMMOT — commot 0.0.3 documentation

非常棒的推断空间通讯的方法。

单细胞推断细胞通讯的缺点:

non-spatial studies often contain significant false positives given that CCC takes place only within limited spatial distances that are not measured in scRNA-seq datasets

目前空间的个性化分析软件

CellPhoneDB v3 restricts interactions to cell clusters in the same microenvironment defined based on spatial information;

stLearn relates the co-expression of ligand and receptor genes to the spatial diversity of cell types;

SVCA and MISTy use probabilistic and machine learning models, respectively, to identify the spatially constrained intercellular gene–gene interactions;

NCEM fits a function to relate cell type and spatial context to gene expression。

空间推断通讯的不足
current methods examine CCC locally and on cell pairs independently, and focus on information between cells or in the neighborhoods of individual cells. As a result,
collective or global information in CCC, such as competition between cells, is neglected.

COMMOT软件特点
a package that infers CCC by simultaneously considering numerous ligand–receptor pairs for either spatial transcriptomics data or spatially annotated scRNA-seq data equipped with spatial distances between cells estimated from paired spatial imaging data; summarizes and compares directions of spatial signaling; identifies downstream effects of CCC on gene expressions using ensemble of trees models; and provides visualization utilities for the various analyses.

软件总览----COMMOT

配体和受体通常在有限的空间范围内与多种复合物相互作用。考虑到这一点，作者提出了具有三个重要特征的collective optimal transport：首先，the use of non-probability mass distributions to control the marginals of the transport plan to maintain comparability between species(需要一点数学背景知识);其次，对CCC实施空间距离约束，以避免连接空间上相距较远的细胞;最后，将多种配体分布结合到多中受体分布以解释多种相互作用。

空间距离通讯

由于缺乏配体和受体蛋白的空间共定位测量，对空间数据的CCC推断方法的直接验证是困难的。在这里，建立了PDE模型来模拟空间中的CCC。COMMOT模拟了不同数量的配体和受体以及不同的竞争模式，从生成的合成数据中准确地重建了CCC连接。comot强化空间限制和不需要概率分布的特点在其他真实的空间转录组数据集上得到了进一步的说明（文章示例）。
对于每一对配体-受体和每一对细胞或spot，CCC推理量化了由一个spot贡献给另一个spot的配体-受体复合物的配体。然后，进行了几个下游分析:首先，空间信号方向的补充分析和CCC区域之间差异的识别;二是在空间聚类层面上对CCC进行归纳与分组;最后，对受CCC影响的下游基因进行鉴定。空间信号方向是通过将每个细胞的CCC矩阵补充到一个向量场来确定接收或发送信号的方向。对于下游分析，首先识别与接收到的信号有差异表达的基因，然后通过结合机器学习模型(使用接收到的信号和其他相关基因预测目标基因水平)来量化CCC对这些基因的影响。

文章给到的分析示例1（human epidermal development）

文章给到的分析示例2（高精度的空间数据分析，例如华大平台）

文章给到的分析示例3（10X空间转录组数据）

最后，我们来分享代码

安装

pip install commot

加载并载入数据，python版本，与scanpy无缝衔接

import commot as ct
import scanpy as sc
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt

加载空间数据，大家有自己的数据就读取自身数据

adata = sc.datasets.visium_sge(sample_id='V1_Mouse_Brain_Sagittal_Posterior')
adata.var_names_make_unique()

基础处理

sc.pp.normalize_total(adata, inplace=True)
sc.pp.log1p(adata)

Specify ligand-receptor pairs

Here, we use an example with only three LR pairs.
A user-defined LR database can be specified in the same way or alternatively, built-in LR databases can be obtained with the function commot.pp.ligand_receptor_database().
For example df_ligrec=ct.pp.ligand_receptor_database(database='CellChat', species='mouse').

LR=np.array([['Tgfb1', 'Tgfbr1_Tgfbr2', 'Tgfb_pathway'],
    ['Tgfb2', 'Tgfbr1_Tgfbr2', 'Tgfb_pathway'],
    ['Tgfb3', 'Tgfbr1_Tgfbr2', 'Tgfb_pathway'],
    ['Fgf1', 'Fgfr1', 'FGF_pathway'],
    ['Fgf1', 'Fgfr2', 'FGF_pathway']],dtype=str)
df_ligrec = pd.DataFrame(data=LR)

构建通讯网络

Use optimal transport to construct CCC networks for the ligand-receptor pairs with a spatial distance constraint of 500um (coupling between cells with distance greater than 500 is prohibited). 这里的距离限制在500um以内。
For example, the spot-by-spot matrix for the pair Tgfb1 (ligand) and Tgfbr1_Tgfbr2 (receptor)is stored in adata.obsp['commot-user_database-Tgfb1-Tgfbr1_Tgfbr2']. The total sent or received signal for each pair is stored in adata.obsm['commot-user_database-sum-sender'] and adata.obsm['commot-user_database-sum-receiver'].

ct.tl.spatial_communication(adata,database_name='user_database', df_ligrec=df_ligrec, dis_thr=500, heteromeric=True, pathway_sum=True)

可视化

#Plot the amount of sent and received signal, for example, of the LR pair Fgf1-Fgfr1.
pts = adata.obsm['spatial']
s = adata.obsm['commot-user_database-sum-sender']['s-Fgf1-Fgfr1']
r = adata.obsm['commot-user_database-sum-receiver']['r-Fgf1-Fgfr1']
fig, ax = plt.subplots(1,2, figsize=(10,4))
ax[0].scatter(pts[:,0], pts[:,1], c=s, s=5, cmap='Blues')
ax[0].set_title('Sender')
ax[1].scatter(pts[:,0], pts[:,1], c=r, s=5, cmap='Reds')
ax[1].set_title('Receiver')

可视化信号方向

ct.tl.communication_direction(adata, database_name='user_database', lr_pair=('Fgf1','Fgfr1'), k=5)
ct.pl.plot_cell_communication(adata, database_name='user_database', lr_pair=('Fgf1','Fgfr1'), plot_method='grid', background_legend=True,
    scale=0.00003, ndsize=8, grid_density=0.4, summary='sender', background='image', clustering='leiden', cmap='Alphabet',
    normalize_v = True, normalize_v_quantile=0.995)

或者

ct.pl.plot_cell_communication(adata, database_name='user_database', lr_pair=('Fgf1','Fgfr1'), plot_method='grid', background_legend=True,
    scale=0.00003, ndsize=8, grid_density=0.4, summary='receiver', background='summary', clustering='leiden', cmap='Reds',
    normalize_v = True, normalize_v_quantile=0.995)

示例2，通路角度，前面处理一致

import os
import gc
import ot
import pickle
import anndata
import scanpy as sc
import pandas as pd
import numpy as np
from scipy import sparse
from scipy.stats import spearmanr, pearsonr
from scipy.spatial import distance_matrix
import matplotlib.pyplot as plt

import commot as ct
adata = sc.datasets.visium_sge(sample_id='V1_Mouse_Brain_Sagittal_Posterior')
adata.var_names_make_unique()
adata.raw = adata
sc.pp.normalize_total(adata, inplace=True)
sc.pp.log1p(adata)
adata_dis500 = adata.copy()
sc.pp.highly_variable_genes(adata, min_mean=0.0125, max_mean=3, min_disp=0.5)
adata = adata[:, adata.var.highly_variable]
sc.tl.pca(adata, svd_solver='arpack')
sc.pp.neighbors(adata, n_neighbors=10, n_pcs=40)
sc.tl.umap(adata)
sc.tl.leiden(adata, resolution=0.4)
sc.pl.spatial(adata, color='leiden')

空间通讯推断

use the CellChatDB ligand-receptor database . Only the secreted signaling LR pairs will be used.

df_cellchat = ct.pp.ligand_receptor_database(species='mouse', signaling_type='Secreted Signaling', database='CellChat')
print(df_cellchat.shape)
# filter the LR pairs to keep only the pairs with both ligand and receptor expressed in at least 5% of the spots
df_cellchat_filtered = ct.pp.filter_lr_database(df_cellchat, adata_dis500, min_cell_pct=0.05)
print(df_cellchat_filtered.shape)
print(df_cellchat_filtered.head())
       0              1     2                   3
0  Tgfb1  Tgfbr1_Tgfbr2  TGFb  Secreted Signaling
1  Tgfb2  Tgfbr1_Tgfbr2  TGFb  Secreted Signaling
2  Tgfb3  Tgfbr1_Tgfbr2  TGFb  Secreted Signaling
3  Tgfb1  Acvr1b_Tgfbr2  TGFb  Secreted Signaling
4  Tgfb1  Acvr1c_Tgfbr2  TGFb  Secreted Signaling

Now perform spatial communication inference for these 250 ligand-receptor pairs with a spatial distance limit of 500. CellChat database considers heteromeric units. The signaling results are stored as spot-by-spot matrices in the obsp slots. For example, the score for spot i signaling to spot j through the LR pair can be retrieved from adata_dis500.obsp['commot-cellchat-Wnt4-Fzd4_Lrp6'][i,j].

ct.tl.spatial_communication(adata_dis500,
    database_name='cellchat', df_ligrec=df_cellchat_filtered, dis_thr=500, heteromeric=True, pathway_sum=True)

#adata_dis500.write("./adata.h5ad")

Determine the spatial direction of a signaling pathway, for example, the PSAP pathway. The interpolated signaling directions for where the signals are sent by the spots and where the signals received by the spots are from are stored in adata_dis500.obsm['commot_sender_vf-cellchat-PSAP'] and adata_dis500.obsm['commot_receiver_vf-cellchat-PSAP'], respectively.

ct.tl.communication_direction(adata_dis500, database_name='cellchat', pathway_name='PSAP', k=5)
ct.pl.plot_cell_communication(adata_dis500, database_name='cellchat', pathway_name='PSAP', plot_method='grid', background_legend=True,
    scale=0.00003, ndsize=8, grid_density=0.4, summary='sender', background='image', clustering='leiden', cmap='Alphabet',
    normalize_v = True, normalize_v_quantile=0.995)

Downstream analysis

识别信号差异基因,还是以PSAP通路为例

Now we further examine what genes are likely differentially expressed with respect to the inferred signaling activity. tradeSeq will be used to model the DE genes. This analysis is similar to finding temporal DE genes just with the pseudotime variable replaced by the amount of received signal. Count data is needed in adata_dis500.layers['counts']

adata_dis500 = sc.read_h5ad("./adata.h5ad")
adata = sc.datasets.visium_sge(sample_id='V1_Mouse_Brain_Sagittal_Posterior')
adata_dis500.layers['counts'] = adata.X

Look for genes that are differentially expressed with respect to PSAP signaling.

df_deg, df_yhat = ct.tl.communication_deg_detection(adata_dis500,
    database_name = 'cellchat', pathway_name='PSAP', summary = 'receiver')

import pickle
deg_result = {"df_deg": df_deg, "df_yhat": df_yhat}
with open('./deg_PSAP.pkl', 'wb') as handle:
    pickle.dump(deg_result, handle, protocol=pickle.HIGHEST_PROTOCOL)

Cluster the downstream genes and visualize the expression trends with respect to increased level of received signal through PSAP pathway

with open("./deg_PSAP.pkl", 'rb') as file:
    deg_result = pickle.load(file)
df_deg_clus, df_yhat_clus = ct.tl.communication_deg_clustering(df_deg, df_yhat, deg_clustering_res=0.4)
top_de_genes_PSAP = ct.pl.plot_communication_dependent_genes(df_deg_clus, df_yhat_clus, top_ngene_per_cluster=5,
    filename='./heatmap_deg_PSAP.pdf', font_scale=1.2, return_genes=True)

Plot some example signaling DE genes

X_sc = adata_dis500.obsm['spatial']
fig, ax = plt.subplots(1,3, figsize=(15,4))
colors = adata_dis500.obsm['commot-cellchat-sum-receiver']['r-PSAP'].values
idx = np.argsort(colors)
ax[0].scatter(X_sc[idx,0],X_sc[idx,1], c=colors[idx], cmap='coolwarm', s=10)
colors = adata_dis500[:,'Ctxn1'].X.toarray().flatten()
idx = np.argsort(colors)
ax[1].scatter(X_sc[idx,0],X_sc[idx,1], c=colors[idx], cmap='coolwarm', s=10)
colors = adata_dis500[:,'Gpr37'].X.toarray().flatten()
idx = np.argsort(colors)
ax[2].scatter(X_sc[idx,0],X_sc[idx,1], c=colors[idx], cmap='coolwarm', s=10)
ax[0].set_title('Amount of received signal')
ax[1].set_title('An example negative DE gene (Ctxn1)')
ax[2].set_title('An example positive DE gene (Gpr37)')

非常棒的空间信号流推断的方法，生活很好，有你更好，抄袭可耻。

最后编辑于：2023.02.23 16:01:37

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 218,386评论 6赞 506
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 93,142评论 3赞 394
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 164,704评论 0赞 353
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 58,702评论 1赞 294
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 67,716评论 6赞 392
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 51,573评论 1赞 305
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 40,314评论 3赞 418
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 39,230评论 0赞 276
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 45,680评论 1赞 314
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 37,873评论 3赞 336
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 39,991评论 1赞 348
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 35,706评论 5赞 346
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 41,329评论 3赞 330
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 31,910评论 0赞 22
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 33,038评论 1赞 270
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 48,158评论 3赞 370
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 44,941评论 2赞 355