LCBC单细胞教程_Dimensionality Reduction(3/7)

网上发现一个single-cell sequencing analysis教程，出自莱顿大学计算生物学中心(Leiden Computational Biology Center)，有代码、有数据、有注释，有讲解，真是入门单细胞的极佳材料。所谓“上士闻道，勤而习之”，那就赶紧学习吧！
源码链接: https://github.com/LeidenCBC/MGC-BioSB-SingleCellAnalysis2020

Dimensionality Reduction & Visualization

In this tutorial we will look at different ways to visualize single cell RNA-seq datasets using dimensionality reduction. We will apply the Principal Component Analysis (PCA), t-distributed Stochastic Neighbor Embedding (t-SNE) and Uniform Manifold Approximation and Projection (UMAP) algorithms. Further, we will look at different ways to plot the dimensionality reduced data and augment them with additional information, such as gene expression or meta-information.

Datasets

For this tutorial we will use a human pancreatic islet cell dataset part of a combination of datasets available here. For this tutorial the dataset is not of particular importance, as we will be looking at general dimensionality reduction methods, rather than specific biological findings. Therefore, we will also not look in detail on the loading and preprocessing part. The following Data Integration lab will work with the same datsets and cover this in more detail.

Data preprocessing

Load required packages. Here, we will use the Seurat functions for the dimensionality reduction. They are also available as pure R functions but Seurat nicely packages them for our data. We will use ggplot for creating the plots.

suppressWarnings(library(Seurat))
suppressWarnings(library(ggplot2))

Data loading and preprocessing

Loading and preprocessing is based on, and covered in more detail in the following Data Integration lab.

Load in expression matrix and metadata. The metadata file contains the technology (techcolumn) and cell type annotations (cell type column) for each cell in the four datasets. The data is assumed to be in the Data Integration folder. You can just change this to your preferred location.

pancreas.data <- readRDS(file = "../session4-integration/session-integration_files/pancreas_expression_matrix.rds")
metadata <- readRDS(file = "../session4-integration/session-integration_files/pancreas_metadata.rds")

Create a Seurat object. This will create a data object with the original matrix and the metadata combined.

table(metadata$celltype)

##             acinar activated_stellate              alpha               beta 
##                711                180               2281               1172 
##              delta             ductal        endothelial            epsilon 
##                405               1065                 61                 14 
##              gamma         macrophage               mast quiescent_stellate 
##                359                 24                 17                 20 
##            schwann 
##                 12

table(metadata$tech)

##     celseq    celseq2 fluidigmc1  smartseq2 
##       1004       2285        638       2394

pancreas <- CreateSeuratObject(pancreas.data, meta.data = metadata)

As there are four datasets, acquired with different technologies, we separate the individual datasets into new Seurat objects, according to the tech column of the metadata. We assign the fourth data object of the resulting split list to the smartseq2 object and free the other memory.

Note: from here on we only use the smartseq2 dataset to manage computation time. You can pick any of the above or the combined. smartseq2 is the largest of the separated, so if you want to speed up the workflow you can switch to the others. Check the pancreas.list for the individual sizes. Make sure you do not remove the objects you want to use.

# split the data by using the tech meta data
pancreas.list <- SplitObject(pancreas, split.by = "tech")

# you can use any subsets from the list. For the lab we continue with the smartseq2 data
# celseq <- pancreas.list[[1]]
# celseq2 <- pancreas.list[[2]]
# fluidigmc1 <- pancreas.list[[3]]
smartseq2 <- pancreas.list[[4]]

# Free up the memory
remove(pancreas.list)
remove(pancreas)
remove(pancreas.data)
gc()

##            used  (Mb) gc trigger   (Mb) limit (Mb)  max used   (Mb)
## Ncells  2342625 125.2    4662090  249.0         NA   3253781  173.8
## Vcells 48756554 372.0  205028951 1564.3      24576 213171546 1626.4

We perform standard preprocessing (log-normalization), identify variable features based on a variance stabilizing transformation vst, and scale the integrated data.

# Normalize and find variable features
smartseq2 <- NormalizeData(smartseq2, verbose = FALSE)
smartseq2 <- FindVariableFeatures(smartseq2, selection.method = "vst", nfeatures = 2000, verbose = FALSE)

# Run the standard workflow for visualization and clustering
smartseq2 <- ScaleData(smartseq2, verbose = FALSE)

Dimensionality Reduction

From here on, we will have a look at the different dimensionality reduction methods and their parameterizations.

PCA

We start with PCA. Seurat provides the RunPCA function, we use smartseq2 as input data. npcs refers to the number of prinicpal components to compute. We set it to 100. This will take a bit longer to compute but will allow use to explore the differences using different numbers of PCs below. By assigning the result to our smartseq2 data object it will be available in the object with the default name pca.

smartseq2 <- RunPCA(smartseq2, npcs = 100, verbose = FALSE)

We can now plot the first two components using the DimPlot function. The first argument here is the Seurat data object smartseq2. By providing the reduction = "pca" argument, DimPlot looks in the object for the PCA we created and assigned above.

DimPlot(smartseq2, reduction = "pca")

image.png

This plot shows some structure of the data. Every dot is a cell, but we do not know which cells belong to which dot, etc. We can add meta information by the group.by parameter. We use the celltype that we have read from the metadata object.

DimPlot(smartseq2, reduction = "pca", group.by = "celltype")

image.png

We can see that only a few of the labelled cell-types separate well but many are clumped together on the bottom of the plot.

Let’s have a look at the PCs to understand a bit better how PCA separates the data. Using the DimHeatmap function, we can plot the expression of the top genes for each PC for a number of cells. We will plot the first six components dims = 1:6 for 500 random cells cells = 500. Each component will produce one heatmap, the cells will be the columns in the heatmap and the top genes for each component the rows.

DimHeatmap(smartseq2, dims = 1:6, cells = 500, balanced = TRUE)

image.png

As discussed in the lecture, and indicated above, PCA is not optimal for visualization, but can be very helpful in reducing the complexity before applying non-linear dimensionality reduction methods. For that, let’s have a look how many PCs actually cover the main variation. A very simple, fast-to-compute way is simply looking at the standard deviation per PC. We use the ElbowPlot.

ElbowPlot(smartseq2, ndims = 100)

image.png

As we can see there is a very steep drop in standard deviation within the first 20 or so PCs indicating that we will likely be able to use roughly that number of PCs as input to follwing computations with little impact on the results.

t-SNE

Let’s try out t-SNE. Seurat by default uses the Barnes Hut (BH) SNE implementation.

Similar to the PCA Seurat provies a convenient function to run t-SNE called RunTSNE. We provide the smartseq2 data object as parameter. By default RunTSNE will look for and use the PCA we created above as input, we can also force it with reduction = "pca". Again we use DimPlot to plot the result, this time using reduction = "tsne" to indicate that we want to plot the t-SNE computation. We create two plots, the first without and the second with the cell-types used for grouping. Already without the coloring, we can see much more structure in the plot than in the PCA plot. With the color overlay we see that most cell-types are nicely separated in the plot.

smartseq2 <- RunTSNE(smartseq2, reduction = "pca")

DimPlot(smartseq2, reduction = "tsne")

image.png

DimPlot(smartseq2, reduction = "tsne", group.by = "celltype")

image.png

Above we did not specify the number of PCs to use as input. Let’s have a look what happens with different numbers of PCs as input. We simply run RunTSNE multiple times with dimsdefining a range of PCs. Note every run overwrites the tsne object nested in the smartseq2object. Therefore we plot the tsne directly after each run and store all plots in an object. We add + NoLegend() + ggtitle("n PCs") to remove the list of cell types for compactness and add a title.

# PC_1 to PC_5
smartseq2 <- RunTSNE(smartseq2, reduction = "pca", dims = 1:5)
p1 <- DimPlot(smartseq2, reduction = "tsne", group.by = "celltype") + NoLegend() + ggtitle("5 PCs")

# PC_1 to PC_10
smartseq2 <- RunTSNE(smartseq2, reduction = "pca", dims = 1:10)
p2 <- DimPlot(smartseq2, reduction = "tsne", group.by = "celltype") + NoLegend() + ggtitle("10 PCs")

# PC_1 to PC_30
smartseq2 <- RunTSNE(smartseq2, reduction = "pca", dims = 1:30)
p3 <- DimPlot(smartseq2, reduction = "tsne", group.by = "celltype") + NoLegend() + ggtitle("30 PCs")

# PC_1 to PC_100
smartseq2 <- RunTSNE(smartseq2, reduction = "pca", dims = 1:100)
p4 <- DimPlot(smartseq2, reduction = "tsne", group.by = "celltype") + NoLegend() + ggtitle("100 PCs")

p1 + p2 + p3 + p4

image.png

Looking at these plots, it seems RunTSNE by default only uses 5 PCs (the plot is identical to the plot with default parameters above), but we get much clearer separation of clusters using 10 or even 30 PCs. This is not surprising considering the plot above of the standard deviation within the PCs above. Therefore we will use 30 PCs in the following, by explicitly setting dims = 1:30. Note that t-SNE is slower with more input dimensions (here the PCs), so it is good to find a middleground between capturing as much variation as possible with as few PCs as possible. However, using the default 5 does clearly not produce optimal results for this dataset. When using t-SNE with PCA preprocessing with your own data, always check how many PCs you need to cover the variance, as done above.

t-SNE has a few hyper-parameters that can be tuned for better visualization. There is an excellent tutorial. The main parameter is the perplexity, basically indicating how many neighbors to look at. We will run different perplexities to see the effect. As we will see, a perplexity of 30 is the default. This value often works well, again it might be advisable to test different values with other data. Note, higher perplexity values make t-SNE slower to compute.

# Perplexity 3
smartseq2 <- RunTSNE(smartseq2, reduction = "pca", dims = 1:30, perplexity = 3)
p1 <- DimPlot(smartseq2, reduction = "tsne", group.by = "celltype") + NoLegend() + ggtitle("30PCs, Perplexity 3")

# Perplexity 10
smartseq2 <- RunTSNE(smartseq2, reduction = "pca", dims = 1:30, perplexity = 10)
p2 <- DimPlot(smartseq2, reduction = "tsne", group.by = "celltype") + NoLegend() + ggtitle("30PCs, Perplexity 10")

# Perplexity 30
smartseq2 <- RunTSNE(smartseq2, reduction = "pca", dims = 1:30, perplexity = 30)
p3 <- DimPlot(smartseq2, reduction = "tsne", group.by = "celltype") + NoLegend() + ggtitle("30PCs, Perplexity 30")

# Perplexity 200
smartseq2 <- RunTSNE(smartseq2, reduction = "pca", dims = 1:30, perplexity = 200)
p4 <- DimPlot(smartseq2, reduction = "tsne", group.by = "celltype") + NoLegend() + ggtitle("30PCs, Perplexity 200")

p1 + p2 + p3 + p4

image.png

Another important parameter is the number of iterations. t-SNE gradually optimizes the low-dimensional space. The more iterations the more there is to optimize. We will run different numbers of iterations to see the effect. As we will see, 1000 iterations is the default. This value often works well, again it might be advisable to test different values with other data. Especially for larger datasets you will need more iterations. Note, more iterations make t-SNE slower to compute.

# 100 iterations
smartseq2 <- RunTSNE(smartseq2, reduction = "pca", dims = 1:30, max_iter = 100)
p1 <- DimPlot(smartseq2, reduction = "tsne", group.by = "celltype") + NoLegend() + ggtitle("100 iterations")

# 500 iterations
smartseq2 <- RunTSNE(smartseq2, reduction = "pca", dims = 1:30, max_iter = 500)
p2 <- DimPlot(smartseq2, reduction = "tsne", group.by = "celltype") + NoLegend() + ggtitle("500 iterations")

# 1000 iterations
smartseq2 <- RunTSNE(smartseq2, reduction = "pca", dims = 1:30, max_iter = 1000)
p3 <- DimPlot(smartseq2, reduction = "tsne", group.by = "celltype") + NoLegend() + ggtitle("1000 iterations")

# 2000 iterations
smartseq2 <- RunTSNE(smartseq2, reduction = "pca", dims = 1:30, max_iter = 2000)
p4 <- DimPlot(smartseq2, reduction = "tsne", group.by = "celltype") + NoLegend() + ggtitle("2000 iterations")

p1 + p2 + p3 + p4

image.png

As we see after 100 iterations the main structure becomes apparent, but there is very little detail. 500 to 2000 iterations all look very similar, with 500 still a bit more loose than 1000, indicating that the optimization converges somewhere between 500 and 1000 iterations. In this case running 2000 would definitely not necessary.

UMAP

We have seen t-SNE and it’s main parameters. Let’s have a look at UMAP. Its main function call is very similar to t-SNE and PCA and called RunUMAP. Again, by default it looks for the PCA in the smartseq2 data object, but we have to provide it with the number of PCs (or dims) to use. Here, we use 30. As expected, the plot looks rather similar to the t-SNE plot, with more compact clusters.

smartseq2 <- RunUMAP(smartseq2, dims = 1:30, verbose = FALSE)

DimPlot(smartseq2, reduction = "umap", group.by = "celltype")

image.png

Just like t-SNE UMAP has a bunch of parameters. In fact, Seurat exposes quite a few more than for t-SNE. We will look at the most important in the following. While there is no site that systematically and interactively explores them as for t-SNE there is a comparison with the same datasets available.

Again, we start with a different number of PCs. Similar to t-SNE, 5 is clearly not enough, 30 provides decent separation and detail. Just like for t-SNE, test this parameter to match your own data in real-world experiments.

# PC_1 to PC_5
smartseq2 <- RunUMAP(smartseq2, dims = 1:5, verbose = FALSE)
p1 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("5 PCs")

# PC_1 to PC_10
smartseq2 <- RunUMAP(smartseq2, dims = 1:10, verbose = FALSE)
p2 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("10 PCs")

# PC_1 to PC_30
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, verbose = FALSE)
p3 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("30 PCs")

# PC_1 to PC_100
smartseq2 <- RunUMAP(smartseq2, dims = 1:100, verbose = FALSE)
p4 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("100 PCs")

p1 + p2 + p3 + p4

image.png

The n.neighbors parameter sets the number of neighbors to consider for UMAP. This parameter is similar to the perplexity in t-SNE. We try a similar range of values for comparison. The results are quite similar to t-SNE. With low values, clearly the structures are too spread out, but quickly the embeddings become quite stable. The default vaule for RunUMAP is 30. Similar to t-SNE this value is quite general. Again, it’s always a good idea to run some tests with new data to find a good value.

# 3 Neighbors
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, n.neighbors = 3, verbose = FALSE)
p1 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("3 Neighbors")

# 10 Neighbors
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, n.neighbors = 10, verbose = FALSE)
p2 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("10 Neighbors")

# 30 Neighbors
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, n.neighbors = 30, verbose = FALSE)
p3 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("30 Neighbors")

# 200 Neighbors
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, n.neighbors = 200, verbose = FALSE)
p4 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("200 Neighbors")

p1 + p2 + p3 + p4

image.png

Another parameter is min.dist. There is no directly comparable parameter in t-SNE. Generally, min.dist defines the compactness of the final embedding. The default value is 0.3.

# Min Distance 0.01
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, min.dist = 0.01, verbose = FALSE)
p1 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("Min Dist 0.01")

# Min Distance 0.1
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, min.dist = 0.1, verbose = FALSE)
p2 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("Min Dist 0.1")

# Min Distance 0.3
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, min.dist = 0.3, verbose = FALSE)
p3 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("Min Dist 0.3")

# Min Distance 1.0
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, min.dist = 1.0, verbose = FALSE)
p4 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("Min Dist 1.0")

p1 + p2 + p3 + p4

image.png

n.epochs is comparable to the number of iterations in t-SNE. Typically, UMAP needs fewer of these to converge, but also changes more when it is run longer. The default is 500. Again, this should be adjusted to your data.

# 10 epochs
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, n.epochs = 10, verbose = FALSE)
p1 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("10 Epochs")

# 100 epochs
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, n.epochs = 100, verbose = FALSE)
p2 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("100 Epochs")

# 500 epochs
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, n.epochs = 500, verbose = FALSE)
p3 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("500 Epochs")

# 1000 epochs
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, n.epochs = 1000, verbose = FALSE)
p4 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("1000 Epochs")

p1 + p2 + p3 + p4

image.png

Finally RunUMAP allows to set the distance metric for the high-dimensional space. While in principle this is also possible with t-SNE, RunTSNE, and most other implementations, do not provide this option. The four posibilities, Euclidean, cosine, manhattan, and hammingdistance are shown below. Cosine distance is the default (RunTSNE uses Euclidean distances).

There is not necessarily a clear winner. Hamming distances perform worse but they are usually used for different data, such as text as they ignore the numerical difference for a given comparison. Going with the default cosine is definitely not a bad choice in most applications.

# Euclidean distance
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, metric = "euclidean", verbose = FALSE)
p1 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("Euclidean")

# Cosine distance
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, metric = "cosine", verbose = FALSE)
p2 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("Cosine")

# Manhattan distance
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, metric = "manhattan", verbose = FALSE)
p3 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("Manhattan")

# Hamming distance
smartseq2 <- RunUMAP(smartseq2, dims = 1:30, metric = "hamming", verbose = FALSE)
p4 <- DimPlot(smartseq2, reduction = "umap", group.by = "celltype") + NoLegend() + ggtitle("Hamming")

p1 + p2 + p3 + p4

image.png

Visualization

Finally, let’s have a brief look at some visualization options. We have already use color for grouping. With label = TRUE we can add a text-label to each group and with repel = TRUEwe can make sure those labels don’t clump together. Finally, pt.size = 0.5 changes the size of the dots used in the plot.

#Re-run a t-SNE so we do not rely on changes above
smartseq2 <- RunTSNE(smartseq2, dims = 1:30)

DimPlot(smartseq2, reduction = "tsne", group.by = "celltype", label = TRUE, repel = TRUE, pt.size = 0.5) + NoLegend()

image.png

Another property we might want to look at in our dimensionality reduction plot is the expression of individual genes. We can overlay gene expression as color using the FeaturePlot. Here, we first find the top two features correlated to the first and second PCs and combine them into a single vector which will be the parameter for the FeaturePlot.

Finally, we call FeaturePlot with the smartseq2 data object, features = topFeaturesPCuses the extracted feature vector to create one plot for each feature in the list and lastly, reduction = "pca" will create PCA plots.

Not surprisingly, the top two features of the first PC form a smooth gradient on the PC_1 axis and the top two features of the second PC a smooth gradient on the PC_2 axis.

# find top genes for PCs 1 and 2
topFeaturesPC1 <- TopFeatures(object = smartseq2[["pca"]], nfeatures = 2, dim = 1)
topFeaturesPC2 <- TopFeatures(object = smartseq2[["pca"]], nfeatures = 2, dim = 2)

# combine the genes into a single vector
topFeaturesPC <- c(topFeaturesPC1, topFeaturesPC2)
print(topFeaturesPC)

## [1] "LGALS3" "IFITM3" "COL1A2" "SPARC"

# feature plot with the defined genes
FeaturePlot(smartseq2, features = topFeaturesPC, reduction = "pca")

image.png

Let’s have a look at the same features on a t-SNE plot. The behavior here is quite different, with the high expression being very localized to specific clusters in the maps. Again, not surprising as t-SNE uses these

FeaturePlot(smartseq2, features = topFeaturesPC, reduction = "tsne")

image.png

It is clear that the top PCs are fundamental to form the clusters, so let’s have a look at more PCs and pick the top gene per PC for a few more PCs.

for(i in 1:6) {
  topFeaturesPC[[i]] <- TopFeatures(object = smartseq2[["pca"]], nfeatures = 1, dim = i)
}
print(topFeaturesPC)

## [1] "IFITM3" "SPARC"  "CTRB1"  "PLVAP"  "HADH"   "NUSAP1"

FeaturePlot(smartseq2, features = topFeaturesPC, reduction = "tsne")

image.png

Finally, let’s create a more interactive plot. First we create a regular FeaturePlot, here with just one gene features = "TM4SF4". Instead of plotting it directly we save the plot in the interactivePlot variable.

With HoverLocator we can then embed this plot into an interactive version. The information = FetchData(smartseq2, vars = c("celltype", topFeaturesPC)) creates a set of properties that will be shown on hover over each point. In this case, we show the cell-type from the meta information.

The result is a plot that allows us to inspect single cells in detail.

interactivePlot <- FeaturePlot(smartseq2, reduction = "tsne", features = "TM4SF4")
HoverLocator(plot = interactivePlot, information = FetchData(smartseq2, vars = c("celltype", topFeaturesPC)))

image.png

最后编辑于：2021.03.20 02:37:44

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 211,376评论 6赞 491
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 90,126评论 2赞 385
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 156,966评论 0赞 347
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 56,432评论 1赞 283
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 65,519评论 6赞 385
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 49,792评论 1赞 290
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 38,933评论 3赞 406
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 37,701评论 0赞 266
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 44,143评论 1赞 303
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 36,488评论 2赞 327
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 38,626评论 1赞 340
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 34,292评论 4赞 329
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 39,896评论 3赞 313
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 30,742评论 0赞 21
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,977评论 1赞 265
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 46,324评论 2赞 360
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 43,494评论 2赞 348

LCBC单细胞教程_Dimensionality Reduction(3/7)

Dimensionality Reduction & Visualization

Datasets

Data preprocessing

Data loading and preprocessing

Dimensionality Reduction

PCA

t-SNE

UMAP

Visualization

推荐阅读更多精彩内容