kaggle 2018 data science bowl 细胞核分割学习笔记

一、 获奖者解决方案

1. 第一名解决方案(Unet 0.631)

主要的贡献

  • targets: 预测touching borders,将问题作为instance分割
  • loss function:组合交叉熵跟soft dice loss,避免pixel imbalance问题
  • binary_crossentropy有类平衡问题,每个像素作为单独的一个来考虑。This makes predictions a bit fuzzy.
  • soft dice (and Jaccard) 对图像中的所有像素点计算,因此预测结果有较好的shapes并且not fuzzy。 不足是它们太confident了,probability is close to 0 or 1 even for wrong pixels.
  • 两者的组合是为了克服它们的不足。最简单的组合是:loss = binary_crossentropy + (1 - dice); 也可以根据自己的数据加权: loss = w1 * binary_crossentropy + w2 * (1 - dice)
  • very deep encoder-decoder architectures that also achieve state-of-the-art results in other binary segmentation problems (SpaceNet, Inria and others)
  • 后处理技巧:分水岭,morphological features and second level model with Gradient Boosted Trees
  • 基于任务的数据增广(Clahe, Sharpen, Emboss,高斯噪声,color2gray,模糊,中值模糊,高斯模糊,调整对比度和亮度,随机scale,roation and flips)。比较有效的两个增广方法
  • Channel shuffle - I guess this one was very important due to the nature of the data
  • Nucleus copying on images. That created a lot of overlapping nuclei. It seemed to help networks to learn better borders for overlapping nuclei.
    Whenever we reduced augmentations we got better validation results and worse score on public leaderboard.

Unet or Mask-RCNN? 我们也参加了之前的一些分割比赛,选择清晰了——Unet!

实验历程

  • 首先尝试最简单的方法并增加watershed line for binary masks(i.e. modified GT masks to always have gaps between nuclei),这个方法在public LB得到0.5分。

不是用的plain Unets,而是预训练的非常深的编码器(Resnet50不够)。That makes a huge difference when you don't have enough data, which is clearly the case。

使用预训练模型(只用来初始化,然后端到端训练)非常重要,如果直接训练一个plain Unet,结果会差很多。

Encoders were initialized with pretrained weights from ImageNet. Then we trained models end to end. From my experience with a frozen encoder it is usually not possible to achieve good segmentation results even on datasets that are more or less similar to ImageNet.

分水岭后处理:取两个阈值,high for seeds, low for masks (something like 0.6 and 0.3 for a binary mask).

  • 然后增加第二个带contours的通道,contours的宽度依赖于核的大小,这些masks + 简单的分水岭后处理得到0.525分,不是很大的突破,但是a hint to the right directino.

In my initial experiments, i actually tried Unet but i gave it up because i have problem in the post processing of unet outputs. Watershed is very sensitive to input marker and energy function (distance transform). That is why I switch to mask-rcnn.
@topcoders solution use unet to learn the "border of nuclei", which is made from watershed line of the ground truth. In this way they can control the accuracy of the watershed post processing.

create ground truth contours from the individual binary masks, then combine them in a single image channel. You can then use these as ground truth along with binary masks to teach U-Net. To output a second channel just increase the number of kernels in the final U-Net layer from 1 to 2, and make sure your inputs are consistent with this format.

contours是所有细胞核的轮廓吗?它的宽度跟细胞核大小怎么个相关法?具体怎么得到。A better approach is to use labels, dilation, watershed with watershed_line=True etc. Watershed line will be a border between the nuclei.

  • 观察结果发现,网络很容易的预测出non ambiguous地方的contours,因此我们决定仅仅预测细胞之间的边界。这很容易的让我们得到0.55分,只用了一个网络的结果+分水岭后处理。
  • 如果我们在一个通道有一个full mask,另一个通道有a border,有时候用于分水岭方法的种子点不足够好。一个更好的方法是change nuclei masks and make pixels empty on the borders. This also allows to use softmax as the target activation instead of sigmoid. 这可以较好的分离细胞核,但是actually decreases MaP because of high thresholds for IoU. 为解决这个问题,我们train additional networks trained on full masks and combined the results in the postprocessing step.
    最后的方法:
  • 2 channels full masks i.e. (mask, border)
  • 2 channels masks for networks with sigmoid activation i.e. (mask - border, border) or 3 channels masks for networks with softmax activation i.e. (mask - border, border , 1 - mask - border)

网络架构

使用Unet类型的编解码结构,编码部分在ImageNet上预训练。令人惊讶的是,类似于VGG16这样的简单编码器在这个数据集上不work,这让我们决定go deeper,As a result - top performing encoders in this competition were : DPN-92, Resnet-152, InceptionResnetV2, Resnet101.

2nd Level model / Postprocessing

在预测出来的nucleus candidates上面训练LightGBM模型。。。看不懂啥意思
lgb just trained as regressor to predict IoU. So, we have nucleus candidate after Neural Networks prediction. We extracting some features and predicting IoU.

在公开LB上,这个带自适应阈值和FP rejection的后处理提升了~15。

集成

后处理前平均所有预测出来的mask。

训练参数

  • RGB channels
  • Random Crops: 256x256
  • Random scale: [0.55, 1.45]
  • Batch Size: 16
  • optimizer: Adam
  • learning rate: initial 1e-4 with decay (we had different LR policies, but mostly small LR no more than 1e-4)
  • preprocessing: same as on ImageNet depending on network

Test Time Augmentations (TTA)

Standard Flips/Rotations(0,90,180,270).

讨论区集锦

  • 是否用dropout

普通的dropout通常对卷积层是有害的,但从经验来讲,SpatialDropout2D可以用于分割任务中并得到好一些的结果,I used SpatialDropout2D just before the classification layer。

  • 如何让编解码过程中维数匹配

I simply changed encoders and replaced valid padding with same padding. 常用的解码器有两种:1) U-Net like with standard upsampling-conv approach 2) a custom FPN like decoder 。两者的性能相当

i. 如何选择种子点

ii. 如何决定分水岭边界

iii. 如何决定地形高度

Deep Watershed Transform(DWT)方法帮助我们解决了上述一些问题。主要的思想是让CNN学习两件事情:unit vectors pointing to (against) boundaryenergy (mountain height) levels。实践中如果仅仅用分水岭,结果很可能会过分割(因为有噪声干扰,所以建立的landscape会有很多局部极小),这就是DWT的思想,让CNN学习mountain landscape。原始作者用2个VGG-like CNN来学习

  • The watershed energy itself (energy, elevation - whatever you may call it);
  • The unit vectors points to (away from) the borders - to help CNN learn the boundaries);

实践中不一定要用一个分离的CNN,仅仅用Unet并且 feed several masks to it(Unet):

  • The merged nuclei masks (gt_mask);
  • Several thinned / eroded merged nuclei masks (3 levels of eroson - 1 pixel, 3 pixels, 5 or 7 pixels);
  • Centers of the nuclei (did not help in my case);
  • Unit vectors (helped a little bit, locally);
  • Borders (helped a bit, locally);

对我来说, 最好的后处理方法 (energy_baseline function) 很简单:

  • Sum the predicted mask and 3 levels of eroded masks;
  • Apply a threshold of 0.4 to produce watershed seeds;
  • Flood the original thresholded masks from these markers;
  • Use distance transform as "landscape height" measure;

个人看来,我想把解决细胞核分割问题的方法分成4类:
i. Unet-like 架构+基于分水岭变换的后处理。
ii. Recurrent架构。I found only this more or less relevant paper (even followed by code release);
iii. 基于proposal的模型,比如Mask-RCNN。

2. 第三名解决方案(Mask-RCNN,0.614)

3. 第四名解决方案(Unet,0.610)

我们队伍评测了Unet和Mask-RCNN,结果是Unet效果显著的好于mask-Rcnn。
Deep Watershed Transform paper启发,对每个pixel,我们预测了the x,y components of vector pointing from the instance border和mask,watershed levels and nuclei centers using the second connected UNet.

For unet encoders we used Conv2d - BN - Relu - Conv2d - Relu Decoders: Upsample/concatenate - Conv2d - Relu

4. 第五名解决方案(Mask-RCNN,0.609)

二、优秀教程

  1. Keras U-Net starter tutorial by Kjetil Amdal-Saevik with 739 upvotes

Get the data

首先导入所有图像和对应的mask,下采样训练集图像和测试集图像来保持things light and manageable. 但是我们应该对测试集的原始图像大小记录下来,从而将predicted masks上采样到原始大小以及创建正确的run-length encoding。

There are definitely better ways to handle this, but it works fine for now! 更好的处理方法有哪些呢??

搭建并训练神经网络

搭建Unet,如果那小数据实验的话建议使用checkpointing and early stopping.

进行预测

在训练集、验证集(合理性检查)以及测试集上做预测。注意:如果使用了early stopping 和checkpointing,记得加载保存的性能最好的模型。

编码结果并提交

最终的结果是0.277 LB

  1. Teaching notebook for total imaging newbies tutorial by Stephen Bailey with 354 upvotes
    新手入门教程,教如何分析图像,并不是要得高分。

A. 处理颜色

数据集中的图像可能是RGB、RGBA以及灰度图像。为了简化处理,通过rgb2gray函数都转到灰度图像。

B. 去除背景

最简单的假设是图像中只用前景和背景两类,它们灰度服从bimodal distribution。如果能发现最好的分离阈值,我们可以mask out背景数据,仅仅处理剩下的前景目标。可以尝试采用大津阈值方法,因为它将图像建模为一个bimodal分布并且找到最优的分离值。

thresh_val = threshold_otsu(im_gray)
mask = np.where(im_gray > thresh_val, 1, 0)

C. Deriving individual masks for each object

我们需要对每个细胞核得到separate mask,遍历mask中所有的目标,通过ndimage.label给每个目标一个label。

from scipy import ndimage
labels, nlabels = ndimage.label(mask)
label_array = []
for label_num in range(1, nlabels+1):
     label_mask = np.where(labels== label_num, 1, 0)
     label_arrays.append(label_mask)

print('共计 {} 个分离的components 被检测到'.format(nlabels))

这样简单的处理有两个很直接的问题

  • 有一些像素点是孤立的(右上角)
  • 细胞重叠(中间靠右)
    [图片上传失败...(image-42cef7-1530348310622)]

利用ndimage.find_objects, 可以迭代的遍历mask,zooming in单个的细胞核并施加额外的后处理。

for label_ind, label_coords in enumerate(ndimage.find_objects(labels)):
    cell = im_gray[label_coords]
    
    # Check if the label size is too small
    if np.product(cell.shape) < 10: 
        print('Label {} is too small! Setting to 0.'.format(label_ind))
        mask = np.where(labels==label_ind+1, 0, mask)

对于adjacent 细胞(不同于重叠细胞),One thing we can do here is to see whether we can shrink the mask to "open up" the differences between the cells. This is called mask erosion. We can then re-dilate it to to recover the original proportions.

# Get the object indices, and perform a binary opening procedure
two_cell_indices = ndimage.find_objects(labels)[1]
cell_mask = mask[two_cell_indices]
cell_mask_opened = ndimage.binary_opening(cell_mask, iterations=8)

本次比赛的目标是细胞核检测,因此需要提交单个细胞核的结果。

  1. Nuclei overview to submission tutorial by Kevin Mader with 240 upvotes

A. the preprocessing steps to load the data

B. a quick visualization of the color-space

C. training a simple CNN

搭建一个简单的CNN网络,用扩张卷积

from keras.models import Sequential
from keras.layers import BatchNormalization, Conv2D, UpSampling2D, Lambda
simple_cnn = Sequential()
simple_cnn.add(BatchNormalization(input_shape = (None, None, IMG_CHANNELS), 
                                  name = 'NormalizeInput'))
simple_cnn.add(Conv2D(8, kernel_size = (3,3), padding = 'same'))
simple_cnn.add(Conv2D(8, kernel_size = (3,3), padding = 'same'))
# use dilations to get a slightly larger field of view
simple_cnn.add(Conv2D(16, kernel_size = (3,3), dilation_rate = 2, padding = 'same'))
simple_cnn.add(Conv2D(16, kernel_size = (3,3), dilation_rate = 2, padding = 'same'))
simple_cnn.add(Conv2D(32, kernel_size = (3,3), dilation_rate = 3, padding = 'same'))

# the final processing
simple_cnn.add(Conv2D(16, kernel_size = (1,1), padding = 'same'))
simple_cnn.add(Conv2D(1, kernel_size = (1,1), padding = 'same', activation = 'sigmoid'))
simple_cnn.summary()

Loss function:Dice

from keras import backend as K
smooth = 1.
def dice_coef(y_true, y_pred):
    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    intersection = K.sum(y_true_f * y_pred_f)
    return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)
def dice_coef_loss(y_true, y_pred):
    return -dice_coef(y_true, y_pred)
simple_cnn.compile(optimizer = 'adam', 
                   loss = dice_coef_loss, 
                   metrics = [dice_coef, 'acc', 'mse'])

训练网络

def simple_gen():
    while True:
        for _, c_row in train_img_df.iterrows():
            yield np.expand_dims(c_row['images'],0), np.expand_dims(np.expand_dims(c_row['masks'],-1),0)

simple_cnn.fit_generator(simple_gen(), 
                         steps_per_epoch=train_img_df.shape[0],
                        epochs = 3)

D. applying the model to the test data

forked from Nuclei Overview to Submission by Kevin Mader (+0/–0)

优质讨论集锦

  1. UNet Vs Mask-rcnn, which is better?
    Unet!

  2. mask-rcnn in pytorch from scratch

  3. 是否需要padding?

  • 原始Unet作者使用VALID padding,不是SAME;
  • 比赛第一名Selim Seferbekov:认为VALID要比SAME padding弱很多。实际上,Unet的贡献不是no padding和cropping,而是skip connection。所以我认为VALID仅仅是作者的preference,不是thorough study的结果。使用zero padding有如下好处:保留空间分辨率,更容易的让编解码架构使用,可以被用于很深的网络以及提升了borders分割的性能。cs231n 助教Karpathy讲义中也提到,如果每次卷积都执行VALID padding,volume的大小会减小,进而导致边界处的信息被“washed away”的很快。
  • Karphathy在twitter上也说到:"Zero padding in ConvNets is highly suspicious/wrong. Input distribution stats are off on each border differently yet params are all shared"。Reflection or periodic or symetric padding has the benefit to not introduce new freqencies in the image signal. (Unet作者就是用的reflection 预处理)

其它

原始Unet作者使用VALID padding,不是SAME; 使用加权的cross entropy。对细胞间small gaps增加更多的权重。我们在预处理阶段对细胞做了腐蚀,确保每个细胞是分离开的,然后对最终的结果做膨胀。使用scale-augmentation使得网络对各种大小的细胞都有预测能力。没有任何集成,最终得分是0.504.
kaggle discussion

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 217,734评论 6 505
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 92,931评论 3 394
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 164,133评论 0 354
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,532评论 1 293
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,585评论 6 392
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,462评论 1 302
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,262评论 3 418
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 39,153评论 0 276
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,587评论 1 314
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,792评论 3 336
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,919评论 1 348
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,635评论 5 345
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 41,237评论 3 329
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,855评论 0 22
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,983评论 1 269
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 48,048评论 3 370
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,864评论 2 354

推荐阅读更多精彩内容

  • 最近学习和考虑孩子教育的时间特别的多,因为倍感对孩子教育的无力和老师一些举动的压力不得不想方设法通过自己能够知...
    王者归去昔阅读 1,174评论 0 0
  • 作为管理者应多研究人性,增强员工的幸福感,让她们感觉满足,这样会大大增加她们的忠诚度从创造更高的业绩。
    城市格调刘姣阅读 162评论 0 1
  • 风轻轻浅舞 鼻息处 温情的芬芳 暖暖熏满 花悄悄绽放 舒展心灵 昨夜的惆怅 片刻消散 且拥住一丝欣喜 推开繁杂岁月...
    梅蕊新说阅读 250评论 5 29
  • 我可以坚强 但是不敢勇敢 隐藏起所有的锋芒 是为了不再受伤 那张定格的美好 我至今仍有收藏 在不断流逝的岁月 我拼...
    瑾年盛夏阅读 273评论 1 3