Task : 预测patient一年之后lung cancer的概率 base on CT lung scan
方案一:(too naive)
1. Binary classification for a given slice of patient, whether it contains a lung nodule or not
2. Regression the bbox of a lung nodule for the slice predicted with lung nodule in 1.
3.1. For each slice with bbox of lung nodule, predict whether its benign or malignant and combine all slice for patient-level prediction
or 3.2 : combine all slices for each patient that has lung nodule into a 3D matrix and run 3D CNN on top of that to predict benign vs malignant (see the second paper below)
Possible way to predict : No cancer if a patient does not have lung nodule or only benign nodule, caner otherwise.
Dataset :
1. http://ptak.felk.cvut.cz/Medical/Motol/LungTIME/ with bbox per slice
paper http://cmp.felk.cvut.cz/ftp/articles/dolejsi/Dolejsi-SPIE2009.pdf
2. dataset https://wiki.cancerimagingarchive.net/display/Public/SPIE-AAPM+Lung+CT+Challenge#00e86e1ad7f340728e6cec3b2b6edfa8
with only point label per patient, thus needs some processing to get the 3D region containing nodules and also sample hard negatives for training
paper https://e-reports-ext.llnl.gov/pdf/806183.pdf
方案二 : (better)
1. Detection of the lung nodules for each slice (hard and need labeled training data)
2. training a classifier based on feature of nodules volume and other parameters in dicom file
方案三 :
CNN+RNN + heavy data augmentation + end2end
for every patient, select N slices, for example N = 200, if n< N , then pad cyclically. if n > 200, then select 200 consecutive slices
so each slice out of 200, will be passed through a cnn, at the final fc layer, we have a fc_size = 100 vector and we passed each of these 200 vectors sequentially into a rnn, with 200 layers. Then the output of rnn will be a fc layer with 1 neuron, probability is whether the patient will get an cancer or not a year later.
Due to the limited training data size, in this case is #of patient, heavy data augmentation is needed at several levels:
1) generate the sequence of 200 slices
2) for slices in 1), do rotation, translation, flip, scaling and so on.