1. main info

iccv2019
task: Small-Data Object Detection
main idea: using generative model

motivation: 1) generative models e.g., GAN is very successful; 2) how can they be useful for downstream tasks?

One example is object detection (OD), especially small-data OD where labeled data is limited. For example, in the case of medical images.

This paper: uses generative models to improve the performance in small-data object detection.

Some problems may appear:

previous works on object insertion for generative models often needs segmentation masks, which are often not available;
GANs are designed to generate realistic images but may not align with the downstream tasks.

Thus, a new DetectorGAN is proposed. DetectorGAN combines the detector and the GAN together in a unified model.

In general, there are two branches after the detector, 1) discriminator to generate adversarial loss; 2) detector to generate the detection loss. Two losses are used to train the model.

Typically, one difficulty is the generator will not receive the gradients of the dection loss, which makes the goal of generating better images for OD fail. Thus, this paper bridges the line between the generator and detection loss.

main contribution:

first integrate a detector into the GAN;
propose a novel unrolling method to bridge the generator and detection.
good results

2. Related works

image-to-image translation
object insertion with GANs
Integration of GANs and Classifiers
Data AUfmenttaion for Object Detection

3. DetectorGAN

main components: a generator, (multiple) discriminators, and a detector.

detector: gives feedback to generator on whether generated images are good.
discriminator: improve the realism and interpretability of the generated images.
$X$ : learn images without objects;
$Y$ : labeled images with objects;

3.1 modules

1) Generators

$G_X$ : takes $X$ and mask as input, output synthetic image with input background and an object inserted at the masked area.
$G_Y$ : takes $Y$ and the object mask as input, output an image with the indicated object removed.

the masks which indicate the plausible inserting locations are also important to the results. It depends on the target datasets.

2) Discriminators

$DIS_{globalX}$ : between {real X, generated X} globally.
$DIS_{globalY}$ : between {real Y, generated Y} globally.
$DIS_{localX}$ : between {real X, generated X} locally in the mased area.

3) detector

detect for both Y and generated X.

3.2 Train generator with detection losses

key: train generator $G_X$ using the gradients from the detector.

$L_{det}^{real}$ only related to real image Y and the detector
$L_{det}^{syn}$ related to clean image X, detector, and $G_X$ .

limitation: there is no link between the real image $Y$ and the generator $G_X$ while the goal is to achieve good results on the real image $Y$ .

Thus, propose the unrolling a single forward-backward pass of the detector:

I think that means using the gradients from the $L_{det}^{real}$ and $L_{det}^{syn}$ to update the weights of the detector at the same time.

Specifically,

train weight DET with generated X and real Y and obtain the gradients using eq.3
update DET
use the updated DET to get eq.1

3.3 overall losses and training

detection: detection loss, eq.1 effected by eq.3, and eq.2
close to the real images globally and locally; $L_{GAN}(DIS_{globalX})$ , $L_{GAN}(DIS_{globalY})$ , $L_{GAN}(DIS_{localX})$ ;

Generative Modeling for Small-Data Object Detection (ICCV2019)