Generative Modeling for Small-Data Object Detection (ICCV2019)

1. main info

  • iccv2019
  • task: Small-Data Object Detection
  • main idea: using generative model

motivation: 1) generative models e.g., GAN is very successful; 2) how can they be useful for downstream tasks?

One example is object detection (OD), especially small-data OD where labeled data is limited. For example, in the case of medical images.

This paper: uses generative models to improve the performance in small-data object detection.

Some problems may appear:

  1. previous works on object insertion for generative models often needs segmentation masks, which are often not available;
  2. GANs are designed to generate realistic images but may not align with the downstream tasks.

Thus, a new DetectorGAN is proposed. DetectorGAN combines the detector and the GAN together in a unified model.

In general, there are two branches after the detector, 1) discriminator to generate adversarial loss; 2) detector to generate the detection loss. Two losses are used to train the model.

Typically, one difficulty is the generator will not receive the gradients of the dection loss, which makes the goal of generating better images for OD fail. Thus, this paper bridges the line between the generator and detection loss.

main contribution:

  1. first integrate a detector into the GAN;
  2. propose a novel unrolling method to bridge the generator and detection.
  3. good results

2. Related works

  1. image-to-image translation
  2. object insertion with GANs
  3. Integration of GANs and Classifiers
  4. Data AUfmenttaion for Object Detection

3. DetectorGAN

main components: a generator, (multiple) discriminators, and a detector.

  • detector: gives feedback to generator on whether generated images are good.

  • discriminator: improve the realism and interpretability of the generated images.

  • X: learn images without objects;

  • Y: labeled images with objects;

3.1 modules

1) Generators

  • G_X: takes X and mask as input, output synthetic image with input background and an object inserted at the masked area.
  • G_Y: takes Y and the object mask as input, output an image with the indicated object removed.

the masks which indicate the plausible inserting locations are also important to the results. It depends on the target datasets.

2) Discriminators

DIS_{globalX}: between {real X, generated X} globally.
DIS_{globalY}: between {real Y, generated Y} globally.
DIS_{localX}: between {real X, generated X} locally in the mased area.

3) detector

  • detect for both Y and generated X.

3.2 Train generator with detection losses

key: train generator G_X using the gradients from the detector.

  • L_{det}^{real} only related to real image Y and the detector
  • L_{det}^{syn} related to clean image X, detector, and G_X.

limitation: there is no link between the real image Y and the generator G_X while the goal is to achieve good results on the real image Y.

Thus, propose the unrolling a single forward-backward pass of the detector:

I think that means using the gradients from the L_{det}^{real} and L_{det}^{syn} to update the weights of the detector at the same time.

Specifically,

  1. train weight DET with generated X and real Y and obtain the gradients using eq.3
  2. update DET
  3. use the updated DET to get eq.1

3.3 overall losses and training

  1. detection: detection loss, eq.1 effected by eq.3, and eq.2
  2. close to the real images globally and locally; L_{GAN}(DIS_{globalX}), L_{GAN}(DIS_{globalY}), L_{GAN}(DIS_{localX});
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容