视频生成小综述起稿

Year 2018

March

1. Probabilistic Video Generation using Holistic Attribute Control https://arxiv.org/pdf/1803.08085.pdf

   a. Videos express highly structured spatio-temporal patterns of visual data. two factors:

        (i) temporally invariant (e.g., person identity), or slowly varying (e.g., activity), attribute-induced appearance, encoding the persistent content of each frame

        (ii) an interframe motion or scene dynamics (e.g., encoding evolution of the person ex- ecuting the action).

   b. VideoVAE

       video generation + future prediction.

       generates a video (short clip) by:

           decoding samples sequentially drawn from a latent space distribution into full video frames.

              -VAE: encoding/decoding frames into/from the latent space

              -RNN: model the dynamics in the latent space.    

        improve the video generation consistency through temporally-conditional sampling and quality

              -structuring the latent space with attribute controls

              -ensuring that attributes can be both inferred and conditioned on during learning/generation


2.Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks


3.Every Smile is Unique: Landmark-Guided Diverse Smile Generation 



Year 2017


-By the Way

 I like this stanford homework paper http://cs231n.stanford.edu/reports/2017/pdfs/323.pdf

1. Dynamics Transfer GAN: Generating Video by Transferring Arbitrary Temporal Dynamics from a Source Video to a Single Target Image


-spatial constructs <---- target image; dynamics <------source video sequence

 To preserve the spatial construct of the target image:

             - the appearance of the source video sequence is suppressed

             - only the dynamics are obtained before being imposed onto the target image.  (using the proposed appearance suppressed dynamics feature.)

 the spatial and temporal consistencies are verified via two discriminator networks.  

             - discriminator A validates the fidelity of the generated frames appearance,

             -  B validates the dynamic consistency of the generated video sequence.

Results:

             - successfully transferred arbitrary dynamics of the source video sequence onto a target image

             - maintained the spatial constructs (appearance) of the target image while generating spatially and temporally consistent video sequences.

Note: It is ### everything (Literature Review in its intro) because it is quite new.




2. Deep Video Generation, Prediction and Completion of Human Action Sequences https://arxiv.org/pdf/1711.08682.pdf


3. Video Generation from Text https://arxiv.org/pdf/1710.00421.pdf

-Hybrid VAE plus GAN

-Two parts:

-Static( Using gist to sketch text-conditioned background color and object layout (LSTM, RNN structure)

-Dynamic ( A text2Filter. )

-3.3 Text2Filter

-Note: Quite compact. Need time to digestilter


4. Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks

   https://arxiv.org/pdf/1709.07592.pdf



5. MoCoGAN: Decomposing Motion and Content for Video Generation

   https://arxiv.org/pdf/1707.04993.pdf





6. To Create What You Tell: Generating Videos from Captions

    https://www.microsoft.com/en-us/research/wp-content/uploads/2017/11/BNI02-panA.pdf


-Temporal GANs conditioning on Captions, namely TGANs-C

     - transformed into a frame sequence with 3D spatio-temporal convolutions.

      -  GAM evaluation metric ( Section 3.4 Experimental Setting)

-  Model Architecture

            -3.1.1 Generator

                     -Given a sentence 𝒮, a bi-LSTM is utilized to contextually embed the input word sequence,  + a LSTM- based encoder to obtain the sentence representation S. + concatenated input of the sentence representation S and random noise variable z.synthesize realistic videos with these

             -3.1.2 The discriminator network 𝐷 includes three discriminators:

                           a.video discriminator classifying realistic videos from generated+ optimizes video-caption matching           

                           b. frame discriminator( between real and fake frames)and aligning frames with the conditioning caption

                           c. motion discriminator emphasizing that the adjacent frames in the generated videos run smoothly

              -3.1.3 The whole part trained with 3 losses:video-level matching-aware loss, frame-level matching-aware loss and temporal coherence loss



                   .

   Year 2016


1. Generating Videos with Scene Dynamics

     https://arxiv.org/abs/1609.02612

- a spatio-temporal convolutional architecture

- untangles the scene’s foreground from the background.

- experiments show the model internally learns useful features for recognizing actions with minimal supervision,

- scene dynamics are a promising signal for representation learning.

- Slides : https://pdfs.semanticscholar.org/presentation/7188/6726f0a1b4075a7213499f8f25d7c9fb4143.pdf

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 206,378评论 6 481
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 88,356评论 2 382
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 152,702评论 0 342
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 55,259评论 1 279
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 64,263评论 5 371
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 49,036评论 1 285
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 38,349评论 3 400
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,979评论 0 259
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 43,469评论 1 300
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,938评论 2 323
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 38,059评论 1 333
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,703评论 4 323
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 39,257评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 30,262评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,485评论 1 262
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 45,501评论 2 354
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,792评论 2 345

推荐阅读更多精彩内容

  • 在人生中,错了就是错了,没有那么多的人会愿意来原谅你,来承担责任,如果有,那一定是你还有利用的价值,如果你恰好被利...
    y杨叶子阅读 135评论 0 0
  • 上天让你放弃和等待,是为了给你最好的 而犹豫是最冷酷的杀手
    胖乎乎先生阅读 130评论 0 10
  • “游客私自下车被虎袭致一死一伤动八达岭物园关闭” 一个全程不到一分钟的视频,事发不过几秒钟的时间,就酿成难以挽救的...
    爱晚睡阅读 481评论 0 6
  • 霓虹枯冷 悠悠古巷 单薄着苍老宁静的风 风? 它不动声色 惊扰了纱窗里的梦 梦? 携一锄烟火 灯光在辽阔的田野探路...
    秋水识心阅读 194评论 0 2