2018-07-11

[1805.09393] Pouring Sequence Prediction using Recurrent Neural Network
https://arxiv.org/abs/1805.09393

价值传播网络，在更复杂的动态环境中进行规划的方法 | 机器之心
https://www.jiqizhixin.com/articles/2018-06-21

DeepMind提出关系RNN：记忆模块RMC解决关系推理难题 | 机器之心
https://www.jiqizhixin.com/articles/070104

当前训练神经网络最快的方式：AdamW优化算法+超级收敛 | 机器之心
https://www.jiqizhixin.com/articles/2018-07-03-14

《Graph learning》| 图传播算法（下） - 简书
//www.greatytc.com/p/e7fb897b1d09

论文笔记之Deep Convolutional Networks on Graph-Structured Data - CSDN博客
https://blog.csdn.net/BVL10101111/article/details/53437940

优于VAE，为万能近似器高斯混合模型加入Wasserstein距离 | 机器之心
https://www.jiqizhixin.com/articles/2018-07-07-4

卷积神经网络不能处理“图”结构数据？这篇文章告诉你答案 | 雷锋网
https://www.leiphone.com/news/201706/ppA1Hr0M0fLqm7OP.html

学界 | 神经网络碰上高斯过程，DeepMind连发两篇论文开启深度学习新方向
https://mp.weixin.qq.com/s?__biz=MzA3MzI4MjgzMw==&mid=2650744847&idx=4&sn=6d04d771485c0970742e33b57dc452a9&chksm=871aec71b06d65671e386229eb75641539aef9e1525e45f2c0f70f6fe9f845d088af9c9cd9fa&scene=38#wechat_redirect

[1807.03402] IGLOO: Slicing the Features Space to Represent Long Sequences
https://arxiv.org/abs/1807.03402

量子位上海交大搞出SRNN，比普通RNN也就快135倍
https://mp.weixin.qq.com/s/wfOzCxe3L2t11VguYLGC9Q

[1807.03379] Online Scoring with Delayed Information: A Convex Optimization Viewpoint
https://arxiv.org/abs/1807.03379

Graph Convolutional Networks (GCNs) 简介 - AHU-WangXiao - 博客园
https://www.cnblogs.com/wangxiaocvpr/p/8059769.html

(1 封私信 / 80 条消息)如何理解 Graph Convolutional Network（GCN）？ - 知乎
https://www.zhihu.com/question/54504471

How powerful are Graph Convolutions? (review of Kipf & Welling, 2016)
https://www.inference.vc/how-powerful-are-graph-convolutions-review-of-kipf-welling-2016-2/

Reinforcement learning’s foundational flaw
https://thegradient.pub/why-rl-is-flawed/

tkipf/gcn: Implementation of Graph Convolutional Networks in TensorFlow
https://github.com/tkipf/gcn

Graph Convolutional Networks | Thomas Kipf | PhD Student @ University of Amsterdam
http://tkipf.github.io/graph-convolutional-networks/

[1807.03379] Online Scoring with Delayed Information: A Convex Optimization Viewpoint https://arxiv.org/abs/1807.03379

We consider a system where agents enter in an online fashion and are evaluated based on their attributes or context vectors. There can be practical situations where this context is partially observed, and the unobserved part comes after some delay. We assume that an agent, once left, cannot re-enter the system. Therefore, the job of the system is to provide an estimated score for the agent based on her instantaneous score and possibly some inference of the instantaneous score over the delayed score. In this paper, we estimate the delayed context via an online convex game between the agent and the system. We argue that the error in the score estimate accumulated over [Math Processing Error] iterations is small if the regret of the online convex game is small. Further, we leverage side information about the delayed context in the form of a correlation function with the known context. We consider the settings where the delay is fixed or arbitrarily chosen by an adversary. Furthermore, we extend the formulation to the setting where the contexts are drawn from some Banach space. Overall, we show that the average penalty for not knowing the delayed context while making a decision scales with [Math Processing Error], where this can be improved to [Math Processing Error] under special setting.
————————————————————

[1807.03402] IGLOO: Slicing the Features Space to Represent Long Sequences
https://arxiv.org/abs/1807.03402

We introduce a new neural network architecture, IGLOO, which aims at providing a representation for long sequences where RNNs fail to converge. The structure uses the relationships between random patches sliced out of the features space of some backbone 1 dimensional CNN to find a representation. This paper explains the implementation of the method and provides benchmark results commonly used for RNNs and compare IGLOO to other structures recently published. It is found that IGLOO can deal with sequences of up to 25,000 time steps. For shorter sequences it is also found to be effective and we find that it achieves the highest score in the literature for the permuted MNIST task. Benchmarks also show that IGLOO can run at the speed of the CuDNN optimised GRU or LSTM without being tied to any specific hardware.
————————————————————

[1807.03523] DLOPT: Deep Learning Optimization Library
https://arxiv.org/abs/1807.03523
Deep learning hyper-parameter optimization is a tough task. Finding an appropriate network configuration is a key to success, however most of the times this labor is roughly done. In this work we introduce a novel library to tackle this problem, the Deep Learning Optimization Library: DLOPT. We briefly describe its architecture and present a set of use examples. This is an open source project developed under the GNU GPL v3 license and it is freely available at this https URL

————————————————————
[1807.03710] Recurrent Auto-Encoder Model for Large-Scale Industrial Sensor Signal Analysis
https://arxiv.org/abs/1807.03710
Recurrent auto-encoder model summarises sequential data through an encoder structure into a fixed-length vector and then reconstructs the original sequence through the decoder structure. The summarised vector can be used to represent time series features. In this paper, we propose relaxing the dimensionality of the decoder output so that it performs partial reconstruction. The fixed-length vector therefore represents features in the selected dimensions only. In addition, we propose using rolling fixed window approach to generate training samples from unbounded time series data. The change of time series features over time can be summarised as a smooth trajectory path. The fixed-length vectors are further analysed using additional visualisation and unsupervised clustering techniques. The proposed method can be applied in large-scale industrial processes for sensors signal analysis purpose, where clusters of the vector representations can reflect the operating states of the industrial system.

————————————————————

[1807.03748] Representation Learning with Contrastive Predictive Coding
https://arxiv.org/abs/1807.03748
While supervised learning has enabled great progress in many applications, unsupervised learning has not seen such widespread adoption, and remains an important and challenging endeavor for artificial intelligence. In this work, we propose a universal unsupervised learning approach to extract useful representations from high-dimensional data, which we call Contrastive Predictive Coding. The key insight of our model is to learn such representations by predicting the future in latent space by using powerful autoregressive models. We use a probabilistic contrastive loss which induces the latent space to capture information that is maximally useful to predict future samples. It also makes the model tractable by using negative sampling. While most prior work has focused on evaluating representations for a particular modality, we demonstrate that our approach is able to learn useful representations achieving strong performance on four distinct domains: speech, images, text and reinforcement learning in 3D environments.

【(Colab/tf.keras + eager)RNN文本生成实例】“Text Generation using a RNN - end-to-end example of generating Shakespeare-like text using tf.keras + eager” 网页链接

【强化学习之旅——持续控制角度】《A Tour of Reinforcement Learning: The View from Continuous Control》by Benjamin Recht [UC Berkeley] 网页链接