<p><span style="font-size:15px">GCNII (ICML 2020) 分享,</span><span style="font-size: 15px;">GCNII汇报ppt版可通过关注公众号【AI机器学习与知识图谱】后回复关键词:</span><span style="font-size: 15px;"><strong>GCNII</strong></span><span style="font-size: 15px;"> 来获得,供学习者使用!</span></p><p><span style="font-size:15px">
</span></p><p><span style="font-size:20px"><strong>Motivation</strong></span></p><p>
</p><p><span style="font-size:15px"><span>在计算机视觉中,模型</span><span>CNN</span><span>随着其层次加深可以学习到更深层次的特征信息,叠加</span><span>64</span><span>层或</span><span>128</span><span>层是十分正常的现象,且能较浅层取得更优的效果。</span></span></p><p>
</p><p><span style="font-size:15px"><span>图卷积神经网络</span><span>GCNs</span><span>是一种针对图结构数据的深度学习方法,但目前大多数的</span><span>GCN</span><span>模型都是浅层的,如</span><span>GCN</span><span>,</span><span>GAT</span><span>模型都是在</span><span>2</span><span>层时取得最优效果,随着加深模型效果就会大幅度下降,经研究</span><span>GCN</span><span>随着模型层次加深会出现</span><span>Over-Smoothing</span><span>问题,</span><span>Over-Smoothing</span><span>既相邻的节点随着网络变深就会越来越相似,最后学习到的</span><span>nodeembedding</span><span>便无法区分。</span></span></p><p><span style="font-size:15px"><span>
</span></span></p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-579017e19bdf586c.jpeg" img-data="{"format":"jpeg","size":32014,"height":396,"width":500}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p>
</p><p><span style="font-size:15px"><span>上图中,随着模型层次加深,在</span><span>Cora</span><span>数据上</span><span>TestAccuracy</span><span>逐渐向下降,</span></span><span style="font-size:15px">Quantitative Metric for Smoothness</span><span style="font-size:15px">给</span><span style="font-size:15px">Over-smoothness</span><span style="font-size:15px">提出一个定量的指标</span><span style="font-size:15px">SV</span><span style="font-size:15px">M_𝐺,如下公式所示:</span></p><p><span style="font-size:15px"><span/></span></p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-d170c3f286068bb5.jpeg" img-data="{"format":"jpeg","size":5388,"height":84,"width":365}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-afcdbddf7a19f384.jpeg" img-data="{"format":"jpeg","size":4927,"height":85,"width":380}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-06102e1bc2b220b1.jpeg" img-data="{"format":"jpeg","size":3796,"height":87,"width":258}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p><span style="font-size:15px">SVM_𝐺衡量了图中任意两个节点之间的欧氏距离之和,SVM_𝐺越小表示图学习时Over-Smoothing越严重当,当SVM_𝐺=0时,图中所有节点完全相同,也可以从图中看出随着层次的加深,SVM_𝐺的值越来越小。</span><span style="font-size:15px"/></p><p>
</p><p><span style="font-size:20px"><strong>Method</strong></span></p><span style="font-size:15px"/><p><span style="font-size:15px">
</span></p><p><span style="font-size:15px"><span>GCNII</span><span>全程:</span><span>GraphConvolutional Networks via Initial residual and Identity Mapping</span></span></p><p><span style="font-size:15px">
</span></p><p><span style="font-size:15px"><span>GCNII</span><span>为了解决</span><span>GCN</span><span>在深层时出现的</span><span>Over-Smoothing</span><span>问题,提出了</span><strong><span>Initial Residual</span></strong><span>和</span><strong><span>Identit Mapping</span></strong><span>两个简单技巧,成功解决了</span><span>GCN</span><span>深层时的</span><span>Over-Smoothing</span><span>问题。</span></span></p><p><span style="font-size:15px">
</span></p><p><span style="font-size:16px"><strong><span style="font-size:16px">1</span><span style="font-size:16px">、</span><span style="font-size:16px">Initial residual</span></strong></span></p><p><span style="font-size:15px"><span>残差一直是解决</span><span>Over-Smoothing</span><span>的最常用的技巧之一,传统</span><span>GCN</span><span>加</span><span>residualconnection</span><span>用公式表示为:</span></span></p><p><span style="font-size:15px"><span>
</span></span></p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-a49994f69f678034.jpeg" img-data="{"format":"jpeg","size":3737,"height":50,"width":346}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p><span style="font-size:15px">GCNII Initial Residual</span><span style="font-size:15px">不是从前一层获取信息,而是从初始层进行残差连接,并且设置了获取的权重。这里初始层</span><span style="font-size:15px">initial representation</span><span style="font-size:15px">不是原始输入</span><span style="font-size:15px">feature</span><span style="font-size:15px">,而是由输入</span><span style="font-size:15px">feature</span><span style="font-size:15px">经过线性变换后得到,如下公式所示:</span>
</p><p>
</p><p><span style="font-size:15px"><span/></span></p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-0911565ca9e7389e.jpeg" img-data="{"format":"jpeg","size":1997,"height":41,"width":187}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p/><p><span style="font-size:15px"><span/></span></p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-55dc5f0a46a63c7c.jpeg" img-data="{"format":"jpeg","size":3551,"height":51,"width":365}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p/><p><span style="font-size:15px"><span>但</span><span>Initial Residual</span><span>不是</span><span>GCNII</span><span>首次提出,而是</span><span>ICLR 2019</span><span>模型</span><span>APPNP</span><span>中提出。</span></span></p><p>
</p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-d5b7db695d6ac326.jpeg" img-data="{"format":"jpeg","size":11605,"height":165,"width":524}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p><span style="font-size:15px">
</span></p><p><span style="font-size:16px"><strong><span>2</span><span>、</span><span>Identity Mapping</span></strong></span><span style="font-size:15px"><span>
</span><span>仅仅使用残差只能缓解</span><span>Over-Smoothing</span><span>问题,因此</span><span>GCNII</span><span>借鉴了</span><span>ResNet</span><span>的思想有了</span><span>Identity Mapping<span style="font-size:15px">,</span></span></span><span style="font-size:15px">Initial Residual</span><span style="font-size:15px">的想法是在当前层</span><span style="font-size:15px">representation</span><span style="font-size:15px">和初始层</span><span style="font-size:15px">representation</span><span style="font-size:15px">之间进行权重选择,而</span><span style="font-size:15px">Identity Mapping</span><span style="font-size:15px">是在参数</span><span style="font-size:15px">W</span><span style="font-size:15px">和单位矩阵</span><span style="font-size:15px">I</span><span style="font-size:15px">之间设置权重选择,如下公式所示:</span></p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-dfcd3edaa0d7db11.jpeg" img-data="{"format":"jpeg","size":7022,"height":56,"width":633}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p><span style="font-size:15px"><span>从上面公式看出,前半部分是</span><span>Initialresidual</span><span>,后半部分是</span><span>IdentityMapping</span><span>,其中</span><span>α</span><span>和</span><span>β</span><span>是</span><span>超参,</span></span><span style="font-size:15px">GCNII</span><span style="font-size:15px">论文中也给出了为什么</span><span style="font-size:15px">IdentityMapping</span><span style="font-size:15px">可以起到缓解</span><span style="font-size:15px">DeepGNN</span><span style="font-size:15px">出现</span><span style="font-size:15px">Over-Smoothing</span><span style="font-size:15px">问题,总结来说:</span><span style="font-size:15px">IdentityMapping</span><span style="font-size:15px">可以起到加快模型的收敛速度,减少有效信息的损失。</span></p><p><span style="font-size:15px">
</span></p><p><span style="font-size:20px"><strong>Conclusion</strong></span></p><p><span style="font-size:15px">
</span></p><p><span style="font-size:16px"><strong><span style="font-size:16px">1</span><span style="font-size:16px">、实验数据</span></strong></span></p><p><span style="font-size:15px"/></p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-f6617369e8a3eaea.jpeg" img-data="{"format":"jpeg","size":37609,"height":216,"width":797}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p><span style="font-size:15px"><span>实验中</span><span>Cora, Citeseer, Pubmed</span><span>三个引文数据,是同质图数据,常用于</span><span>Transductive Learning</span><span>类任务,</span></span><span style="font-size:15px">三种数据都由以下八个文件组成,存储格式类似:</span></p><span>ind.dataset_str.x=> the feature vectors of the training instances <span>as</span> scipy.sparse.csr.csr_matrix object;</span><div><span>
</span></div><div><span>ind.dataset_str.tx=>the feature vectors of the test instances <span>as</span> scipy.sparse.csr.csr_matrix object;</span></div><div><span>
</span></div><div><span>ind.dataset_str.allx=>the feature vectors of both labeled <span>and</span> unlabeled training instances (asuperset of ind.dataset_str.x) <span>as</span> scipy.sparse.csr.csr_matrix object;</span></div><div><span>
</span></div><div><span>ind.dataset_str.y=>the one-hot labels of the labeled training instances <span>as</span> numpy.ndarray object;</span></div><div><span>
</span></div><div><span>ind.dataset_str.ty=>the one-hot labels of the test instances <span>as</span> numpy.ndarray object;</span></div><div><span>
</span></div><div><span>ind.dataset_str.ally=>the labels <span>for</span> instances in ind.dataset_str.allxas numpy.ndarray object;</span></div><div><span>
</span></div><div><span>ind.dataset_str.graph=>a dictin the format {index: [index_of_neighbor_nodes]}<span>as</span> collections.defaultdict object;</span></div><div><span>
</span></div><div><span>ind.dataset_str.test.index=>the indices of test instances in graph, <span>for</span> the inductive setting <span>as</span> listobject. All objects above must be saved using python pickle module.</span></div><div>
</div><div><span>以cora为例:</span><span>ind.dataset_str.x=> 训练实例的特征向量,是scipy.sparse.csr.csr_matrix类对象,shape:(<span>140</span>,<span>1433</span>)</span><span>ind.dataset_str.tx=> 测试实例的特征向量,shape:(<span>1000</span>,<span>1433</span>)</span><span>ind.dataset_str.allx=> 有标签的+无无标签训练实例的特征向量,是ind.dataset_str.x的超集,shape:(<span>1708</span>,<span>1433</span>)</span><span>ind.dataset_str.y=>训练实例的标签,独热编码,numpy.ndarray类的实例,是numpy.ndarray对象,shape:(<span>140</span>,<span>7</span>)</span><span>ind.dataset_str.ty=>测试实例的标签,独热编码,numpy.ndarray类的实例,shape:(<span>1000</span>,<span>7</span>)</span><span>ind.dataset_str.ally=>对应于ind.dataset_str.allx的标签,独热编码,shape:(<span>1708</span>,<span>7</span>)</span><span>ind.dataset_str.graph=>图数据,collections.defaultdict类的实例,格式为{index:[index_of_neighbor_nodes]}</span><span>ind.dataset_str.test.index=>测试实例的id,<span>2157</span>行上述文件必须都用python的pickle模块存储</span><p><span style="font-size:15px"><span/></span></p><p><span style="font-size:15px">
</span></p><p><span style="font-size:16px"><b>2</b><b>、实验结果</b></span></p><p><span style="font-size:15px">实验结果在Cora,citeseer,pubmed三个数据上都进行DeepGNN测试,测试结果可以看出随着网络层级的加深,模型不仅没有像传统GNN出现Over-Smoothing而效果下降,反而模型效果随着深度增加而不断提升,解决了传统DeepGNN存在的Over-Smoothing问题。</span><span style="font-size:15px"/></p><p>
</p><div class="image-package"><img src="https://upload-images.jianshu.io/upload_images/26011021-657f024c92b571c3.jpeg" img-data="{"format":"jpeg","size":75066,"height":633,"width":474}" class="uploaded-img" style="min-height:200px;min-width:200px;" width="auto" height="auto"/>
</div><p/><p>
</p><p><span style="font-size:15px">GCNII汇报ppt版可通过关注公众号后回复关键词:</span><span style="font-size:15px"><strong>GCNII</strong></span><span style="font-size:15px"> 来获得,供学习者使用!有用的话就点个赞呗!</span></p><p>
</p><p><span style="font-size:18px"><strong>往期精彩</strong></span></p><p>
</p><p><span style="font-size:14px"/></p><p>【知识图谱系列】多关系神经网络CompGCN</p><p>各大AI研究院共35场NLP算法岗面经奉上</p><p><span style="font-size:14px">Tensorflow常用函数使用说明及实例简记</span></p><p><span style="font-size:14px">自己动手实现一个神经网络多分类器</span>
</p><p><span style="font-size:14px"/></p><p><span style="font-size:14px">干货 | NLP中的十个预训练模型</span></p><p><span style="font-size:14px">FastText原理和文本分类实战,看这一篇就够了</span></p><p><span style="font-size:14px">GPT,GPT2,Bert,Transformer-XL,XLNet论文阅读速递</span></p><p><span style="font-size:14px">Word2vec, Fasttext, Glove, Elmo, Bert, Flair训练词向量教程+数据+源码</span></p></div>
【知识图谱系列】DeepGNN中Over-Smoothing
©著作权归作者所有,转载或内容合作请联系作者
- 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
- 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
- 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
推荐阅读更多精彩内容
- 【2】Model_2: 1stChebNet(2017)-github:gcn (https://github.c...
- CompGCN(ICLR 2020)分享, CompGCN汇报ppt版可通过关注公众号后回复关键词: Com...
- EvolveGCN (AAAI 2020) 分享 EvolveGCN汇报ppt版可通过关注公众号【AI机器学...
- 张利平2021.3.6「学习《情绪按钮》第20天收获: [太阳]今天学习内容: 第七章《情绪的来源》(五)情绪的来...