作者:Zhihong Deng 最近看了 Michael Bronstein 教授写的一篇博客,分析得挺好的,简单分享一下。https://towardsdatascience.com/do-we-need-deep-graph-neural-networks-be62d3ec5c59深度学习,特别是 CV 领域的模型,往往有数十上百层,与此相比,在图“深度学习”中(大部分工作都 ≤5 层,谈不上深,所以加个引号吧),大部分模型架构都是浅层的,设计深度模型到底有没有用呢?现有的一些工作告诉我们,训练深度图神经网络是很难的,除了深度学习的传统问题(梯度消失和过拟合)之外,针对图本身的特性,还需要克服另外两个问题:
最后但或许也很重要的一点就是评估方法,一些常见的基准数据集和方法未必能准确评估图神经网络的效果,我们观察到深度图网络在一些数据集上性能随深度下降,或许仅仅是因为数据集太小,发生了过拟合。斯坦福新推出的 Open Graph Benchmark 可以解决部分问题,它提高了大规模的图数据,并给定了训练和测试数据的划分方式。 [1] More precisely, over-smoothing makes node feature vector collapse into a subspace, see K. Oono and T. Suzuki, Graph neural networks exponentially loose expressive power for node classification (2019). arXiv:1905.10947, which provides asymptotic analysis using dynamic systems formalist.[2] Q. Li, Z. Han, X.-M. Wu, Deeper insights into graph convolutional networks for semi-supervised learning (2019). Proc. AAAI. Draws the analogy between the GCN model and Laplacian smoothing and points to the over-smoothing phenomenon.[3] H. Nt and T. Maehara, Revisiting graph neural networks: All we have is low-pass filters (2019). arXiv:1905.09550. Uses spectral analysis on graphs to answer when GCNs perform well.[4] U. Alon and E. Yahav, On the bottleneck of graph neural networks and its practical implications (2020). arXiv:2006.05205. Identified the over-squashing phenomenon in graph neural networks, which is similar to one observed in sequential recurrent models.