論文の公開元へ

書き出し

Refer/BibIX

RIS

BibTeX

TSV

グラフ類似性測定のためのグラフ局所構造を考慮したGromov-Wasserstein距離

浅見莉絵子早稲田大学

2021.03.15

概要

機械学習で使用するデータの多くは，ツリー，グループ，クラスター，シーケンスなどの構造化データである．そして多くの構造化データはグラフ構造として表現することができる [1]．図 1.1 にグラフ構造の例を示す．グラフ構造はノードと，ノード間のつながりや経路を表すエッジから構成される．近年グラフ構造は，ソーシャルネットワークから医療工学 [2]，バイオインフォマティクス [3]，化学 [4]，および天文学まで，多くの分野で注目されている．例えばソーシャルネットワークでは，ユーザー同士の繋がりをグラフ構造で表すことができる．これは現実世界における学校の交友関係や企業の同僚との関係を表す．ソーシャルネットワークのグラフを分析することにより，ユーザーの分類，コミュニティの検出，友達の推薦などを行うことができる [5]．また化学における分子化合物では，各原子をノードとし，原子同士の結合をエッジとするグラフ構造表現によって記述できる．さらに，タンパク質の相互作用をグラフ構造で表現すると，ノードがタンパク質に対応し，エッジが相互作用するタンパク質間のリンクを表す．このように，グラフ構造表現は多くのアプリケーションで構造化データを記述する手段として有効であり，グラフ表現を用いることで数学的処理が可能となる．本論文では，グラフ構造を持ったデータをグラフデータと呼ぶ．

近年，機械学習を用いたデータ解析が注目されているが，ベクトル値を持つデータに特化した既存の機械学習手法では，グラフデータに対するパフォーマンスが大幅に低下する場合がある．機械学習をグラフデータに対して効果的に適用するためには，機械学習アルゴリズムに対してノードとエッジに関連付けられたグラフ構造を活用することが重要である．

グラフデータに対する機械学習，統計，コンピュータビジョンの基本的な問題を表す一例として，Graph Matching（GM）がある [6]．GM では，グラフデータのペア間でノードやエッジの不一致を最小限に抑えるような，最適なマッチングパラメータを探して調整する．これは，理論的にも実際的にも難しい問題であり，NP 完全問題 [7] としてよく知られている 2 次割り当て問題になる．Lawler QAP [8] と Koopmans Beckmann QAP [9] の二つはよく使われている定式化だが，計算が非常に複雑である．しかし GM 問題は，コンピュータビジョン [10]，ネットワーク分析 [11]，バイオインフォマティクス，生物学 [12]，データマイニングなどの多くのアプリケーションで頻繁に発生する．例として，マルウェアサンプルの検出が挙げられる [11]．グラフ構造表現でサンプルをコーディングすると，未知のサンプルに対する既知のマルウェアサンプルやクリーンサンプルとの比較を検出の難易度として，GM 問題と同様に再定式化できる．また，データベースの中の既知のタンパク質に対するグラフ表現されたタンパク質の分類では，実験で得られた新しいタンパク質を識別することができる．

上記のような GM 問題は，一般的に二つのカテゴリに分類される．一つ目は，1 対 1 マッピングとも呼ばれる Exact Graph Matching であり，ノードの数が同じ二つのグラフ間の厳密な対応が必要である．二つ目は Inexact Graph Matching で，多対多マッピングまたはベストグラフマッチングとも呼ばれる．Inexact Graph Matching では Exact Graph Matching が緩和され，複数のノードをマージすることで大きさの違うグラフをマッチングできるようになる．したがって，Inexact Graph Matching はノードの数が異なるグラフ間のマッチングに適用でき，より現実世界に近いデータに対応することができる．GM，グラフ分類，グラフクラスタリングなどのグラフベースの問題で扱うグラフ類似性を検討すると，ノード間のペアワイズ距離の類似性行列または共分散行列が，固有の構造情報を用いるために利用できる最も効果的な手法である．これらの手法は，着目するノードを絶対的な方法で表現するのではなく，残りのノードに対する相対的な方法で表現する．既存研究では，Random Walk [13, 14] および Shortest Path [15]，Subtrees [4, 14]，Cycles [16]，Graphlets [17]，Pyramid Matched Graphs [18] に基づくカーネルがあり，二つのグラフ構造化データ間の類似性を測定するグラフカーネルは，効果的なアプローチである [19]．さらに，Weisfeiler–Lehman 手法 [20] を使用してパフォーマンスを改善することができる．このアプローチの顕著な利点は，取得したカーネルを Support Vector Machine（SVM）などの既存の機械学習手法にプラグインできることである．他にも，グラフ畳み込みネットワークは，エンドツーエンドでグラフ上の機能の最適な組み合わせを学習する手法であり既存手法よりも優れている [1, 21–24]．また，最適輸送 [25, 26] に基づく手法では，グラフデータに確率測度を与える．例えば，最適輸送を用いた Gromov– Wasserstein (GW) 距離 [27–29] では，二つの異なる距離空間において，空間内のペアワイズ距離を用いて，二つの確率測度間の距離を計算する．しかし，いずれの手法でも完全な GM，グラフ分類，グラフクラスタリングを行うには至っていない．そこで本研究では GW に基づき，さらに構造の類似性を表す新たなパラメータを検討した．

論文の公開元へ

参考文献

[1] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and P. S. Yu. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 2020.

[2] T. M. Lehmann, M. O. Gu¨ld, C. Thies, B. Fischer, K. Spitzer, D. Keysers, H. Ney, M. Kohnen, H. Schubert, and B. B. Wein. Content-based image retrieval in medical applications. Methods of Information in Medicine, Vol. 43, No. 4, pp. 354–361, 2004.

[3] D. J. Jacobs, L. A. Rader, A. J. Kuhn, and M. F. Thorpe. Protein flexibility predictions using graph theory. Proteins, Vol. 44, No. 2, pp. 150–165, 2001.

[4] P. Mah´e and J.-P. Vert. Graph kernels based on tree patterns for molecules. Machine learning, Vol. 75, No. 1, pp. 3–35, 2009.

[5] H. Cai and K. C. Zheng, V. W.and Chang. A comprehensive survey of graph embedding: Problems, techniques and applications. IEEE Transactions on Knowledge and Data Engineering, 2018.

[6] F. Emmert-Streib, M. Dehmer, and Y. Shi. Fifty years of graph matching, network alignment and network comparison. Information Sciences, Vol. 346–347, pp. 180–197, 2016.

[7] E. M. Loiola, N. M. M. de Abreu, P. O. Boaventura-Netto, H. Peter, and T. Querido. A survey for the quadratic assignment problem. European Journal of Operational Research, Vol. 176, No. 2, pp. 657–690, 2007.

[8] Eugene L. Lawler. The quadratic assignment problem. Management Science, Vol. 9, pp. 586–599, 1963.

[9] T. C. Koopmans and M. J. Beckmann. Assignment problems and the location of eco- nomic activities. Econometrica, Vol. 25, pp. 53–76, 1957.

[10] S. Belongie, J. Malik, and J. Puzicha. Shape matching and object recognition using shape contexts. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Vol. 24, No. 4, 2002.

[11] S. Wang, Z. Chen, X. Yu, L. Ding, J. Ni, L.-A. Tang, J. Gui, Z. Li, H. Chen, and P. S. Yu. Heterogeneous graph matching networks for unknown malware detection. In Proceedings of the Twenty-Eight International Joint Conference on Artiﬁcial Intelligence (IJCAI), 2019.

[12] B. Sch¨olkopf, K. Tsuda, and J.-P. Vert. Kernel Methods in Computational Biology. The MIT Press, 2004.

[13] H. Kashima, K. Tsuda, and A. Inokuchi. Marginalized kernels between labeled graphs. In 20th Internatioal Conference in Machine Learning (ICML), 2003.

[14] T. G¨artner, P. Flach, and S. Wrobel. On graph ker- nels: Hardness results and efficient alternatives. In 16th Annual Conference on Learning Theory and 7th Kernel Workshop, COLT/Kernel, 2003.

[15] K. M. Borgwardt and H.-P. Kriegel. Shortest-path kernels on graphs. In 5th International Conference on Data Mining, Vol. 5, pp. 74–81, 2005.

[16] T. Horvath, T. Gartner, and S. Wrobel. Cyclic pattern kernels for predictive graph mining. In 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2004.

[17] N. Shervashidze, S.V.N. Vishwanathan, T. H. Petri, K. Mehlhorn, and K. M. Borgwardt. Efficient graphlet kernels for large graph comparison. In International Conference on Artiﬁcial Intelligence and Statistics (AISTATS), 2009.

[18] K. Grauman and T. Darrell. The pyramid match kernel: Efficient learning with sets of feature. The Journal of Machine Learning Research, Vol. 8, pp. 725–760, 2007.

[19] N. M. Kriege, F. D. Johansson, and C. Morris. A survey on graph kernels. Applied Network Science, Vol. 5, No. 6, 2020.

[20] N. Shervashidze, P. Schweitzer, E. J. v Leeuwen, K. Mehlhorn, and K. M. Borgwardt. Weisfeiler-lehman graph kernels. The Journal of Machine Learning Research, Vol. 12, pp. 2539–2561, 2011.

[21] M. M. Bronstein, J. Bruna, Y. LeCun, A. Szlam, and P. Vandergheynst. Geometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine, Vol. 34, No. 4, pp. 18–42, 2017.

[22] T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. arXiv preprint: arXiv:1609.02907, 2017.

[23] M. Defferrard, X. Bresson, and P. Vandergheynst. Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Process- ing Systems (NIPS), 2016.

[24] M. Niepert, M. Ahmed, and K. Kutzkov. Learning convolutional neural networks for graphs. In Thirty-third International Conference on Machine Learning (ICML), 2016.

[25] Cedric. Villani. Optimal transport: Old and new. Springer, New York, 2008.

[26] G. Peyr´e and M. Cuturi. Computational optimal transport. Foundations and Trends in Machine Learning, Vol. 11, No. 5-6, pp. 355–607, 2019.

[27] Facundo M´emoli. On the use of gromov–Hausdorff distances for shape compasrison. In M. Botsch, R. Pajarola, B. Chen, and M. Zwicker, editors, Eurographics Symposium on Point-Based Graphics. The Eurographics Association, 2007.

[28] Facundo M´emoli. Gromov–Wasserstein distances and the metric approach to object matching. Found. Comput. Math., Vol. 11, pp. 417–487, 2011.

[29] G. Peyr´e, M. Cuturi, and J. Solomon. Gromov-Wasserstein averaging of kernel and dis- tance matrices. In Thirty-third International Conference on Machine Learning (ICML), 2016.

[30] T. Vayer, L. Chapel, R. Flamary, R. Tavenard, and N. Courty. Optimal transport for structured data with application on graphs. In Thirty-sixth International Conference on Machine Learning (ICML), 2019.

[31] H. Xu, D. Luo, H. Zha, and L. Carin. Gromov-wasserstein learning for graph matching and node embedding. In Thirty-sixth International Conference on Machine Learning (ICML), 2019.

[32] H. Xu, D. Luo, and L. Carin. Scalable Gromov-Wasserstein learning for graph parti- tioning and matching. In 33rd Conference on Neural Information Processing Systems (NeurIPS), 2019.

[33] C. Bunne, D. Alvarez-Melis, A. Krause, and S. Jegelka. Learning generative models across incomparable spaces. In Thirty-sixth International Conference on Machine Learn- ing (ICML), 2019.

[34] Y. Yan, W. Li, H. Wu, H. Min, M. Tan, and Q. Wu. Semi-supervised optimal transport for heterogeneous domain adaptation. In Proceedings of the Twenty-Seventh Interna- tional Joint Conference on Artiﬁcial Intelligence (IJCAI), 2018.

[35] Y. Lipman, R. M. Rustamov, and T. A. Funkhouser. Biharmonic distance. ACM Trans- actions on Graphics (TOG), Vol. 29, No. 3, p. 27, 2010.

[36] T. Kataoka and A. Inokuchi. Hadamard code graph kernels for classifying graphs. 5th International Conference on Pattern Recognition Applications and Methods, pp. 24–32, 2016.

[37] S. Hido and H. Kashima. A linear-time graph kernel. 9th International Conference on Data Mining, pp. 179–188, 2009.

[38] John C. Gower. A general coefficient of similarity and some of its properties. Biometrics, pp. 857–871, 1971.

[39] M. Sugiyama and K. M. Borgwardt. Halting in random walk kernels. In Advances in Neural Information Processing Systems, pp. 1639–1647, 2015.

[40] R. Sinkhorn and P. Knopp. Concerning nonnegative matrices and doubly stochastic matrices. Paciﬁc Journal of Mathematics, Vol. 21, No. 2, pp. 343–348, 1967.

[41] Marco. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Annual Conference on Neural Information Processing Systems (NIPS), 2013.

[42] J. D. Benamou, G. Carlier, M. Cuturi, L. Nenna, and G. Peyr´e. Iterative bregman projec- tions for regularized transportation problems. SIAM Journal on Scientiﬁc Computing,, Vol. 37, No. 2, pp. 1111–A1138, 2015.

[43] V. Seguy and M. Cuturi. Principal geodesic analysis for probability measures under the optimal transport metric. In Advances in Neural Information Processing Systems (NIPS), 2015.

[44] A. Rolet, M. Cuturi, and G. Peyr´e. Fast dictionary learning with a smoothed wasserstein loss. In Nineteenth International Conference on Artiﬁcial Intelligence and Statistics (AISTATS), 2016.

[45] N. Courty, R. Flamary, and D. Tuia. Domain adaptation with regularized optimal transport. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML), 2014.

[46] M. Cuturi and A. Doucet. Fast computation of wasserstein barycenters. In Thirty-ﬁrst International Conference on Machine Learning (ICML), 2014.

[47] J. Solomon, G. Peyr´e, V. Kim, and S. Sra. Entropic metric alignment for correspondence problems. ACM Transactions on Graphics (TOG), Vol. 35, No. 4, p. 72, 2016.

[48] D. Burago, Y. Burago, and S. Ivanov. A Course in Metric Geometry, Vol. 33 of AMS Graduate Sudies in Math. American Mathematical Society, 2001.

[49] H. Petric Maretic, M. El Gheche, G. Chierchia, and P. Frossard. GOT: An optimal trans- port framework for graph comparison. In Advances in Neural Information Processing Systems (NeurIPS), 2019.

[50] H. Petric Maretic, M. El Gheche, G. Chierchia, and P. Frossard. Wasserstein-based graph alignment. arXiv preprint: arXiv:2003.06048, 2020.

[51] K. Riesen and H. Bunke. Iam graph database repository for graph based pattern recog- nition and machine learning. SSPRSPR, Vol. 5342, pp. 287–297, 2008.

[52] A. K. Debnath, R. L. Lopez de Compadre, G. Debnath, A. J. Shusterman, and C. Han- sch. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Vol. 2, pp. 786–797, 1991.

[53] N. Kriegel and P. Mutzel. Subgraph matching kernels for attributed graphs. International Conference on Machine Learning, 2012.

[54] J. Sutherland, L. A.O ’Brien, D. F. Weaver. Spline-fitting with a genetic algorithm: a method for developing classification structure-activity relationships. J. Chem. Inf. Comput. Sci., Vol. 43, pp. 1906–1915, 2003.

[55] P. Yanardag and S. V. N. Vishwanathan. Deep Graph Kernels. In 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1365–1374. ACM Press, 2015.

[56] L. Oettershagen, N. Kriege, C. Morris, and P. Mutzel. Temporal graph kernels for classifying dissemination processes. In SIAM International Conference on Data Mining (SDM), 2020.

[57] L. Isella, J. Stehl´e, A. Barrat, C. Cattuto, J.-F. Pinton, and W. Van den Broeck. What’s in a crowd? analysis of face-to-face behavioral networks. Journal of Theoretical Biology, Vol. 271, No. 1, pp. 166–180, 2011.

[58] N. Eagle and A. Pentland. Reality mining: Sensing complex social systems. Personal Ubiquitous Computing, Vol. 10, No. 4, pp. 255–268, 2006.

[59] S. Pan, J. Wu, X. Zhu, G. Long, and C. Zhang. Task sensitive feature exploration and learning for multi-task graph classification. IEEE Transactions on Cybernetics, Vol. 47, No. 3, pp. 744–758, 2017.

[60] M. Neumann, N. Patricia, R. Garnett, and K. Kersting. Efficient graph kernels by randomization. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML), 2012.

[61] Uirike. von. Luxburg. A tutorial on spectral clustering. Statistics and Computing, Vol. 17, pp. 395–416, 2007.

[62] G. Siglidis, G. Nikolentzos, S. Limnios, C. Giatsidis, K. Skianis, and M. Vazirgianis. GraKeL: A Graph Kernel Library in Python. Journal of Machine Learning Research, Vol. 21, pp. 1–5, 2020.

参考文献をもっと見る

分野

大学

学位論文種類・取得年

言語

グラフ類似性測定のためのグラフ局所構造を考慮したGromov-Wasserstein距離

概要

関連論文

関係データの連続的表現に関する研究

LISPプログラミングによるグラフ理論の諸問題の効率的なアルゴリズム

Clustered lassoを用いたスパース共クラスタリング

Discrete hypergroups derived from graphs

ベッチ数を指定した連結2部グラフの総数とその応用

参考文献

分野

大学

学位論文種類・取得年

言語

コピーが完了しました

URLをコピーしました

グラフ類似性測定のためのグラフ局所構造を考慮したGromov-Wasserstein距離

概要

関連論文

関係データの連続的表現に関する研究

LISPプログラミングによるグラフ理論の諸問題の効率的なアルゴリズム

Clustered lassoを用いたスパース共クラスタリング

Discrete hypergroups derived from graphs

ベッチ数を指定した連結2部グラフの総数とその応用

参考文献