Frontiers in Signal Processing

Improved Dynamic Routing Algorithm for Information Aggregation

Download PDF (494.2 KB) PP. 17 - 26 Pub. Date: January 31, 2021

DOI: 10.22606/fsp.2021.51003

Author(s)

Gongbin Chen
Key Laboratory of Electronic and Information Engineering, Southwest Minzu University, State Ethnic Affairs Commission, Chengdu, China
Wei Xiang
Key Laboratory of Electronic and Information Engineering, Southwest Minzu University, State Ethnic Affairs Commission, Chengdu, China
Yansong Deng^*
Key Laboratory of Electronic and Information Engineering, Southwest Minzu University, State Ethnic Affairs Commission, Chengdu, China

Abstract

Information aggregation is an essential component of text encoding, but it has been paid less attention. The pooling-based (max or average pooling) aggregation method is a bottom-up and passive aggregation method, and loses a lot of important information. Recently, attention mechanism and dynamic routing policy are separately used to aggregate information, but their aggregation capabilities can be further improved. In this paper, we proposed an novel aggregation method combining attention mechanism and dynamic routing, which can strengthen the ability of information aggregation and improve the quality of text encoding. Then, a novel Leaky Natural Logarithm (LNL) squash function is designed to alleviate the “saturation” problem of the squash function of the original dynamic routing. Layer Normalization is added to the dynamic routing policy for speeding up routing convergence as well. A series of experiments are conducted on five text classification benchmarks. Experimental results show that our method outperforms other aggregating methods.

Keywords

information aggregation, dynamic routing, attention, squash function, text classification

References

[1] Ba J L, Kiros J R, Hinton G E. Layer normalization[J]. arXiv preprint arXiv:1607.06450, 2016.

[2] Chung J , Gulcehre C , Cho K H , et al. Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling[J]. Eprint Arxiv, 2014.

[3] Conneau A , Schwenk H , Barrault L , et al. Very Deep Convolutional Networks for Text Classification[J]. 2017.

[4] Gong J , Xipeng Qiu, Wang S , et al. Information Aggregation via Dynamic Routing for Sequence Encoding[J]. 2018.

[5] Jiang M , Zhang W , Zhang M , et al. An LSTM-CNN attention approach for aspect-level sentiment classification[J]. Journal of Computational Methods in ences and Engineering, 2019, 19(4):1-10.

[6] Johnson R , Zhang T . Deep Pyramid Convolutional Neural Networks for Text Categorization[C]. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017.

[7] Johnson R, Zhang T. Effective use of word order for text categorization with convolutional neural networks[J]. arXiv preprint arXiv:1412.1058, 2014.

[8] Kalchbrenner N , Grefenstette E , Blunsom P . A Convolutional Neural Network for Modelling Sentences[J]. Eprint Arxiv, 2014, 1.

[9] Kim Y . Convolutional Neural Networks for Sentence Classification[J]. Eprint Arxiv, 2014.

[10] Kingma D P , Ba J . Adam: A Method for Stochastic Optimization[J]. Computer ence, 2014.

[11] Le Q V , Mikolov T . Distributed Representations of Sentences and Documents[J]. 2014.

[12] Lin Z, Feng M, Santos C N, et al. A structured self-attentive sentence embedding[J]. arXiv preprint arXiv:1703.03130, 2017.

[13] Liu P , Qiu X , Chen X , et al. Multi-Timescale Long Short-Term Memory Neural Network for Modelling Sentences and Documents[C]. Conference on Empirical Methods in Natural Language Processing. 2015.

[14] Liu Y , Ji L , Huang R , et al. An Attention-Gated Convolutional Neural Network for Sentence Classification[J]. 2018.

[15] Mikolov T . Distributed Representations of Words and Phrases and their Compositionality[J]. Advances in Neural Information Processing Systems, 2013, 26:3111-3119.

[16] Pennington J , Socher R , Manning C . Glove: Global Vectors for Word Representation[C]. Conference on Empirical Methods in Natural Language Processing. 2014.

[17] Ren H, Lu H. Compositional coding capsule network with k-means routing for text classification[J]. 2018.

[18] Ribeiro J G, Felisberto F S, Neto I C. Pruning and Sparsemax Methods for Hierarchical Attention Networks[J]. arXiv preprint arXiv:2004.04343, 2020.

[19] Sabour S, Frosst N, Hinton G E. Dynamic routing between capsules[C]. Advances in neural information processing systems. 2017: 3856-3866.

[20] Socher R, Perelygin A, Wu J, et al. Recursive deep models for semantic compositionality over a sentiment treebank[C]. Proceedings of the 2013 conference on empirical methods in natural language processing. 2013: 1631-1642.

[21] Tang D , Qin B , Liu T . Learning Semantic Representations of Users and Products for Document Level Sentiment Classification[C]. Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2015.

[22] Tuan-Linh N , Swathi K , Minho L . A fuzzy convolutional neural network for text sentiment analysis[J]. Journal of Intelligent and Fuzzy Systems, 2018:1-10.

[23] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]. Advances in neural information processing systems. 2017: 5998-6008.

[24] Wang D, Liu Q. An optimization view on dynamic routing between capsules[J]. 2018.

[25] Wang R , Li Z , Cao J , et al. Convolutional Recurrent Neural Networks for Text Classification[C]. 2019 International Joint Conference on Neural Networks (IJCNN). 2019.

[26] Wang Y , Huang M , Zhu X , et al. Attention-based LSTM for Aspect-level Sentiment Classification[C]. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. 2016.

[27] Yang Z , Yang D , Dyer C , et al. Hierarchical Attention Networks for Document Classification[C]. Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2017.

[28] Zhao W, Ye J , Yang M , et al. Investigating Capsule Networks with Dynamic Routing for Text Classification[C]. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.