リケラボ論文検索は、全国の大学リポジトリにある学位論文・教授論文を一括検索できる論文検索サービスです。

リケラボ 全国の大学リポジトリにある学位論文・教授論文を一括検索するならリケラボ論文検索大学・研究所にある論文を検索できる

リケラボ 全国の大学リポジトリにある学位論文・教授論文を一括検索するならリケラボ論文検索大学・研究所にある論文を検索できる

大学・研究所にある論文を検索できる 「Study on SVM Classifiers for Imbalanced Data Classification Using Quasi-Linear Kernel」の論文概要。リケラボ論文検索は、全国の大学リポジトリにある学位論文・教授論文を一括検索できる論文検索サービスです。

コピーが完了しました

URLをコピーしました

論文の公開元へ論文の公開元へ
書き出し

Study on SVM Classifiers for Imbalanced Data Classification Using Quasi-Linear Kernel

Liang Peifeng 早稲田大学

2021.07.29

概要

The classification problem is a key part of machine learning. In real-world, most classi­fication problems are nonlinear and complicated. When building a classification model, the datasets are generally assumed to be balanced and the target is to minimize the loss function to achieve high accuracy and good generalization. However, in most real-world applications, the datasets suffer from imbalance, which results in the learned classifica­tion model to have a decision boundary leaning to the minority class which causes many minority samples misclassified. This will result in poor performance for global classifi­cation, since the minority class which includes much fewer samples than other class is often much more important and valuable. Therefore, simply minimizing loss function to achieve high accuracy is unsuitable to classify the imbalanced datasets. Study on imbalanced data classification has been a very hot research issue in machine learning. In this thesis, support vector machine (SVM) based classifiers with high performance are developed for imbalanced datasets by taking advantages of the quasi-linear SVM.

Quasi-linear SVM is a model based on divide-and-conquer strategy. It uses a multi-local linear model to solve nonlinear classification problems. Unlike SVM with standard k- emel functions which may be considered as black-box models, the quasi-linear SVM first learns information of data structure as prior knowledge and then builds multi-local linear model with interpolation function or piecewise linear model to represent the non­linear classification boundary. From a viewpoint of modeling, the quasi-linear SVM is a two-step modeling method for nonlinear classifications including model building and model optimization. In model building step, the information of data structure is learned and represented as a model in linear regression form. In model optimization step, SVM formulation is applied to optimizing the parameters of model. From the viewpoint of SVM, the quasi-linear SVM is an SVM with a quasi-linear kernel which includes k- emel composition and SVM optimization. In kernel composition step, the learned in­formation is used to compose a kernel function. The second step can be considered as a common kernel SVM for further model optimization. Although the quasi-linear SVM is effective and flexible in classifying nonlinear datasets, it is based on balanced datasets. Simply applying the quasi-linear SVM to imbalanced data classification will not achieve desirable results. Therefore, the goals of this thesis are developing SVM classifiers with high performance for imbalanced datasets of three different scenarios by taking advantages of the quasi-linear SVM:1)a quasi-linear SVM classifier with local offset adjustment for solving within-class imbalance problem; 2) a quasi-linear

SVM classifier with oversampling in feature space to avoid overlapping problems; 3) a quasi-linear SVM classifier for one-class classification with high performance.

The dissertation contains five chapters as follows:

Chapter 1 briefly describes the problems of imbalanced data classification and SVM with quasi-linear kernel. Different problems in imbalanced data classification, such as within-class imbalance problem, overlapping problem caused in SMOTE method and the one-class classification problem are discussed.

Chapter 2 develops a quasi-linear SVM classifier with local offset adjustment for solv­ing within-class imbalance problem by leveraging the multiple local linear models em­bedded in the quasi-linear SVM. Within-class imbalance problems often occur in the imbalanced data classification which worsen the imbalance distribution problem and in­crease the learning complexity. However, most existing methods for imbalanced data ei­ther focus on rectifying the between-class imbalance problem, which is insufficient and inappropriate in many different scenarios. The proposed method tries to solve this prob­lem by developing a simple yet effective SVM classifier with local offset adjustment for imbalance classification problems. First, a geometry-based partitioning method is mod­ified for imbalanced dataset to divide the input space into multiple linearly separable partitions along the potential separation boundary. Then an F-measure SVM is applied to estimate local offsets optimized in each local linear partition. Finally, by constructing a quasi-linear kernel based on the partitioning information, a quasi-linear SVM classi­fier with local offsets is constructed. On 14 real-world datasets, the proposed method outperforms traditional imbalanced classification methods without considering within- class imbalance problem such as weighted SVM (WSVM) and SVM with weighted harmonic mean (WHM) offset (The “win/tie/lose=14/0/0” and “13/0/1” for the F-score and Gmean indexes). Meanwhile, the proposed method has better performance than the previous solutions (local cluster with sampling methods) with the overall evaluation F-score and Gmean: “win/tie/lose= 12/0/2” and “10/0/4” correspondingly.

Chapter 3 develops a quasi-linear SVM classifier with oversampling in multi-linear fea­ture space (MLFS) to solve overlapping problem caused in implementing SMOTE. S- MOTE is a useful and effective oversampling method to balance datasets by creating synthetic minority samples using linear interpolation. However, when dealing with non­linear imbalanced datasets, SMOTE may create wrong samples falling inside majority class which causes the decision boundary to be spread further into majority class. To solve this problem, an MLFS is built based on the quasi-linear kernel which is com­posed from a pretrained neural network. Since data points are separated linearly in the MLFS, implementing SMOTE in the MLFS will not cause overlapping problem. By using the quasi-linear kernel, the proposed oversampling method avoids computing Eu­clidean distances among the samples directly when mapping the samples to the MLFS and oversampling minority class in the MLFS, which makes it easily to be applied to deal with high-dimensional datasets. The proposed method uses unsupervised learning for pretraining neural network, which make it possible to avoid considering the imbal­ance problem at the stage of pretraining the neural network. Finally, in order to imple­ment oversampling minority class fast and effectively, the proposed method implements the SMOTE in kernel level to avoid computing very high dimensional feature vectors directly. On 18 real-world datasets, the proposed method is very competitive and out­performs the traditional SMOTE method on the overall evaluation metrics F-score and Gmean with “win/tie/lose= 18/0/0”,“12/0/6”. Comparing with other previous solutions, the proposed method also achieves better performance with “win/tie/lose= 16/0/2” and “14/0/4” for the F-score and Gmean indexes on the average.

Chapter 4 develops a quasi-linear SVM classifier for one-class classification with high performance by building a piecewise linear model in feature space. The proposed clas­sifier builds a piecewise linear separation boundary in the feature space to separate data from the origin so as to capture a more compact region in the input space. The proposed one-class classifier with more compact region will decrease the probability of outlier objects falling inside the domain of the classifier and then achieve better performance. For this purpose, the input space is first divided into a group of partitions by using a partitioning mechanism of s% winner-take-all autoencoder. A gated linear network is then designed to implement a group of linear classifiers for each partition, in which the gate signals are generated from the autoencoder. By applying a one-class SVM formu­lation to optimize the parameter set of gated linear network, the one-class classifier is implemented in an exact same way as a standard one-class SVM with a quasi-linear kernel composed by using a base kernel with the gate signals. Numerical experiments on various real-world datasets demonstrate the effectiveness of the proposed method. The classification results of the proposed one-class classifier are 6twin/lose=13/l55 and “14/0” for F-score and accuracy indexes compare to traditional one-class SVM on 14 real-world datasets.

Chapter 5 concludes the dissertation. In this thesis, by taking advantage of the quasi- linear SVM, three SVM classifiers with high performance are developed for imbalanced data classification problems: a quasi-linear SVM classifier with local offset adjustment for solve within-class imbalance, a quasi-linear SVM classifier with oversampling in feature space to avoid overlapping problems and a quasi-linear SVM classifier for one- class classification with high performance. Numerical simulation results demonstrated the effectiveness of the proposed SVM classifiers for imbalanced data classifications.

この論文で使われている画像

参考文献

[1] B. Zhou, B. Chen, and J. Hu, “Quasi-linear support vector machine for nonlinear classification,55IEICE Trans, on Fundamentals of Electronics, Communications and Computer Sciences, vol.97, no. 7, pp.1587-1594, 2014.

[2] W. Li, B. Zhou, B. Chen, and J. Hu, i6A geometry-based two-step method for nonlinear classification using quasi-linear support vector machine,55IEEJ Trans, on Electrical and Electronic Engineering, vol.12, no. 6, pp. 883-890, Nov., 2017.

[3] W. Li, B. Zhou, and J. Hu, “A deep neural network based quasi-linear kernel for support vector machines,,5 IEICE Trans, on Fundamentals of Electronics, Com­munications and Computer Sciences, vol.99, no.12, pp. 2558—2565, 2016.

[4] R Kang and S. Cho, “Eus svms: Ensemble of under-sampled svms for data im­balance problems,5, in Proc, of International Conference on Neural Information Processing (ICONIP 2006). Berlin: Springer, 2006, pp. 837-846.

[5] Z. Ghahramani, “Unsupervised learning,55 in Advanced Lectures on Machine Learning. Springer, 2004, pp. 72-112.

[6] J. C. Fernandez, S. Salcedo-Sanz, P. A. Gutierrez, E. Alexandre, and C. Hervas- Martinez, ""'Significant wave height and energy flux range forecast with machine learning classifiers,,5 Engineering Applications of Artificial Intelligence, vol.43, pp. 44-53, 2015.

[7] B. K. Panigrahi, R K. Dash, and J. Reddy, “Hybrid signal processing and machine intelligence techniques for detection, quantification and classification of power quality disturbances/5 Engineering Applications of Artificial Intelligence, vol.22, no. 3, pp. 442—454, 2009.

[8] F. Wang, Z. Zhen, Z. Mi, H. Sun, S. Su, and G. Yang, “Solar irradiance fea­ture extraction and support vector machines based weather status pattern recogni­tion model for short-term photovoltaic power forecasting,55 Energy and Buildings, vol.86, pp. 427438, 2015.

[9] N. Japkowicz, ""'Learning from imbalanced data sets: a comparison of various strategies,5, in AAAI workshop on learning from imbalanced data sets, vol.68. Menlo Park, CA, 2000, pp. 10-15.

[10] X. Guo, Y. Yin, C. Dong, G. Yang, and G. Zhou, “On the class imbalance prob­lem,,5 in Proc, of fourth international conference on natural computation (IC- NC’2008),vol.4. IEEE, 2008, pp.192-201.

[11] G. H. Nguyen, A. Bouzerdoum, and S. L. Phung, “Learning pattern classification tasks with imbalanced data sets,55 in Pattern recognition. In-Teh, 2009, pp. 193- 208.

[12] C. Huang, Y Li, C. Chen, and X. Tang, “Learning deep representation for imbal­anced classification/5 in Proc, of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'16), Las Vgas, USA, 2016, pp. 5375-5384.

[13] M. Buda, A. Maki, and M. A. Mazurowski, “A systematic study of the class im­balance problem in convolutional neural networks,"" Neural Networks, vol. 106, pp. 249-259, 2018.

[14] Y Sun, A. Wong, and M. Kamel, ''Classification of imbalanced data: A review,55 International journal of pattern recognition and artificial intelligence, vol.23, no. 4, pp. 687-719, 2009.

[15] H. Guo, Y. Li, J. Shang, M. Gu, Y Huang, and B. Gong, '""Learning from class- imbalanced data: Review of methods and applications,55 Expert Systems with Ap­plications, vol.73, pp. 220-239, 2017.

[16] A. Ali, S. M. Shamsuddin, and A. L. Ralescu, ''Classification with class imbal­ance problem: a review,55 International Journal of Advances in Soft Computing and its Applications, vol.7, no. 3, pp.176-204, 2015.

[17] J. M. Johnson and T. M. Khoshgoftaar, “Survey on deep learning with class im­balance,55 Journal of Big Data, vol.6, no.1,pp. 1-54, 2019.

[18] A. Mahani and A. R. B. Ali, ''Classification problem in imbalanced datasets,55 in Recent Trends in Computational Intelligence. IntechOpen, 2019.

[19] M. Galar, A. Fernandez, E. Barrenechea, H. Bustince, and F. Herrera, 6iA re­view on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches,,5 IEEE Trans, on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol.42, no. 4, pp. 463-484, 2012.

[20] V Ganganwar, ‘An overview of classification algorithms for imbalanced dataset- s,,s International Journal of Emerging Technology and Advanced Engineering, vol.2, no. 4, pp. 42-47, 2012.

[21] H. He and E. A. Garcia, ''Learning from imbalanced data,55 IEEE Trans, on Knowledge and Data Engineering, vol.21,no. 9, pp. 1263-1284, 2009.

[22] S. Pouyanfar, Y. Tao, A. Mohan, H. Tian, A. S. Kaseb, K. Gauen, R. Dailey, S. Aghajanzadeh, Y Lu, and S. Chen, “Dynamic sampling in convolutional neu­ral networks for imbalanced data classification,55 in Proc, of IEEE conference on multimedia information processing and retrieval (MIPR 2018). Miami, USA: IEEE, 2018, pp. 112-117.

[23] D. Gupta and B. Richhariya, “Entropy based fuzzy least squares twin support vector machine for class imbalance learning,55 Applied Intelligence, vol.48, no.11,pp. 4212-^231,2018.

[24] R Perera and V M. Patel, '""Learning deep features for one-class classification,55 IEEE Trans, on Image Processing, vol.28, no.11,pp. 5450-5463, 2019.

[25] H. Zhang, H. Zhang, S. Pirbhulal,W. Wu, and V. H. C. D. Albuquerque, “Ac- tive balancing mechanism for imbalanced medical data in deep learning-based classification models,55 ACM Trans, on Multimedia Computing, Communications, and Applications (TOMM), vol.16, no. Is, pp.1-15, 2020.

[26] Fanny and T. W. Cenggoro, “Deep learning for imbalance data classification us­ing class expert generative adversarial network/5 Procedia Computer Science, vol.135, pp. 60-67, 2018.

[27] Y Yan, M. Chen, M. Shyu, and S. Chen, “Deep learning for imbalanced multime­dia data classification,55 in Proc, of IEEE international symposium on multimedia (ISM 2015). Miami USA: IEEE, 2015, pp. 483488.

[28] Ji. Lin, Q. Chen, and X. Qi, “Deep reinforcement learning for imbalanced classi­fication,Applied Intelligence, pp. 1-15, 2020.

[29] S. Goyal, A. Raghunathan, M. Jain, H. V. Simhadri, and P. Jain, “Drocc: Deep robust one-class classification,55 arXiv preprint arXiv:2002.12718, 2020.

[30] Q. Dong, S. Gong, and X. Zhu, “Imbalanced deep learning by minority class in­cremental rectification,55 IEEE iron, on pattern analysis and machine intelligence^ vol.41,no. 6, pp. 1367-1381, 2018.

[31] H. Lee, M. Park, and J. Kim, “Plankton classification on imbalanced large scale database via convolutional neural networks with transfer learning,55 in Proc, of IEEE international conference on image processing (ICIP 2016). Vancouver, Canada: IEEE, 2016, pp. 3713-3717.

[32] S. H. Khan, M. Hayat, M. Bennamoun, F. Sohel, and R. Togneri, “Cost-sensitive learning of deep feature representations from imbalanced data,55 IEEE Trans, on Neural Networks and Learning Systems, vol.29, no. 8, pp. 3573-3587, 2017.

[33] S. Ando and C. Y. Huang, “Deep over-sampling framework for classifying im­balanced data,55 in Proc, of The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Skopje, 2017,pp. 770-785.

[34] Z. Zhu and Z. Song, “Fault diagnosis based on imbalance modified kernel fish­er discriminant analysis,55 Chemical Engineering Research and Design, vol.88, no. 8, pp. 936-951,201 0.

[35] L. Ruff, R. Vandernieulen, N. Goemitz, L. Deecke, S. A. Siddiqui, A. Binder, E. Muller, and M. Kloft, “Deep one-class classification,55 in Proc, of International conference on machine learning (ICML 2018), Stockholm SWEDEN,2018, pp. 43934402.

[36] I. Golan and R. El-Yaniv, “Deep anomaly detection using geometric transforma­tions,,5 in Proc, of Thirty-second Conference on Neural Information Processing Systems (NIPS 2018), Montreal, Canada, 2018, pp. 9758-9769.

[37] W. Khreich, E. Granger, A. Miri, and R. Sabourin, '""Iterative boolean combina­tion of classifiers in the roc space: An application to anomaly detection with hmms,” Pattern Recognition, vol.43, no. 8, pp. 2732-2752, 2010.

[38] M. rravallaee, N. Stakhanova, and A. A. Ghorbani, “Toward credible evaluation of anomaly-based intrusion-detection methods,55 IEEE Trans, on Systems, Man, and Cybernetics, Part C (Applications and Reviews), vol.40, no. 5, pp. 516-524, 2010.

[39] H. Wang, Z. Cui, Y. Chen, M. Avidan, A. B. Abdallah, and A. Kronzer, ^Predict­ing hospital readmission via cost-sensitive deep learning,55IEEE/ACM trans, on computational biology and bioinformatics, vol.15, no. 6, pp. 1968-1978, 2018.

[40] M. A. Mazurowski, P. A. Habas, J. M. Zurada, J. Y. Lo, J. A. Baker, and G. D. Tourassi, “Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance,55 Neural networks, vol.21,no. 2-3, pp. 427-436, 2008.

[41] Q. T. Ngo and S. Yoon, “Facial expression recognition based on weighted-cluster loss and deep transfer learning using a highly imbalanced dataset,” Sensors, vol. 20(9), no. 2639, pp. 1-21,2020.

[42] N. Sarafianos, X. Xu, and I. A. Kakadiaris, “Deep imbalanced attribute classifi­cation using visual attention aggregation,55 in Proc, of the European Conference on Computer Vision (ECCV 2018), Munich,Germany, 2018, pp. 680-697.

[43] C. Huang, Y Li, C. L. Chen, and X. Tang, “Deep imbalanced learning for face recognition and attribute prediction,55 IEEE trans, on pattern analysis and ma­chine intelligence, pp. 1-14, 2019.

[44] C. Padurariu and M. E. Breaban, “Dealing with data imbalance in text classifica­tion,55 Procedia Computer Science, vol. 159, pp. 736-745, 2019.

[45] A. Sun, E. Lim, and Y Liu, “On strategies for imbalanced text classification using svm: A comparative study,” Decision Support Systems, vol.48, no.1,pp. 191-201, 2009.

[46] M. Lango, “Tackling the problem of class imbalance in multi-class sentiment classification: An experimental study,55 Foundations of Computing and Decision Sciences, vol.44, no. 2, pp. 151-178, 2019.

[47] Y Li, G. Sun, and Y. Zhu, “Data imbalance problem in text classification,55 in Proc, of third International Symposium on Information Processing(ISIP 2010). Qingdao, China: IEEE, 2010, pp. 301-305.

[48] D. Galpert, S. Del Rio, F. Herrera, E. Ancede-Gallardo, A. Antunes, and G. Agiiero-Chapin, fiiAn effective big data supervised imbalanced classification approach for ortholog detection in related yeast species,55 BioMed research inter­national, vol. 2015, 2015.

[49] X. Zhao, X. Li, L. Chen, and K. Aihara, “Protein classification with imbalanced data,’’ Proteins: Structure, function, and bioinformatics, vol.70, no. 4, pp. 1125- 1132, 2008.

[50] A. Al-Shahib, R. Breitling, and D. Gilbert, “Feature selection and the class im­balance problem in predicting protein function from sequence,,5 Applied Bioin­formatics, vol.4, no. 3, pp. 195-203, 2005.

[51] N. V. Chawla, K. W. Bowyer, L. 0. Hall, and W. P. Kegelmeyer, “SMOTE: syn­thetic minority over-sampling technique,5, Journal of Artificial Intelligence Re­search, vol.16, pp. 321-357, 2002.

[52] D. P. Williams, V. Myers, and M. S. Silvious, “Mine classification with imbal­anced data,” IEEE Geoscience and Remote Sensing Letters, vol.6, no. 3, pp. 528-532, 2009.

[53] Y. Tang, Y. Zhang, N. V. Chawla, and S. Krasser, “SVMs modeling for highly imbalanced classification,,5 IEEE Trans, on Systems, Man and Cybernetics, Part B (Cybernetics), vol.39, no.1,pp. 281-288, 2009.

[54] H. M. Nguyen, E. W. Cooper, and K. Kamei, '""Borderline over-sampling for im­balanced data classification,55 International Journal of Knowledge Engineering and Soft Data Paradigms, vol.3, no.1,pp. 4—21,2011.

[55] S. Barua, M. M. Islam, X. Yao, and K. Murase, “MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning,,5 IEEE Trans, on Knowledge and Data Engineering, vol.26, no. 2, pp. 405一425, 2014.

[56] M. Perez-Ortiz, P. A. Gutierrez, P. Tino, and C. Hervas-Martmez, “0versampling the minority class in the feature space,” IEEE Trans, on Neural Networks and Learning Systems, vol.27, no. 9, pp. 1947-1961, 2016.

[57] V N. Vapnik, “An overview of statistical learning theory,55 IEEE trans, on neural networks, vol.10, no. 5, pp. 988-999, 1999.

[58] X. Liu, J. Wu, and Z. Zhou, “Exploratory undersampling for class-imbalance learning,55 IEEE Trans, on Systems, Man,and Cybernetics, Part B (Cybernetics), vol.39, no. 2, pp. 539-550, 2009.

[59] J. Van Hulse, T. M. Khoshgoftaar, and A. Napolitano, “Experimental perspec­tives on learning from imbalanced data,’7 in Proc, of the 24th international con­ference on Machine learning (ICML 2007). New York, USA: ACM, 2007, pp. 935-942.

[60] P. Hensman and D. Masko, “The impact of imbalanced training data for con­volutional neural networks,55 Degree Project in Computer Science, KTH Royal Institute of Technology, 2015.

[61] H. Han, W. Wang, and B. Mao, “Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning,55 in Proc, of International Conference on Intelligent Computing. Hefei, China: Springer, 2005, pp. 878-887.

[62] C. Bunkhumpompat, K. Sinapiromsaran, and C. Lursinsap, “Safe-level-smote: Safe-level-synthetic minority over-sampling technique for handling the class im­balanced problem,5, in Pacific-Asia conference on knowledge discovery and data mining. Springer, 2009, pp. 475—482.

[63] T. Jo and N. Japkowicz, “Class imbalances versus small di^juncts,” ACM Sigkdd Explorations Newsletter, vol.6, no.1,pp. 40-49, 2004.

[64] I. Nekooeimehr and S. K. Lai-Yuen, ‘""Adaptive semi-unsupervised weighted oversampling (A-SUWO) for imbalanced datasets,55 Expert Systems with Appli­cations, vol.46, pp. 405-416, 2016.

[65] B. Zhou, W. Li, and J. Hu, ‘""A new segmented oversampling method for imbal­anced data classification using quasi-linear support vector machine,,5 IEEJ Trans, on Electical and Electronic Engineering, vol.12, no. 2, pp. 133-145, 2017.

[66] Y. Sun, M. S. Kamel, A. K. Wong, and Y. Wang, ""'Cost-sensitive boosting for classification of imbalanced data,’’ Pattern Recognition, vol.40, no.12, pp. 3358- 3378, 2007.

[67] T. Razzaghi, ""'Cost-sensitive learning-based methods for imbalanced classifica­tion problems with applications,,5 Ph.D. dissertation, University of Central Flori­da, 2014.

[68] B. Li, J. Hu, and K. Hirasawa, “Support vector machine classifier with WHM offset for unbalanced data.” Journal of Advanced Computational Intelligence and Intelligent Informatics, vol.12, no.1,pp. 94—101, 2008.

[69] Y Huang and S. Du, uWeighted support vector machine for classification with uneven training class sizes,” in Proc, of 2005 International Conference on Ma­chine Learning and Cybernetics (ICMLC 2005),vol.7. Guangzhou, China: IEEE, 2005, pp. 43654369.

[70] D. M. J. Tax, “One-class classification,,5 PhD dissertation series number 65, Delft University of Technology, 2001.

[71] D. M. Tax and R. P. Duin, “Support vector data description,55 Machine Learning, vol.54, no.1,pp. 45-66, 2004.

[72] B. Scholkopf, J. C. Platt, J. Shawe-Taylor, A. J. Smola, and R. C. Williamson, '""Estimating the support of a high-dimensional distribution,,5 Neural Computa­tion, vol.13, no. 7, pp. 1443-1471, 2001.

[73] L. M. Manevitz and M. Yousef, “One-class SVMs for document classification,55 Journal of Machine Learning Research, vol.2, no. Dec, pp. 139-154, 2001.

[74] B. Krawczyk, ''Learning from imbalanced data: open challenges and future di­rections,55 Progress in Artificial Intelligence, vol.5, no. 4, pp. 221-232, 2016.

[75] N. V. Chawla, A. Lazarevic, L. 0. Hall, and K. W. Bowyer, 6tSmoteboost: Im­proving prediction of the minority class in boosting,55 in Proa of European con­ference on principles of data mining and knowledge discovery. Berlin, Heidel­berg: Springer, 2003, pp. 107-119.

[76] H. Guo and H. L. Viktor, ''Learning from imbalanced data sets with boosting and data generation: the databoost-im approach,55 ACM Sigkdd Explorations Newslet­ter, vol.6, no.1,pp. 30-39, 2004.

[77] D. Mease, A. J. Wyner, and A. Buja, “Boosted classification trees and class prob- ability/quantile estimation,55 Journal of Machine Learning Research, vol.8, no. Mar, pp. 409439, 2007.

[78] W. Li, P. Liang, and J. Hu, “An autoencoder-based piecewise linear model for nonlinear classification using quasilinear support vector machines,55IEEJ Trans. on Electrical and Electronic Engineering, vol.14, no. 8, pp.1236——1243, Aug., 2019.

[79] , “Non-local information for a mixture of multiple linear classifiers,5, in Proc, of IEEE International Joint Conference on Neural Networks (IJCN- N’2017). Anchorage: IEEE, 2017, pp. 3741-3746.

[80] B. Scholkopf and A. J. Smola, Learning with kernels: support vector machines, regularization,optimization, and beyond. MIT press, 2002.

[81] C. Cortes and V. Vapnik, “Support-vector networks,55 Machine Learning, vol.20, no. 3, pp. 273-297, 1995.

[82] T. Hofmann, B. Scholkopf, and A. J. Smola, “Kernel methods in machine learn­ing,,5 The Annals of Statistics, pp. 1171-1220, 2008.

[83] J. Vert and K. Tsuda, “A primer on kernel methods,55 Kernel Methods in Compu-t . 47, pp. 35-70, 2004.

[84] W. Li and J. Hu, “Geometric approach of quasi-linear kernel composition for sup­port vector machine,55 in Proc, of 2015 International Joint Conference on Neural Networks (IJCNN2015). Killamey: IEEE, 2015, pp. 1-7.

[85] V Franc and V Hlavac, “An iterative algorithm learning the maximal margin classifier,Pattern Recognition, vol.36, no. 9, pp. 1985-1996, 2003.

[86] P. Liang, W. Li, and J. Hu, “Fast svm training using data reconstruction for clas­sification of very large datasets,55IEEJ Trans, on Electrical and Electronic Engi­neering, vol.15, no. 3, pp. 372-381, March, 2020.

[87] Y LeCun, K. Kavukcuoglu, and C. Farabet, “Convolutional networks and appli­cations in vision.55 in Proc, of International Symposium on Circuits and Systems (ISCAS2010), Paris, 2010, pp. 253-256.

[88] P. Liang, W. Li, D. Liu, and J. Hu, '""Large-scale image classification using fast svm with deep quasi-linear kernel,55 in Proc, of IEEE International Joint Confer­ence on Neural Networks (IJCNN'2017). Anchorage: IEEE, 2017, pp. 1064— 1071.

[89] N. Japkowicz, '""Concept-learning in the presence of between-class and within- class imbalances/5 in Proc, of Conference of the Canadian Society for Computa­tional Studies of Intelligence. Ottawa: Springer, 2001,pp. 67-77.

[90] S. S. Khan and M. G. Madden, (tA survey of recent trends in one class clas­sification,55 in Proc, of the 18th Irish Conference on Artificial Intelligence and Cognitive Science. Dublin: Springer, 2009, pp. 188-197.

[91] R Liang, F. Zheng, W. Li, and J. Hu, “Quasi-linear svm classifier with segmented local offsets for imbalanced data classification,,5 IEEJ Trans, on Electrical and Electronic Engineering, vol.14, no. 2, pp. 289-296, Feb., 2019.

[92] D. R. Musicant, V. Kumar, and A. Ozgur, ""'Optimizing F-measure with support vector machines.55 in Proc, of the International Florida Artificial Intelligence Re­search Society Conference, Augustine, 2003, pp. 356-360.

[93] R Liang, W. Li, and J. Hu, “0versampling the minority class in a multi-linear feature space for imbalanced data classification,5, IEEJ Trans, on Electrical and Electronic Engineering, vol.13, no.10, pp. 1483-1491, Oct., 2018.

[94] N. V. Chawla, N. Japkowicz, and A. Kotcz, “Editorial: special issue on learning from imbalanced data sets/5 ACM Sigkdd Explorations Newsletter, vol.6, no.1, pp. 1-6, 2004.

[95] N. Japkowicz, ""'Supervised versus unsupervised binary-learning by feedforward neural networks,55 Machine Learning, vol.42, no.1,pp. 97-122, 2001.

[96] R. C. Prati, G. E. Batista, and M. C. Monard, “Class imbalances versus class overlapping: an analysis of a learning system behavior,55 in Proc, of Mexican International Conference on Artificial Intelligence, vol.4. Mexico: Springer, 2004, pp. 312-321.

[97] N. Japkowicz and S. Stephen, “The class imbalance problem: A systematic s- tudy,” Intelligent Data Analysis, vol.6, no. 5, pp. 429—449, 2002.

[98] G. E. Batista, R. C. Prati, and M. C. Monard, “A study of the behavior of sev­eral methods for balancing machine learning training data,” ACM Sigkdd Explo­rations Newsletter, vol.6, no.1,pp. 20-29, 2004.

[99] J. Wu, H. Xiong, R Wu, and J. Chen, “Local decomposition for rare class analy­sis,55 in Proc, of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-07). San Jose: ACM, 2007, pp. 814—823.

[100] D. A. Cieslak and N. V. Chawla, “Start globally, optimize locally, predict glob­ally: Improving performance on imbalanced data,” in Proc, of Eighth IEEE In­ternational Conference on Data Mining (ICD ). Pisa: IEEE, 2008, pp.143-152.

[101] A. Rakotomamonjy, ""Optimizing area under roc curve with svms.” in Proc, of ROC Analysis in Artificial Intelligence, Valencia, 2004, pp. 71-80.

[102] C. Chang and C. Lin, “LIBSVM: a library for support vector machines,ACM Trans, on Intelligent Systems and Technology (TIST), vol.2, no. 3, pp. 1-27,2011, software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.

[103] L. Buitinck, G. Louppe, M. Blondel,F. Pedregosa, A. Mueller, O. Grisel,V Nic- ulae, P. Prettenhofer, A. Gramfort, J. Grobler, R. Layton, J. VanderPlas, A. Joly, B. Holt, and G. Varoquaux, “API design for machine learning software: experi­ences from the scikit-leam project,in ECML PKDD Workshop: Languages for Data Mining and Machine Learning, 2013, pp. 108-122.

[104] J. Davis and M. Goadrich, “The relationship between precision-recall and ROC curves,,5 in Proc, of the 23rd International Conference on Machine Learning (ICML2006). Pittsburgh: ACM, 2006, pp. 233-240.

[105] M. Kubat and S. Matwin, ''Addressing the curse of imbalanced training sets: one-sided selection,,5 in Proc, of the 14th International Conference on Machine Learning (ICML 1997), vol.97, Nashville, 1997, pp. 179-186.

[106] N. Ofek, L. Rokach, R. Stem, and A. Shabtai, “Fast-cbus: A fast clustering-based undersampling method for addressing the class imbalance problem,55 Neurocom­puting, vol. 243, pp. 88-102, 2017.

[107] N. Thai-Nghe, Z. Gantner, and L. Schmidt-Thieme, “Cost-sensitive learning methods for imbalanced data,55 in Proc, of 2010 IEEE on International Joint Conference on Neural Networks (IJ CNN 2010). Barcelona: IEEE, 2010, pp.1-8.

[108] H. He, Y. Bai, E. A. Garcia, and S. Li, “ADASYN: Adaptive synthetic sampling approach for imbalanced learning/5 in Proc, of IEEE International Joint Confer­ence on Neural Networks (IJCNN 2008). Hongkong, China: IEEE, 2008, pp. 1322-1328.

[109] B. Wang and N. Japkowicz, “Imbalanced data set learning with synthetic sam­ples,55 in Proc, of IRIS Machine Learning Workshop, Ottawa, 2004, pp. 1-19.

[110] B. Chen, W. Gu, and J. Hu, ‘""An improved multi-label classification method and its application to functional genomics,5, International Journal of Computational Biology and Drug Design, vol.3, no. 2, pp. 133-145, 2010.

[111] C. C. Aggarwal, A. Hinneburg, and D. A. Keim, “On the surprising behavior of distance metrics in high dimensional space,5, in Proc, of International Conference on Database Theory, London, UK, 2001, pp. 420-434.

[112] R. Blagus and L. Lusa, “SMOTE for high-dimensional class-imbalanced data,55 BMC Bioinformatics, vol.14, no. 106, pp. 1-16, 2013.

[113] V Vapnik, The nature of statistical learning theory. Springer Science & Busi­ness Media, 2013.

[114] H. Xiong, M. Swamy, and M. 0. Ahmad, ''Optimizing the kernel in the empirical feature space,55 IEEE Trans, on Neural Networks, vol.16, no. 2, pp. 460-474, 2005.

[115] S. Tokui, K. Oono, S. Hido, and J. Clayton, “Chainer: a next-generation open source framework for deep learning,55 in Proc, of The 29th Annual Conference on Neural Information Processing Systems (NIPS2015), Montreal, 2015.

[116] A. Makhzani and B. J. Frey, “Winner-take-all autoencoders,5, in Advances in Neu­ral Information Processing Systems,2015, pp. 2791-2799.

[117] K. Simonyan and A. Zisserman, “Vbry deep convolutional networks for large- scale image recognition,5, arXiv preprint arXiv:1409.1556, 2014.

[118] S. S. Khan and M. G. Madden, “One-class classification: taxonomy of study and review of techniques,55 The Knowledge Engineering Review, vol.29, no. 3, pp. 345-374, 2014.

[119] J. H. M. Janssens, “Outlier selection and one-class classification,55 TiCC PhD Dissertation Series No. 27, Tilburg University, 2013.

[120] H. Yu, J. Han, and K. Chang, “PEBL: Web page classification without negative examples,55 IEEE Trans, on Knowledge and Data Engineering, vol.16, no.1,pp. 70-81, 2004.

[121] G. Cohen, M. Hilario, H. Sax, S. Hugonnet, C. Pellegrini, and A. Geissbiihler, “An application of one-class support vector machines to nosocomial infection detection.55 in Medinfo 2004, 2004, pp. 716-720.

[122] H. Shin, D. Eom, and S. Kim, “One-class support vector machines - an applica­tion in machine fault detection and classification,55 Computers & Industrial Engi­neering, vol.48, no. 2, pp. 395—408, 2005.

[123] Z. Zeng, Y. Fu, G. I. Roisman, Z. Wen, Y. Hu, and T. S. Huang, “One-class classification for spontaneous facial expression analysis,55 in Proc, of 7th Inter­national Conference on Automatic Face and Gesture Recognition (FGR 2006). Southampton: IEEE, 2006, pp. 281-286.

[124] H. Alashwal,S. Deris, and R. M. Othman, “One-class support vector machines for protein-protein interactions prediction,55 World Academy of Science, Engi­neering and Technology International Journal of Bioengineering and Life Sci­ences, vol.1,no. 3, pp.192-199, 2007.

[125] C. Desir, S. Bernard, C. Petitjean, and L. Heutte, “One class random forests,55 Pattern Recognition, vol.46, no.12, pp. 3490-3506, 2013.

[126] R. Duin, “On the choice of smoothing parameters for Parzen estimators of prob­ability density functions,55 IEEE Tran, on Computers, vol.25, no.11,pp. 1175- 1179, 1976.

[127] G. Cohen, H. Sax, A. Geissbuhler et al., “Novelty detection using one-class parzen density estimator, an application to surveillance of nosocomial infection- s.” Studies in Health Technology and Informatics, vol. 136, pp. 21-26, 2008.

[128] E. M. Knorr, R. T. Ng, and V Tucakov, ""'Distance-based outliers: algorithms and applications,5, The International Journal on Very Large Data Bases, vol.8, no. 3-4, pp. 237-253, 2000.

[129] M. Jiang, S. Tseng, and C. Su, “Two-phase clustering process for outliers detec­tion,5, Pattern Recognition Letters, vol.22, no. 6-7, pp. 691-700, 2001.

[130] L. Manevitz and M. Yousef, “One-class document classification via neural net­works,55 Neurocomputing, vol.70, no. 7-9, pp. 1466-1481, 2007.

[131] E. Pekalska, D. M. Tax, and R. Duin, “One-class LP classifiers for dissimilarity representations/5 in Proc, of Advances in Neural Information Processing Systems (NIPS 2003), Vancouver, 2003, pp. 777-784.

[132] L. E. Ghaoui, M. I. Jordan, and G. R. Lanckriet, “Robust novelty detection with single-class mpm,” in Proc, of Advances in Neural Information Processing Sys­tems, Vancouver, 2003, pp. 929-936.

[133] S. Yin, X. Gao, H. R. Karimi, and X. Zhu, “Study on support vector machine­based fault detection in tennessee eastman process,55 in Abstract and Applied Analysis, vol. 2014. Hindawi, 2014.

[134] P. Liang, W. Li, H. Tian, and J. Hu, “One-class classification using a support vec­tor machine with a quasi-linear kernel,’’ IEEJ Trans, on Electrical and Electronic Engineering, vol.14, no. 3, pp. 449—456, March, 2019.

[135] G. E. Hinton and R. R. Salakhutdinov, ''Reducing the dimensionality of data with neural networks/5 Science, vol. 313, no. 5786, pp. 504—507, 2006.

[136] A. Ng, “Sparse autoencoder,55 CS294A Lecture notes, 2011.

[137] A. Makhzani and B. Frey, “K-sparse autoencoders,55 arXiv preprint arX- iv:1312.5663, 2013.

[138] L. Ladicky and P. Torr, “Locally linear support vector machines,55 in Proc, of the 28th International Conference on Machine Learning (ICML 2011),Bellevue, 2011, pp. 985-992.

[139] X. Glorot, A. Bordes, and Y. Bengio, “Deep sparse rectifier neural networks,5, in Proc, of the Fourteenth International Conference on Artificial Intelligence and Statistics, Lauderdale, 2011, pp. 315-323.

[140] D. M. Tax and K. Muller, “Feature extraction for one-class classification,55 in Proc, of Joint International Conference on Artificial Neural Networks and Neural Information Processing (ICANN/ICONIP 2003). Istanbul: Springer, 2003, pp. 342-349.

[141] D. Tax, “Data description toolbox dd tools 2.1.1,’’ Delft University of Technology, Delft, The Netherlands, Tech. Rep, 2014.

参考文献をもっと見る