Exact fuzzy k-Nearest neighbor classification for big datasets

Maillo, Jesus and Luengo, Julian and García, Salvador and Herrera, Francisco (2017) Exact fuzzy k-Nearest neighbor classification for big datasets. In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2017), 9-12 July 2017, Naples, Italy.

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Download (992kB) | Preview

Abstract

The k-Nearest Neighbors (kNN) classifier is one of the most effective methods in supervised learning problems. It classifies unseen cases comparing their similarity with the training data. Nevertheless, it gives to each labeled sample the same importance to classify. There are several approaches to enhance its precision, with the Fuzzy k Nearest Neighbors (FuzzykNN) classifier being among the most successful ones. FuzzykNN computes a fuzzy degree of membership of each instance to the classes of the problem. As a result, it generates smoother borders between classes. Apart from the existing kNN approach to handle big datasets, there is not a fuzzy variant to manage that volume of data. Nevertheless, calculating this class membership adds an extra computational cost becoming even less scalable to tackle large datasets because of memory needs and high runtime. In this work, we present an exact and distributed approach to run the Fuzzy-kNN classifier on big datasets based on Spark, which provides the same precision than the original algorithm. It presents two separately stages. The first stage transforms the training set adding the class membership degrees. The second stage classifies with the kNN algorithm the test set using the class membership computed previously. In our experiments, we study the scaling-up capabilities of the proposed approach with datasets up to 11 million instances, showing promising results.

Item Type: Conference or Workshop Item (Paper)
Schools/Departments: University of Nottingham, UK > Faculty of Science > School of Computer Science
Depositing User: Eprints, Support
Date Deposited: 16 Aug 2017 10:06
Last Modified: 17 Aug 2017 04:50
URI: http://eprints.nottingham.ac.uk/id/eprint/44937

Actions (Archive Staff Only)

Edit View Edit View