Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores

Peng, Xiangjun, Wang, Qingfeng, Sun, Xu, Gong, Chunye and Wang, Yaohua (2019) Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores. In: 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), 5-7 Dec. 2019, Gold Coast, Australia, Australia.

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Available under Licence Creative Commons Attribution.
Download (1MB) | Preview

Abstract

User-Item (U-I) matrix has been used as the dominant data infrastructure of Collaborative Filtering (CF). To reduce space consumption in runtime and storage, caused by data sparsity and growing need to accommodate side information in CF design, one needs to go beyond the UI Matrix. In this paper, we took a case study of Succinct Representations in Collaborative Filtering, rather than using a U-I Matrix. Our key insight is to introduce Succinct Data Structures as a new infrastructure of CF. Towards this, we implemented a User-based K-Nearest-Neighbor CF prototype via Wavelet Tree, by first designing a Accessible Compressed Documents (ACD) to compress U-I data in Wavelet Tree, which is efficient in both storage and runtime. Then, we showed that ACD can be applied to develop an efficient intersection algorithm without decompression, by taking advantage of ACD’s characteristics. We evaluated our design on 1,000 cores of Tianhe-II supercomputer, with one of the largest public data set ml-20m. The results showed that our prototype could achieve 3.7 minutes on average to deliver the results.

Item Type: Conference or Workshop Item (Paper)
Keywords: Succinct Data Structures; Collaborative Filtering; Supercomputing
Schools/Departments: University of Nottingham Ningbo China > Faculty of Business > Nottingham University Business School China
University of Nottingham Ningbo China > Faculty of Science and Engineering > School of Computer Science
University of Nottingham Ningbo China > Faculty of Science and Engineering > Department of Mechanical, Materials and Manufacturing Engineering
Identification Number: https://doi.org/10.1109/PDCAT46702.2019.00083
Depositing User: Wu, Cocoa
Date Deposited: 08 May 2020 01:05
Last Modified: 08 May 2020 01:05
URI: https://eprints.nottingham.ac.uk/id/eprint/60535

Actions (Archive Staff Only)

Edit View Edit View