Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 CoresTools Peng, Xiangjun, Wang, Qingfeng, Sun, Xu, Gong, Chunye and Wang, Yaohua (2019) Succinct Representations in Collaborative Filtering: A Case Study using Wavelet Tree on 1,000 Cores. In: 2019 20th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT), 5-7 Dec. 2019, Gold Coast, Australia, Australia.
Official URL: http://dx.doi.org/10.1109/PDCAT46702.2019.00083
AbstractUser-Item (U-I) matrix has been used as the dominant data infrastructure of Collaborative Filtering (CF). To reduce space consumption in runtime and storage, caused by data sparsity and growing need to accommodate side information in CF design, one needs to go beyond the UI Matrix. In this paper, we took a case study of Succinct Representations in Collaborative Filtering, rather than using a U-I Matrix. Our key insight is to introduce Succinct Data Structures as a new infrastructure of CF. Towards this, we implemented a User-based K-Nearest-Neighbor CF prototype via Wavelet Tree, by first designing a Accessible Compressed Documents (ACD) to compress U-I data in Wavelet Tree, which is efficient in both storage and runtime. Then, we showed that ACD can be applied to develop an efficient intersection algorithm without decompression, by taking advantage of ACD’s characteristics. We evaluated our design on 1,000 cores of Tianhe-II supercomputer, with one of the largest public data set ml-20m. The results showed that our prototype could achieve 3.7 minutes on average to deliver the results.
Actions (Archive Staff Only)
|