25th International Conference on Database Systems for Advanced Applications

Sep. 24-27, 2020, Jeju, South Korea

Click following URL

http://dasfaa2020.sigongji.com

to visit DASFAA 2020 Online Event Site

Paper details

Title: HEGJoin: Heterogeneous CPU-GPU Epsilon Grids for Accelerated Distance Similarity Join

Authors: Benoit Gallet and Michael Gowanlock

Abstract: The distance similarity join operation joins two datasets (or tables), A and B, based on a search distance, , (An B), and returns the pairs of points (p a , p b ), where p a ¡ô A and p b ¡ô B such that the distance between p a and p b ¡Â . In the case where A = B, then this operation is a similarity self-join (and therefore, A no A). In contrast to the majority of the literature that focuses on either the CPU or the GPU, we propose in this paper Heterogeneous CPU-GPU Epsilon Grids Join (HEGJoin), an efficient algorithm to process a distance similarity join using both the CPU and the GPU. We leverage two state-of-the-art algorithms: LBJoin for the GPU and Super-EGO for the CPU. We achieve good load balancing between architectures by assigning points with larger workloads to the GPU and those with lighter workloads to the CPU through the use of a shared work queue. We examine the performance of our heterogeneous algorithm against LBJoin, as well as Super-EGO by comparing performance to the upper bound throughput. We observe that HEGJoin consistently achieves close to this upper bound.

Video file:

Slide file:

Sponsors