Realizing the Benefits of Processing-in-DIMM for In-Memory Joins

2024-11-22
  • 155

[Abstract]

Modern Dual In-line Memory Modules (DIMMs) can now support Processing-In-Memory (PIM) by placing In-DIMM Processors (IDPs) near their memory banks. PIM can greatly accelerate in-memory joins, whose performance is frequently bounded by main-memory accesses, by offloading the operations of the join from host CPUs to the IDPs. However, as real PIM hardware has not been available until very recently, the prior PIM-assisted join algorithms have relied on PIM hardware simulators which assume many PIM hardware characteristics significantly different from those of real PIM hardware. To realize the benefits of PIM on real systems, I will first present PID-Join, a fast in-memory join algorithm exploiting UPMEM DIMMs, currently the only publicly-available PIM-enabled DIMMs. PID-Join optimizes all three join types (i.e., hash, sort-merge, nested-loop) for the IDPs, enables fast inter-IDP communication using host CPU cache streaming and vector instructions, and facilitates fast data transfer between the IDPs and the host memory. I will then present SPID-Join, a skew-resistant in-memory join algorithm leveraging the PIM-enabled DIMMs. SPID-Join overcomes PID-Join’s performance and scalability limitations with skewed input records by replicating popular join keys across multiple IDPs. Doing so allows SPID-Join to process popular join keys with much higher aggregate memory bandwidth and computational throughput offered by multiple IDPs. As there exists a trade-off between the join throughput and the degree of join key replication, SPID-Join employs a cost model to identify the optimal join key replication ratio for given join and system configurations.

[Biography]

Youngsok Kim is currently an associate professor with the Department of Computer Science and Engineering at Yonsei University. His research interests span computer architecture and system software with an emphasis on architecture-conscious database management systems, performance modeling, and architectural support for deep learning. He received his BSc and PhD in Computer Science and Engineering from POSTECH. He was a post-doc researcher at Seoul National University before joining Yonsei University, and was an intern at Consumer Hardware, Google Inc. and S.LSI Business, Samsung Electronics Co., Ltd. during his PhD studies.

LIST