Concurrent Near-Data Processing Architectures

3D die-stacking technology allows logic circuits, such as simple processors, to be placed physically near memory, using high-bandwidth Through-Silicon Via (TSV) interconnects for communication between the near-memory processor and memory. Today, commercially available devices that exploit the 3D die-stacking technology, such as Hybrid Memory Cube (HMC) or High-Bandwidth Memory (HBM), implement only simple memory controller logic near memory. However, we expect that soon simple processors will be placed near memory, enabling near-data-processing. We have investigated how near-memory accelerators can be combined with novel data structures and algorithms to exploit the low-latency, high-bandwidth memory access of future NDP architectures, while also preserving the high concurrency of conventional systems. In particular, we have focused on software libraries and architectural support for general-purpose concurrent data structures with near-data-processing architectures. These data structures are used in many applications and adapting them to NDP architectures is a key step toward making these architectures useful. In conventional architectures, “pointer-chasing” data structures with poor cache locality and high-contention concurrent data structures are often bottlenecks, while near-data-processing architectures have the promise to alleviate or even eliminate these problems.

We found that potential benefits of NDP-based concurrent data structures also required lightweight NDP hardware modifications (inspired by observations on data structure access patterns and underlying DRAM activity). Our software-hardware approach showed significant improvements in performance and energy consumption compared to state-of-the-art concurrent data structures.

Here are some of my publications related to the topic.
  • Concurrent Data Structures with Near-Data Processing: an Architecture-Aware Approach. SPAA 2019. PDF
  • Attacking Memory-Hard scrypt with Near-Data Processing. MEMSYS 2019. PDF
  • Hybrid Skiplist: Combining the Best of Near-Data Processing and Lock-Free Algorithms. ACM SRC Student Research Competition (held @ MICRO 2019). PDF