Overview of the main idea (3 sentences)

7 years after Morsel-Driven Parallelism: A NUMA-Aware Query Evaluation Framework for the Many-Core Age, the Hyper/Umbra folks are back to improve upon their morsel-based scheduling algorithms. In this paper, a few ideas are discussed:

  1. [Not novel, from 1995] Stride scheduling is briefly re-introduced, which is a mechanism for prioritizing concurrent processes based on “time slices”.
  2. Thread-local scheduling (also briefly mentioned in the Morsel paper, but more heavily argued for in this paper), which is basically the idea of not having a centralized dispatcher and simply “sharding” the scheduling logic with lock-free data structures.

Now, for the main ideas:

  1. Dynamically-sized morsels (partitions), which were fixed in the paper from 2014. This is basically protecting against “unfair” morsels being slower to process for some reason.
  2. Adaptive query priorities, which is basically slightly prioritizing faster queries to reduce overall query latency.
    1. "short running queries can be prioritized without noticeably altering the latency of other, long running requests. Assume we are executing two types of queries. The short running ones make up 90% of the workload and take 10ms to execute. Meanwhile, the long running requests take 1s. Even if we treat all short requests preferentially, this amounts to less than 10% of the overall workload being prioritized. As a result, the long running requests are not slowed down significantly”
    2. This is done by slightly reducing the priority of queries which have used more CPU time so far.
  3. Self-tuning the priorities from #4 over time by playing around with different values and measuring overall query latency (🤯)
    1. (This is the kind of idea that is always fun to talk about but very hard to actually engineer)

Key findings / takeaways from the paper (2-3 sentences)

System used in evaluation and how it was modified/extended (1 sentence)

Workload Evaluated (1 sentence)

“sample from TPC-H queries at SF3 and SF30 (…) picking queries at SF3 is three times more likely than picking queries at SF30 (…) in order to obtain a more interesting workload”