Personal note: I always enjoy learning the historical context behind different concepts. And it also helps me memorize things:

I am also very glad I read this paper ahead of Lecture #07.

Overview of the main idea (3 sentences)

This paper compares the pull and push models of query plan processing. The authors refer to these as:

While not completely orthogonal, these concepts are not completely coupled either. I would argue that vectorization could also be a part of data-centric code generation. And generated code could and does certainly take advantage of pipelining. However, the authors point out that the pull model lends itself to vectorization really well (especially for OLAP), and that the push model really benefits from LLVM-style code generation. So, the comparison and the intertwining do make sense.

Key findings / takeaways from the paper (2-3 sentences)

The authors conclude that both strategies are viable, and the performance for them is similar in OLAP workloads. For OLTP systems, the authors conclude that code compilation is better (not just because of performance, but also because it allows systems to more easily adopt custom languages for user-defined functions).

Another important key mention is that SIMD is better applicable in vectorized systems, but should also be possible to integrate into push model systems with code generation (but probably very hard).

Personal note: I’m very interested in HTAP systems such as SingleStoreDB. For these products, a mixture is required.

System used in evaluation and how it was modified/extended (1 sentence)

https://github.com/TimoKersten/db-engine-paradigms

It’s two custom built systems the authors call “Tectorwise” and “Typer”.

Workload Evaluated (1 sentence)