Date: Feb 07 2024
Slides: https://15721.courses.cs.cmu.edu/spring2024/slides/05-execution2.pdf
Reading
Raw Lecture Notes
- OLAP systems today are roughly 50-50% push-based vs. pull-based
- Push-based is superior for OLAP. From previous system:
- Start with leaf nodes and "push" data to their parents.
- Can "fuse" operators together within a for-loop to
minimize intermediate result staging.
- Pros/Cons:
- Allows for tighter control of caches/registers in pipelines.
- May not have exact control of intermediate result sizes.
- Difficult to implement some operators (Sort-Merge Join).
- Another big advantage of push-based is that it allows for fine-grained control of execution. Can more easily be paused/resumed as needed.
- Parallel Execution
- at least for this class, multi-process, multi-threaded and multi-node will be the same thing
- (I know that in practice this is not really true. It has never been more true as it is today though with the network performance evolving so much)
- Inter Query Parallelism AND Intra Query Parallelism
- Most DBMSs use a First Come First Serve queue to prioritize queries. But sometimes they have a prioritization mechanism.
- Intra Query Parallelism
- Intra Operator (Horizontal) - most common
- We will split a single operator into different pieces of data.
- Results need to be put back together.
- A Gather operator combines the result from multiple workers into a single output stream. (SingleStoreDB has this) - N to 1
- A Distribute operator fans it out (basically the opposite) - 1 to N
- A Repartition/Shuffle takes multiple inputs and splits it into multiple outputs (N to M)
- Inter Operator (Vertical)
- Run operators at the same time.
- Can we build parts of a DBMS query engine as a library?
- Velox, (Apache Arrow) DataFusion, Intel OAP, Polars
- Meta Velox (paper!)
- High-performance single-node query executor
- No SQL parser
- No metadata catalog
- No cost-based optimizer
- It takes a plan and executes it.
- What does it have?
- Push-based vectorized query processing
- Precompiled primitives + Codegen expressions
- Arrow Compatible
- Adaptive Query Optimization
- Only supports Sort-Merge + Hash Joins
- It doesn’t own data. It exposes API to define connectors to retrieve data that have to be implemented the clients of this library. It also exposes adapters to encode/decode different storage format.
- (DWRF is the next version of ORC?)
- Velox Components
- Type System
- Expression Engine
- Internal Data Representation
- Function API
- Operator Engine
- Storage Connectors/Adapters
- Resource Manager
- The best OLAP systems only materialize tuples when necessary. Only IDs are sent up or up down the plan execution. To do joins, we mostly don’t need the full tuple.
- What is the benefit of early materialization?
- (I don’t know the answer)
- Apache Arrow is an open source to describe how to store data in memory in a columnar fashion.
- Arrow is better than Parquet/ORC at random access.
- Arrow string encoding
- used to be very naive
- velox/duckdb improves it to make it better (some sort of dictionary encoding). German-style encoding.
- Substrait is like Arrow but for query plans. The idea is that all databases could share the same “schema” for query plans.
- Not that widely adopted yet. Still early.
- Expression Evaluation
- can we compile the expressions
- can we vectorize it so a bunch of tuples are evaluated
- With Velox, expression trees can be flattened. Besides this, it also enables a lot of micro-optimizations around expression execution (e.g., if they see
UPPER('hello')
they will just memoize it to HELLO
).
- Everything we’ve discussed so far is after the query plan has been generated. But query plan generation is super important - for good performance, we need good plans.
- Can the execution engine be allowed to modify a query’s plan and the expression trees?
- Some execution engines do this. They learn from the execution and rewrite parts of the plan for the current or future executions.
- (See “Adaptive Query Processing In The Looking Class”, CIDR 2025)
- With Velox, it adapts some expressions to accelerate them:
- Predicate Reordering (
WHERE SLOW_FUNC(a)=true AND FAST_FUNC(b)=true
, maybe it makes sense to run the FAST_FUNC
first to reduce the data size)
- Column Prefetching
- Not Null Fast Paths
- Use faster ASCII functions if the strings don’t actually have any UTF-8 characters