Lecture #15: Optimizer Implementation 3

Date: Mar 25 2024

Reading

Plan Stitch: Harnessing the Best of Many Plans (B. Ding, et al., VLDB 2018)
Adaptive Query Processing in the Looking Glass (S. Babu, et al., CIDR 2015) (Optional)
Looking Ahead Makes Query Plans Robust (J. Zhu, et al., VLDB 2017) (Optional)
Neo: A Learned Query Optimizer (R. Marcus, et al., VLDB 2019) (Optional)

As the query executes, things could change (inserts, deletes, updates) that may affect the performance of a given plan. What if we don’t have to generate the entire plan right from the get go?
The most likely reason for a plan to not be good is getting the join order incorrectly.
- We think that our join will produce X number of tuples but it may be much different.
- And this assumption can have big effects down the road.
this technique is called “Adaptive Query Optimization”
- it will allow the DBMS to modify the query plan for a query to perform better based on what it learns along the way
3 approaches
- modify future invocations
  - plan correction
  - feedback loop
- replan current invocations
  - start-over from scratch or
  - keep intermediate results
- plan pivot points
DBMSs should track the history of query invocations (cost estimations, query plan, actual runtime metrics)
- if the DBMS generates a new plan for a query and it performs worse, it should revert to the old plan
- but simply reverting is not that smart. Instead, as per the “Plan Stitch” paper from VLDB 2018, maybe we can just revert a sub-plan and then use bottom-up dynamic programming to find a better overall plan.
- AND! we can share sub-plans between queries!
  - as long as they’re logically equivalent
- (Redshift does this at the codegen level. They cache the compiled versions of subplans/”plan fragment”)