Date: Mar 25 2024
Slides: https://15721.courses.cs.cmu.edu/spring2024/slides/15-optimizer3.pdf
Reading
Raw Lecture Notes
- As the query executes, things could change (inserts, deletes, updates) that may affect the performance of a given plan. What if we don’t have to generate the entire plan right from the get go?
- The most likely reason for a plan to not be good is getting the join order incorrectly.
- We think that our join will produce X number of tuples but it may be much different.
- And this assumption can have big effects down the road.
- this technique is called “Adaptive Query Optimization”
- it will allow the DBMS to modify the query plan for a query to perform better based on what it learns along the way
- 3 approaches
- modify future invocations
- plan correction
- feedback loop
- replan current invocations
- start-over from scratch or
- keep intermediate results
- plan pivot points
- DBMSs should track the history of query invocations (cost estimations, query plan, actual runtime metrics)
- if the DBMS generates a new plan for a query and it performs worse, it should revert to the old plan
- but simply reverting is not that smart. Instead, as per the “Plan Stitch” paper from VLDB 2018, maybe we can just revert a sub-plan and then use bottom-up dynamic programming to find a better overall plan.
- AND! we can share sub-plans between queries!
- as long as they’re logically equivalent
- (Redshift does this at the codegen level. They cache the compiled versions of subplans/”plan fragment”)