Date: Feb 26 2024

Slides: https://15721.courses.cs.cmu.edu/spring2024/slides/10-multiwayjoins.pdf

Reading

Raw Lecture Notes

In all honestly, this topics is not super relevant to me so I didn’t pay too much attention to either the lecture or the class paper.

Here’s the rough idea. In some cases, especially if multiple tables are being joined, it’s likely that the binary joins we studied in Lecture #09: Hash Join Algorithms will not work well. This could happen because the number of tuples/rows that’s being produced is larger than the end result of all the joins (e.g., imagine a 4-way join where the majority of rows from the first 3 joins gets discarded by the last join). To prevent this, there’s a set of algorithms called “worst-case optimal joins”. Andy talks about a couple of them:

These are quite complex and still early in their development. Pretty much no DBMS supports them today, especially because it’s hard to know when they should be used in a query.

However, because of the SQL/PGQ extension which adds graph database-like capabilities to SQL, which was approved into the SQL ANSI standard in 2023, the importance of this type of join will greatly increase. That is because graph operations will much more often lead to binary join algorithms not performing well.


Profiling Tools