Overview of the main idea (3 sentences)

Andy and team study the high-level design of both Parquet and ORC. Then, they run a series of workload tests against both file systems. The lessons learnt section at the end is a great starting point for anybody designing a replacement to Parquet/ORC.

Key findings / takeaways from the paper (2-3 sentences)

System used in evaluation and how it was modified/extended (1 sentence)

Parquet and ORC were used, but not modified.

Workload Evaluated (1 sentence)

Various different datasets were evaluated: