Query Optimization Notes
Blog Chapters
**Chapter 1: Overview of Query Optimization **
Query languages like SQL are declarative -> user tells DBMSs what they want, not how to compute
-> DBMS attempts to find a correct execution plan with the best cost
1. SQL Query
|
v
AST
|
v
2. Binder <- System Catalog -> Cost Model
|
v
Logical plan
|
v
Optmizer (schema info from catalog, cost model info)
|
v
Physical plan
* Optimizer: find lowest cost physical plan from a large search space of promising plans
-> Ideally should find best plan
-> interesting: MongloDB: run both, and remember which is better ...
Not always 1-1 from logical plan to physical plan
Topics:
Search strategies
Enumerations / Transformations
Parallelization
Statistics / Summarization
Cardinality Estimation / Parameterization
Adaptivity / Feedback Mechanisms
Real-world implementation
Search strategies:
1. Heuristics Rules
-> long term problematic, what oracle previously did
2. Cost-based Search
Top down vs Bottom up
Bottom up: nothing in plan, but build gradually (System R)
Top down: start with outcome, work down the tree to find the optimal plan that gets you to the goal (Volcano)
Optimization granularity:
1. Single
search space small
2. Multiple
search space much larger
global optimal for all plans
more efficient if many similar queries
only in research so far
THIS IS VEYR HARD: proven to be NP-Complete
-> No optimizer provides real optimal plan
estimation techniques to guess real plan cost
heuristics to limit search space
Enjoy Reading This Article?
Here are some more articles you might like to read next: