Blog Chapters

  1. Chapter 1: Overview of Query Optimization

**Chapter 1: Overview of Query Optimization **


Query languages like SQL are declarative -> user tells DBMSs what they want, not how to compute
-> DBMS attempts to find a correct execution plan with the best cost


1. SQL Query
    |
    v
    AST
    |
    v
2. Binder <- System Catalog -> Cost Model
    |
    v 
    Logical plan
    |
    v
    Optmizer (schema info from catalog, cost model info)
    |
    v
    Physical plan

* Optimizer: find lowest cost physical plan from a large search space of promising plans
-> Ideally should find best plan
-> interesting: MongloDB: run both, and remember which is better ... 

Not always 1-1 from logical plan to physical plan

Topics:
Search strategies
Enumerations / Transformations
Parallelization
Statistics / Summarization
Cardinality Estimation / Parameterization
Adaptivity / Feedback Mechanisms
Real-world implementation


Search strategies: 
1. Heuristics Rules
-> long term problematic, what oracle previously did

2. Cost-based Search


Top down vs Bottom up
Bottom up: nothing in plan, but build gradually (System R)

Top down: start with outcome, work down the tree to find the optimal plan that gets you to the goal (Volcano)

Optimization granularity: 
1. Single
      search space small
2. Multiple
      search space much larger
      global optimal for all plans
      more efficient if many similar queries
      only in research so far


THIS IS VEYR HARD: proven to be NP-Complete
-> No optimizer provides real optimal plan
      estimation techniques to guess real plan cost
      heuristics to limit search space


Back to Blog Chapters