1. clickbat (reminds patent descriptions ;)); the article itself is about routine query optimization
2. there is no "materialized view" - the view created is a standard one; subquery materialization != materialized views. see wikipedia article (https://en.wikipedia.org/wiki/Materialized_view) to find what materialized views are
Is that from an orm? Looks like a query builder to me.
The main thing you left out is the type conversions with plain sql. So a fair comparison would include the e2e flow, imo.
That said, your point is very much valid because of the nested parentheses - I think the main issue is that SQL is an “infix”(?) dsl? It’s strange because Rust has perfectly good proc macros for this type of stuff.
If anything, this experience has made me appreciate Diesel even more!
Having strongly typed SQL has prevented these sort of bugs from happening at compile time. The only reason I ran into trouble here was that I went into `unsafe` land without proper precaution. That's on me, not the ORM.
Any story about making something 1200x faster is going to be about something being 1197x too slow to begin with. It's like celebrating a performance failure.
I think it was definitely unoptimized to begin with. The major slow down didn't happen until the model evolved though, as I mention in the Background section.
Not catching that change was certainly a performance failure!
The culprit for the slowdown is this section, reformatted to be readable:
LEFT OUTER JOIN (
boundary
INNER JOIN threshold ON (threshold.id = boundary.threshold_id)
INNER JOIN model ON (model.id = boundary.model_id)
LEFT OUTER JOIN alert ON (alert.boundary_id = boundary.id)
) ON (boundary.metric_id = metric.id)
From what I can gather, it does a full scan on "boundary" table so it can do the inner joins before doing the left outer join against "metric".
The author then introduces the "metric_boundary" view which does the left outer join between "metric" and "boundary". However the SQL using the "metric_boundary" view is not shown, and from the query planner results I can't see that the new SQL is equivalent?
That is, without the "metric_boundary" view, one immediate thought on how to optimize the original query is to avoid doing the "threshold" and "model" inner joins, instead flattening the query:
LEFT OUTER JOIN boundary ON (boundary.metric_id = metric.id)
LEFT OUTER JOIN threshold ON (threshold.id = boundary.threshold_id)
LEFT OUTER JOIN model ON (model.id = boundary.model_id)
LEFT OUTER JOIN alert ON (alert.boundary_id = boundary.id)
This should allow the database to just search on the primary keys, similar to the final result in the article. However, this is not equivalent. It will include values from "boundary" even if there are no matches for "threshold" and "model", unlike the original query. In the original, "boundary" rows will not be returned unless there's also linked rows in "threshold" and "model".
However, from the query planner results after introducing the "metric_boundary" view, from what I can see the author has in effect done just the same kind of flattening?
|--SEARCH boundary USING INDEX sqlite_autoindex_boundary_2 (metric_id=?) LEFT-JOIN
|--SEARCH threshold USING INTEGER PRIMARY KEY (rowid=?) LEFT-JOIN
|--SEARCH model USING INTEGER PRIMARY KEY (rowid=?) LEFT-JOIN
Am I missing something here? Did the author not really need to reject "boundary" table rows which didn't have any corresponding rows in "threshold" and "model" table, ie didn't really need the inner joins, and didn't realize this?
Not really. In this context they're a hint to the optimizer to evaluate the CTE result store=materialize it somewhere and use that snapshot when the CTE is queried subsequently.
The alternative means the CTE is copy-pasted essentially and them evaluated.
The closest one can get to materialized views in SQLite is probably with CREATE TABLE AS. Drop and recreate.
The VIEW itself is not a "materialized" view, but when the (normal) view is instantiated within a single query it is treated as MATERIALIZED because it is used multiple times in the query execution: https://www.sqlite.org/lang_with.html#materialization_hints
> Detect and prevent performance regressions before they make it to production with continuous benchmarking
Would ever write
> I got feedback from a user that their Bencher Perf Page was taking a while to load. So I decided to check it out, and oh, man were they being nice.
I get that this is an edge case, but it doesn't feel awesome that they had such a painful performance issue that went undetected!
reply