[Postgres-xl-developers] Postgres-XL 9.6 merge / resolving execute on any -> all plan changes
tomas.vondra at 2ndquadrant.com
Tue May 23 15:07:03 PDT 2017
Hi XL hackers,
You are probably aware that we've pushed Postgres-XL 9.6 code (that is,
Postgres-XL merging PostgreSQL 9.6 improvements) into the public repo a
few days ago.
The announcement and some additional details is available here:
The current code resolves all merge conflicts and most regression
failures, and is fairly stable, there's a bunch of remaining issues
(e.g. regression failures) that need fixing before XL 9.6 can be
Let me discuss one of the main causes of regression failures - I do
believe my analysis and solution is correct, but I'm not entirely sure
about that, and so would appreciate feedback from other developers.
The failures are caused by plans switching from
-> Remote Subquery Scan on all (datanode_1)
-> Remote Subquery Scan on any (datanode_1,datanode_2)
After a bit of investigation I've realized the plans change thanks to
this block in adjust_subplan_distribution:
if (subd && !equal(subd, pathd))
... if "on any" then pick one node and switch to "on all" ...
On XL 9.5 the condition evaluates as true and so the restriction
switches from "any" to "all" (with just a single node). On XL 9.6 the
condition is however evaluated as "false" because the two distributions
are equal, and so the plans keep the "any" restriction.
I'm not entirely sure why XL 9.6 ends up with
being true, while XL 9.5 evaluates this as false. I do believe it is
related to the upper-planner pathification (upstream commit 3fc6e2d7f),
which caused quite a few conflicts with the XL planner code.
An alternative possibility might be that we do set / propagate the
distributions for paths differently, but so far I haven't found any
issue in that.
I have however found out that simply removing the equal() part of the
condition (see the trivial patch attached) fixes pretty much all these
plan changes, and the regression.diff gets about 30% smaller.
While fixing the various merge conflicts, one of the principles I tried
to follow as much as possible was not introducing unnecessary behavior
or plan changes. This fix is consistent with this principle.
The question is whether that's the right / correct / best solution, and
I'd appreciate your opinions on this.
FWIW I think we'll soon get rid of adjust_subplan_distribution()
altogether. It used to be necessary because of the uppler-planner
working with plans and not paths, but that is no longer the case. So
ultimately most changes in createplan.c should disappear and move to the
path construction. But that's something we shouldn't do until the code
I also wonder why exactly are we picking the node while constructing the
plan, and not sooner (while constructing the path) or later (during
execution). I can see arguments for both options, but createplan.c seems
a bit weird. Can someone explain why it's done here?
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 531 bytes
Desc: not available
More information about the Postgres-xl-developers