[Postgres-xl-developers] What are "coordinator quals" for? Do we still need them?

Tomas Vondra tomas.vondra at 2ndquadrant.com
Tue Oct 17 12:46:08 PDT 2017


Hi,

While investigating the remaining failures in regression tests, I've ran
into failures in updatable_views regression test suite due to extra
entries in the targetlists.

Essentially, we produce something like this

  Update on public.t111
    -> Index Scan using t1_a_idx on public.t1
       Output: 100, t1.b, t1.c, t1.a, t1.a, t1.a, t1.a, t1.a, t1.a,
               t1.a, t1.a, t1.ctid
    -> ...

instead of just

  Update on public.t111
    ->  Index Scan using t1_a_idx on public.t1
        Output: 100, t1.b, t1.c, t1.ctid

That is, we add duplicate entries for t1.a for some reason. After a bit
of poking I found this block in rewriteTargetListUD() is the culprit:

https://github.com/2ndQuadrant/postgres-xl/blob/master/src/backend/rewrite/rewriteHandler.c#L1413

Apparently what happens is that we pull all Vars from quals, and add
them to the target list. When dealing with an inheritance tree, we get
one Var copy for each of the tables - hence the duplicate Vars.

And indeed, adding a check that we never add the same Var repeatedly
gets rid of the duplicate entries - t1.a only gets added once.

But I'm starting to question why we need to add these Vars at all? The
comment talks about "coordinator quals" but unfortunately it's not
really defined or explained anywhere. AFAICS those are quals meant to be
evaluated at Remote Subquery Scan node, but the question is why would
that be necessary when we push all quals as deep as posible.

For example, consider a query

    UPDATE t SET val2 = 1000 WHERE val = 7;

which is currently planned like this:

                          QUERY PLAN
  ----------------------------------------------------------
   Remote Fast Query Execution
     Output: 1000, t.val, t.xc_node_id, t.ctid
     Remote query: UPDATE t SET val2 = 1000 WHERE (val = 7)
       ->  Update on public.t
          ->  Seq Scan on public.t
              Output: val, 1000, ctid
              Filter: (tab1_hash.val = 7)

because rewriteTargetListUD() pulls Vars from all quals (which in this
case is just "val = 7"), and adds them to the target list. But I'm not
quite sure why that would be needed, particularly when there are no
quals attached to the "Remote Fast Query Execution" (which is where it
should be according to EXPLAIN [1])?

[1]
https://github.com/2ndQuadrant/postgres-xl/blob/master/src/backend/commands/explain.c#L1517

Furthermore, the comment actually says this:

  * PGXCTODO: This list could be reduced to keep only in target list the
  * vars using Coordinator Quals.

which seems to suggest that this is not really the correct behavior
anyway, as it pulls Vars from all quals, not just "coordinator quals."

I suspect we don't actually need or use the coordinator quals, so this
is just unnecessary code. I went ahead and tried to removing it - it
does break some of the regression tests, but only because the plans
change in the expected way (i.e. unnecessary Vars get removed from
target lists). There are no result changes, or anything like that.

This also (mostly) fixes the updatable_views regression tests - the XL
plans get pretty close to what upstream produces.

Attached is a patch removing this chunk of code, and accepting the
associated plan changes. Does anyone see an issue with it? Or do we
think we still need "coordinator quals"?

If not, I'll push this by the end of this week.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
-------------- next part --------------
A non-text attachment was scrubbed...
Name: remove-coordinator-quals.patch
Type: text/x-patch
Size: 35803 bytes
Desc: not available
URL: <http://lists.postgres-xl.org/pipermail/postgres-xl-developers-postgres-xl.org/attachments/20171017/ccf786c3/attachment-0001.bin>


More information about the Postgres-xl-developers mailing list