[Postgres-xl-general] Failed to close remote subplan
Aaron Jackson
ajackson at revionics.com
Tue Sep 23 11:34:18 PDT 2014
Since my first response, I have tried creating a couple of indexes on productid. Since then I dropped them, but I never thought to go back to the explain plan. My explain plan now looks identical to yours.
Limit (cost=19.30..21.30 rows=200 width=36)
-> HashAggregate (cost=19.30..21.30 rows=200 width=36)
-> Remote Subquery Scan on all (datanode1,datanode2,datanode3,datanode4,datanode5,datanode6) (cost=0.00..16.20 rows=620 width=36)
-> HashAggregate (cost=0.00..16.20 rows=200 width=36)
-> Seq Scan on sales s (cost=0.00..16.20 rows=620 width=36)
(5 rows)
I've since dropped the table and recreated it and still have the same plan. So I am completely perplexed by what has happened.
Aaron
________________________________
From: Mason Sharp [msharp at translattice.com]
Sent: Tuesday, September 23, 2014 12:30 PM
To: Aaron Jackson
Cc: postgres-xl-general at lists.sourceforge.net
Subject: Re: [Postgres-xl-general] Failed to close remote subplan
On Tue, Sep 23, 2014 at 11:41 AM, Aaron Jackson <ajackson at revionics.com<mailto:ajackson at revionics.com>> wrote:
The table in question is structured something like this:
create table client.sales(
storeid int,
productid int,
revenue numeric
) with( OIDS = false ) distribute by hash(storeid)
So, I decided to try to aggregate revenue by store rather than product with a limit. This works just fine, so I believe it may be related to performing a limited aggregation on a non-distributed column.
I am having trouble reproducing this.
test1=# create schema client;
CREATE SCHEMA
test1=# create table client.sales(
storeid int,
productid int,
revenue numeric
) with( OIDS = false ) distribute by hash(storeid);
CREATE TABLE
test1=# insert into client.sales select ceiling(2000 * random()), ceiling(5000 * random()), generate_series(1,100000);
INSERT 0 100000
Then I could successfully run both of these queries:
SELECT s.productid, count(s.revenue) AS revenue FROM client.sales s group by s.productid LIMIT 1000;
SELECT s.productid, sum(s.revenue) AS revenue FROM client.sales s group by s.productid LIMIT 1000;
My explain plans look a little different though.
test1=# explain SELECT s.productid, count(s.revenue) AS revenue FROM client.sales s group by s.productid LIMIT 1000;
QUERY PLAN
----------------------------------------------------------------------------------------------
Limit (cost=2042.00..2052.00 rows=1000 width=10)
-> HashAggregate (cost=2042.00..2066.97 rows=2497 width=10)
-> Remote Subquery Scan on all (dn1,dn2) (cost=0.00..1542.00 rows=100000 width=10)
-> HashAggregate (cost=0.00..1542.00 rows=2497 width=10)
-> Seq Scan on sales s (cost=0.00..1542.00 rows=100000 width=10)
(5 rows)
I am just running two datanodes for this though, not 6 like you.
Does this above work for you ok?
Is there anything in the datanode log files? Like hitting some resource maximum?
Also, I don't think it would impact you, but have you increased shared_buffers and work_mem from the defaults in the datanodes?
Thanks,
Mason
Aaron
________________________________
From: Aaron Jackson [ajackson at revionics.com<mailto:ajackson at revionics.com>]
Sent: Tuesday, September 23, 2014 10:15 AM
To: postgres-xl-general at lists.sourceforge.net<mailto:postgres-xl-general at lists.sourceforge.net>
Subject: Re: [Postgres-xl-general] Failed to close remote subplan
I've made a smallish replica table client.sales with a productid and revenue. The idea is to summarize revenue by product...
EXPLAIN SELECT s.productid, sum(s.revenue) AS revenue FROM client.sales s group by s.productid LIMIT 1000;
Limit (cost=13820090.77..13833634.87 rows=1000 width=10)
-> GroupAggregate (cost=13820090.77..14297425.44 rows=35243 width=10)
-> Remote Subquery Scan on all (datanode1,datanode2,datanode3,datanode4,datanode5,datanode6) (cost=1382
0090.77..13979084.85 rows=63597632 width=10)
-> GroupAggregate (cost=13820090.77..13979084.85 rows=35243 width=10)
-> Sort (cost=13820090.77..13979084.85 rows=63597632 width=10)
Sort Key: productid
-> Seq Scan on sales s (cost=0.00..1229548.32 rows=63597632 width=10)
Wanted to compare that to a similar non-aggregate query.. which works.
EXPLAIN SELECT s.productid, count(s.revenue) AS revenue FROM client.sales s group by s.productid LIMIT 1000;
Limit (cost=13820090.77..13833634.87 rows=1000 width=10)
-> GroupAggregate (cost=13820090.77..14297425.44 rows=35243 width=10)
-> Remote Subquery Scan on all (datanode1,datanode2,datanode3,datanode4,datanode5,datanode6) (cost=13820090.77..13979084.85 rows=63597632 width=10)
-> GroupAggregate (cost=13820090.77..13979084.85 rows=35243 width=10)
-> Sort (cost=13820090.77..13979084.85 rows=63597632 width=10)
Sort Key: productid
-> Seq Scan on sales s (cost=0.00..1229548.32 rows=63597632 width=10)
And to verify that this fails on aggregates, I did an average as well.
SELECT s.productid, avg(s.revenue) AS revenue FROM client.sales s group by s.productid LIMIT 10;
ERROR: Failed to close remote subplan
Note that I also adjusted the limit down to see if that was the bottleneck. Then I eliminated the limit entirely.
SELECT s.productid, sum(s.revenue) AS revenue FROM client.sales s group by s.productid;
SELECT s.productid, avg(s.revenue) AS revenue FROM client.sales s group by s.productid;
Both queries without the limit clause succeeded.
Aaron
________________________________
From: Aaron Jackson [ajackson at revionics.com<mailto:ajackson at revionics.com>]
Sent: Monday, September 22, 2014 3:21 PM
To: postgres-xl-general at lists.sourceforge.net<mailto:postgres-xl-general at lists.sourceforge.net>
Subject: [Postgres-xl-general] Failed to close remote subplan
I promise to stop throwing out questions... soon!
I have a case where most queries work just fine, however, the following query fails every time, with the message "ERROR: Failed to close remote subplan".
SELECT s.productid, sum(s.value) AS revenue FROM ns.summary s group by s.productid LIMIT 1000
If I modify the query every so slightly, it's fine...
SELECT s.productid, count(s.value) AS revenue FROM ns.summary s group by s.productid LIMIT 1000
The coordinator (DEBUG1) logs "timeout while waiting for response" - which corresponds to the post-timeout in pgxc_node_receive. Unfortunately, the datanodes don't seem to have much information. Anyone have a specific suggestion about where to look to see what the remote data node did not respond.
Aaron
------------------------------------------------------------------------------
Meet PCI DSS 3.0 Compliance Requirements with EventLog Analyzer
Achieve PCI DSS 3.0 Compliant Status with Out-of-the-box PCI DSS Reports
Are you Audit-Ready for PCI DSS 3.0 Compliance? Download White paper
Comply to PCI DSS 3.0 Requirement 10 and 11.5 with EventLog Analyzer
http://pubads.g.doubleclick.net/gampad/clk?id=154622311&iu=/4140/ostg.clktrk
_______________________________________________
Postgres-xl-general mailing list
Postgres-xl-general at lists.sourceforge.net<mailto:Postgres-xl-general at lists.sourceforge.net>
https://lists.sourceforge.net/lists/listinfo/postgres-xl-general
--
Mason Sharp
TransLattice - http://www.translattice.com<http://www.translattice.com/>
Clustered and Distributed Database Solutions
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.postgres-xl.org/private.cgi/postgres-xl-general-postgres-xl.org/attachments/20140923/b0f56088/attachment.htm>
More information about the postgres-xl-general
mailing list