[postgres-xl-general] something wrong with my Postgres-XL

Tomas Vondra tomas.vondra at 2ndquadrant.com
Sat Oct 28 09:35:59 PDT 2017


Please be more careful when responding - use "reply all" or "reply list"
(or whatever your client offers) so that you respond to the mailing
list, not just to me. Otherwise it's difficult for others to keep track
of the discussion :-/

On 10/28/2017 02:36 PM, Alexander Voytsekhovskyy wrote:
> Yeah, you are right,
> 
> explain without analyze gives me correct query plan. 
> 
> But the main problem is still here - even yours simple example or my
> real data gives 2 times slower execution on Postgres-XL with 2 nodes (2
> physical servers same network) than 1 normal PostgreSQL with same
> hardware. Should i try to add more nodes to get it faster than
> normal PostgreSQL?
> 

As I mentioned before, the problem is that while we correctly plan the
query, we execute the remote part using cursors. And that unfortunately
disables parallelism in the executor, and no workers get launched.

We do have some experimental patches that address this, but haven't
managed to clean and test them properly before pgconf.eu. It's on the
TODO, though.

> And one more: Here i am creating new tables equal to real ones in new
> schema:
> 
> --------------------
> CREATE TABLE delivery_data (
>     delivery_id integer NOT NULL,
>     id_product integer NOT NULL,
>     client_product_id integer,
>     amount numeric(14,6) DEFAULT 0.000000 NOT NULL,
>     orig_amount numeric(14,6) DEFAULT NULL::numeric,
>     id_client_tt integer NOT NULL,
>     id_ta integer DEFAULT 0 NOT NULL,
>     money numeric(16,6) DEFAULT 0.000000 NOT NULL,
>     money_purchase numeric(16,6) DEFAULT NULL::numeric,
>     expire_date date,
>     manufacture_date date,
>     delivery_number_id integer,
>     invoice_number_id integer,
>     order_number_id integer,
>     delivery_date date NOT NULL,
>     delivery_group_id integer,
>     money_with_vat numeric(16,6) DEFAULT NULL::numeric,
>     money_purchase_with_vat numeric(16,6) DEFAULT NULL::numeric,
>     warehouse_id integer
> );
> 
> CREATE TABLE delivery_data_2017_q1 (
>     CONSTRAINT delivery_data_2017_q1_delivery_date_check CHECK
> (((delivery_date >= '2017-01-01'::date) AND (delivery_date <=
> '2017-03-31'::date)))
> )
> INHERITS (delivery_data) to node (datanode_1);;
> 
> 
> CREATE TABLE delivery_data_2017_q2 (
>     CONSTRAINT delivery_data_2017_q2_delivery_date_check CHECK
> (((delivery_date >= '2017-04-01'::date) AND (delivery_date <=
> '2017-06-30'::date)))
> )
> INHERITS (delivery_data) to node (datanode_1);;
> 

I wonder if mapping partitions explicitly to datanodes like this is a
good idea, as it somewhat conflicts with the idea that spreading the
data to datanodes is done by table distribution.

For example, it's common that most people access "recent data", which
may be a single partition. With this approach, all the queries will be
handled by just a single datanode, which does not really scale.

So I'd just create the partitions, without specifying the nodes.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



More information about the postgres-xl-general mailing list