[Postgres-xl-developers] Failed to get pooled connections after failover

Joakim Lundgren lundgren700 at gmail.com
Fri Aug 28 03:00:55 PDT 2015


I have some problems when I test the failover process in Postgres-XL.

I use following topology:

node_        name node_port node_host nodeis_primary nodeis_preferred
coord1       C         20015        db-4            FALSE
 FALSE
coord2       C         20016        db-5            FALSE
 FALSE
coord3       C         20017        db-6            FALSE
 FALSE
datanode1 D        5434          db-4            TRUE                  TRUE
datanode2 D        5435          db-5            FALSE                FALSE
datanode3 D        5436          db-6            FALSE                FALSE

Each coord and datanode also have slave nodes in a "circular" manner, that
is coord1 master on db-4 has slave coord1 on db-5, coord2 master on db-5
has slave coord 2 on db-6 and coord3 master on db-6 has slave coord3 on
db-4. Same goes with datanodes.

I create a table 'test_table': create table test_table (a varchar(10));

I insert a row : insert into test_table values ('3333333333');

I do a select: select xc_node_id,* from test_table;

xc_node_id  |     a

-------------+-----------
 -1894792127 | 333333333

I do select from pgxc_node: select node_host from pgxc_node where node_id=
-1894792127

 node_host

-----------
db-6

I do a simple test of failover by shutting down host db-6. Then following
happens

PGXC monitor all

Running: gtm master

Running: gtm slave

Running: gtm proxy gtm_pxy1

Running: gtm proxy gtm_pxy2

Not running: gtm proxy gtm_pxy3

Running: coordinator master coord1

Running: coordinator slave coord1

Running: coordinator master coord2

Not running: coordinator slave coord2

Not running: coordinator master coord3

Running: coordinator slave coord3

Running: datanode master datanode1

Running: datanode slave datanode1

Running: datanode master datanode2

Not running: datanode slave datanode2

Not running: datanode master datanode3
Running: datanode slave datanode3

I failover coord3 with following result

PGXC failover coordinator coord3
Failover coordinators.
Failover the coordinator coord3.
Failover coordinator coord3 using gtm gtm_pxy1
Actual Command: ssh postgres at db-4 "( pg_ctl promote -Z coordinator -D
/home/postgres/pgxc/nodes/coord_slave ) > /tmp/db5_STDOUT_10575_0 2>&1" <
/dev/null > /dev/null 2>&1
Bring remote stdout: scp postgres at db-4:/tmp/db5_STDOUT_10575_0
/tmp/STDOUT_10575_1 > /dev/null 2>&1
Actual Command: ssh postgres at db-4 "( pg_ctl restart -Z coordinator -D
/home/postgres/pgxc/nodes/coord_slave -w -o -i; sleep 1 ) >
/tmp/db5_STDOUT_10575_2 2>&1" < /dev/null > /dev/null 2>&1
Bring remote stdout: scp postgres at db-4:/tmp/db5_STDOUT_10575_2
/tmp/STDOUT_10575_3 > /dev/null 2>&1
pgxc_ctl.conf                                      100%   19KB  18.6KB/s
00:00
Datanode coord3 is not running.  Skip reconfiguration for this datanode.
ERROR:  Failed to get pooled connections
 pgxc_pool_reload
------------------
 t
(1 row)

EXECUTE DIRECT
 pgxc_pool_reload
------------------
 t
(1 row)

EXECUTE DIRECT
 pgxc_pool_reload
------------------
 t
(1 row)

ERROR:  Failed to get pooled connections
 pgxc_pool_reload
------------------
 t
(1 row)

EXECUTE DIRECT
 pgxc_pool_reload
------------------
 t
(1 row)

EXECUTE DIRECT
 pgxc_pool_reload
------------------
 t
(1 row)

WARNING:  can not connect to GTM: No route to host
WARNING:  can not connect to GTM: No route to host
WARNING:  Xid is invalid.
WARNING:  can not connect to GTM: No route to host
WARNING:  can not connect to GTM: No route to host
WARNING:  Xid is invalid.
WARNING:  can not connect to GTM: No route to host
WARNING:  can not connect to GTM: No route to host
WARNING:  can not connect to GTM: No route to host
WARNING:  can not connect to GTM: No route to host
WARNING:  Xid is invalid.
ERROR:  GTM error, could not obtain snapshot XID = 0
WARNING:  can not connect to GTM: No route to host
WARNING:  can not connect to GTM: No route to host
WARNING:  Xid is invalid.
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  Xid is invalid.
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  Xid is invalid.
ERROR:  GTM error, could not obtain snapshot XID = 0
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  Xid is invalid.
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  Xid is invalid.
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  Xid is invalid.
ERROR:  GTM error, could not obtain snapshot XID = 0
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  Xid is invalid.
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  Xid is invalid.
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  Xid is invalid.
ERROR:  GTM error, could not obtain snapshot XID = 0
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  Xid is invalid.
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  Xid is invalid.
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  Xid is invalid.
ERROR:  GTM error, could not obtain snapshot XID = 0
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  Xid is invalid.
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  Xid is invalid.
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  can not connect to GTM: Connection refused
WARNING:  Xid is invalid.
ERROR:  GTM error, could not obtain snapshot XID = 0

Same thing happens when failing over the datanode datanode3

Monitoring the processes:

PGXC monitor all
Running: gtm master
Running: gtm slave
Running: gtm proxy gtm_pxy1
Running: gtm proxy gtm_pxy2
Not running: gtm proxy gtm_pxy3
Running: coordinator master coord1
Running: coordinator slave coord1
Running: coordinator master coord2
Not running: coordinator slave coord2
Running: coordinator master coord3
Running: datanode master datanode1
Running: datanode slave datanode1
Running: datanode master datanode2
Not running: datanode slave datanode2
Running: datanode master datanode3

It seems that the masters coord3 and datanode 3 has failedover to their
slave, even if there were some errors..

When selecting from test_table following happens

select * from test_table;
ERROR:  Failed to get pooled connections

Have you seen this behaviour before? It seems to me that GTM does not get
resynchronized to the slave processes somehow. I also have tried this with
the latest snapshot from the GIT repository without success. Is there
something else that must be done when failing over? I hav also tried making
checkpoints and creating barriers without success.

/Regards Joakim Lundgren
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.postgres-xl.org/private.cgi/postgres-xl-developers-postgres-xl.org/attachments/20150828/eff62126/attachment.htm>


More information about the Postgres-xl-developers mailing list