[Postgres-xl-bugs] postgres-xl HA with pgpool and stream replication?

Kaijiang Chen chenkaijiang at gmail.com
Thu Jul 9 04:01:58 PDT 2015


More info, I can execute "\d" command on slave but can't execute "select *
from t;", where t is created in the master as: create table t (f1 int, f2
int);

I found 2 logs from the slave coordinator that looks useful:

2015-07-09 18:50:32.035 CST,,,3909,,559e51f7.f45,1,,2015-07-09 18:50:31
CST,,0,LOG,00000,"database system is ready to accept read only
connections",,,,,,,,"sigusr1_handler, postmaster.c:4533",""
------------looks like the slave is ready to execute the read only queries.

2015-07-09 18:52:24.135
CST,"postgres","postgres",3921,"127.0.0.1:60857",559e51fd.f51,4,"SELECT",2015-07-09
18:50:37 CST,2/8,0,ERROR,XX000,"cannot assign TransactionIds during
recovery",,,,,,"select * from t;",,"GetNewTransactionId,
varsup.c:146","psql"
-----------this error is found only in coordinator, not on any datanodes.



On Thu, Jul 9, 2015 at 6:34 PM, Kaijiang Chen <chenkaijiang at gmail.com>
wrote:

> Hi, Lucky Haryadi, thank you very much for your effective response!
>
> I worked on it, and the stream replication from master XL to slave XL
> works now!
>
> But there is still a problem. If I ran any read only queries on the slave
> XL, I got an error message:
> postgres=# select txid_current();
> ERROR:  cannot execute txid_current() during recovery
> postgres=# select * from t;
> ERROR:  cannot assign TransactionIds during recovery
>
> I need to run read only queries on the slave for load balance. There is a
> commit that fix this problem:
>
> http://ehc.ac/p/postgres-xl/postgres-xl/ci/fe9985c168d85738e5d88ed9407b840449f31b75/#diff-5
>
> But the above snapshot is too old and it seems unwise to merge it to the
> XL version I downloaded from
> http://sourceforge.net/projects/postgres-xl/files/Releases/Version_9.2rc/postgres-xl-v9.2-src.tar.gz/download
>
> Is the bug fixed on the code branch which is based the 9.2rc? So that I
> can use it or merge its bug fixing into the version 9.2rc?
>
>
> Thank you again for your help!
>
> Best Wishes
> Kaijiang
>
>
>
> On Wed, Jul 1, 2015 at 10:35 AM, Lucky Haryadi <lucky at equnix.co.id> wrote:
>
>> Hi Kaijiang,
>>
>> I think that should do it.
>> Just to remind that you have to ensure that the Slaves promoted correctly
>> and GTM is still available, since you run all components on 1 server.
>> In case of synchronous mode, you also want to ensure that synchronous
>> mode is turned off when Slaves are down.
>>
>> In my case, using my HA scripts, while the apps doing insertion and I
>> took down one of data node, failover took about 3 minutes to allow the apps
>> inserting normally again.
>> On my asumption, this 3 minutes gap is because of ARP resolution, since I
>> use floating IP to register the node.
>> Or maybe after promoting Slave, should I execute pgxc_pool_reload() again
>> to reload the configuration?
>>
>> Regards,
>>
>> Lucky Haryadi
>> Equnix Business Solutions, PT
>> (An Open Source an Open Mind Company)
>>
>> On Wed, Jul 1, 2015 at 8:59 AM, Kaijiang Chen <chenkaijiang at gmail.com>
>> wrote:
>>
>>> Hi, Lucky and Koichi, very impressive and detailed reply, thank you very
>>> much!
>>>
>>> So, I think the solution to build an XL HA with pgpool should be below
>>> (Could you please review it?) :
>>>
>>> (suppose I have 2 servers:)
>>>
>>> 1) Deploy master XL (including all components like GTM, datanodes) in
>>> server A; and deploy slave XL (all components) in server B
>>>
>>> 2) Setup stream replications: master coordinator ====> slave
>>> coordinator, master datanode 1 ====> slave datanode1, master datanode2
>>> ====> slave datanode 2, .... master datanode N ====> slave datanode N
>>>
>>> 3) Deploy pgpool and config pgpool: master XL's coordinator is database0
>>> and slave XL's coordinator is database 1 (because a coordinator can be
>>> treated as a single DB). So that pgpool can load-balance between master XL
>>> and slave XL, and if one of them fails, pgpool can switch to the other one.
>>>
>>> Is the solution correct?
>>>
>>> Best Regards,
>>> Kaijiang
>>>
>>>
>>>
>>> On Tue, Jun 30, 2015 at 11:03 AM, Lucky Haryadi <lucky at equnix.co.id>
>>> wrote:
>>>
>>>> Hi, Suzuki-san,
>>>>
>>>> Yes, on the previous experiment I found out that I didn't configure the
>>>> synchronous mode well.
>>>> I forgot to enable the synchronous_standby_names parameter.
>>>> My HA scripts actually have already handled crashes on master or slave
>>>> side (cut down the synchronous)
>>>> So currently I retest the experiment using the synchronous mode
>>>> properly and see whether the cases happens again.
>>>>
>>>> Regards,
>>>>
>>>> Lucky Haryadi
>>>> Equnix Business Solutions, PT
>>>> (An Open Source and Open Mind Company)
>>>>
>>>>
>>>> On Tue, Jun 30, 2015 at 9:57 AM, Koichi Suzuki <koichi.dbms at gmail.com>
>>>> wrote:
>>>>
>>>>> You need to use synchronous replication not to lose any updates.
>>>>> Synchronous replication is slow and you can configure only one synchronous
>>>>> slave for a node.   When synchronous slave crashes or stops, the master is
>>>>> blocked unless the master cuts out the synchronous slave.
>>>>>
>>>>> Pgxc_ctl does some of this work.
>>>>>
>>>>> Regards;
>>>>>
>>>>> ---
>>>>> Koichi Suzuki
>>>>>
>>>>> 2015-06-30 11:26 GMT+09:00 Lucky Haryadi <lucky at equnix.co.id>:
>>>>>
>>>>>> Hi, Kaijiang
>>>>>>
>>>>>> Currently I'm doing some experiments with XL using hot standby
>>>>>> replication.
>>>>>> The configuration is like this:
>>>>>>
>>>>>>  ---------------------------------------------------------------------
>>>>>>                          | Master Coordinator ----- (Stream Reps)
>>>>>> ------> Slave Coordinator  |
>>>>>>
>>>>>>  ---------------------------------------------------------------------
>>>>>>                                        |
>>>>>>          |
>>>>>>                                        |
>>>>>>          |
>>>>>> ----------------------------------------------
>>>>>>  -----------------------------------------------
>>>>>> | Master Data Node 1 -----> Slave Data Node 1|            | Master
>>>>>> Data Node 2 -----> Slave Data Node 2 |
>>>>>> ----------------------------------------------
>>>>>>  -----------------------------------------------
>>>>>>
>>>>>> And I build my own HA scripts, using simple bash script to do
>>>>>> heartbeat between Master-Slave pair and do the failover, like takeover the
>>>>>> floating IP and promoting Slave to Master.
>>>>>>
>>>>>> The whole configuration was running well on normal condition, no
>>>>>> transaction, no action at all to the configuration.
>>>>>> And I did some insertion, like 10million data (using generate_series)
>>>>>> to one table, and all was perfect. The data was distributed normally on all
>>>>>> Data Nodes, including the Slave ones.
>>>>>>
>>>>>> So on next experiment, I tried to forcing shutdown on one of the data
>>>>>> node on normal condition (without transaction), failover was successful.
>>>>>> Insertion after failover was also successful.
>>>>>>
>>>>>> And I did the next experiment, forcing shutdown on one data node
>>>>>> while inserting bunch of data (for example after 30 seconds).
>>>>>> The failover was successful, slave data node promoted to master, but
>>>>>> some weird cases happened:
>>>>>>
>>>>>> 1. The inserted table on the promoted slave is gone.
>>>>>> 2. Data count in the table on other data node (which still on normal
>>>>>> Master-Slave condition) is zero.
>>>>>> 3. Total data count when I did query via Coordinator is zero.
>>>>>>
>>>>>> Currently I'm still investigating those cases by doing the same
>>>>>> experiment and verbosing the log.
>>>>>>
>>>>>> Will come to you again after I get some updates.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> Lucky Haryadi
>>>>>> Equnix Business Solutions, PT
>>>>>> (An Open Source and Open Mind Company)
>>>>>>
>>>>>> On Mon, Jun 29, 2015 at 9:38 PM, Kaijiang Chen <
>>>>>> chenkaijiang at gmail.com> wrote:
>>>>>>
>>>>>>> Hi, all, does postgres-XL support stream replication (treat the XL
>>>>>>> as a single DB and replicate the entire DB in another XL for HA)?
>>>>>>>
>>>>>>> Can I treat the XL as a single DB and build an HA environment with
>>>>>>> PGPool?
>>>>>>>
>>>>>>> It looks like that XL doesn't support some HA features of
>>>>>>> postgreSQL. Is there any introduction to good practice of XL HA?
>>>>>>>
>>>>>>> Best Regards,
>>>>>>> Kaijiang
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ------------------------------------------------------------------------------
>>>>>>> Monitor 25 network devices or servers for free with OpManager!
>>>>>>> OpManager is web-based network management software that monitors
>>>>>>> network devices and physical & virtual servers, alerts via email &
>>>>>>> sms
>>>>>>> for fault. Monitor 25 devices for free with no restriction. Download
>>>>>>> now
>>>>>>> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
>>>>>>> _______________________________________________
>>>>>>> Postgres-xl-bugs mailing list
>>>>>>> Postgres-xl-bugs at lists.sourceforge.net
>>>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xl-bugs
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------------
>>>>>> Don't Limit Your Business. Reach for the Cloud.
>>>>>> GigeNET's Cloud Solutions provide you with the tools and support that
>>>>>> you need to offload your IT needs and focus on growing your business.
>>>>>> Configured For All Businesses. Start Your Cloud Today.
>>>>>> https://www.gigenetcloud.com/
>>>>>> _______________________________________________
>>>>>> Postgres-xl-bugs mailing list
>>>>>> Postgres-xl-bugs at lists.sourceforge.net
>>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xl-bugs
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards,
>>>>
>>>> Lucky Haryadi
>>>> Equnix Business Solutions, PT
>>>> (An Open Source an Open Mind Company)
>>>>
>>>> Pusat Niaga ITC Roxy Mas Blok C2/42.  Jl. KH Hasyim Ashari 125, Jakarta
>>>> Pusat
>>>> T: +62 21 7997 692 F: +62 21 6315 281 M: +62 856 932 78 456
>>>>
>>>>
>>>> Caution: The information enclosed in this email (and any attachments)
>>>> may be legally privileged and/or confidential and is intended only for the
>>>> use of the addressee(s). No addressee should forward, print, copy, or
>>>> otherwise reproduce this message in any manner that would allow it to be
>>>> viewed by any individual not originally listed as a recipient. If the
>>>> reader of this message is not the intended recipient, you are hereby
>>>> notified that any unauthorized disclosure, dissemination, distribution,
>>>> copying or the taking of any action in reliance on the information herein
>>>> is strictly prohibited. If you have received this communication in error,
>>>> please immediately notify the sender and delete this message.Unless it is
>>>> made by the authorized person, any views expressed in this message are
>>>> those of the individual sender and may not necessarily reflect the views of
>>>> PT Equnix Business Solutions.
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.postgres-xl.org/private.cgi/postgres-xl-bugs-postgres-xl.org/attachments/20150709/7021e334/attachment.htm>


More information about the postgres-xl-bugs mailing list