[Postgres-xl-bugs] postgres-xl HA with pgpool and stream replication?

Lucky Haryadi lucky at equnix.co.id
Tue Jun 30 19:35:28 PDT 2015


Hi Kaijiang,

I think that should do it.
Just to remind that you have to ensure that the Slaves promoted correctly
and GTM is still available, since you run all components on 1 server.
In case of synchronous mode, you also want to ensure that synchronous mode
is turned off when Slaves are down.

In my case, using my HA scripts, while the apps doing insertion and I took
down one of data node, failover took about 3 minutes to allow the apps
inserting normally again.
On my asumption, this 3 minutes gap is because of ARP resolution, since I
use floating IP to register the node.
Or maybe after promoting Slave, should I execute pgxc_pool_reload() again
to reload the configuration?

Regards,

Lucky Haryadi
Equnix Business Solutions, PT
(An Open Source an Open Mind Company)

On Wed, Jul 1, 2015 at 8:59 AM, Kaijiang Chen <chenkaijiang at gmail.com>
wrote:

> Hi, Lucky and Koichi, very impressive and detailed reply, thank you very
> much!
>
> So, I think the solution to build an XL HA with pgpool should be below
> (Could you please review it?) :
>
> (suppose I have 2 servers:)
>
> 1) Deploy master XL (including all components like GTM, datanodes) in
> server A; and deploy slave XL (all components) in server B
>
> 2) Setup stream replications: master coordinator ====> slave coordinator,
> master datanode 1 ====> slave datanode1, master datanode2 ====> slave
> datanode 2, .... master datanode N ====> slave datanode N
>
> 3) Deploy pgpool and config pgpool: master XL's coordinator is database0
> and slave XL's coordinator is database 1 (because a coordinator can be
> treated as a single DB). So that pgpool can load-balance between master XL
> and slave XL, and if one of them fails, pgpool can switch to the other one.
>
> Is the solution correct?
>
> Best Regards,
> Kaijiang
>
>
>
> On Tue, Jun 30, 2015 at 11:03 AM, Lucky Haryadi <lucky at equnix.co.id>
> wrote:
>
>> Hi, Suzuki-san,
>>
>> Yes, on the previous experiment I found out that I didn't configure the
>> synchronous mode well.
>> I forgot to enable the synchronous_standby_names parameter.
>> My HA scripts actually have already handled crashes on master or slave
>> side (cut down the synchronous)
>> So currently I retest the experiment using the synchronous mode properly
>> and see whether the cases happens again.
>>
>> Regards,
>>
>> Lucky Haryadi
>> Equnix Business Solutions, PT
>> (An Open Source and Open Mind Company)
>>
>>
>> On Tue, Jun 30, 2015 at 9:57 AM, Koichi Suzuki <koichi.dbms at gmail.com>
>> wrote:
>>
>>> You need to use synchronous replication not to lose any updates.
>>> Synchronous replication is slow and you can configure only one synchronous
>>> slave for a node.   When synchronous slave crashes or stops, the master is
>>> blocked unless the master cuts out the synchronous slave.
>>>
>>> Pgxc_ctl does some of this work.
>>>
>>> Regards;
>>>
>>> ---
>>> Koichi Suzuki
>>>
>>> 2015-06-30 11:26 GMT+09:00 Lucky Haryadi <lucky at equnix.co.id>:
>>>
>>>> Hi, Kaijiang
>>>>
>>>> Currently I'm doing some experiments with XL using hot standby
>>>> replication.
>>>> The configuration is like this:
>>>>
>>>>  ---------------------------------------------------------------------
>>>>                          | Master Coordinator ----- (Stream Reps)
>>>> ------> Slave Coordinator  |
>>>>
>>>>  ---------------------------------------------------------------------
>>>>                                        |
>>>>        |
>>>>                                        |
>>>>        |
>>>> ----------------------------------------------
>>>>  -----------------------------------------------
>>>> | Master Data Node 1 -----> Slave Data Node 1|            | Master Data
>>>> Node 2 -----> Slave Data Node 2 |
>>>> ----------------------------------------------
>>>>  -----------------------------------------------
>>>>
>>>> And I build my own HA scripts, using simple bash script to do heartbeat
>>>> between Master-Slave pair and do the failover, like takeover the floating
>>>> IP and promoting Slave to Master.
>>>>
>>>> The whole configuration was running well on normal condition, no
>>>> transaction, no action at all to the configuration.
>>>> And I did some insertion, like 10million data (using generate_series)
>>>> to one table, and all was perfect. The data was distributed normally on all
>>>> Data Nodes, including the Slave ones.
>>>>
>>>> So on next experiment, I tried to forcing shutdown on one of the data
>>>> node on normal condition (without transaction), failover was successful.
>>>> Insertion after failover was also successful.
>>>>
>>>> And I did the next experiment, forcing shutdown on one data node while
>>>> inserting bunch of data (for example after 30 seconds).
>>>> The failover was successful, slave data node promoted to master, but
>>>> some weird cases happened:
>>>>
>>>> 1. The inserted table on the promoted slave is gone.
>>>> 2. Data count in the table on other data node (which still on normal
>>>> Master-Slave condition) is zero.
>>>> 3. Total data count when I did query via Coordinator is zero.
>>>>
>>>> Currently I'm still investigating those cases by doing the same
>>>> experiment and verbosing the log.
>>>>
>>>> Will come to you again after I get some updates.
>>>>
>>>> Regards,
>>>>
>>>> Lucky Haryadi
>>>> Equnix Business Solutions, PT
>>>> (An Open Source and Open Mind Company)
>>>>
>>>> On Mon, Jun 29, 2015 at 9:38 PM, Kaijiang Chen <chenkaijiang at gmail.com>
>>>>  wrote:
>>>>
>>>>> Hi, all, does postgres-XL support stream replication (treat the XL as
>>>>> a single DB and replicate the entire DB in another XL for HA)?
>>>>>
>>>>> Can I treat the XL as a single DB and build an HA environment with
>>>>> PGPool?
>>>>>
>>>>> It looks like that XL doesn't support some HA features of postgreSQL.
>>>>> Is there any introduction to good practice of XL HA?
>>>>>
>>>>> Best Regards,
>>>>> Kaijiang
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Monitor 25 network devices or servers for free with OpManager!
>>>>> OpManager is web-based network management software that monitors
>>>>> network devices and physical & virtual servers, alerts via email & sms
>>>>> for fault. Monitor 25 devices for free with no restriction. Download
>>>>> now
>>>>> http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
>>>>> _______________________________________________
>>>>> Postgres-xl-bugs mailing list
>>>>> Postgres-xl-bugs at lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xl-bugs
>>>>>
>>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Don't Limit Your Business. Reach for the Cloud.
>>>> GigeNET's Cloud Solutions provide you with the tools and support that
>>>> you need to offload your IT needs and focus on growing your business.
>>>> Configured For All Businesses. Start Your Cloud Today.
>>>> https://www.gigenetcloud.com/
>>>> _______________________________________________
>>>> Postgres-xl-bugs mailing list
>>>> Postgres-xl-bugs at lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/postgres-xl-bugs
>>>>
>>>>
>>>
>>
>>
>> --
>> Best regards,
>>
>> Lucky Haryadi
>> Equnix Business Solutions, PT
>> (An Open Source an Open Mind Company)
>>
>> Pusat Niaga ITC Roxy Mas Blok C2/42.  Jl. KH Hasyim Ashari 125, Jakarta
>> Pusat
>> T: +62 21 7997 692 F: +62 21 6315 281 M: +62 856 932 78 456
>>
>>
>> Caution: The information enclosed in this email (and any attachments) may
>> be legally privileged and/or confidential and is intended only for the use
>> of the addressee(s). No addressee should forward, print, copy, or otherwise
>> reproduce this message in any manner that would allow it to be viewed by
>> any individual not originally listed as a recipient. If the reader of this
>> message is not the intended recipient, you are hereby notified that any
>> unauthorized disclosure, dissemination, distribution, copying or the taking
>> of any action in reliance on the information herein is strictly prohibited.
>> If you have received this communication in error, please immediately notify
>> the sender and delete this message.Unless it is made by the authorized
>> person, any views expressed in this message are those of the individual
>> sender and may not necessarily reflect the views of PT Equnix Business
>> Solutions.
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.postgres-xl.org/private.cgi/postgres-xl-bugs-postgres-xl.org/attachments/20150701/54db8076/attachment.htm>


More information about the postgres-xl-bugs mailing list