[Postgres-xl-developers] postgres-xl crashes for large'ish datasets

pierre de fermat fermatslittletheorem at gmail.com
Fri May 13 05:46:54 PDT 2016


Hello
I have few large'ish datasets (5 sets of 200+million rows) which I need to
analyse. The XL set up has 18 nodes (16 cores, 16GB RAM,500GB SAS on RAID).
It has 4 coordinators,1 GTM and 1 GTM proxy.
One of the machines run 1 coordinator, and the two GTMs. Half the nodes
use  GTM and the other half, the proxy.

The data structure lends itself to good indexing. The data is distributed
with a key(sensor ID)  which ensures that any aggregation/processing that
we do, would be localized to a node.   We have postgres functions that do
such aggregations/processing. We run multi-threaded java or python code in
with each thread is given a set of sensor IDs. The workloads are grouped by
coordinators. This helps us engage all the data nodes and to process data.
Such processing works with limited load, but the coordinator which shares
the machine with the GTMs  consistently crashes under heavy loads (40
connections or above per coordinator, with max_connections set as 1000).
The only error message we get is:

"terminating connection because of crash of another server process","The
postmaster has commanded this server process to roll back the current
transaction and exit, because another server process exited abnormally and
possibly corrupted shared memory.","In a moment you should be able to
reconnect to the database and repeat your command."


We would like to debug this problem  or help you guys debug postgres-xl if
our loads have unearthed a bug. How do we generate any detailed diagnostics
or log that could help both of us?

Regards
Pierre
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.postgres-xl.org/private.cgi/postgres-xl-developers-postgres-xl.org/attachments/20160513/59dfd12b/attachment.htm>


More information about the Postgres-xl-developers mailing list