[Postgres-xl-general] Best XL Configuration For Large Volume INSERT Statements

Silva, Richard rsilva at javelin.mg
Tue Sep 30 08:03:57 PDT 2014


Postgres-XL Community,

Our XL environment currently has 10 data-nodes, 1 coordinator, 1 gtm proxy, 1 gtm, 1 gtm standby all running on 8 core machines with 16GB of RAM and 1TB SSD drives. Up to this point, everything has been considerably faster than our single server production environment with the exception of INSERT statements.

Senario:
We conduct a simple join of two tables and write the results out to a new table.

CREATE TABLE myschema.mytable AS
SELECT m.id
      ,m.project_id
      ,m.account_id
      ,m.company_name
      ,p.start_date
      ,p.end_date
FROM (SELECT smb_project_id
            ,start_date
            ,end_date
      FROM myschema.projects) AS p
  INNER JOIN (SELECT id
                   ,project_id
                   ,account_id
                   ,company_name
             FROM myschema.customers) AS m
    ON m.project_id = p.project_id;

The resulting SELECT statement returns close to 100 million records which then get inserted into a table for further aggregation (among other things). The current run time averages about 45 minutes for both physical and temporary tables. I understand there are lots of different ways to accomplish inserts of this nature such as COPY; however, thats not really what I’m looking for.

Question:
What are the recommended configuration settings for XL when it comes to large volume inserts?

Here are the postgresql.conf settings I’ve changed on the datanodes, which have had little impact.

wal_level = minimal
wal_sync_method = fdatasync
synchronous_commit = off
default_statistics_target = 50
maintenance_work_mem = 208MB
constraint_exclusion = on
checkpoint_completion_target = 0.9
effective_cache_size = 2560MB
work_mem = 20MB
wal_buffers = -1
checkpoint_segments = 16
shared_buffers = 832MB
max_prepared_transactions = 100

Any suggestions or possible links to suggestions?

searching...
Richard Silva ETL Architect
Javelin • Making the brand promise personal, profitable and provable.






Access Manager: This email is intended only for the person or entity to which it is addressed and may contain information that is privileged, confidential or otherwise protected from disclosure. Dissemination, distribution or copying of this e-mail or the information herein by anyone other than the intended recipient, or an employee or agent responsible for delivering the message to the intended recipient, is prohibited. If you have received this e-mail in error, please immediately notify us by calling the Help Desk at (855) 237-8324.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.postgres-xl.org/private.cgi/postgres-xl-general-postgres-xl.org/attachments/20140930/3b5e8ef0/attachment.htm>


More information about the postgres-xl-general mailing list