[Postgres-xl-general] Excessive "Unexpected EOF" errors in GTM proxy log
Michael Misiewicz
mmisiewicz at gmail.com
Wed Feb 1 17:00:24 PST 2017
I'm not sure if this is related or not, but I'm also seeing very high
system CPU usage for the gtm_proxy process while doing a COPY operation. Is
this normal? I examined the process using perf, and found the following:
Samples: 2M of event 'cycles', Event count (approx.): 121821209092
4.02% [kernel] [k] fget_light
2.86% [kernel] [k] _spin_lock
2.80% [kernel] [k] sock_poll
2.18% [kernel] [k] _spin_lock_irqsave
2.09% gtm_proxy [.] GTMProxy_ThreadMain
1.64% [kernel] [k] tcp_poll
1.34% libpthread-2.12.so [.] pthread_getspecific
1.28% [kernel] [k] fput
1.17% [kernel] [k] do_sys_poll
1.13% [kernel] [k] __audit_syscall_exit
1.05% [kernel] [k] tcp_recvmsg
1.02% [kernel] [k] tcp_transmit_skb
1.02% gtm_proxy [.] AllocSetAlloc
I'm not a C expert, but some googling led me to believe that spin lock is a
busy loop, I guess related to the fget_light call?
On Tue, Jan 31, 2017 at 10:42 PM, Michael Misiewicz <mmisiewicz at gmail.com>
wrote:
> Hi all,
>
> I have a cluster with 18 datanodes, 3 coordinators, and 1 GTM and 1 GTM
> proxy. I am getting quite a bit of chatter in the GTM proxy log:
>
> 1:3925305088:2017-02-01 03:40:14.746 UTC -LOG: unexpected EOF on client
> connection
> LOCATION: ReadCommand, proxy_main.c:2049
> 1:3935794944:2017-02-01 03:40:14.817 UTC -LOG: unexpected EOF on client
> connection
> LOCATION: ReadCommand, proxy_main.c:2049
> 1:3925305088:2017-02-01 03:40:15.037 UTC -LOG: unexpected EOF on client
> connection
> LOCATION: ReadCommand, proxy_main.c:2049
> 1:3935794944:2017-02-01 03:40:15.092 UTC -LOG: unexpected EOF on client
> connection
> LOCATION: ReadCommand, proxy_main.c:2049
> 1:3925305088:2017-02-01 03:40:15.407 UTC -LOG: unexpected EOF on client
> connection
> LOCATION: ReadCommand, proxy_main.c:2049
>
> To the point that this is causing log files to overflow. Has anyone seen
> an error like this before? Near as I can tell things are working right,
> there’s nothing out of the ordinary in how the cluster is working. My
> config is:
> #===========================
> # Added at initialization, 20170131_22:11:51
> nodename = 'gtm_pxy1'
> listen_addresses = '*'
> port = 20000
> gtm_host = '01.bm-datascience-dev.dev.nym2'
> gtm_port = 20001
> worker_threads = 100
> gtm_connect_retry_interval = 1
> # End of addition
>
> Michael
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.postgres-xl.org/pipermail/postgres-xl-general-postgres-xl.org/attachments/20170201/9b1081b9/attachment.htm>
More information about the postgres-xl-general
mailing list