[Postgres-xl-bugs] cache lookup failed for node ERROR when using XL

user susernameb at gmail.com
Fri Mar 11 06:53:01 PST 2016


For Problem # 1: Use attached scripts to simulate.

Steps

1 : cd to XL installation dir

2: ./setup.sh

3. psql –dbname=postgres -p 5433 –c “create table test (val int);”

   Step 3 works fine.

4. ./stop.sh

5. ./start.sh

6. psql –dbname=postgres -p 5433 –c “create table test1 (val int);”

 

Step 6 gives error: “FATAL:  cache lookup failed for node 16384”

Error is seen on Ubuntu 14.04.1 LTS and Ubuntu 15.04

 

May be its just something in our setup that’s causing the problem.

 

Problem #2 

Yes this is a transient error. That’s why it is being difficult to nail down a test case.

 

Thanks!

 

From: Pavan Deolasee [mailto:pavan.deolasee at gmail.com] 
Sent: Thursday, March 10, 2016 11:55 PM
To: user
Cc: Koichi Suzuki; postgres-xc-bugs
Subject: Re: [Postgres-xl-bugs] cache lookup failed for node ERROR when using XL

 

 

 

On Thu, Mar 10, 2016 at 10:49 PM, user <susernameb at gmail.com> wrote:

Ok! There are two problems and we are mixing the threads here.

 

 

Thanks for clarifying. This definitely helps.

 

Problem 1: If XL cluster is abruptly killed. On restart we get error “FATAL:  cache lookup failed for node 16384”. Issuing any queries to coordinator gives this error. 

 

I tried this on my OSX setup, didn't see any issue. Will try on CentOS next. Can you please tell me how do you restart the servers? Do you use pgxc_ctl to configure and manage the cluster?

 

Is it possible to create a self-contained test to replicate this problem?

 

Problem 2: Setup cluster from scratch and keep autovacuum on – then we see below errors every now and then in the coordinator log.

“could not find tuple for relation …”

“catalog is missing 3 attribute(s) for relid …”

 

More questions. Is it a transient error or it stays forever? If its a permanent error, can you please try reindexing pg_class and pg_attribute and see if that helps?

 

 

Hope this avoids confusion.

 

If it helps, both Problems 1 and 2 are not seen on X2 “REL_1_2_STABLE”.

 

 

Thanks for letting us know. This part of the code has changed significantly between XC and XL 9.5. Also, PostgreSQL itself has made significant changes to the way catalogs are scanned.

 

Thanks,

Pavan

 

-- 

 Pavan Deolasee                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.postgres-xl.org/private.cgi/postgres-xl-bugs-postgres-xl.org/attachments/20160311/c4a439a7/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: stop.sh
Type: application/octet-stream
Size: 134 bytes
Desc: not available
URL: <http://lists.postgres-xl.org/private.cgi/postgres-xl-bugs-postgres-xl.org/attachments/20160311/c4a439a7/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: setup.sh
Type: application/octet-stream
Size: 2200 bytes
Desc: not available
URL: <http://lists.postgres-xl.org/private.cgi/postgres-xl-bugs-postgres-xl.org/attachments/20160311/c4a439a7/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: start.sh
Type: application/octet-stream
Size: 316 bytes
Desc: not available
URL: <http://lists.postgres-xl.org/private.cgi/postgres-xl-bugs-postgres-xl.org/attachments/20160311/c4a439a7/attachment-0002.obj>


More information about the postgres-xl-bugs mailing list