server resetting
6 answers - 3546 bytes -

Postgresql 7.4.7 (yes, I've been telling them we need to upgrade to the
latest 7.4)
Red Hat Enterprise Linux ES release 3
We are having problems with the postgresql server resetting and dropping
all user connections. There is a core file generated and I've attached
a backtrace. I'm about to dig into the source to see what I can find,
but if anyone can put their finger on the problem, I would appreciate
it. I do realize that there is a call to exec_stmt() which appears to
have a null value being passed, which I suspect is the issue. Why a
null is being passed is what I plan to look into.
Thanks for any info, here's the backtrace:
Using host libthread_db library "/lib/tls/libthread_db.so.1".
Core was generated by `postgres: bwoods exp [local] INSERT '.
Program terminated with signal 11, Segmentation fault.
#0 exec_stmt (estate=0xfeff8a90, stmt=0x0) at pl_exec.c:928
in pl_exec.c
#0 exec_stmt (estate=0xfeff8a90, stmt=0x0) at pl_exec.c:928
#1 0x0083f005 in exec_stmts (estate=0xfeff8a90, stmts=0x90fa9e0)
at pl_exec.c:903
#2 0x0083f4f2 in exec_stmt_if (estate=0xfeff8a90, stmt=0x90fab78)
at pl_exec.c:1139
#3 0x0083f0ca in exec_stmt (estate=0xfeff8a90, stmt=0x90fab78)
at pl_exec.c:947
#4 0x0083f005 in exec_stmts (estate=0xfeff8a90, stmts=0x90fab90)
at pl_exec.c:903
#5 0x0083f4f2 in exec_stmt_if (estate=0xfeff8a90, stmt=0x90fad20)
at pl_exec.c:1139
#6 0x0083f0ca in exec_stmt (estate=0xfeff8a90, stmt=0x90fad20)
at pl_exec.c:947
#7 0x0083f005 in exec_stmts (estate=0xfeff8a90, stmts=0x9133e60)
at pl_exec.c:903
#8 0x0083f4f2 in exec_stmt_if (estate=0xfeff8a90, stmt=0x90d97b8)
at pl_exec.c:1139
#9 0x0083f0ca in exec_stmt (estate=0xfeff8a90, stmt=0x90d97b8)
at pl_exec.c:947
#10 0x0083f005 in exec_stmts (estate=0xfeff8a90, stmts=0x9118408)
at pl_exec.c:903
#11 0x0083ee15 in exec_stmt_block (estate=0xfeff8a90, block=0x90d97e8)
at pl_exec.c:859
#12 0x0083e77a in plpgsql_exec_trigger (func=0x9149ae0, trigdata=0xfeff8ca0)
at pl_exec.c:645
#13 0x0083b053 in plpgsql_call_handler (fcinfo=0xfeff8b50) at
pl_handler.c:121
#14 0x080f1c8e in ExecCallTriggerFunc (trigdata=0xfeff8ca0, finfo=0x935e260,
per_tuple_context=0x0) at trigger.c:1150
#15 0x080f2be7 in DeferredTriggerExecute (event=0x92af050, itemno=0,
rel=0x8,
trigdesc=0x935daf0, finfo=0xfeff8a90, per_tuple_context=0x0)
at trigger.c:1859
#16 0x080f2fee in deferredTriggerInvokeEvents (immediate_only=1 '\001')
at trigger.c:2000
#17 0x080f314f in DeferredTriggerEndQuery () at trigger.c:2135
#18 0x08178ae8 in finish_xact_command () at postgres.c:1749
#19 0x08177816 in exec_simple_query (
query_string=0x8fe2438 "INSERT INT logs
(seq,level,event_code,event_date,event_time,city,p rovince,user_id,est_dsp_date,est_dsp_time,country, edilate,carr_code,notes,trac_notes,order_num)
VALUES ('2','6','TAS','09/14/06','19:")
at postgres.c:905
#20 0x08179f09 in PostgresMain (argc=4, argv=0x8f94b48,
username=0x8f94ab8 "bwoods") at postgres.c:2871
#21 0x08153c90 in BackendFork (port=0x8fa6af0) at postmaster.c:2564
#22 0x08153683 in BackendStartup (port=0x8fa6af0) at postmaster.c:2207
#23 0x08151be8 in ServerLoop () at postmaster.c:1119
#24 0x081512ae in PostmasterMain (argc=5, argv=0x8f92688) at
postmaster.c:897
#25 0x08121163 in main (argc=5, argv=0xfeff9e44) at main.c:214
No.6 | | 1308 bytes |
| 
Tom Lane wrote:
Geoffrey <esoteric (AT) 3times25 (DOT) netwrites:
>Tom Lane wrote:
Well, these make it clear that you've got some pretty big chunks of
nonstandard code in the backend, so my first thought is that there's a
memory-clobber bug somewhere in that.
>I'm not sure what you mean by 'nonstandard code,' could you expand on
>that?
The traces include code from /usr/local/lib/libgrid.so and
/usr/local/lib/libpcmsrv.so I don't know what those are,
but I'm quite sure they are not invoked by a standard Postgres build.
I also find it suggestive that they appear to have been written in C++
we've seen problems before from trying to link C++ code into the
backend, because it tends to bring along its own incompatible ideas
about how to do error recovery and memory management.
The libpcmsrv is a library for looking up miles, vendor provided. I'll
have to check on the other one, it may be related to the same package.
Thanks for the heads up on C++ code.
It seems we may have located a memory problem in a library that is used
throughout our code, thus, we are looking into this at this time.
Thanks again.