Databases

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • server resetting

    6 answers - 3546 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    Postgresql 7.4.7 (yes, I've been telling them we need to upgrade to the
    latest 7.4)
    Red Hat Enterprise Linux ES release 3
    We are having problems with the postgresql server resetting and dropping
    all user connections. There is a core file generated and I've attached
    a backtrace. I'm about to dig into the source to see what I can find,
    but if anyone can put their finger on the problem, I would appreciate
    it. I do realize that there is a call to exec_stmt() which appears to
    have a null value being passed, which I suspect is the issue. Why a
    null is being passed is what I plan to look into.
    Thanks for any info, here's the backtrace:
    Using host libthread_db library "/lib/tls/libthread_db.so.1".
    Core was generated by `postgres: bwoods exp [local] INSERT '.
    Program terminated with signal 11, Segmentation fault.
    #0 exec_stmt (estate=0xfeff8a90, stmt=0x0) at pl_exec.c:928
    in pl_exec.c
    #0 exec_stmt (estate=0xfeff8a90, stmt=0x0) at pl_exec.c:928
    #1 0x0083f005 in exec_stmts (estate=0xfeff8a90, stmts=0x90fa9e0)
    at pl_exec.c:903
    #2 0x0083f4f2 in exec_stmt_if (estate=0xfeff8a90, stmt=0x90fab78)
    at pl_exec.c:1139
    #3 0x0083f0ca in exec_stmt (estate=0xfeff8a90, stmt=0x90fab78)
    at pl_exec.c:947
    #4 0x0083f005 in exec_stmts (estate=0xfeff8a90, stmts=0x90fab90)
    at pl_exec.c:903
    #5 0x0083f4f2 in exec_stmt_if (estate=0xfeff8a90, stmt=0x90fad20)
    at pl_exec.c:1139
    #6 0x0083f0ca in exec_stmt (estate=0xfeff8a90, stmt=0x90fad20)
    at pl_exec.c:947
    #7 0x0083f005 in exec_stmts (estate=0xfeff8a90, stmts=0x9133e60)
    at pl_exec.c:903
    #8 0x0083f4f2 in exec_stmt_if (estate=0xfeff8a90, stmt=0x90d97b8)
    at pl_exec.c:1139
    #9 0x0083f0ca in exec_stmt (estate=0xfeff8a90, stmt=0x90d97b8)
    at pl_exec.c:947
    #10 0x0083f005 in exec_stmts (estate=0xfeff8a90, stmts=0x9118408)
    at pl_exec.c:903
    #11 0x0083ee15 in exec_stmt_block (estate=0xfeff8a90, block=0x90d97e8)
    at pl_exec.c:859
    #12 0x0083e77a in plpgsql_exec_trigger (func=0x9149ae0, trigdata=0xfeff8ca0)
    at pl_exec.c:645
    #13 0x0083b053 in plpgsql_call_handler (fcinfo=0xfeff8b50) at
    pl_handler.c:121
    #14 0x080f1c8e in ExecCallTriggerFunc (trigdata=0xfeff8ca0, finfo=0x935e260,
    per_tuple_context=0x0) at trigger.c:1150
    #15 0x080f2be7 in DeferredTriggerExecute (event=0x92af050, itemno=0,
    rel=0x8,
    trigdesc=0x935daf0, finfo=0xfeff8a90, per_tuple_context=0x0)
    at trigger.c:1859
    #16 0x080f2fee in deferredTriggerInvokeEvents (immediate_only=1 '\001')
    at trigger.c:2000
    #17 0x080f314f in DeferredTriggerEndQuery () at trigger.c:2135
    #18 0x08178ae8 in finish_xact_command () at postgres.c:1749
    #19 0x08177816 in exec_simple_query (
    query_string=0x8fe2438 "INSERT INT logs
    (seq,level,event_code,event_date,event_time,city,p rovince,user_id,est_dsp_date,est_dsp_time,country, edilate,carr_code,notes,trac_notes,order_num)
    VALUES ('2','6','TAS','09/14/06','19:")
    at postgres.c:905
    #20 0x08179f09 in PostgresMain (argc=4, argv=0x8f94b48,
    username=0x8f94ab8 "bwoods") at postgres.c:2871
    #21 0x08153c90 in BackendFork (port=0x8fa6af0) at postmaster.c:2564
    #22 0x08153683 in BackendStartup (port=0x8fa6af0) at postmaster.c:2207
    #23 0x08151be8 in ServerLoop () at postmaster.c:1119
    #24 0x081512ae in PostmasterMain (argc=5, argv=0x8f92688) at
    postmaster.c:897
    #25 0x08121163 in main (argc=5, argv=0xfeff9e44) at main.c:214
  • No.1 | | 879 bytes | |

    Geoffrey <esoteric (AT) 3times25 (DOT) netwrites:
    Program terminated with signal 11, Segmentation fault.
    #0 exec_stmt (estate=0xfeff8a90, stmt=0x0) at pl_exec.c:928
    in pl_exec.c
    #0 exec_stmt (estate=0xfeff8a90, stmt=0x0) at pl_exec.c:928
    #1 0x0083f005 in exec_stmts (estate=0xfeff8a90, stmts=0x90fa9e0)
    at pl_exec.c:903
    #2 0x0083f4f2 in exec_stmt_if (estate=0xfeff8a90, stmt=0x90fab78)
    at pl_exec.c:1139

    It seems you've got a corrupt "compiled statements" data structure for
    a plpgsql trigger function. this does not look like any of the
    known post-7.4.7 bug fixes. Can you show us the source code for that
    trigger?

    regards, tom lane

    (end of broadcast)
    TIP 9: In versions below 8.0, the planner will ignore your desire to
    choose an index scan if your joining column's datatypes do not
    match
  • No.2 | | 756 bytes | |

    Tom Lane wrote:

    It seems you've got a corrupt "compiled statements" data structure for
    a plpgsql trigger function. this does not look like any of the
    known post-7.4.7 bug fixes. Can you show us the source code for that
    trigger?

    Problem is, we seem to be having a problem with this reset issue and I
    don't see a correlation in the backtraces. Most of them are in fact
    related to inserts, but there are at least three different tables
    involved. There are also some where an INSERT is not involved. I've
    attached three more backtraces from different core files to provide
    further data and hopefully pinpoint this issue.

    We're assuming a common problem here, maybe that's our first mistake.
  • No.3 | | 955 bytes | |

    Geoffrey <esoteric (AT) 3times25 (DOT) netwrites:
    Problem is, we seem to be having a problem with this reset issue and I
    don't see a correlation in the backtraces. Most of them are in fact
    related to inserts, but there are at least three different tables
    involved. There are also some where an INSERT is not involved. I've
    attached three more backtraces from different core files to provide
    further data and hopefully pinpoint this issue.

    Well, these make it clear that you've got some pretty big chunks of
    nonstandard code in the backend, so my first thought is that there's a
    memory-clobber bug somewhere in that. It might be worth trying to run
    the code with a debugging malloc library (ElectricFence or some such)
    to try to locate the culprit.

    regards, tom lane

    (end of broadcast)
    TIP 4: Have you searched our list archives?

    http://archives.postgresql.org
  • No.4 | | 1125 bytes | |

    Tom Lane wrote:
    Geoffrey <esoteric (AT) 3times25 (DOT) netwrites:
    >Problem is, we seem to be having a problem with this reset issue and I
    >don't see a correlation in the backtraces. Most of them are in fact
    >related to inserts, but there are at least three different tables
    >involved. There are also some where an INSERT is not involved. I've
    >attached three more backtraces from different core files to provide
    >further data and hopefully pinpoint this issue.


    Well, these make it clear that you've got some pretty big chunks of
    nonstandard code in the backend, so my first thought is that there's a
    memory-clobber bug somewhere in that. It might be worth trying to run
    the code with a debugging malloc library (ElectricFence or some such)
    to try to locate the culprit.

    I'm not sure what you mean by 'nonstandard code,' could you expand on
    that? All the trigger code is written in plpgsql. Are you suggesting
    we're stomping on our own memory within the trigger code we've written?
  • No.5 | | 1009 bytes | |

    Geoffrey <esoteric (AT) 3times25 (DOT) netwrites:
    Tom Lane wrote:
    >Well, these make it clear that you've got some pretty big chunks of
    >nonstandard code in the backend, so my first thought is that there's a
    >memory-clobber bug somewhere in that.


    I'm not sure what you mean by 'nonstandard code,' could you expand on
    that?

    The traces include code from /usr/local/lib/libgrid.so and
    /usr/local/lib/libpcmsrv.so I don't know what those are,
    but I'm quite sure they are not invoked by a standard Postgres build.
    I also find it suggestive that they appear to have been written in C++
    we've seen problems before from trying to link C++ code into the
    backend, because it tends to bring along its own incompatible ideas
    about how to do error recovery and memory management.

    regards, tom lane

    (end of broadcast)
    TIP 2: Don't 'kill -9' the postmaster
  • No.6 | | 1308 bytes | |

    Tom Lane wrote:
    Geoffrey <esoteric (AT) 3times25 (DOT) netwrites:
    >Tom Lane wrote:

    Well, these make it clear that you've got some pretty big chunks of
    nonstandard code in the backend, so my first thought is that there's a
    memory-clobber bug somewhere in that.

    >I'm not sure what you mean by 'nonstandard code,' could you expand on
    >that?


    The traces include code from /usr/local/lib/libgrid.so and
    /usr/local/lib/libpcmsrv.so I don't know what those are,
    but I'm quite sure they are not invoked by a standard Postgres build.
    I also find it suggestive that they appear to have been written in C++
    we've seen problems before from trying to link C++ code into the
    backend, because it tends to bring along its own incompatible ideas
    about how to do error recovery and memory management.

    The libpcmsrv is a library for looking up miles, vendor provided. I'll
    have to check on the other one, it may be related to the same package.

    Thanks for the heads up on C++ code.

    It seems we may have located a memory problem in a library that is used
    throughout our code, thus, we are looking into this at this time.

    Thanks again.

Re: server resetting


max 4000 letters.
Your nickname that display:
In order to stop the spam: 3 + 2 =
QUESTION ON "Databases"

EMSDN.COM