Development

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • New: codecvt locale facet is broken (reproducible crash)

    22 answers - 2080 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    The attached source file (UTF-8 encoded) demonstrates that codecvt
    is broken for the simplest of transformations (UTF-8 to UCS-4).
    This is pretty basic, and the underlying gconf stuff works correctly,
    so the bug is either in libstdc++6 or somewhere inline in the headers.
    $ ./wide
    wide: /iconv/loop.c:425: utf8_internal_loop_single: Assertion `inptr -
    bytebuf (statecount & 7)' failed.
    Aborted
    While running:
    (gdb) bt
    #0 0x0fcc672c in () from /lib/tls/libc.so.6
    #1 0x0fe0425c in ? () from /lib/tls/libc.so.6
    #2 0x0ffa6ef8 in std::codecvt<wchar_t, char, __mbstate_t>::do_in ()
    from /usr/lib/libstdcso.6
    #3 0x100016b4 in std::__codecvt_abstract_base<wchar_t, char, __mbstate_t>::in
    (this=0x100290b8, __state=@0x7fa405a8, __from=0x10013014
    "ESC%GESC%@2
    37",
    __from_end=0x1001301d "", __from_next=@0x7fa405b0, __to=0x7fa405bc,
    __to_end=0x7fa406fc, __to_next=@0x7fa405b4)
    at
    /
    odecvt.h:204
    #4 0x10001244 in to_wide_string (str=@0x7fa40758, locale=@0x7fa40738)
    at wide.cc:22
    #5 0x10001544 in main () at wide.cc:59
    Program received signal SIGABRT, Aborted.
    0x0fcd67bc in raise () from /lib/tls/libc.so.6
    (gdb) bt
    #0 0x0fcd67bc in raise () from /lib/tls/libc.so.6
    #1 0x0fcd82c0 in abort () from /lib/tls/libc.so.6
    #2 0x0fcce768 in __assert_fail () from /lib/tls/libc.so.6
    #3 0x0fcc6c7c in () from /lib/tls/libc.so.6
    #4 0x0fcc6c7c in () from /lib/tls/libc.so.6
    #5 0x0fcc6c7c in () from /lib/tls/libc.so.6
    #6 0x0fcc6c7c in () from /lib/tls/libc.so.6
    #7 0x0fcc6c7c in () from /lib/tls/libc.so.6
    #8 0x0fcc6c7c in () from /lib/tls/libc.so.6
    #9 0x0fcc6c7c in () from /lib/tls/libc.so.6
    #10 0x0fcc6c7c in () from /lib/tls/libc.so.6
    #11 0x0fcc6c7c in () from /lib/tls/libc.so.6
    Previous frame inner to this frame (corrupt stack?)
    It affects GCC 4.2 (20060613), 4.1, 4.0, 3.3
    on Debian GNU/Linux (unstable).
    The program works correctly with 3.4:
    $ g3.4 -o wide wide.cc
    $ ./wide
    1
    $
    Regards,
    Roger
  • No.1 | | 586 bytes | |

    Comment #2 from pcarlini at suse dot de 2006-06-16 13:30
    Humm, this is really puzzling because nothing non-trivial changed in that area
    going from 3.4 to 4.0 and of course we all run daily the testsuite which
    includes quite a few codecvt tests, which always pass smoothly. Could you
    please compare/contrast your issue to existing testcases in
    testsuite/22_locale/codecvt?

    Anyway, if I save the attached wide.cc from the browser and compile/run it,
    then I get "1 4 1 4" without end. Is that the expected result? can you
    help us reproduce the problem? Thanks,
  • No.2 | | 311 bytes | |

    Comment #11 from pcarlini at suse dot de 2006-06-16 14:20
    (In reply to comment #9)
    Humm, wait, I'm working on x86-linux! Is that target specific? You can see the
    issue only on powerpc?

    Well, in any case all the codecvt regression tests are always fine on powerpc
    and powerpc64-linux too
  • No.3 | | 287 bytes | |

    Comment #10 from rleigh at debian dot org 2006-06-16 14:20
    Yes, this is all on the same Debian installation. 3.3, 3.4, 4.0, 4.1 and 4.2
    (snapshot) are available. All but 3.4 exhibit this problem.
    I will test on an i686 system in a moment to check if it's powerpc-only.
  • No.4 | | 224 bytes | |

    Comment #14 from pcarlini at suse dot de 2006-06-16 15:09
    Can you please tell us the glibc version? I'm asking because I can reproduce on
    an ia64 machine using glibc2.4, not on all the glibc2.3.6 systems I tried.
  • No.5 | | 182 bytes | |

    Comment #1 from rleigh at debian dot org 2006-06-16 13:09
    Created an attachment (id=11679)
    ()
    Testcase to show codecvt crash
    Compile with
    g++ -o wide wide.cc
  • No.6 | | 508 bytes | |

    Comment #4 from pcarlini at suse dot de 2006-06-16 13:49
    (In reply to comment #3)
    The source is UTF-8 encoded, and it assumes you are going to run it in a UTF-8
    locale. That might possibly be why you get odd output.

    The expected output should be as per the GCC 3.4 output in the original report:

    $ g3.4 -o wide wide.cc
    $ ./wide
    1
    $

    , thanks. Then I used the "en_US.UTF-8" locale and it worked fine, both
    mainline and stock 4.1.1: no crashes, apparently same output.
  • No.7 | | 963 bytes | |

    Comment #15 from rleigh at debian dot org 2006-06-16 16:16
    $ uname -a
    Linux hardknott 2.6.16.17 #7 Sun May 21 15:39:23 BST 2006 ppc GNU/Linux

    $ /lib/libc.so.6
    GNU C Library stable release version 2.3.6, by Roland McGrath et al.
    Copyright (C) 2005 Free Software Foundation, Inc.
    This is free software; see the source for copying conditions.
    There is N warranty; not even for MERCHANTABILITY or FITNESS FR A
    PARTICULAR PURPSE.
    Compiled by GNU CC version 4.0.4 20060507 (prerelease) (Debian 4.0.3-3).
    Compiled on a Linux 2.6.13 system on 2006-06-08.
    Available extensions:
    GNU libio by Per Bothner
    crypt add-on version 2.1 by Michael Glad and others
    GNU Libidn by Simon Josefsson
    linuxthreads-0.10 by Xavier Leroy
    BIND-8.2.3-T5B
    libthread_db work sponsored by Alpha Processor Inc
    NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
    software FPU emulation by Richard Henderson, Jakub Jelinek and others
  • No.8 | | 207 bytes | |

    Comment #16 from pcarlini at suse dot de 2006-06-16 16:56
    I can reproduce on an ia64-linux machine, so confirmed, but very puzzling on
    the libstdcv3 side, no idea how/when we are going to deal with it
  • No.9 | | 1616 bytes | |

    Comment #17 from rleigh at debian dot org 2006-06-16 16:59
    Created an attachment (id=11682)
    ()
    Use mbsnrtowcs directly.

    This testcase is similar to the original, with the exception that it uses
    mbsnrtowcs in place of the codecvt locale facet. It also initialises the
    locale with setlocale() for LC_CTYPE.

    It shows some interesting results, in fact the exact opposite of the original
    testcase:

    GCC ver powerpc i386
    3.3 fail fail
    3.4 K K
    4.0 K fail
    4.1 K fail
    4.2 K fail

    With this test, the expected output is this:
    $ ./wide2
    1

    The output for the failed tests:

    GCC 3.3:
    powerpc (GCC 3.3 was bad at wide streams; the output is "lost"):
    $ ./wide2
    1

    i386:
    $ ./wide2
    wide2: /iconv/loop.c:425: utf8_internal_loop_single: Assertion `inptr -
    bytebuf (statecount & 7)' failed.
    Aborted

    GCC 4.0/i386:
    $ ./wide2
    wide2: /iconv/loop.c:425: utf8_internal_loop_single: Assertion `inptr -
    bytebuf (statecount & 7)' failed.
    Aborted

    GCC 4.1/i386:
    ./wide2
    wide2: /iconv/loop.c:425: utf8_internal_loop_single: Assertion `inptr -
    bytebuf (statecount & 7)' failed.
    Aborted

    GCC 4.2/i386:
    $ ./wide2
    wide2: /iconv/loop.c:425: utf8_internal_loop_single: Assertion `inptr -
    bytebuf (statecount & 7)' failed.
    Aborted

    Please do allow for the fact that one (or both) of these testcases might be
    buggy; I've never used these interfaces before. However the behaviour is
    still highly variable between the two platforms.

    Regards,
    Roger
  • No.10 | | 252 bytes | |

    Comment #18 from pcarlini at suse dot de 2006-06-16 17:03
    , thanks. Before I go completely crazy, let's agree at least about a detail:
    let's not involve 3.3: in 3.3 codecvt is known to be broken and was completely
    rewritten for 3.4.
  • No.11 | | 302 bytes | |

    --

    pcarlini at suse dot de changed:

    What |Removed |Added

    AssignedTo|unassigned at gcc dot gnu |pcarlini at suse dot de
    |dot org |
    Status|WAITING |ASSIGNED
    Ever Confirmed|0 |1
    Last reconfirmed|2006-06-16 16:56:58 |2006-06-16 17:03:44
    date| |

  • No.12 | | 200 bytes | |

    --
    pcarlini at suse dot de changed:
    What |Removed |Added
    AssignedTo|pcarlini at suse dot de |unassigned at gcc dot gnu
    | |dot org
    Status|ASSIGNED |NEW
  • No.13 | | 1360 bytes | |

    Comment #19 from rleigh at debian dot org 2006-06-16 17:26
    Created an attachment (id=11683)
    ()
    C example using mbsnrtowcs

    This testcase is the same as the last, but uses C only.

    It looks like this:

    GCC ver powerpc i386
    3.3 K K
    3.4 K K
    4.0 K fail
    4.1 K fail
    4.2 K fail

    The expected output is:
    $ ./wide3

    1

    i386 (all failing versions):
    $ ./wide3

    Segmentation fault

    (gdb) run
    Starting program: /home/rleigh/wide3

    Program received signal SIGSEGV, Segmentation fault.
    0xa7e0e19d in (step=0x805ede0,
    data=0xafc2a8d0, inptrp=0xafc2aa80, inend=0x8048754 "", outbufstart=0x0,
    irreversible=0xafc2a8f8, do_flush=0, consume_incomplete=1)
    at /iconv/loop.c:371
    371 /iconv/loop.c: No such file or directory.
    in /iconv/loop.c
    (gdb) bt
    #0 0xa7e0e19d in (step=0x805ede0,
    data=0xafc2a8d0, inptrp=0xafc2aa80, inend=0x8048754 "", outbufstart=0x0,
    irreversible=0xafc2a8f8, do_flush=0, consume_incomplete=1)
    at /iconv/loop.c:371
    #1 0xa7e65bd9 in __mbsnrtowcs (dst=0xafc2a93c, src=0xafc2aa80, nmc=9,
    len=162, ps=0xafc2aa84) at mbsnrtowcs.c:106
    #2 0x08048503 in print_wide (str=0x804874b "�237") at wide3.c:16
    #3 0x080485f0 in main () at wide3.c:40

    Both the powerpc and i386 system are running the same version of glibc.
  • No.14 | | 260 bytes | |

    Comment #20 from rleigh at debian dot org 2006-06-16 17:28
    Before I go completely crazy, let's agree at least about a detail:
    let's not involve 3.3: in 3.3 codecvt is known to be broken and was
    completely rewritten for 3.4.
    Agreed :)
  • No.15 | | 716 bytes | |

    Comment #21 from pcarlini at suse dot de 2006-06-16 18:10
    , I think I have something meaningful to say: seems definitely a
    miscompilation. I would ask you to check on powerpc-linux what I'm seeing on
    ia64-linux: the problem goes away if I both build libstdc++ and eventually the
    testcase at " -g3". Therefore I would ask you to go inside the libstdcv3
    dir of your build tree, do a make clean ; make CXXFLAGS=" -g3", reinstall
    the library alone (no need to rebuild the compiler proper) and build the
    testcase itself " -g3". ia64-linux the problem goes away. If yoy can
    confirm, the difficult part begins ;) because we are supposed to prepare a
    reduced testcase for the compiler people
  • No.16 | | 418 bytes | |

    Comment #22 from rleigh at debian dot org 2006-06-16 18:19
    Just to summarise the current tests:

    wide wide2 wide3
    GCC ver ppc i386 ppc i386 ppc i386
    3.4 K K K K K fail
    4.0 fail K K fail K fail
    4.1 fail K K fail K fail
    4.2 fail K K fail K fail

    GCC 3.4 is the most reliable, but I don't understand the pattern of failures.

    I'll do a build in a moment as you suggest.
  • No.17 | | 190 bytes | |

    Comment #23 from rleigh at debian dot org 2006-06-17 14:29
    This will take a few more hours. I didn't have a built GCC tree to hand, so
    I'm still waiting on "make bootstrap".
  • No.18 | | 2730 bytes | |

    Comment #24 from rleigh at debian dot org 2006-06-18 00:27
    /gcc-20060613/configure ,c++
    /home/rleigh/gcc-test

    $ ./wide
    terminate called after throwing an instance of 'std::runtime_error'
    what(): name not valid
    Aborted

    #0 0x0fcf77c8 in kill () at /string/bits/string2.h:998
    #1 0x0fcf754c in GI_raise (sig=6) at

    #2 0x0fcf8e68 in GI_abort () at /sysdeps/generic/abort.c:88
    #3 0x0ffb273c in () at

    #4 0x0ffaf87c in __cxxabiv1::__terminate (handler=0) at

    #5 0x0ffaf8b8 in std::terminate () at

    #6 0x0ffafa20 in __cxa_throw (obj=<value optimized out>, tinfo=<value
    optimized out>, dest=<value optimized out>)
    at
    #7 0x0ff3a050 in std::__throw_runtime_error (__s=<value optimized out>) at

    #8 0x0ffadd64 in (__cloc=<value
    optimized out>, __s=<value optimized out>) at c++locale.cc:141
    #9 0x0ff40154 in _Impl (this=0x10013080, __s=0x6 <Address 0x6 out of bounds>,
    __refs=<value optimized out>)
    at
    #10 0x0ff41ac4 in locale (this=0x7fc83950, __s=<value optimized out>) at

    #11 0x100015e8 in main () at wide.cc:54

    $ ./wide2
    1

    ./wide3

    1

    Rebuilding libstdc++v3 with 'make CXXFLAGS=" -g3"':

    $ ./wide
    terminate called after throwing an instance of 'std::runtime_error'
    what(): name not valid
    Aborted

    (gdb) run
    Starting program: /home/rleigh/wbug/wide
    terminate called after throwing an instance of 'std::runtime_error'
    what(): name not valid

    Program received signal SIGABRT, Aborted.
    0x0fcc57c8 in kill () at /string/bits/string2.h:998
    998 /string/bits/string2.h: No such file or directory.
    in /string/bits/string2.h
    Current language: auto; currently c
    (gdb) bt
    #0 0x0fcc57c8 in kill () at /string/bits/string2.h:998
    #1 0x0fcc554c in GI_raise (sig=6) at

    #2 0x0fcc6e68 in GI_abort () at /sysdeps/generic/abort.c:88
    #3 0x0ffaf7d4 in () at

    #4 0x0ffaa238 in __cxxabiv1::__terminate (handler=0xffaf5ac
    <()>)
    at
    #5 0x0ffaa288 in std::terminate () at

    #6 0x0ffaa534 in __cxa_throw (obj=0x10013130, tinfo=0xffe2d58, dest=0xff1ea3c
    <~runtime_error>)
    at
    #7 0x0ff120e4 in std::__throw_runtime_error (__s=0xffb7e04
    " name not valid")
    at
    #8 0x0ffa7624 in (__cloc=@0x7fd11824,
    __s=0x1001306c "en_GB.UTF8") at c++locale.cc:141
    #9 0x0ff1bda4 in _Impl (this=0x10013080, __s=0x1001306c "en_GB.UTF8",
    __refs=1) at
    #10 0x0ff1de70 in locale (this=0x7fd11950, __s=0x10002364 "") at

    #11 0x10001748 in main () at wide.cc:54

    $ ./wide2
    1

    $ ./wide3

    1

    Regards,
    Roger
  • No.19 | | 943 bytes | |

    Comment #25 from pcarlini at suse dot de 2006-06-18 09:35
    (In reply to comment #24)
    terminate called after throwing an instance of 'std::runtime_error'
    what(): name not valid

    This is the standard throw which happens when a named locale cannot be used,
    has nothing to do with the issue which we are discussing and it's expexted
    behavior. The only possible explanation is that the GNU locale model has been
    disabled by the configure-time tests. Do you have installed a full set of
    locales, in particular de_DE? See also these notes for additional details:

    Anyway, at this point it's almost sure we are dealing with a miscompilation,
    the fact that nothing changed in the libary code and the problem happen with
    the 4.x compilers (of new technology, ssa, etc) it's also a strong indication
    of that (besides my 100% reproducible tests on ia64-linux and all the other
    checks).
  • No.20 | | 560 bytes | |

    Comment #26 from rleigh at debian dot org 2006-06-18 09:51
    Thiemo Seufer diagnosed this as a problem with the testcases: mbstate_t needs
    explictly initialising to all-bits-zero with memset. After doing this

    std::memset(&state, 0, sizeof(mbstate_t));

    all the testcases work for me on powerpc and i386.

    Since this is not a bug, it can be closed. Sorry about that. Perhaps the
    libstdc++ doxygen documentation for codecvt could document that
    state_type/mbstate_t needs explicit initialisation before use.

    Regards,
    Roger
  • No.21 | | 876 bytes | |

    Comment #27 from pcarlini at suse dot de 2006-06-18 10:03
    (In reply to comment #26)
    Thiemo Seufer diagnosed this as a problem with the testcases: mbstate_t needs
    explictly initialising to all-bits-zero with memset. After doing this

    std::memset(&state, 0, sizeof(mbstate_t));

    all the testcases work for me on powerpc and i386.

    Funny. Actually, we still have bugs, in the testsuite only , where we are never
    doing the initialization. I will fix that. Sorry about my part of the waste of
    time, I'm learning some of those details with you, the current codecvt has been
    contributed by other people.

    Since this is not a bug, it can be closed. Sorry about that. Perhaps the
    libstdc++ doxygen documentation for codecvt could document that
    state_type/mbstate_t needs explicit initialisation before use.

    Regards,
    Roger
  • No.22 | | 139 bytes | |

    Comment #28 from pcarlini at suse dot de 2006-06-18 10:13
    Correction, our testcases are already fine, zero_state does the job
    Anyway

Re: New: codecvt locale facet is broken (reproducible crash)


max 4000 letters.
Your nickname that display:
In order to stop the spam: 3 + 2 =
QUESTION ON "Development"

EMSDN.COM