Development

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • builtin_bswap plus enhancements

    11 answers - 24285 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    Hi Ian,
    Here's the original patch with changes from bswap[,l,ll] to
    bswap[32,64], plus enhancements from both Falk and I including docs and
    more tests. I know you already approved the first, but I figured I'd go
    ahead and incorporate some of Falk's work and get it reapproved before
    4.3. I'll be working on ppc support for it at some point as well.
    Falk: I don't have an alpha machine to test the alpha support so if
    you'd like to include that can you rework that part of your patch?
    Still K with the 4.2->4.3 change for the version?
    Tested on x86-darwin, ppc-darwin, x86-linux, ppc64-darwin.
    -eric
    2006-08-03 Eric Christopher <echristo (AT) apple (DOT) com>
    Falk Hueffner <falk (AT) debian (DOT) org>
    * doc/extend.texi (__builtin_bswap32): Document.
    (__builtin_bswap64): Ditto.
    * doc/libgcc.texi (bswapsi2): Document.
    (bswapdi2): Ditto.
    * doc/rtl.texi (bswap): Document.
    * optabs.c (expand_unop): Don't widen a bswap.
    (init_optabs): Init bswap. Set libfuncs explicitly
    for bswapsi2 and bswapdi2.
    * optabs.h (TI_bswap): New.
    (bswap_optab): Ditto.
    * genopinit.c (optabs): Handle bswap_optab.
    * tree.h (tree_index): Add TI_UINT32_TYPE and
    TI_UINT64_TYPE.
    (uint32_type_node): New.
    (uint64_type_node): Ditto.
    * tree.c (build_common_tree_nodes_2): Initialize
    uint32_type_node and uint64_type_node.
    * builtins.c (expand_builtin_bswap): New.
    (expand_builtin): Call.
    (fold_builtin_bswap): New.
    (fold_builtin_1): Call.
    * fold-const.c (tree_expr_nonnegative_p): Return true
    for bswap.
    * builtin-types.def (BT_UINT32): New.
    (BT_UINT64): Ditto.
    (BT_FN_UINT32_UINT32): Ditto.
    (BT_FN_UINT64_UINT64): Ditto.
    * builtins.def (BUILT_IN_BSWAP32): New.
    (BUILT_IN_BSWAP64): Ditto.
    * rtl.def (BSWAP): New.
    * genattrtab.c (check_attr_value): New.
    * libgcc2.c (__bswapSI2): New.
    (__bswapDI2): Ditto.
    * libgcc2.h (__bswapSI2): Declare.
    (__bswapDI2): Ditto.
    * mklibgcc.in (lib2funcs): Add _bswapsi2 and _bswapdi2.
    * simplify-rtx.c (simplify_const_unary_operation): Return
    0 for BSWAP.
    * libgcc-std.ver (__bwapsi2): Add.
    (__bswapdi2): Ditto.
    * reload1.c (eliminate_regs_1): Add bswap.
    (elimination_effects): Ditto.
    * config/i386/i386.h (x86_bswap): New.
    (TARGET_BSWAP): Use.
    * config/i386/i386.c (x86_bswap): Set.
    2006-08-03 Eric Christopher <echristo (AT) apple (DOT) com>
    Falk Hueffner <falk (AT) debian (DOT) org>
    * New.
    * gcc.dg/builtin-bswap-1.c: Ditto.
    * gcc.dg/builtin-bswap-2.c: Ditto.
    * gcc.dg/builtin-bswap-3.c: Ditto.
    * gcc.dg/builtin-bswap-4.c: Ditto.
    * gcc.dg/builtin-bswap-5.c: Ditto.
    Index: gcc/doc/extend.texi
    gcc/doc/extend.texi(revision 115913)
    gcc/doc/extend.texi(working copy)
    @@ -6058,6 +6058,16 @@ Similar to @code{__builtin_powi}, except
    are @code{long double}.
    @end deftypefn
    +@deftypefn {Built-in Function} int32_t __builtin_bswap32 (int32_t x)
    +Returns @var{x} with the order of the bytes reversed; for example,
    +@code{0xaabbccdd} becomes @code{0xddccbbaa}. Byte here always means
    +exactly 8 bits.
    +@end deftypefn
    +
    +@deftypefn {Built-in Function} int64_t __builtin_bswap64 (int64_t x)
    +Similar to @code{__builtin_bswap32}, except the argument and return types
    +are 64-bit.
    +@end deftypefn
    @node Target Builtins
    @section Built-in Functions Specific to Particular Target Machines
    Index: gcc/doc/libgcc.texi
    gcc/doc/libgcc.texi(revision 115913)
    gcc/doc/libgcc.texi(working copy)
    @@ -212,6 +212,11 @@ These functions return the value zero if
    These functions return the number of bits set in @var{a}.
    @end deftypefn
    +@deftypefn {Runtime Function} int32_t __bswapsi2 (int32_t @var{a})
    +@deftypefnx {Runtime Function} int64_t __bswapdi2 (int64_t @var{a})
    +These functions return the @var{a} byteswapped.
    +@end deftypefn
    +
    @node Soft float library routines
    @section Routines for floating point emulation
    @cindex soft float library
    @@ -728,4 +733,3 @@ document me!
    @deftypefn {Runtime Function} void __clear_cache (char *@var{beg}, char *@var{end})
    This function clears the instruction cache between @var{beg} and @var{end}.
    @end deftypefn
    -
    Index: gcc/doc/rtl.texi
    gcc/doc/rtl.texi(revision 115913)
    gcc/doc/rtl.texi(working copy)
    @@ -2092,6 +2092,11 @@ mode @var{m}. The mode of @var{x} will
    Represents the number of 1-bits modulo 2 in @var{x}, represented as an
    integer of mode @var{m}. The mode of @var{x} will usually be an integer
    mode.
    +
    +@findex bswap
    +@item (bswap:@var{m} @var{x})
    +Represents the value @var{x} with the order of bytes reversed, carried out
    +in mode @var{m}, which must be a fixed-point machine mode.
    @end table
    @node Comparisons
    Index: gcc/optabs.c
    gcc/optabs.c(revision 115913)
    gcc/optabs.c(working copy)
    @@ -2590,6 +2590,10 @@ expand_unop (enum machine_mode mode, opt
    goto try_libcall;
    }
    + /* We can't widen a bswap. */
    + if (unoptab == bswap_optab)
    + goto try_libcall;
    +
    if (CLASS_HAS_WIDER_MDES_P (class))
    for (wider_mode = GET_MDE_WIDER_MDE (mode);
    wider_mode != VIDmode;
    @@ -5252,6 +5256,7 @@ init_optabs (void)
    absv_optab = init_optabv (ABS);
    addcc_optab = init_optab (UNKNWN);
    one_cmpl_optab = init_optab (NT);
    + bswap_optab = init_optab (BSWAP);
    ffs_optab = init_optab (FFS);
    clz_optab = init_optab (CLZ);
    ctz_optab = init_optab (CTZ);
    @@ -5455,6 +5460,11 @@ init_optabs (void)
    init_interclass_conv_libfuncs (trunc_optab, "trunc", MDE_FLAT, MDE_DECIMAL_FLAT);
    init_interclass_conv_libfuncs (trunc_optab, "trunc", MDE_DECIMAL_FLAT, MDE_FLAT);
    + /* Explicitly initialize the bswap libfuncs since we need them to be
    + valid for things other than word_mode. */
    + set_optab_libfunc (bswap_optab, SImode, "__bswapsi2");
    + set_optab_libfunc (bswap_optab, DImode, "__bswapdi2");
    +
    /* Use cabs for double complex abs, since systems generally have cabs.
    Don't define any libcall for float complex, so that cabs will be used. */
    if (complex_double_type_node)
    Index: gcc/optabs.h
    gcc/optabs.h(revision 115913)
    gcc/optabs.h(working copy)
    @@ -146,6 +146,8 @@ enum optab_index
    /* Abs value */
    TI_abs,
    TI_absv,
    + /* Byteswap */
    + TI_bswap,
    /* Bitwise not */
    TI_one_cmpl,
    /* Bit scanning and counting */
    @@ -315,6 +317,7 @@ extern GTY(()) optab optab_table[TI_MAX
    #define abs_optab (optab_table[TI_abs])
    #define absv_optab (optab_table[TI_absv])
    #define one_cmpl_optab (optab_table[TI_one_cmpl])
    +#define bswap_optab (optab_table[TI_bswap])
    #define ffs_optab (optab_table[TI_ffs])
    #define clz_optab (optab_table[TI_clz])
    #define ctz_optab (optab_table[TI_ctz])
    Index: gcc/genopinit.c
    gcc/genopinit.c(revision 115913)
    gcc/genopinit.c(working copy)
    @@ -148,6 +148,7 @@ static const char * const optabs[] =
    "atan_optab->handlers[$A].insn_code = CDE_FR_$(atan$a2$)",
    "strlen_optab->handlers[$A].insn_code = CDE_FR_$(strlen$a$)",
    "one_cmpl_optab->handlers[$A].insn_code = CDE_FR_$(one_cmpl$a2$)",
    + "bswap_optab->handlers[$A].insn_code = CDE_FR_$(bswap$a2$)",
    "ffs_optab->handlers[$A].insn_code = CDE_FR_$(ffs$a2$)",
    "clz_optab->handlers[$A].insn_code = CDE_FR_$(clz$a2$)",
    "ctz_optab->handlers[$A].insn_code = CDE_FR_$(ctz$a2$)",
    Index: gcc/tree.c
    gcc/tree.c(revision 115913)
    gcc/tree.c(working copy)
    @@ -6535,6 +6535,10 @@ build_common_tree_nodes_2 (int short_dou
    long_double_ptr_type_node = build_pointer_type (long_double_type_node);
    integer_ptr_type_node = build_pointer_type (integer_type_node);
    + /* Fixed size integer types. */
    + uint32_type_node = build_nonstandard_integer_type (32, true);
    + uint64_type_node = build_nonstandard_integer_type (64, true);
    +
    /* Decimal float types. */
    dfloat32_type_node = make_node (REAL_TYPE);
    TYPE_PRECISIN (dfloat32_type_node) = DECIMAL32_TYPE_SIZE;
    Index: gcc/tree.h
    gcc/tree.h(revision 115913)
    gcc/tree.h(working copy)
    @@ -3203,6 +3203,9 @@ enum tree_index
    TI_UINTDI_TYPE,
    TI_UINTTI_TYPE,
    + TI_UINT32_TYPE,
    + TI_UINT64_TYPE,
    +
    TI_INTEGER_ZER,
    TI_INTEGERNE,
    TI_INTEGER_MINUSNE,
    @@ -3278,6 +3281,9 @@ extern GTY(()) tree global_trees[TI_MAX]
    #define unsigned_intDI_type_nodeglobal_trees[TI_UINTDI_TYPE]
    #define unsigned_intTI_type_nodeglobal_trees[TI_UINTTI_TYPE]
    +#define uint32_type_nodeglobal_trees[TI_UINT32_TYPE]
    +#define uint64_type_nodeglobal_trees[TI_UINT64_TYPE]
    +
    #define integer_zero_nodeglobal_trees[TI_INTEGER_ZER]
    #define integer_one_nodeglobal_trees[TI_INTEGERNE]
    #define integer_minus_one_nodeglobal_trees[TI_INTEGER_MINUSNE]
    Index: gcc/builtins.c
    gcc/builtins.c(revision 115913)
    gcc/builtins.c(working copy)
    @@ -4564,6 +4564,30 @@ expand_builtin_alloca (tree arglist, rtx
    return result;
    }
    +/* Expand a call to a bswap builtin. The arguments are in ARGLIST. MDE
    + is the mode to expand with. */
    +
    +static rtx
    +expand_builtin_bswap (tree arglist, rtx target, rtx subtarget)
    +{
    + enum machine_mode mode;
    + tree arg;
    + rtx op0;
    +
    + if (!validate_arglist (arglist, INTEGER_TYPE, VID_TYPE))
    + return 0;
    +
    + arg = TREE_VALUE (arglist);
    + mode = TYPE_MDE (TREE_TYPE (arg));
    + op0 = expand_expr (arg, subtarget, VIDmode, 0);
    +
    + target = expand_unop (mode, bswap_optab, op0, target, 1);
    +
    + gcc_assert (target);
    +
    + return convert_to_mode (mode, target, 0);
    +}
    +
    /* Expand a call to a unary builtin. The arguments are in ARGLIST.
    Return 0 if a normal call should be emitted rather than expanding the
    function in-line. If convenient, the result should be placed in TARGET.
    @@ -5825,6 +5849,14 @@ expand_builtin (tree exp, rtx target, rt
    expand_stack_restore (TREE_VALUE (arglist));
    return const0_rtx;
    + case BUILT_IN_BSWAP32:
    + case BUILT_IN_BSWAP64:
    + target = expand_builtin_bswap (arglist, target, subtarget);
    +
    + if (target)
    +return target;
    + break;
    +
    CASE_INT_FN (BUILT_IN_FFS):
    case BUILT_IN_FFSIMAX:
    target = expand_builtin_unop (target_mode, arglist, target,
    @@ -7441,6 +7473,67 @@ fold_builtin_bitop (tree fndecl, tree ar
    return NULL_TREE;
    }
    +/* Fold function call to builtin_bswap and the long and long long
    + variants. Return NULL_TREE if no simplification can be made. */
    +static tree
    +fold_builtin_bswap (tree fndecl, tree arglist)
    +{
    + tree arg;
    +
    + if (! validate_arglist (arglist, INTEGER_TYPE, VID_TYPE))
    + return 0;
    +
    + /* constant value. */
    + arg = TREE_VALUE (arglist);
    + if (TREE_CDE (arg) == INTEGER_CST && ! TREE_CNSTANTVERFLW (arg))
    + {
    + HST_WIDE_INT hi, width, r_hi = 0;
    + unsigned HST_WIDE_INT lo, r_lo = 0;
    + tree type;
    +
    + type = TREE_TYPE (arg);
    + width = TYPE_PRECISIN (type);
    + lo = TREE_INT_CST_LW (arg);
    + hi = TREE_INT_CST_HIGH (arg);
    +
    + switch (DECL_FUNCTIN_CDE (fndecl))
    +{
    + case BUILT_IN_BSWAP32:
    + case BUILT_IN_BSWAP64:
    + {
    + int s;
    +
    + for (s = 0; s < width; s += 8)
    +{
    + int d = width - s - 8;
    + unsigned HST_WIDE_INT byte;
    +
    + if (s < HST_BITS_PER_WIDE_INT)
    + byte = (lo >s) & 0xff;
    + else
    + byte = (hi >(s - HST_BITS_PER_WIDE_INT)) & 0xff;
    +
    + if (d < HST_BITS_PER_WIDE_INT)
    + r_lo |= byte << d;
    + else
    + r_hi |= byte << (d - HST_BITS_PER_WIDE_INT);
    +}
    + }
    +
    + break;
    +
    +default:
    + gcc_unreachable ();
    +}
    +
    + if (width < HST_BITS_PER_WIDE_INT)
    +return build_int_cst (TREE_TYPE (TREE_TYPE (fndecl)), r_lo);
    + else
    +return build_int_cst_wide (TREE_TYPE (TREE_TYPE (fndecl)), r_lo, r_hi);
    + }
    +
    + return NULL_TREE;
    +}
    /* Return true if EXPR is the real constant contained in VALUE. */
    static bool
    @@ -8788,6 +8881,10 @@ fold_builtin_1 (tree fndecl, tree arglis
    CASE_FLT_FN (BUILT_IN_LLRINT):
    return fold_fixed_mathfn (fndecl, arglist);
    + case BUILT_IN_BSWAP32:
    + case BUILT_IN_BSWAP64:
    + return fold_builtin_bswap (fndecl, arglist);
    +
    CASE_INT_FN (BUILT_IN_FFS):
    CASE_INT_FN (BUILT_IN_CLZ):
    CASE_INT_FN (BUILT_IN_CTZ):
    Index: gcc/fold-const.c
    gcc/fold-const.c(revision 115913)
    gcc/fold-const.c(working copy)
    @@ -12092,6 +12092,8 @@ tree_expr_nonnegative_p (tree t)
    CASE_INT_FN (BUILT_IN_FFS):
    CASE_INT_FN (BUILT_IN_PARITY):
    CASE_INT_FN (BUILT_IN_PP****):
    + case BUILT_IN_BSWAP32:
    + case BUILT_IN_BSWAP64:
    /* Always true. */
    return 1;
    @@ -13000,4 +13002,3 @@ fold_strip_sign_ops (tree exp)
    }
    return NULL_TREE;
    }
    -
    Index:
    (revision 0)
    (revision 0)
    @@ -0,0 +1,12 @@
    +/* { dg-do compile } */
    +/* { dg-options "-march=i486" } */
    +/* { dg-final { scan-assembler "bswap" } } */
    +
    +int foo (int a)
    +{
    + int b;
    +
    + b = __builtin_bswap (a);
    +
    + return b;
    +}
    Index:
    (revision 0)
    (revision 0)
    @@ -0,0 +1,14 @@
    +/* { dg-do compile } */
    +/* { dg-options "" } */
    +/* { dg-final { scan-assembler-not "__builtin_" } } */
    +
    +#include <stdint.h>
    +
    +uint32_t foo (uint32_t a)
    +{
    + int b;
    +
    + b = __builtin_bswap32 (a);
    +
    + return b;
    +}
    Index:
    (revision 0)
    (revision 0)
    @@ -0,0 +1,19 @@
    +/* { dg-do run } */
    +/* { dg-options "" } */
    +#include <stdint.h>
    +
    +extern void abort (void);
    +
    +int main (void)
    +{
    + uint32_t a = 0x80000000;
    + uint32_t b;
    +
    + b = __builtin_bswap32 (a);
    + a = __builtin_bswap32 (b);
    +
    + if (b != 0x80 || a != 0x80000000)
    + abort ();
    +
    + return 0;
    +}
    Index:
    (revision 0)
    (revision 0)
    @@ -0,0 +1,16 @@
    +/* { dg-do run } */
    +/* { dg-options "" } */
    +int
    +main (void)
    +{
    + /* Test constant folding. */
    + extern void link_error (void);
    +
    + if (__builtin_bswap32(0xaabbccdd) != 0xddccbbaa)
    + link_error ();
    +
    + if (__builtin_bswap64(0x1122334455667788ULL) != 0x8877665544332211ULL)
    + link_error ();
    +
    + return 0;
    +}
    Index:
    (revision 0)
    (revision 0)
    @@ -0,0 +1,19 @@
    +/* { dg-do run } */
    +/* { dg-options "" } */
    +#include <stdint.h>
    +
    +extern void abort (void);
    +
    +int main (void)
    +{
    + uint32_t a = 4;
    + uint32_t b;
    +
    + b = __builtin_bswap32 (a);
    + a = __builtin_bswap32 (b);
    +
    + if (b == 4 || a != 4)
    + abort ();
    +
    + return 0;
    +}
    Index:
    (revision 0)
    (revision 0)
    @@ -0,0 +1,59 @@
    +/* { dg-do run } */
    +/* { dg-options "-Wall" } */
    +
    +#include <stdint.h>
    +
    +#define MAKE_FUN(suffix, type)\
    + type my_bswap##suffix(type x) {\
    + type result = 0;\
    + int shift;\
    + for (shift = 0; shift < 8 * sizeof (type); shift += 8)\
    + {\
    +result <<= 8;\
    +result |= (x >shift) & 0xff;\
    + }\
    + return result;\
    + }\
    +
    +MAKE_FUN(32, uint32_t);
    +MAKE_FUN(64, uint64_t);
    +
    +extern void abort (void);
    +
    +#define NUMS32\
    + {\
    + 0x00000000UL,\
    + 0x11223344UL,\
    + 0xffffffffUL,\
    + }
    +
    +#define NUMS64\
    + {\
    + 0x0000000000000000ULL,\
    + 0x1122334455667788ULL,\
    + 0xffffffffffffffffULL,\
    + }
    +
    +uint32_t uint32_ts[] =
    + NUMS32;
    +
    +uint64_t uint64_ts[] =
    + NUMS64;
    +
    +#define N(table) (sizeof (table) / sizeof (table[0]))
    +
    +int
    +main (void)
    +{
    + int i;
    +
    + for (i = 0; i < N(uint32_ts); i++)
    + if (__builtin_bswap32 (uint32_ts[i]) != my_bswap32 (uint32_ts[i]))
    + abort ();
    +
    + for (i = 0; i < N(uint64_ts); i++)
    + if (__builtin_bswap64 (uint64_ts[i]) != my_bswap64 (uint64_ts[i]))
    + abort ();
    +
    + return 0;
    +}
    Index: gcc/builtin-types.def
    gcc/builtin-types.def(revision 115913)
    gcc/builtin-types.def(working copy)
    @@ -74,6 +74,8 @@ DEF_PRIMITIVE_TYPE (BT_LNGLNG, long_lo
    DEF_PRIMITIVE_TYPE (BT_ULNGLNG, long_long_unsigned_type_node)
    DEF_PRIMITIVE_TYPE (BT_INTMAX, intmax_type_node)
    DEF_PRIMITIVE_TYPE (BT_UINTMAX, uintmax_type_node)
    +DEF_PRIMITIVE_TYPE (BT_UINT32, uint32_type_node)
    +DEF_PRIMITIVE_TYPE (BT_UINT64, uint64_type_node)
    DEF_PRIMITIVE_TYPE (BT_WRD, (*lang_hooks.types.type_for_mode) (word_mode, 0))
    DEF_PRIMITIVE_TYPE (BT_FLAT, float_type_node)
    DEF_PRIMITIVE_TYPE (BT_DUBLE, double_type_node)
    @@ -203,6 +205,10 @@ DEF_FUNCTIN_TYPE_1 (BT_FN_DFLAT128_DFL
    DEF_FUNCTIN_TYPE_1 (BT_FN_VID_VPTR, BT_VID, BT_VLATILE_PTR)
    DEF_FUNCTIN_TYPE_1 (BT_FN_VID_PTRPTR, BT_VID, BT_PTR_PTR)
    DEF_FUNCTIN_TYPE_1 (BT_FN_UINT_UINT, BT_UINT, BT_UINT)
    +DEF_FUNCTIN_TYPE_1 (BT_FN_ULNG_ULNG, BT_ULNG, BT_ULNG)
    +DEF_FUNCTIN_TYPE_1 (BT_FN_ULNGLNG_ULNGLNG, BT_ULNGLNG, BT_ULNGLNG)
    +DEF_FUNCTIN_TYPE_1 (BT_FN_UINT32_UINT32, BT_UINT32, BT_UINT32)
    +DEF_FUNCTIN_TYPE_1 (BT_FN_UINT64_UINT64, BT_UINT64, BT_UINT64)
    DEF_PINTER_TYPE (BT_PTR_FN_VID_PTR, BT_FN_VID_PTR)
    @@ -434,4 +440,3 @@ DEF_FUNCTIN_TYPE_VAR_5 (BT_FN_INT_STRIN
    DEF_PINTER_TYPE (BT_PTR_FN_VID_VAR, BT_FN_VID_VAR)
    DEF_FUNCTIN_TYPE_3 (,
    BT_PTR, BT_PTR_FN_VID_VAR, BT_PTR, BT_SIZE)
    -
    Index: gcc/builtins.def
    gcc/builtins.def(revision 115913)
    gcc/builtins.def(working copy)
    @@ -594,6 +594,8 @@ DEF_EXT_LIB_BUILTIN (BUILT_IN_ALLCA,
    DEF_GCC_BUILTIN (BUILT_IN_APPLY, "apply", , ATTR_NULL)
    DEF_GCC_BUILTIN (BUILT_IN_APPLY_ARGS, "apply_args", BT_FN_PTR_VAR, ATTR_NULL)
    DEF_GCC_BUILTIN (BUILT_IN_ARGS_INF, "args_info", BT_FN_INT_INT, ATTR_NULL)
    +DEF_GCC_BUILTIN (BUILT_IN_BSWAP32, "bswap32", BT_FN_UINT32_UINT32, ATTR_CNST_NTHRW_LIST)
    +DEF_GCC_BUILTIN (BUILT_IN_BSWAP64, "bswap64", BT_FN_UINT64_UINT64, ATTR_CNST_NTHRW_LIST)
    DEF_LIB_BUILTIN (BUILT_IN_CALLC, "calloc", BT_FN_PTR_SIZE_SIZE, ATTR_MALLC_NTHRW_LIST)
    DEF_GCC_BUILTIN (BUILT_IN_CLASSIFY_TYPE, "classify_type", BT_FN_INT_VAR, ATTR_NULL)
    DEF_GCC_BUILTIN (BUILT_IN_CLZ, "clz", BT_FN_INT_UINT, ATTR_CNST_NTHRW_LIST)
    Index: gcc/rtl.def
    gcc/rtl.def(revision 115913)
    gcc/rtl.def(working copy)
    @@ -567,6 +567,9 @@ DEF_RTL_EXPR(ABS, "abs", "e", RTX_UNARY)
    /* Square root */
    DEF_RTL_EXPR(SQRT, "sqrt", "e", RTX_UNARY)
    +/* Swap bytes. */
    +DEF_RTL_EXPR(BSWAP, "bswap", "e", RTX_UNARY)
    +
    /* Find first bit that is set.
    Value is 1 + number of trailing zeros in the arg.,
    or 0 if arg is 0. */
    Index: gcc/genattrtab.c
    gcc/genattrtab.c(revision 115913)
    gcc/genattrtab.c(working copy)
    @@ -959,6 +959,7 @@ check_attr_value (rtx exp, struct attr_d
    case CTZ:
    case PP****:
    case PARITY:
    + case BSWAP:
    XEXP (exp, 0) = check_attr_value (XEXP (exp, 0), attr);
    break;
    Index: gcc/libgcc2.c
    gcc/libgcc2.c(revision 115913)
    gcc/libgcc2.c(working copy)
    @@ -492,6 +492,30 @@ __ashrdi3 (DWtype u, word_type b)
    }
    #endif

    +#ifdef L_bswapsi2
    +UWtype
    bswapSI2 (UWtype u)
    +{
    + return ((((u) & 0xff000000) >24)
    + | (((u) & 0x00ff0000) >8)
    + | (((u) & 0x0000ff00) << 8)
    + | (((u) & 0x000000ff) << 24));
    +}
    +#endif
    +#ifdef L_bswapdi2
    +UDWtype
    bswapDI2 (UDWtype u)
    +{
    + return ((((u) & 0xff00000000000000ull) >56)
    + | (((u) & 0x00ff000000000000ull) >40)
    + | (((u) & 0x0000ff0000000000ull) >24)
    + | (((u) & 0x000000ff00000000ull) >8)
    + | (((u) & 0x00000000ff000000ull) << 8)
    + | (((u) & 0x0000000000ff0000ull) << 24)
    + | (((u) & 0x000000000000ff00ull) << 40)
    + | (((u) & 0x00000000000000ffull) << 56));
    +}
    +#endif
    #ifdef L_ffssi2
    #undef int
    int
    Index: gcc/libgcc2.h
    gcc/libgcc2.h(revision 115913)
    gcc/libgcc2.h(working copy)
    @@ -304,11 +304,13 @@ typedef int word_type __attribute__ ((mo
    #define __ctzSI2__NW(ctz,2)
    #define __popcountSI2__NW(popcount,2)
    #define __paritySI2__NW(parity,2)
    +#define __bswapSI2__NW(bswap,2)
    #define __ffsDI2__NDW(ffs,2)
    #define __clzDI2__NDW(clz,2)
    #define __ctzDI2__NDW(ctz,2)
    #define __popcountDI2__NDW(popcount,2)
    #define __parityDI2__NDW(parity,2)
    +#define __bswapDI2__NDW(bswap,2)
    extern DWtype __muldi3 (DWtype, DWtype);
    extern DWtype __divdi3 (DWtype, DWtype);
    @@ -345,11 +347,13 @@ extern Wtype __addvSI3 (Wtype, Wtype);
    extern Wtype __subvSI3 (Wtype, Wtype);
    extern Wtype __mulvSI3 (Wtype, Wtype);
    extern Wtype __negvSI2 (Wtype);
    +extern UWtype __bswapSI2 (UWtype);
    extern DWtype __absvDI2 (DWtype);
    extern DWtype __addvDI3 (DWtype, DWtype);
    extern DWtype __subvDI3 (DWtype, DWtype);
    extern DWtype __mulvDI3 (DWtype, DWtype);
    extern DWtype __negvDI2 (DWtype);
    +extern UDWtype __bswapDI2 (UDWtype);
    #ifdef
    extern SItype __absvsi2 (SItype);
    Index: gcc/mklibgcc.in
    gcc/mklibgcc.in(revision 115913)
    gcc/mklibgcc.in(working copy)
    @@ -91,7 +91,7 @@ lib2funcs='_muldi3 _negdi2 _lshrdi3 _ash
    _ffssi2 _ffsdi2 _clz _clzsi2 _clzdi2 _ctzsi2 _ctzdi2 _popcount_tab
    _popcountsi2 _popcountdi2 _paritysi2 _paritydi2 _powisf2 _powidf2
    _powixf2 _powitf2 _mulsc3 _muldc3 _mulxc3 _multc3 _divsc3 _divdc3
    -_divxc3 _divtc3'
    +_divxc3 _divtc3 _bswapsi2 _bswapdi2'
    if [ "$LIB2_SIDITI_CNV_FUNCS" ]; then
    for func in $swfloatfuncs; do
    Index: gcc/simplify-rtx.c
    gcc/simplify-rtx.c(revision 115913)
    gcc/simplify-rtx.c(working copy)
    @@ -1042,6 +1042,9 @@ simplify_const_unary_operation (enum rtx
    val &= 1;
    break;
    +case BSWAP:
    + return 0;
    +
    case TRUNCATE:
    val = arg0;
    break;
    @@ -4869,4 +4872,3 @@ simplify_rtx (rtx x)
    }
    return NULL;
    }
    -
    Index: gcc/config/i386/i386.h
    gcc/config/i386/i386.h(revision 115913)
    gcc/config/i386/i386.h(working copy)
    @@ -164,6 +164,7 @@ extern const int x86_use_bt;
    extern const int x86_cmpxchg, x86_cmpxchg8b, x86_cmpxchg16b, x86_xadd;
    extern const int x86_use_incdec;
    extern const int x86_pad_returns;
    +extern const int x86_bswap;
    extern int x86_prefetch_sse;
    #define TARGET_USE_LEAVE (x86_use_leave & TUNEMASK)
    @@ -236,6 +237,7 @@ extern int x86_prefetch_sse;
    #define TARGET_CMPXCHG8B (x86_cmpxchg8b & (1 << ix86_arch))
    #define TARGET_CMPXCHG16B (x86_cmpxchg16b & (1 << ix86_arch))
    #define TARGET_XADD (x86_xadd & (1 << ix86_arch))
    +#define TARGET_BSWAP (x86_bswap & (1 << ix86_arch))
    #ifndef TARGET_64BIT_DEFAULT
    #define TARGET_64BIT_DEFAULT 0
    Index: gcc/config/i386/i386.c
    gcc/config/i386/i386.c(revision 115913)
    gcc/config/i386/i386.c(working copy)
    @@ -829,6 +829,8 @@ const int x86_cmpxchg8b = ~(m_386 | m_48
    const int x86_cmpxchg16b = m_NCNA;
    /* Exchange and add was added for 80486. */
    const int x86_xadd = ~m_386;
    +/* Byteswap was added for 80486. */
    +const int x86_bswap = ~m_386;
    const int x86_pad_returns = m_ATHLN_K8 | m_GENERIC;
    /* In case the average insn count for single function invocation is
    Index: gcc/libgcc-std.ver
    gcc/libgcc-std.ver(revision 115913)
    gcc/libgcc-std.ver(working copy)
    @@ -273,4 +273,6 @@ GCC_4.2.0 {
    __floatuntixf
    __floatuntitf
    _Unwind_GetIPInfo
    + __bswapsi2
    + __bswapdi2
    }
    Index: gcc/reload1.c
    gcc/reload1.c(revision 115913)
    gcc/reload1.c(working copy)
    @@ -2523,6 +2523,7 @@ eliminate_regs_1 (rtx x, enum machine_mo
    case CTZ:
    case PP****:
    case PARITY:
    + case BSWAP:
    new = eliminate_regs_1 (XEXP (x, 0), mem_mode, insn, false);
    if (new != XEXP (x, 0))
    return gen_rtx_fmt_e (code, GET_MDE (x), new);
    @@ -2743,6 +2744,7 @@ elimination_effects (rtx x, enum machine
    case CTZ:
    case PP****:
    case PARITY:
    + case BSWAP:
    elimination_effects (XEXP (x, 0), mem_mode);
    return;
  • No.1 | | 486 bytes | |

    Aug 3, 2006, at 5:58 PM, Eric Christopher wrote:
    Here's the original patch with changes from bswap[,l,ll] to bswap
    [32,64], plus enhancements from both Falk and I including docs and
    more tests. I know you already approved the first, but I figured
    I'd go ahead and incorporate some of Falk's work and get it
    reapproved before 4.3. I'll be working on ppc support for it at
    some point as well.

    Why not add bswap16 as well?
    -Chris
  • No.2 | | 659 bytes | |

    Chris Lattner wrote:
    Aug 3, 2006, at 5:58 PM, Eric Christopher wrote:
    >Here's the original patch with changes from bswap[,l,ll] to
    >bswap[32,64], plus enhancements from both Falk and I including docs
    >and more tests. I know you already approved the first, but I figured
    >I'd go ahead and incorporate some of Falk's work and get it reapproved
    >before 4.3. I'll be working on ppc support for it at some point as well.


    Why not add bswap16 as well?

    No reason. I'll probably do it, but this is as much patch as I wanted to
    keep around for now.
    -eric
  • No.3 | | 3265 bytes | |

    Eric Christopher <echristo (AT) apple (DOT) comwrites:

    Index: gcc/doc/extend.texi

    gcc/doc/extend.texi(revision 115913)
    gcc/doc/extend.texi(working copy)
    @@ -6058,6 +6058,16 @@ Similar to @code{__builtin_powi}, except
    are @code{long double}.
    @end deftypefn

    +@deftypefn {Built-in Function} int32_t __builtin_bswap32 (int32_t x)
    +Returns @var{x} with the order of the bytes reversed; for example,
    +@code{0xaabbccdd} becomes @code{0xddccbbaa}. Byte here always means
    +exactly 8 bits.
    +@end deftypefn
    +
    +@deftypefn {Built-in Function} int64_t __builtin_bswap64 (int64_t x)
    +Similar to @code{__builtin_bswap32}, except the argument and return types
    +are 64-bit.
    +@end deftypefn

    You are defining the functions to use unsigned types, so I think you
    should document them to use uint32_t and uint64_t.

    +@deftypefn {Runtime Function} int32_t __bswapsi2 (int32_t @var{a})
    +@deftypefnx {Runtime Function} int64_t __bswapdi2 (int64_t @var{a})
    +These functions return the @var{a} byteswapped.
    +@end deftypefn

    Here also.

    Index:

    (revision 0)
    (revision 0)
    @@ -0,0 +1,14 @@
    +/* { dg-do compile } */
    +/* { dg-options "" } */
    +/* { dg-final { scan-assembler-not "__builtin_" } } */
    +
    +#include <stdint.h>
    +
    +uint32_t foo (uint32_t a)
    +{
    + int b;
    +
    + b = __builtin_bswap32 (a);
    +
    + return b;
    +}

    Is this test really going to pass on a processor which does not
    support a byte swap instruction? I must be missing something.

    Index: gcc/builtin-types.def

    gcc/builtin-types.def(revision 115913)
    gcc/builtin-types.def(working copy)
    @@ -74,6 +74,8 @@ DEF_PRIMITIVE_TYPE (BT_LNGLNG, long_lo
    DEF_PRIMITIVE_TYPE (BT_ULNGLNG, long_long_unsigned_type_node)
    DEF_PRIMITIVE_TYPE (BT_INTMAX, intmax_type_node)
    DEF_PRIMITIVE_TYPE (BT_UINTMAX, uintmax_type_node)
    +DEF_PRIMITIVE_TYPE (BT_UINT32, uint32_type_node)
    +DEF_PRIMITIVE_TYPE (BT_UINT64, uint64_type_node)
    DEF_PRIMITIVE_TYPE (BT_WRD, (*lang_hooks.types.type_for_mode) (word_mode, 0))
    DEF_PRIMITIVE_TYPE (BT_FLAT, float_type_node)
    DEF_PRIMITIVE_TYPE (BT_DUBLE, double_type_node)
    @@ -203,6 +205,10 @@ DEF_FUNCTIN_TYPE_1 (BT_FN_DFLAT128_DFL
    DEF_FUNCTIN_TYPE_1 (BT_FN_VID_VPTR, BT_VID, BT_VLATILE_PTR)
    DEF_FUNCTIN_TYPE_1 (BT_FN_VID_PTRPTR, BT_VID, BT_PTR_PTR)
    DEF_FUNCTIN_TYPE_1 (BT_FN_UINT_UINT, BT_UINT, BT_UINT)
    +DEF_FUNCTIN_TYPE_1 (BT_FN_ULNG_ULNG, BT_ULNG, BT_ULNG)
    +DEF_FUNCTIN_TYPE_1 (BT_FN_ULNGLNG_ULNGLNG, BT_ULNGLNG, BT_ULNGLNG)
    +DEF_FUNCTIN_TYPE_1 (BT_FN_UINT32_UINT32, BT_UINT32, BT_UINT32)
    +DEF_FUNCTIN_TYPE_1 (BT_FN_UINT64_UINT64, BT_UINT64, BT_UINT64)

    Why are you defining BT_FN_ULNG_ULNG and BT_FN_ULNGLNG_ULNGLNG
    here? If they are not needed, please remove them.

    Index: gcc/libgcc-std.ver

    gcc/libgcc-std.ver(revision 115913)
    gcc/libgcc-std.ver(working copy)
    @@ -273,4 +273,6 @@ GCC_4.2.0 {
    __floatuntixf
    __floatuntitf
    _Unwind_GetIPInfo
    + __bswapsi2
    + __bswapdi2
    }

    As you said, this should be GCC_4.3.0 assuming this is checked in
    after 4.2 branches.

    Patch is K for stage 1 with those changes.

    Thanks.

    Ian
  • No.4 | | 644 bytes | |

    Eric Christopher <echristo (AT) apple (DOT) comwrites:

    Here's the original patch with changes from bswap[,l,ll] to
    bswap[32,64], plus enhancements from both Falk and I including docs
    and more tests. I know you already approved the first, but I figured
    I'd go ahead and incorporate some of Falk's work and get it reapproved
    before 4.3. I'll be working on ppc support for it at some point as
    well.

    Falk: I don't have an alpha machine to test the alpha support so if
    you'd like to include that can you rework that part of your patch?

    K, will do as soon as you have committed.
  • No.5 | | 351 bytes | |

    Chris Lattner <clattner (AT) apple (DOT) comwrites:

    Why not add bswap16 as well?

    It should be unnecessary, since any attempt to express it should be
    picked up by the rot idiom recognizer, and the backends should then
    emit optimal code for constant-8 rots (and if that doesn't actually
    happen, we should rather fix that).
  • No.6 | | 526 bytes | |

    Thursday 10 August 2006 20:43, Falk Hueffner wrote:
    Chris Lattner <clattner (AT) apple (DOT) comwrites:
    Why not add bswap16 as well?

    It should be unnecessary, since any attempt to express it should be
    picked up by the rot idiom recognizer, and the backends should then
    emit optimal code for constant-8 rots (and if that doesn't actually
    happen, we should rather fix that).

    Is this true for machines that don't have HImode registers or arithmetic
    operations?

    Paul
  • No.7 | | 1104 bytes | |

    Aug 10, 2006, at 12:43 PM, Falk Hueffner wrote:
    Chris Lattner <clattner (AT) apple (DOT) comwrites:
    >Why not add bswap16 as well?
    >

    It should be unnecessary, since any attempt to express it should be
    picked up by the rot idiom recognizer, and the backends should then
    emit optimal code for constant-8 rots (and if that doesn't actually
    happen, we should rather fix that).

    Sure, makes sense. I was wondering more from the sake of consistency
    than from what GCC can and can not do.

    LLVM, for example, recognizes the common bswap idioms for 16/32/64
    bits and generates good code for them, but also exposes intrinsics
    for each. The intrinsics are important to clients who want to *know*
    they are going to get good code, without having to know that a
    particular version of the compiler will do the right thing, or worry
    about regressions in future versions.

    In any case, I have no particular interest in what GCC does, it just
    seemed odd to have buildins for 32/64-bit but not 16-bit.
    -Chris
  • No.8 | | 1013 bytes | |

    Paul Brook <paul (AT) codesourcery (DOT) comwrites:

    Thursday 10 August 2006 20:43, Falk Hueffner wrote:
    >Chris Lattner <clattner (AT) apple (DOT) comwrites:
    >Why not add bswap16 as well?
    >>

    >It should be unnecessary, since any attempt to express it should be
    >picked up by the rot idiom recognizer, and the backends should then
    >emit optimal code for constant-8 rots (and if that doesn't actually
    >happen, we should rather fix that).
    >

    Is this true for machines that don't have HImode registers or arithmetic
    operations?

    I suppose it should still be true if the machine doesn't have 16bit
    registers. However it wouldn't work if the target doesn't have any
    mode corresponding to 16 bit (although I suspect that targets
    deviating from the standard SI=32bit will become extinct at some
    point, so I don't know whether it's worth bothering).
  • No.9 | | 1579 bytes | |

    Thursday 10 August 2006 21:07, Falk Hueffner wrote:
    Paul Brook <paul (AT) codesourcery (DOT) comwrites:
    Thursday 10 August 2006 20:43, Falk Hueffner wrote:
    >Chris Lattner <clattner (AT) apple (DOT) comwrites:
    >Why not add bswap16 as well?
    >>

    >It should be unnecessary, since any attempt to express it should be
    >picked up by the rot idiom recognizer, and the backends should then
    >emit optimal code for constant-8 rots (and if that doesn't actually
    >happen, we should rather fix that).
    >

    Is this true for machines that don't have HImode registers or arithmetic
    operations?

    I suppose it should still be true if the machine doesn't have 16bit
    registers. However it wouldn't work if the target doesn't have any
    mode corresponding to 16 bit (although I suspect that targets
    deviating from the standard SI=32bit will become extinct at some
    point, so I don't know whether it's worth bothering).

    Where does the rothi idiom recognition happen? trees or RTL?
    In particular does the idiom recognition still work if the target only has
    lsrsi3, not lsrhi3?
    I was thinking of targets like ARM where arithmetic only usually happens on
    SImode values. Arm has an instruction that byteswaps the low half of a
    register and extends the result to full register width.
    Do I need to define a named rotlhi3 pattern that fails for anything other than
    a constat rotation of 8?

    Paul
  • No.10 | | 684 bytes | |

    Falk Hueffner <falk (AT) debian (DOT) orgwrites:

    Chris Lattner <clattner (AT) apple (DOT) comwrites:

    Why not add bswap16 as well?

    It should be unnecessary, since any attempt to express it should be
    picked up by the rot idiom recognizer, and the backends should then
    emit optimal code for constant-8 rots (and if that doesn't actually
    happen, we should rather fix that).

    bswap16 is still necessary, for consistency, and to avoid bug reports.

    Eric indicated offline that he was going to work on that, but wanted
    to get the first set of patches approved. At least, that is how I
    interpreted what he said.

    Ian
  • No.11 | | 825 bytes | |

    Ian Lance Taylor wrote:
    Falk Hueffner <falk (AT) debian (DOT) orgwrites:

    >Chris Lattner <clattner (AT) apple (DOT) comwrites:
    >>

    Why not add bswap16 as well?
    >It should be unnecessary, since any attempt to express it should be
    >picked up by the rot idiom recognizer, and the backends should then
    >emit optimal code for constant-8 rots (and if that doesn't actually
    >happen, we should rather fix that).


    bswap16 is still necessary, for consistency, and to avoid bug reports.

    Eric indicated offline that he was going to work on that, but wanted
    to get the first set of patches approved. At least, that is how I
    interpreted what he said.

    Yup.
    -eric

Re: builtin_bswap plus enhancements


max 4000 letters.
Your nickname that display:
In order to stop the spam: 3 + 2 =
QUESTION ON "Development"

EMSDN.COM