Development

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • Vectorize int to float conversions

    6 answers - 1319 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    This patch adds the necessary target hooks and vectorizer changes to
    support vectorization of statements that include
    casting from int to float, In addition, it implements the hook for altivec.
    Testcase attached.
    Bootstrapped and tested on the vectorizer testcases on powerpc-linux.
    Bootstrapped with vectorization enabled and tested on the vectorizer
    testcases on i386-linux.
    Full makecheck in progress.
    K for mainline, once testing completes?
    : ADDPATCH vectorizer (ssa, loops):
    Tehila.
    ChangeLog:
    * target.h (struct vectorize): Add builtin_vectorized_function field.
    * tree-vectorizer.h (type_int_float_vec_info_type): New enum
    stmt_vec_info_type value.
    (): New declaration.
    * tree-vect-analyze.c (vect_analyze_operations): Add
    call.
    * target-def.h (): New.
    * tree-vect-transform.c (): New
    function.
    (vect_transform_stmt): Add case for type_int_float_vec_info_type.
    * tree-vect-generic.c (expand_vector_operations_1): Consider correct
    mode.
    * config/rs6000/rs6000.c (): New.
    (): Defined.
    (rs6000_expand_builtin): Add handling a case of ALTIVEC_BUILTIN_VCFUX
    or
    ALTIVEC_BUILTIN_VCFSX.
    (See attached file: int_to_float_convert.txt)(See attached file:
    vect-intfloat-conversion.c)
  • No.1 | | 1015 bytes | |

    1/23/07, Tehila Meyzels <TEHILA (AT) il (DOT) ibm.comwrote:
    --
    This patch adds the necessary target hooks and vectorizer changes to
    support vectorization of statements that include
    casting from int to float, In addition, it implements the hook for altivec.
    Testcase attached.

    Bootstrapped and tested on the vectorizer testcases on powerpc-linux.
    Bootstrapped with vectorization enabled and tested on the vectorizer
    testcases on i386-linux.
    Full makecheck in progress.

    K for mainline, once testing completes?

    I would like it more if you did a generic vectorize_conversion hook (to also
    support float->int conversion) and if you added the necessary changes to
    support mixed-vector operations to for example enable vectorization of
    lrint (which is a float->int conversion, too). This patch looks really too
    special.

    I suppose we can work on more generic support for type (and width) changing
    operations.

    Thanks,
    Richard.
  • No.2 | | 6619 bytes | |

    1/23/07, Tehila Meyzels <TEHILA (AT) il (DOT) ibm.comwrote:

    "Richard Guenther" <richard.guenther (AT) gmail (DOT) comwrote on 23/01/2007
    17:55:54:

    1/23/07, Tehila Meyzels <TEHILA (AT) il (DOT) ibm.comwrote:
    --
    This patch adds the necessary target hooks and vectorizer changes to
    support vectorization of statements that include
    casting from int to float, In addition, it implements the hook for
    altivec.
    Testcase attached.

    Bootstrapped and tested on the vectorizer testcases on powerpc-linux.
    Bootstrapped with vectorization enabled and tested on the vectorizer
    testcases on i386-linux.
    Full makecheck in progress.

    K for mainline, once testing completes?

    I would like it more if you did a generic vectorize_conversion hook (to
    also
    support float->int conversion) and if you added the necessary changes to
    support mixed-vector operations to for example enable vectorization of
    lrint (which is a float->int conversion, too). This patch looks really
    too
    special.

    I suppose we can work on more generic support for type (and width)
    changing
    operations.
    --
    Thanks for the quick response!
    This was the aim when I start working on this patch.
    I've sent this patch as a first step for the general
    "vectorize_conversion", you've just described.

    WRT float to int conversion, actually, I think I've done that (the
    "FIX_TRUNC_EXPR" parts).
    Maybe I've missed it in tree-vect-generic.c, so I can fix that.

    WRT different widths, here the situation is a bit more complicated and we
    should consider the demotion/promotion
    support we already have in the vecrorizer (since those cases combine these
    two functionalities - conversion+demotion/promotion).
    My thought was to do that soon after this patch.

    Here's a more thorough review.

    + /* Returns a code for builtin that realizes vectorized version of
    + int to float conversion, or NULL_TREE if not available. */
    + tree (* builtin_intfloat_conversion) (tree);
    +

    I don't like the name - maybe builtin_vectorized_conversion?

    + operation = GIMPLE_STMTPERAND (stmt, 1);
    + code = TREE_CDE (operation);
    + if (code != FIX_TRUNC_EXPR && code != FLAT_EXPR)
    + return false;

    why place this restriction here? I think this should be done in the
    target hook.

    + /* Check types of lhs and rhs. */
    + op0 = TREEPERAND (operation, 0);
    + rhs_type = TREE_TYPE (op0);

    so you only allow unary operations - please add a helper function here
    that extracts the source type from the operation so we can also allow
    function calls here.

    + /* FRNW: need to extend to support short<->float conversions as well. */
    + if (nunits_out != nunits_in)
    + return false;

    I have been thinking about this myself - there is high-level support
    for this missing,
    so this is ok for now. But the canonical way is to write ? or FIXME
    instead of
    FRNW.

    + /* Check that types are both integral or non-integral. */
    + if ((INTEGRAL_TYPE_P (rhs_type) && INTEGRAL_TYPE_P (lhs_type))
    + || (!INTEGRAL_TYPE_P (rhs_type) && !INTEGRAL_TYPE_P (lhs_type)))
    + return false;

    huh? isn't it the point to have rhs_type real and lhs_type int or the
    other way around?

    + /* Check */
    + ncopies = LP_VINFVECT_FACTR (loop_vinfo) / nunits_in;
    + gcc_assert (ncopies >= 1);

    Check what? Why with an assert?

    + /* Supportable by target? */
    + if (!
    + || ! (vectype_in))
    + return false;

    I see you are passing the _type_ to the target hook. You should pass the
    whole operation there to allow vectorizing for example different rounding
    modes.

    + prev_stmt_info = NULL;
    + for (j = 0; j < ncopies; j++)
    + {
    + vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
    + params = build_tree_list (NULL_TREE, vec_oprnd0);
    +
    + builtin_decl =
    + (vectype_in);

    Wrong indentation, and no need to call this over and over again - cache it
    from the invocation above where you check for target support.

    + new_stmt = build_function_call_expr (builtin_decl, params);
    +
    + /* Arguments are ready. create the new vector stmt. */
    + new_stmt = build2 (GIMPLE_MDIFY_STMT, void_type_node, vec_dest,
    + new_stmt);
    + new_temp = make_ssa_name (vec_dest, new_stmt);
    + GIMPLE_STMTPERAND (new_stmt, 0) = new_temp;
    + vect_finish_stmt_generation (stmt, new_stmt, bsi);
    +
    + if (j == 0)
    + STMT_VINFVEC_STMT (stmt_info) = *vec_stmt = new_stmt;
    + else
    + STMT_VINFRELATED_STMT (prev_stmt_info) = new_stmt;
    + prev_stmt_info = vinfo_for_stmt (new_stmt);
    + }
    + return true;
    + }

    Do you have a testcase excercising ncopies != 1?

    vect_transform_stmt (tree stmt, block_st
    3873,3878
    4009,4019
    gcc_assert (done);
    break;

    + case type_int_float_vec_info_type:
    + done = (stmt, bsi, &vec_stmt);
    + gcc_assert (done);
    + break;
    +

    all the names involving *_int_float_* should probably just be , i.e.
    vectorizable_conversion, etc.

    expand_vector_operations_1 (block_stmt_i
    405,411
    && TREE_CDE_CLASS (code) != tcc_binary)
    return;

    ! if (code == NP_EXPR || code == VIEW_CNVERT_EXPR)
    return;

    gcc_assert (code != CNVERT_EXPR);
    405,413
    && TREE_CDE_CLASS (code) != tcc_binary)
    return;

    ! if (code == NP_EXPR
    ! || code == FLAT_EXPR
    ! || code == VIEW_CNVERT_EXPR)
    return;

    There's a lot missing here, CNVERT_EXPR and FIX_TRUNC_EXPR come to
    my mind.

    Note that the patch somewhat overlaps with the support for vectorizing builtin
    functions, so you might want to call the builtin_vectorized_function target hook
    for conversions that involve a function call. This also suggests that maybe
    the conversion facility should use another generic builtin_vectorized_operation
    target hook which you would pass the tree_code and the vector type
    (as the builtin_vectorized_function hook gets the builtin function id and the
    result vector type). To ever support nunits_in != nunits_out natively it
    would be nice to also pass the desired operand vector type to the target hooks.

    So it would look like

    tree (* builtin_vectorized_operation) (unsigned tree_code, tree
    lhs_vec_type, tree rhs_vec_type);

    Can you rework the patch with this suggested interface change? We can add
    the support for type changing functions later if you like.

    Thanks,
    Richard.
  • No.3 | | 8037 bytes | |

    1/25/07, Tehila Meyzels <TEHILA (AT) il (DOT) ibm.comwrote:

    Here's a more thorough review.

    + /* Returns a code for builtin that realizes vectorized version of
    + int to float conversion, or NULL_TREE if not available. */
    + tree (* builtin_intfloat_conversion) (tree);
    +

    I don't like the name - maybe builtin_vectorized_conversion?

    K. No problem.
    --
    + operation = GIMPLE_STMTPERAND (stmt, 1);
    + code = TREE_CDE (operation);
    + if (code != FIX_TRUNC_EXPR && code != FLAT_EXPR)
    + return false;

    why place this restriction here? I think this should be done in the
    target hook.

    Cause I wanted to take it one step at a time, and make it explicit what
    this function is meant to support for now. Also - do we really want to
    invoke the builtin function on any opcode? I agree that we may want to
    extend the list of conversion opcodes we handle, but I still think it
    should be checked here before we call the target builtin - the idea is that
    the target hook is expected to handle a certain closed set of conversion
    tree-codes, right?

    Yes, sort of. But the target hook can also return NULL_TREE if it doesn't
    support the conversion. So, I'm fine to keep the check in the generic code
    if you add it to the rs6000 target hook as well.

    + /* Check types of lhs and rhs. */
    + op0 = TREEPERAND (operation, 0);
    + rhs_type = TREE_TYPE (op0);

    so you only allow unary operations - please add a helper function here
    that extracts the source type from the operation so we can also allow
    function calls here.

    + /* FRNW: need to extend to support short<->float conversions
    as well. */
    + if (nunits_out != nunits_in)
    + return false;

    I have been thinking about this myself - there is high-level support
    for this missing,

    Not sure what high-level support you are referring to

    There didn't seem to be a way to do for example V2DFmode -V2DImode
    conversion in two steps (there's no cvtpd2di but only cvtsd2di). The closest
    I could find would be enhancing the pattern recognition code to emit more
    than one instruction.

    so this is ok for now. But the canonical way is to write ? or FIXME
    instead of
    FRNW.

    + /* Check that types are both integral or non-integral. */
    + if ((INTEGRAL_TYPE_P (rhs_type) && INTEGRAL_TYPE_P (lhs_type))
    + || (!INTEGRAL_TYPE_P (rhs_type) && !INTEGRAL_TYPE_P (lhs_type)))
    + return false;

    huh? isn't it the point to have rhs_type real and lhs_type int or the
    other way around?
    --
    Exactly. That's why we return 'false' if lhs and rhs are both integral or
    both floats.

    Ah, so I was confused by the comment which suggests you are checking for
    the opposite. Maybe change it to

    /* Bail out if the types are both integral or non-integral. */

    + /* Check */
    + ncopies = LP_VINFVECT_FACTR (loop_vinfo) / nunits_in;
    + gcc_assert (ncopies >= 1);

    Check what? Why with an assert?

    , wrong comment, will fix that.
    It is supposed to say "Sanity check: make sure that at least one copy of
    the vectorized stmt needs to be generated".
    , it means that the VF is say 8, while the nunits of the stmt in
    question is, say, 16, which is not expected
    to happen. The vectorizer is supposed to determine the VF according to the
    smallest type in the loop (i.e. largest nunits)).
    This is the reason for the 'assert'.

    + /* Supportable by target? */
    + if (!
    + || ! (vectype_in))
    + return false;

    I see you are passing the _type_ to the target hook. You should pass the
    whole operation there to allow vectorizing for example different rounding
    modes.

    Can you please send me some examples for what you have in mind?
    (Do you mean lrint family, that rounds vs. (int), that truncates, for
    example?)

    Below I changed my mind somewhat ;) I was indeed thinking of function
    calls, but this is probably better handled by calling to the
    builtin_vectorized_function
    target hook (which can be added later). Still I'd like you to pass
    the TREE_CDE
    of the operation to the hook so the target has a chance to reject an operation
    it cannot handle (we at least have FLAT_EXPR and FIX_TRUNC_EXPR).

    + prev_stmt_info = NULL;
    + for (j = 0; j < ncopies; j++)
    + {
    + vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
    + params = build_tree_list (NULL_TREE, vec_oprnd0);
    +
    + builtin_decl =
    + (vectype_in);

    Wrong indentation, and no need to call this over and over again - cache
    it
    from the invocation above where you check for target support.

    Sure, I'll fix that.
    BTW - I think I've followed build_vectorized_function_call code, so I
    think we have the same problem there, no? ;-)

    From a quick look I cannot see it there, but you may be right.

    + new_stmt = build_function_call_expr (builtin_decl, params);
    +
    + /* Arguments are ready. create the new vector stmt. */
    + new_stmt = build2 (GIMPLE_MDIFY_STMT, void_type_node, vec_dest,
    + new_stmt);
    + new_temp = make_ssa_name (vec_dest, new_stmt);
    + GIMPLE_STMTPERAND (new_stmt, 0) = new_temp;
    + vect_finish_stmt_generation (stmt, new_stmt, bsi);
    +
    + if (j == 0)
    + STMT_VINFVEC_STMT (stmt_info) = *vec_stmt = new_stmt;
    + else
    + STMT_VINFRELATED_STMT (prev_stmt_info) = new_stmt;
    + prev_stmt_info = vinfo_for_stmt (new_stmt);
    + }
    + return true;
    + }

    Do you have a testcase excercising ncopies != 1?

    Sure, I'll add one.

    Thanks!

    vect_transform_stmt (tree stmt, block_st
    3873,3878
    4009,4019
    gcc_assert (done);
    break;

    + case type_int_float_vec_info_type:
    + done = (stmt, bsi, &vec_stmt);
    + gcc_assert (done);
    + break;
    +

    all the names involving *_int_float_* should probably just be , i.e.
    vectorizable_conversion, etc.

    K.
    --
    expand_vector_operations_1 (block_stmt_i
    405,411
    && TREE_CDE_CLASS (code) != tcc_binary)
    return;

    ! if (code == NP_EXPR || code == VIEW_CNVERT_EXPR)
    return;

    gcc_assert (code != CNVERT_EXPR);
    405,413
    && TREE_CDE_CLASS (code) != tcc_binary)
    return;

    ! if (code == NP_EXPR
    ! || code == FLAT_EXPR
    ! || code == VIEW_CNVERT_EXPR)
    return;

    There's a lot missing here, CNVERT_EXPR and FIX_TRUNC_EXPR come to
    my mind.

    I agree.
    FIX_TRUNC_EXPR - sure, as I wrote in my previous note (I missed it).
    CNVERT_EXPR - is this the tree code in lrint cases?

    No, CNVERT_EXPR is just a synonym for NP_EXPR really.

    I haven't found anything more that seems relevant, beside these 2, in
    tree.def.

    Yes, for int <-float conversion we only have FIX_TRUNC_EXPR and FLAT_EXPR,
    all others (like lrint) are represented using calls to builtin functions.

    Note that the patch somewhat overlaps with the support for vectorizing
    builtin
    functions, so you might want to call the builtin_vectorized_function
    target hook
    for conversions that involve a function call. This also suggests that
    maybe
    the conversion facility should use another generic
    builtin_vectorized_operation
    target hook which you would pass the tree_code and the vector type
    (as the builtin_vectorized_function hook gets the builtin function id and
    the
    result vector type). To ever support nunits_in != nunits_out natively it
    would be nice to also pass the desired operand vector type to the
    target hooks.

    So it would look like

    tree (* builtin_vectorized_operation) (unsigned tree_code, tree
    lhs_vec_type, tree rhs_vec_type);

    Can you rework the patch with this suggested interface change? We can
    add
    the support for type changing functions later if you like.

    Sure.
    Thanks for the helpful comments.

    Tehila.
    --
    Thanks,
    Richard.
    --
  • No.4 | | 5226 bytes | |

    1/25/07, Dorit Nuzman <DRIT (AT) il (DOT) ibm.comwrote:
    "Richard Guenther" <richard.guenther (AT) gmail (DOT) comwrote on 24/01/2007
    17:49:08:

    Hi Richard,

    Here's a more thorough review.

    + /* Check types of lhs and rhs. */
    + op0 = TREEPERAND (operation, 0);
    + rhs_type = TREE_TYPE (op0);

    so you only allow unary operations - please add a helper function here
    that extracts the source type from the operation so we can also allow
    function calls here.
    --
    I think that vectorization of function calls should be handled by
    vectorizable_call (that you had recently added). I think vectorizable_call
    doesn't need to really care about the semantics of the called function -
    whether it does some math function or a conversion - from the vectorizer's
    point of view it's replacing one function call with another. right?

    so this is ok for now. But the canonical way is to write ? or FIXME
    instead of
    FRNW.
    --
    The convention in the vectorizer files has been to use a FRNW for
    restrictions on vectorization that can (and should) be relaxed, and
    FIXME/CHECKME for other things. It can easily be changed to FIXMEs (do we
    really want to do that though?)

    I guess no then.

    + new_stmt = build_function_call_expr (builtin_decl, params);
    +
    + /* Arguments are ready. create the new vector stmt. */
    + new_stmt = build2 (GIMPLE_MDIFY_STMT, void_type_node, vec_dest,
    + new_stmt);
    + new_temp = make_ssa_name (vec_dest, new_stmt);
    + GIMPLE_STMTPERAND (new_stmt, 0) = new_temp;
    + vect_finish_stmt_generation (stmt, new_stmt, bsi);
    +
    + if (j == 0)
    + STMT_VINFVEC_STMT (stmt_info) = *vec_stmt = new_stmt;
    + else
    + STMT_VINFRELATED_STMT (prev_stmt_info) = new_stmt;
    + prev_stmt_info = vinfo_for_stmt (new_stmt);
    + }
    + return true;
    + }

    Do you have a testcase excercising ncopies != 1?
    --
    good point, a testcase should be added. Here's an exmaple:

    for (i=0; i<n; i++){
    float_arr[i] = (float) int_arr[i];
    char_arr[i] = 0;
    }

    (the above loop will be vectorized using VF=16 (for vector size 16 bytes),
    which means that we have to "unroll" the int->float conversion operation by
    4 (i.e. create 4 copies, in order to generate 16 float results in each
    vectorized iteration).

    By the way - this (supporting the case that ncopies>1) is something that is
    also missing in vectorizable_call - I was going to address this a few weeks
    ago but never got around to doing that (I totally missed it when I read
    over your patch - sorry about that). I'm travelling next week, but could
    provide a patch to add the required support in the following week. (What
    I'd really want to do is to put together some template for all the
    "vectorizable_*" functions so that we won't forget stuff like that in the
    future).

    Note that the patch somewhat overlaps with the support for vectorizing
    builtin
    functions, so you might want to call the builtin_vectorized_function
    target hook
    for conversions that involve a function call.

    which goes back to the point I was trying to make above - vectorizing
    function calls (that happen to do a conversion) should be handled by
    vectorizable_call. vectorizable_conversion will handle only tree-code that
    do conversions. What do you say?

    Hmm, but then if there is no special support for lhs_type != rhs_type necessary
    then why call it vectorize_intfloat_conversion and not provide generic
    target_vectorize_unop and target_vectorize_binop target hooks (or only one)?

    If it is indeed special because it's a conversion then it should also handle
    the case of converting function calls (on a second look - it doesn't
    look special,
    so I'll work on supporting different input/output vector types for the
    function vectorizing path).

    To ever support nunits_in != nunits_out natively it
    would be nice to also pass the desired operand vector type to the
    target hooks.

    So it would look like

    tree (* builtin_vectorized_operation) (unsigned tree_code, tree
    lhs_vec_type, tree rhs_vec_type);
    --
    it will take more than that - if the size of the arguments is different
    than the size of the result then vectorization involves working with a
    different number of vector registers coming in to the conversion compared
    to the number of vector register coming out of the conversion (i.e. doing
    something like we do in vectorize_demotion/promotion, although maybe not
    using the same pack/unpack idioms) - e.g. - say you want to vectorize a
    short->float conversion, and say we're working with a vector size 16 bytes
    and the VF is 8 - your rhs argument is one vector of 8 shorts; your lhs
    needs to be 2 vectors of 4 floats each. This is part of why the intention
    was to take it one step at a time - solve the same-size-conversion first,
    and the more general case later.

    Yes, I agree - but it's something that without it we're not going to vectorize
    too much int <-float conversions in practice.

    Richard.
  • No.5 | | 2363 bytes | |

    1/25/07, Dorit Nuzman <DRIT (AT) il (DOT) ibm.comwrote:
    "Richard Guenther" <richard.guenther (AT) gmail (DOT) comwrote on 25/01/2007
    18:02:59:
    1/25/07, Dorit Nuzman <DRIT (AT) il (DOT) ibm.comwrote:

    which goes back to the point I was trying to make above - vectorizing
    function calls (that happen to do a conversion) should be handled by
    vectorizable_call. vectorizable_conversion will handle only tree-code
    that
    do conversions. What do you say?

    Hmm, but then if there is no special support for lhs_type !=
    rhs_type necessary
    then why call it vectorize_intfloat_conversion and not provide generic
    target_vectorize_unop and target_vectorize_binop target hooks (or only
    one)?
    --
    it's a good point actually - it may be that vectorizable_conversion has
    more in common with vectorizable_operation than it may have seemed at
    first. The general idea was that all vectorizable_* functions (at least
    those that deal with tree-codes) handle the same type coming in as the type
    coming out, expect for vectorizable_demotion, vectorizable_promotion, and
    vectorizable_conversion, which are the only places where you need to worry
    about different types. But it may not be a bad idea to distribute things
    differently, like you suggest. Anyhow, lets put in this functionality first
    as is, and then see how best (if at all) to reorganize things

    If it is indeed special because it's a conversion then it should also
    handle
    the case of converting function calls

    I was thinking there's more in common to handling different function calls
    than between handling (conversion) tree-codes and (conversion) function
    calls. But, the more I think about it the less I'm sure of it Also, I
    think that once we add the conversion between different sizes, it will be
    clearer how we want to organize things. Anyhow, like I said before, I think
    there's a lot in common to a lot of the vectorizable_* functions, and maybe
    even room to consider a single template, so when I get to doing that it
    will also sort out this issue

    Tehila, can you prepare an updated patch with the other minor stuff
    I suggested and the TREE_CDE passed to the hook? Also the new target
    hook needs documenting in tm.texi.

    Thanks,
    Richard.
  • No.6 | | 1482 bytes | |

    1/27/07, Tehila Meyzels <TEHILA (AT) il (DOT) ibm.comwrote:

    "Richard Guenther" <richard.guenther (AT) gmail (DOT) comwrote on 25/01/2007
    18:56:56:
    Tehila, can you prepare an updated patch with the other minor stuff
    I suggested and the TREE_CDE passed to the hook? Also the new target
    hook needs documenting in tm.texi.

    Sure. I'll do that.
    Thanks a lot to both of you for your help.
    Tehila.

    I noticed in playing with the vectorized functions that

    + for (j = 0; j < ncopies; j++)
    + {
    + vec_oprnd0 = vect_get_vec_def_for_operand (op0, stmt, NULL);
    + params = build_tree_list (NULL_TREE, vec_oprnd0);
    +
    + builtin_decl =
    + (vectype_in);
    + new_stmt = build_function_call_expr (builtin_decl, params);
    +
    + /* Arguments are ready. create the new vector stmt. */
    + new_stmt = build2 (GIMPLE_MDIFY_STMT, void_type_node, vec_dest,
    + new_stmt);
    + new_temp = make_ssa_name (vec_dest, new_stmt);
    + GIMPLE_STMTPERAND (new_stmt, 0) = new_temp;
    + vect_finish_stmt_generation (stmt, new_stmt, bsi);
    +
    + if (j == 0)
    + STMT_VINFVEC_STMT (stmt_info) = *vec_stmt = new_stmt;
    + else
    + STMT_VINFRELATED_STMT (prev_stmt_info) = new_stmt;
    + prev_stmt_info = vinfo_for_stmt (new_stmt);
    + }

    will not work correctly for ncopies 1 as you need to use
    vect_get_vec_def_for_stmt_copy ()
    for the function argument for j 0. See vectorizable_reduction ().

    Richard.

Re: Vectorize int to float conversions


max 4000 letters.
Your nickname that display:
In order to stop the spam: 8 + 7 =
QUESTION ON "Development"

EMSDN.COM