target/120234 - adjust vector construction costs The following allows vectorizing the gcc.target/i386/pr111023*.c testcases again with -m32 -msse2 by ensuring we see through a cast when looking for vector extract sources during costing of vector construction. This, together with the forwprop fix fixes the regression on those testcases. PR target/120234 * config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): For constructor elements always look through a conversion. Rewrite load and vector extraction matching to be more obvious. Allow arbitrary conversions from the vector extract to elide costing of a gpr<->xmm move.
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 52f8218..acedc73 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc
@@ -26427,26 +26427,29 @@ TREE_VISITED (op) = 1; gimple *def = SSA_NAME_DEF_STMT (op); tree tem; + /* Look through a conversion. */ if (is_gimple_assign (def) && CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (def)) && ((tem = gimple_assign_rhs1 (def)), true) - && TREE_CODE (tem) == SSA_NAME - /* A sign-change expands to nothing. */ - && tree_nop_conversion_p (TREE_TYPE (gimple_assign_lhs (def)), - TREE_TYPE (tem))) + && TREE_CODE (tem) == SSA_NAME) def = SSA_NAME_DEF_STMT (tem); - /* When the component is loaded from memory we can directly - move it to a vector register, otherwise we have to go - via a GPR or via vpinsr which involves similar cost. - Likewise with a BIT_FIELD_REF extracting from a vector - register we can hope to avoid using a GPR. */ - if (!is_gimple_assign (def) - || ((!gimple_assign_load_p (def) - || (!TARGET_SSE4_1 - && GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op))) == 1)) - && (gimple_assign_rhs_code (def) != BIT_FIELD_REF - || !VECTOR_TYPE_P (TREE_TYPE - (TREE_OPERAND (gimple_assign_rhs1 (def), 0)))))) + /* When the component is loaded from memory without sign- + or zero-extension we can move it to a vector register and/or + insert it via vpinsr with a memory operand. */ + if (gimple_assign_load_p (def) + && tree_nop_conversion_p (TREE_TYPE (op), + TREE_TYPE (gimple_assign_lhs (def))) + && (GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op))) > 1 + || TARGET_SSE4_1)) + ; + /* When the component is extracted from a vector it is already + in a vector register. */ + else if (is_gimple_assign (def) + && gimple_assign_rhs_code (def) == BIT_FIELD_REF + && VECTOR_TYPE_P (TREE_TYPE + (TREE_OPERAND (gimple_assign_rhs1 (def), 0)))) + ; + else { if (fp) {