Allow early sets of SSE hard registers from standard_sse_constant_p.

My previous patch, which was intended to reduce the differences seen by
the combination of -march=cascadelake and -m32, has additionally found
some more instances where this combination behaves differently to regular
x86_64-pc-linux-gnu.  The middle-end always, and backends usually, use
emit_move_insn to emit/expand move instructions allowing the backend
control over placing things in constant pools, adding REG_EQUAL notes,
and so on.  Several of the AVX512 built-in expanders bypass this logic,
and instead generate moves directly using emit_insn(gen_rtx_SET (dst,src)).

For example, i386-expand.c line 12004 contains:
      for (i = 0; i < 8; i++)
        emit_insn (gen_rtx_SET (xmm_regs[i], const0_rtx));

I suspect that in this case, loading of standard_sse_constant_p, my
change to require loading of likely spilled hard registers via a
pseudo is perhaps overly strict, so this patch/fix reallows these
immediate constants values to be loaded directly prior to reload.

2021-10-15  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
	* config/i386/i386.c (ix86_hardreg_mov_ok): For vector modes,
	allow standard_sse_constant_p immediate constants.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index fb65609..9cc903e 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -19303,7 +19303,9 @@
   /* Avoid complex sets of likely_spilled hard registers before reload.  */
   if (REG_P (dst) && HARD_REGISTER_P (dst)
       && !REG_P (src) && !MEM_P (src)
-      && !x86_64_immediate_operand (src, GET_MODE (dst))
+      && !(VECTOR_MODE_P (GET_MODE (dst))
+	   ? standard_sse_constant_p (src, GET_MODE (dst))
+	   : x86_64_immediate_operand (src, GET_MODE (dst)))
       && ix86_class_likely_spilled_p (REGNO_REG_CLASS (REGNO (dst)))
       && !reload_completed)
     return false;