x86: fold FMA VEX and EVEX templates Following the folding of some generic AVX/AVX2 templates with their AVX512F counterpart ones, do this for FMA ones as well, requiring one further adjustment to cpu_flags_match().