x86/Intel: adjust representation of embedded rounding / SAE

MASM doesn't consider {sae} and alike a separate operand; it is attached
to the last register operand instead, just like spelled out by the SDM.
Make the disassembler follow this first, before also adjusting the
assembler (such that it'll be easy to see that the assembler change
doesn't alter generated code).
37 files changed