For x86 div operation, can someone please help understand why the EAX register is not used when the operand/dividend size is a double word, instead of splitting them into DX:AX registers?
Given here: https://c9x.me/x86/html/file_module_x86_id_72.html
Is it something to do with the sign?
Thanks!
It's historical. That's what the 16-bit 8086 did, because it had no 32-bit registers. And when the 32-bit 80386 was introduced, one of its major "features" was backward compatibility with the 8086: it could run 8086 code in its "real" or "virtual 8086" modes.
So the 80386 CPU had to support an instruction that did a 32-bit divide of a dividend split across two 16-bit registers, for use in real/v86 modes. And so it might as well use the same behavior in 32-bit protected mode, rather than use more transistors to implement an alternate form. Anyhow, it was generally part of Intel's philosophy to have as much similarity between the 386 and 8086 as possible, even if that led to some awkward choices that wouldn't have made sense for a generic 32-bit machine. (We're still paying for many of those choices, 35 years later.)
In 32-bit or 64-bit mode, rather than use this form, you may prefer to put your 32-bit dividend in eax
, zero out edx
, and do a 64->32 bit div
(quadword/doubleword operand size). This also allows you a full 32 bits for the result, which then cannot overflow; whereas with a 32->16 bit div
, you will get an exception if the result does not fit in 16 bits.
I guess also one of the design choice reasons why Apple's arm has such an edge over the x86? (just an amateur guess)
@Alterecho Not really, no. The main reason is that they got their process down on the money and made a design with an extraordinary number of execution units and a wide frontend. Intel could do the same with more work.
@fuz Oh yes, I also forgot about the process. 14nm vs M1's 5nm. The on silicon unified memory RAM! The fast swap with the fast SSD! I think AMD got quite close to the idea of an UMA architecture with it's smart access memory tech!