![Page 1: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/1.jpg)
with a focus on floating point
![Page 2: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/2.jpg)
For floating point (i.e., real numbers), MASM supports: real4
single precision; IEEE standard; analogous to float real8
double precision; IEEE standard; analogous to double
real10 double extended precision Not IEEE standard
NaN = Not a Number (see p. 4-14 of v1)
![Page 3: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/3.jpg)
SSE2 supports 32 and 64 bit f.p. data x87 supports 32, 64, and 80 bit f.p. data
![Page 4: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/4.jpg)
![Page 5: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/5.jpg)
Note: These are 24-bit binary numbers.
Here they are in base 10: 2.00000000000000 1.99999988079071
![Page 6: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/6.jpg)
![Page 7: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/7.jpg)
SSE2 = Streaming SIMD Extensions 2 SIMD = Single Instruction Multiple Data
instructions
SSE2 introduced in 2000 on Pentium 4 and Intel Xeon processors.
![Page 8: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/8.jpg)
1996 Intel MMX 1998 AMD 3DNow! 1999 Intel SSE on P3 2001 Intel SSE2 on P4 2003 Intel SSE3 (since Prescott P4) 2006 Intel SupplementalSSE3 (since Woodcrest Xeons) 2006 Intel SSE4 (4.1 and 4.2) 2007 AMD SSE5 (proposed 2007, implemented 2011) 2008 Intel AVX (proposed 2008, implemented 2011 in Intel
Westmere and AMD Bulldozer) XMM registers go from 128 bit to 256 bit, called YMM.
![Page 9: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/9.jpg)
1. You must use MASM v6.15 or newer for SIMD support. (MASM v6.15 is available from the course software web page.)
2. You must enable MASM support for these instructions with the following:
.686 ;instructions for Pentium Pro (or better)
.xmm ;allow simd instructions.model flat, stdcall ;no crazy segments!
![Page 10: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/10.jpg)
Each one of the 8 128-bit registers (xmm0...xmm7) can hold: 16 packed 1 byte integers 8 packed word (2 byte) integers 4 packed doubleword (4 byte) integers 2 packed quadword (8 byte) integers 1 double quadword (16 byte)
4 packed single precision (4 bytes each) floating point values
2 packed double precision (8 bytes each) floating point values
![Page 11: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/11.jpg)
![Page 12: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/12.jpg)
![Page 13: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/13.jpg)
![Page 14: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/14.jpg)
![Page 15: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/15.jpg)
IA32 Registers: 8 32-bit GPRs
Integer only 8 80-bit fp regs
Floating point only 8 64-bit mmx regs
Integer only Re-uses fp regs
8 128-bit xmm regs Integer and fp
![Page 16: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/16.jpg)
IA32 Registers: 8 32-bit GPRs
Integer only 8 80-bit fp regs
Floating point only 8 64-bit mmx regs
Integer only Re-uses fp regs
8 128-bit xmm regs Integer and fp
![Page 17: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/17.jpg)
IA32 Registers: 8 32-bit GPRs
Integer only 8 80-bit fp regs
Floating point only 8 64-bit mmx regs
Integer only Re-uses fp regs
8 128-bit xmm regs Integer and fp
![Page 18: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/18.jpg)
IA32 Registers: 8 32-bit GPRs
Integer only 8 80-bit fp regs
Floating point only 8 64-bit mmx regs
Integer only Re-uses fp regs
8 128-bit xmm regs Integer and fp These will be the
focus of our discussion.
![Page 19: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/19.jpg)
![Page 20: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/20.jpg)
XMMregisterformats
![Page 21: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/21.jpg)
The utilities.asm MASM code (on the course’s software web page) contains a function that you can call to display the contents of the 8 xmm registers (dump) as pairs of 64 bit double precision fp values.
call dumpXmm64
![Page 22: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/22.jpg)
![Page 23: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/23.jpg)
1. Data movement
2. Arithmetic
3. Comparison
4. Conversion
![Page 24: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/24.jpg)
1. Data movement
2. Arithmetic
3. Comparison
4. Conversion
![Page 25: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/25.jpg)
movhpd Move High Packed Double-Precision Floating-
Point Value
movlpd Move Low Packed Double-Precision Floating-
Point Value
movsd Move Scalar Double-Precision Floating-Point
Value
![Page 26: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/26.jpg)
movhpd - Move High Packed Double-Precision Floating-Point Value for memory to XMM move:
DEST[127-64] ← SRC; DEST[63-0] unchanged Ex. movhpd xmm0, m64
for XMM to memory move: DEST ← SRC[127-64] Ex. movhpd m64, xmm2
![Page 27: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/27.jpg)
movlpd - Move Low Packed Double-Precision Floating-Point Value for memory to XMM move:
DEST[127-64] unchanged; DEST[63-0] ← SRC
Ex. movlpd xmm1, m64 for XMM to memory move:
DEST ← SRC[63-0] Ex. movlpd m64, xmm2
![Page 28: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/28.jpg)
movsd - Move Scalar Double-Precision Floating-Point Value
1. when source and destination operands are both XMM registers: DEST[127-64] remains unchanged; DEST[63-0] ←
SRC[63-0] Ex. movsd xmm1, xmm3
2. when source operand is XMM register and destination operand is memory location: DEST ← SRC[63-0] Ex. movsd m64, xmm2
3. when source operand is memory location and destination operand is XMM register: DEST[127-64] ← 0000000000000000H; DEST[63-0] ← SRC Ex. movsd xmm1, m64
![Page 29: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/29.jpg)
1. Data movement
2. Arithmetic (scalar)
3. Comparison
4. Conversion
![Page 30: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/30.jpg)
addsd - Add Scalar Double-Precision Floating-Point Values
subsd - Subtract Scalar Double-Precision Floating-Point Values
mulsd - Multiply Scalar Double-Precision Floating-Point Values
divsd - Divide Scalar Double-Precision Floating-Point Values
Also sqrtsd but no sin or cos SSE2 instructions! We have to use the x87 instructions for that!
![Page 31: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/31.jpg)
addsd DEST[63-0] ← DEST[63-0] + SRC[63-0] DEST[127-64] remains unchanged
![Page 32: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/32.jpg)
subsd DEST[63-0] ← DEST[63-0] − SRC[63-0] DEST[127-64] remains unchanged
![Page 33: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/33.jpg)
mulsd DEST[63-0] ← DEST[63-0] * xmm2/m64[63-0] DEST[127-64] remains unchanged
![Page 34: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/34.jpg)
divsd DEST[63-0] ← DEST[63-0] / SRC[63-0] DEST[127-64] remains unchanged
![Page 35: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/35.jpg)
1. Data movement
2. Arithmetic (packed)
3. Comparison
4. Conversion
![Page 36: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/36.jpg)
addpd - Add Packed Double-Precision Floating-Point Values
subpd - Subtract Packed Double-Precision Floating-Point Values
mulpd - Multiply Packed Double-Precision Floating-Point Values
divpd - Divide Packed Double-Precision Floating-Point Values
![Page 37: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/37.jpg)
addpd - Add Packed Double-Precision Floating-Point Values DEST[63-0] ← DEST[63-0] + SRC[63-0] DEST[127-64] ← DEST[127-64] + SRC[127-64]
![Page 38: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/38.jpg)
subpd - Subtract Packed Double-Precision Floating-Point Values DEST[63-0] ← DEST[63-0] / (SRC[63-0]) DEST[127-64] ← DEST[127-64] / (SRC[127-64])
![Page 39: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/39.jpg)
mulpd - Multiply Packed Double-Precision Floating-Point Values DEST[63-0] ← DEST[63-0] / (SRC[63-0]) DEST[127-64] ← DEST[127-64] / (SRC[127-64])
![Page 40: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/40.jpg)
divpd - Divide Packed Double-Precision Floating-Point Values DEST[63-0] ← DEST[63-0] / (SRC[63-0]) DEST[127-64] ← DEST[127-64] / (SRC[127-64])
![Page 41: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/41.jpg)
1. Data movement
2. Arithmetic
3. Comparison
4. Conversion
![Page 42: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/42.jpg)
comisd Compare Scalar Ordered Double-Precision
Floating-Point Values and Set EFLAGS
![Page 43: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/43.jpg)
1. Data movement
2. Arithmetic
3. Comparison
4. Conversion
![Page 44: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/44.jpg)
cvtsd2si Convert Scalar Double-Precision Floating-Point
Value to Doubleword Integer
cvtsi2sd Convert Doubleword Integer to Scalar Double-
Precision Floating-Point Value
![Page 45: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/45.jpg)
cvtsd2si Convert Scalar Double-Precision Floating-Point
Value to Doubleword Integer DEST[31-0] ←
Convert_Double_Precision_Floating_Point_To_Integer(SRC[63-0])
![Page 46: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/46.jpg)
cvtsi2sd Convert Doubleword Integer to Scalar Double-
Precision Floating-Point Value DEST[63-0] ←
Convert_Integer_To_Double_Precision_Floating_Point(SRC[31-0])
DEST[127-64] remains unchanged
![Page 47: With a focus on floating point. For floating point (i.e., real numbers), MASM supports: real4 single precision; IEEE standard; analogous to float](https://reader036.vdocuments.us/reader036/viewer/2022062309/5697bfa01a28abf838c9543d/html5/thumbnails/47.jpg)