_mm_dp_pd

Microsoft Specific

Emits the Streaming SIMD Extensions 4 (SSE4) instruction dppd. This instruction computes the dot product of double precision floating point values.

__m128d _mm_dp_pd( 
   __m128d a,
   __m128d b,
   const int mask 
);

Parameters

  • [in] a
    A 128-bit parameter that contains two 64-bit floating point values.

  • [in] b
    A 128-bit parameter that contains two 64-bit floating point values.

  • [in] mask
    A constant mask that determines which components will be multiplied and where to place the results.

Return value

A 128 bit parameter that contains both 64-bit results of the dot products.

The result can be expressed with the following equations:

tmp0 := (mask4 == 1) ? (a0 * b0) : +0.0
tmp1 := (mask5 == 1) ? (a1 * b1) : +0.0
tmp2 := tmp0 + tmp1
r0 := (mask0 == 1) ? tmp2 : +0.0
r1 := (mask1 == 1) ? tmp2 : +0.0

Requirements

Intrinsic

Architecture

_mm_dp_pd

x86, x64

Header file <smmintrin.h>

Remarks

The immediate bits 4-5 of mask determine which of the corresponding source operand pairs are to be multiplied. Bits 0-1 determine whether the dot product result will be written. If a mask bit is 0, the corresponding product result or written value is +0.0.

r0, a0, and b0 are the lowest 64 bits of return value r and parameters a and b, respectively. r1, a1, and b1 are the highest 64 bits of return value r and parameters a and b, respectively.

maski is bit i of parameter mask, where bit 0 is the least significant bit.

Before you use this intrinsic, software must ensure that the underlying processor supports the instruction.

Example

#include <stdio.h>
#include <smmintrin.h>

int main ()
{
    __m128d a, b;
    const int mask = 0x31;

    a.m128d_f64[0] = 1.5;
    a.m128d_f64[1] = 10.25;
    b.m128d_f64[0] = -1.5;
    b.m128d_f64[1] = 3.125;

    __m128d res = _mm_dp_pd(a, b, mask);

    printf_s("Original a: %I64f\t%I64f\nOriginal b: %I64f\t%I64f\n",
                a.m128d_f64[0], a.m128d_f64[1], b.m128d_f64[0], b.m128d_f64[1]);
    printf_s("Result res: %I64f\t%I64f\n",
                res.m128d_f64[0], res.m128d_f64[1]);

    return 0;
}
Original a: 1.500000    10.250000
Original b: -1.500000   3.125000
Result res: 29.781250   0.000000

See Also

Reference

Compiler Intrinsics