myIaaS augments

×

Hardware Specifications Comparison

Content for Hardware Specifications Comparison goes here.

Specification	Intel Xeon Phi 31S1P	Intel Xeon E5-2690 v2
Socket / Bus Type	PCIe 3.0 x16	FCLGA 2011
Cores	57	10
Threads	228 (Hyper-Threading X2)	20 (Hyper-Threading)
Base Clock Speed	1.1 GHz	2.8 GHz
Max Turbo Speed	N/A	3.6 GHz
Cache	8 MB (L2)	25 MB (L3)
TDP	225 W	135 W
Instruction Set
MMX	✔	✔
SSE	✔	✔
SSE2	✔	✔
SSE3	✔	✔
SSSE3	✔	✔
SSE4.1	✔	✔
SSE4.2	✔	✔
AVX	✔	✔
AVX-512	✔	✖
AVX2	✖	✔
FMA	✔	✔
Intel 64	✔	✔
BMI1/BMI2	✖	✔
CLMUL	✖	✔
RDRAND	✖	✔

×

ISA Examples in C++

These ISAs are common between the Intel E5 & Phi 31S1P, and I will expand on them in a bit more detail later on.

MMX

MMX (MultiMedia eXtensions) is used for multimedia tasks. It operates on 64-bit registers.


#include <mmintrin.h>

void mmx_example() {
    __m64 a = _mm_set_pi32(1, 2); // Set two integers
    __m64 b = _mm_set_pi32(3, 4);
    __m64 result = _mm_add_pi32(a, b); // Add the two
    _mm_empty(); // Clear MMX state
}

SSE

SSE (Streaming SIMD Extensions) allows for single instruction multiple data operations on 128-bit registers.


#include <xmmintrin.h>

void sse_example() {
    __m128 a = _mm_set_ps(1.0f, 2.0f, 3.0f, 4.0f);
    __m128 b = _mm_set_ps(5.0f, 6.0f, 7.0f, 8.0f);
    __m128 result = _mm_add_ps(a, b); // Add the two
}

SSE2

SSE2 extends SSE with support for double-precision floating-point and integer data types.


#include <emmintrin.h>

void sse2_example() {
    __m128d a = _mm_set_pd(1.0, 2.0);
    __m128d b = _mm_set_pd(3.0, 4.0);
    __m128d result = _mm_add_pd(a, b); // Add the two
}

SSE3

SSE3 adds new instructions for complex arithmetic and horizontal operations.


#include <pmmintrin.h>

void sse3_example() {
    __m128d a = _mm_set_pd(1.0, 2.0);
    __m128d b = _mm_set_pd(3.0, 4.0);
    __m128d result = _mm_hadd_pd(a, b); // Horizontal add
}

SSSE3

SSSE3 introduces additional instructions for data manipulation.


#include <tmmintrin.h>

void ssse3_example() {
    __m128i a = _mm_set_epi8(1, 2, 3, 4, 5, 6, 7, 8);
    __m128i b = _mm_set_epi8(8, 7, 6, 5, 4, 3, 2, 1);
    __m128i result = _mm_add_epi8(a, b); // Add packed bytes
}

SSE4.1

SSE4.1 adds new instructions for string and integer operations.


#include <smmintrin.h>

void sse4_1_example() {
    __m128i a = _mm_set_epi32(1, 2, 3, 4);
    __m128i b = _mm_set_epi32(5, 6, 7, 8);
    __m128i result = _mm_max_epi32(a, b); // Max of packed integers
}

SSE4.2

SSE4.2 enhances string processing and includes new integer operations.


#include <nmmintrin.h>

void sse4_2_example() {
    __m128i a = _mm_set_epi32(1, 2, 3, 4);
    __m128i b = _mm_set_epi32(5, 6, 7, 8);
    __m128i result = _mm_cmpestrm(a, 4, b, 4, _SIDD_CMP_EQUAL_EACH); // Compare strings
}

AVX

AVX (Advanced Vector Extensions) extends the SIMD capabilities to 256 bits.


#include <immintrin.h>

void avx_example() {
    __m256 a = _mm256_set_ps(1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f, 7.0f, 8.0f);
    __m256 b = _mm256_set_ps(8.0f, 7.0f, 6.0f, 5.0f, 4.0f, 3.0f, 2.0f, 1.0f);
    __m256 result = _mm256_add_ps(a, b); // Add the two
}

FMA

FMA (Fused Multiply-Add) allows for a single instruction to perform multiplication and addition, improving performance and precision.


#include <immintrin.h>

void fma_example() {
    __m256 a = _mm256_set_ps(1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f, 7.0f, 8.0f);
    __m256 b = _mm256_set_ps(8.0f, 7.0f, 6.0f, 5.0f, 4.0f, 3.0f, 2.0f, 1.0f);
    __m256 c = _mm256_set1_ps(2.0f); // Set all elements to 2.0
    __m256 result = _mm256_fmadd_ps(a, b, c); // result = (a * b) + c
}

Intel 64 / AMD64

Intel 64 and AMD64 are 64-bit architectures that support a wide range of instructions, including those from the previous ISAs. Here's a simple example of using 64-bit integers.


#include <cstdint>

void amd64_example() {
    int64_t a = 9223372036854775807; // Max value for int64_t
    int64_t b = 1;
    int64_t result = a + b; // Simple addition
}

These are specific ISA to their socket or bus

Intel Xeon Phi 31S1P - AVX-512


#include <immintrin.h>
#include <iostream>

void avx512_example() {
    __m512i a = _mm512_set1_epi32(2); // Set all elements to 2
    __m512i b = _mm512_set1_epi32(3); // Set all elements to 3
    __m512i result = _mm512_add_epi32(a, b); // Add the two
    int32_t* res = (int32_t*)&result;
    std::cout << "AVX-512 Result: ";
    for (int i = 0; i < 16; ++i) {
        std::cout << res[i] << " "; // Print results
    }
    std::cout << std::endl;
}

Intel Xeon E5-2690 v2

AVX2 Example


#include <immintrin.h>
#include <iostream>

void avx2_example() {
    __m256i a = _mm256_set1_epi32(4); // Set all elements to 4
    __m256i b = _mm256_set1_epi32(5); // Set all elements to 5
    __m256i result = _mm256_add_epi32(a, b); // Add the two
    int32_t* res = (int32_t*)&result;
    std::cout << "AVX2 Result: ";
    for (int i = 0; i < 8; ++i) {
        std::cout << res[i] << " "; // Print results
    }
    std::cout << std::endl;
}

BMI Example


#include <immintrin.h>
#include <iostream>

void bmi_example() {
    uint32_t a = 0b1100; // Example value
    uint32_t result = _blsi_u32(a); // Get the least significant set bit
    std::cout << "BMI Result: " << result << std::endl;
}

CLMUL Example


#include <immintrin.h>
#include <iostream>

void clmul_example() {
    uint64_t a = 0x1234567890abcdef;
    uint64_t b = 0xfedcba0987654321;
    uint64_t result = _mm_clmulepi64_si128(_mm_set_epi64x(a, 0), _mm_set_epi64x(b, 0), 0);
    std::cout << "CLMUL Result: " << std::hex << result << std::endl;
}

RDRAND Example


#include <immintrin.h>
#include <iostream>

void rdrand_example() {
    unsigned int random_value;
    if (_rdrand32_step(&random_value)) {
        std::cout << "RDRAND Result: " << random_value << std::endl;
    } else {
        std::cout << "RDRAND failed to generate a random number." << std::endl;
    }
}

×

Software Utilising Various ISAs

Content for Software Utilising Various ISAs goes here.

Software Name	Description	ISAs Utilised
FFmpeg	A multimedia framework for recording, converting, and streaming audio and video.	MMX, SSE, SSE2, AVX
GIMP	A powerful open-source image editor that supports various image formats.	SSE, SSE2
Blender	A 3D graphics and animation software that supports modelling, rendering, and animation.	SSE, AVX
OpenCV	A computer vision library that provides tools for image processing and machine learning.	SSE, AVX
Libav	A fork of FFmpeg that provides libraries and tools for handling multimedia data.	MMX, SSE, AVX
SciPy	A Python library used for scientific and technical computing, often leveraging optimised libraries.	SSE, AVX
TensorFlow	An open-source machine learning framework that can utilise various ISAs for performance.	AVX, FMA
GNU Octave	A high-level programming language primarily intended for numerical computations.	SSE, AVX
MPlayer	A media player that supports a wide range of audio and video formats.	MMX, SSE
Caffe	A deep learning framework that can utilise optimised libraries for performance.	AVX, FMA

Closed Source Software

Software Name	Description	ISAs Utilised
Adobe Photoshop	A leading image editing software used for photo editing, graphic design, and digital art.	SSE, SSE2, AVX
Microsoft Office	A suite of productivity applications including Word, Excel, and PowerPoint.	SSE, AVX
NVIDIA CUDA Toolkit	A parallel computing platform and application programming interface model created by NVIDIA.	SSE, AVX
Autodesk Maya	A 3D computer graphics application used for creating interactive 3D applications, including video games.	SSE, AVX
MATLAB	A high-level programming language and interactive environment for numerical computation, visualisation, and programming.	SSE, AVX
Unity	A cross-platform game engine used for developing video games and simulations for computers, consoles, and mobile devices.	SSE, AVX
Final Cut Pro	A professional video editing software developed by Apple for macOS.	SSE, AVX
CorelDRAW	A vector graphics editor used for graphic design, illustration, and layout.	SSE, AVX
CyberLink PowerDirector	A video editing software that provides tools for creating and editing videos.	SSE, AVX
VMware Workstation	A virtualisation software that allows users to run multiple operating systems on a single physical machine.	SSE, AVX

Menu

Integration of compute

Compute usage

Some footer content

Menu

Hardware Specifications Comparison

ISA Examples in C++

MMX

SSE

SSE2

SSE3

SSSE3

SSE4.1

SSE4.2

AVX

FMA

Intel 64 / AMD64

These are specific ISA to their socket or bus

Intel Xeon Phi 31S1P - AVX-512

Intel Xeon E5-2690 v2

AVX2 Example

BMI Example

CLMUL Example

RDRAND Example

Software Utilising Various ISAs

Closed Source Software

ISA Code Examples

Supported Instruction Sets Comparison

Intel Xeon E5-2690 v2

Intel Xeon Phi 31S1P

Applications Utilising ISAs

Integration of compute

Compute usage

Some footer content