Releases · ermig1979/Simd

03 Mar 12:58

ermig1979

v4.9.111

76bb6f2

Simd v4.9.111

Algorithms

New features

AVX2, AVX-512BW optimizations of ResizerByteBicubic class.
SSE4.1, AVX2, AVX-512BW, NEON optimizations of function Base64Decode.
NEON optimizations of function SynetSwish32f.
Swish activation function to NEON optimizations of SynetConvolution32f framework.
Swish activation function to NEON optimizations of SynetDeconvolution32f framework.
Swish activation function to NEON optimizations of SynetMergedConvolution32f framework.
Swish activation function to NEON optimizations of SynetConvolution8i framework.
Swish activation function to NEON optimizations of SynetMergedConvolution8i framework.
NEON optimizations of function Yuv444pToBgraV2.
SSE2, AVX2, AVX-512BW, NEON optimizations of function Yuv420pToBgraV2.

Improving

SSE4.1 optimizations of ResizerByteBicubic class.

Bug fixing

Compiler error in NEON optimizations of function AlphaUnpremultiply.
MSVS Compiler warnings in SSE4.1, AVX2, AVX-512BW optimizations of function TransformImage.

Assets 3

03 Mar 12:53

ermig1979

v4.9.110

76bb6f2

Simd v4.9.110

Algorithms

New features

Base implementation, SSE4.1 optimizations of ResizerByteBicubic class.
Base implementation of function BgraToYuv444pV2.
Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function Nv12SaveAsJpegToMemory.
Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function Yuv420pSaveAsJpegToMemory.
Base implementation of function BgraToYuv420pV2.

Bug fixing

Error in SSE4.1, AVX2, AVX-512BW optimizations of function BgraToRgba.
Error in SSE4.1, AVX2 optimizations of function BgraToBgr.
Error in SSE4.1, AVX2 optimizations of function BgraToRgb.
Error in Base implementation, SSE4.1, AVX2, AVX-512BW, NEON optimizations of function AlphaUnpremultiply.

Test framework

New features

Tests for verifying functionality of function BgraToYuv444pV2.
Tests for verifying functionality of function Nv12SaveAsJpegToMemory.
Tests for verifying functionality of function Yuv420pSaveAsJpegToMemory.
Tests for verifying functionality of function BgraToYuv420pV2.

Assets 3

03 Jan 07:51

ermig1979

v4.9.109

1375fc2

Simd v4.9.109

Algorithms

New features

Parameter Uyvy422ToBgr to function.
SSE4.1, AVX2 optimizations of function Uyvy422ToBgr.
Base implementation, SSE4.1, AVX2 optimizations of function Uyvy422ToYuv420p.
Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function Base64Encode.
Base implementation of function Base64Decode.

Improving

AVX2 optimizations of class ResizerNearest for Bgr24, Uv16.

Renaming

Function UyvyToBgr to Uyvy422ToBgr.

Test framework

New features

Tests for verifying functionality of function Uyvy422ToYuv420p.
Tests for verifying functionality of function Base64Encode.
Tests for verifying functionality of function Base64Decode.

Documentation

Changes

Update developers list.

Assets 3

01 Dec 11:28

ermig1979

v4.9.108

b66cf06

Simd v4.9.108

Algorithms

New features

SSE4.1, AVX2, AVX-512F, AVX-512BW optimizations of class ResizerNearest.
Add SimdResizeMethodNearestPytorch to SimdResizeMethodType enumeration.
Add parameter BackgroundStatUpdateTime to Motion Detector.
MotionDetector performance optimization (case of falling star).
16-bit UYVY image format in View.
Base implementation of function UyvyToBgr.
Base implementation, SSE2, AVX2, AVX-512F optimizations of function SynetSwish32f.
SimdConvolutionActivationSwish item of SimdConvolutionActivationType enumeration.
Swish activation function to Base implementation, SSE2, AVX2, AVX-512F optimizations of SynetConvolution32f framework.
Swish activation function to Base implementation, SSE2, AVX2, AVX-512F optimizations of SynetDeconvolution32f framework.
Swish activation function to Base implementation, SSE2, AVX2, AVX-512F optimizations of SynetMergedConvolution32f framework.
Swish activation function to Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI optimizations of SynetConvolution8i framework.
Swish activation function to Base implementation, SSE4.1, AVX2, AVX-512BW, AVX-512VNNI optimizations of SynetMergedConvolution8i framework.
SimdYuvType enumeration.
Base implementation, SSE2, AVX2, AVX-512BW optimizations of function Yuv444pToBgraV2.
Function Simd::Resize supports images with 16-bit channel size.
Base implementation function Yuv420pToBgraV2.

Improving

Refactoring of SimdResizeMethodType enumeration.

Bug fixing

Stack corruption in function Simd::Avx2::JpegWriteBlockSubs.

Test framework

New features

Tests for verifying functionality of function UyvyToBgr.
Tests for verifying functionality of function SynetSwish32f.
Tests for verifying functionality of function Yuv444pToBgraV2.
Tests for verifying functionality of function Yuv420pToBgraV2.

Infrastructure

Bug fixing

Wrong compiler options correction in Cmake.

Assets 3

01 Nov 08:33

ermig1979

v4.9.107

5e0cacf

Simd v4.9.107

Algorithms

New features

Internal class Holder to replace std::unique_ptr for old compilers without support of C++11 standard.
SimdBayerLayoutType enumeration.
Base implementation of class ResizerNearest.

Bug fixing

Compiler error when defined macro SIMD_SSE2_DISABLE.
Compiler error when defined macro SIMD_NEON_DISABLE.

Infrastructure

New features

SIMD_ROOT Cmake parameter.

Assets 3

01 Oct 09:51

ermig1979

v4.9.106

1b25f67

Simd v4.9.106

Algorithms

New features

Base implementation, SSE2, AVX, AVX-512F, NEON optimizations of function SynetHardSigmoid32f.
SimdConvolutionActivationHardSigmoid item of SimdConvolutionActivationType enumeration.
HardSigmoid activation function to Base implementation, SSE2, AVX, AVX2, AVX-512F, NEON optimizations of SynetConvolution32f framework.
HardSigmoid activation function to Base implementation, SSE2, AVX, AVX2, AVX-512F, NEON optimizations of SynetDeconvolution32f framework.
HardSigmoid activation function to Base implementation, SSE2, AVX, AVX2, AVX-512F, NEON optimizations of SynetMergedConvolution32f framework.
NEON optimizations of SynetMergedConvolution32fDc class.
NEON optimizations of SynetMergedConvolution32fCd class.
NEON optimizations of SynetInnerProduct32fGemm class.
NEON optimizations of SynetInnerProduct32fProd class.
HardSigmoid activation function to Base implementation, SSE41, AVX2, AVX-512BW, AVX-512VNNI, NEON optimizations of SynetConvolution8i framework.
HardSigmoid activation function to Base implementation, SSE41, AVX2, AVX-512BW, AVX-512VNNI optimizations of SynetMergedConvolution8i framework.

Bug fixing

Compiler error in file SimdInit.h (CLang, Windows).

Removing

Remove including SimdConfig.h in SimdLib.h.

Test framework

New features

Tests for verifying functionality of function SynetHardSigmoid32f.
'-pi' test parameter (to print internal performance statistics of Simd Library to console).

Assets 3

13 Sep 12:37

ermig1979

v4.9.105

b93367c

Simd v4.9.105

Algorithms

New features

AVX2 optimizations of function TransformImage (case of Gray8, Uv16, Bgr24 for Rotate180, TransposeRotate90).
Method Frame::Clone with region parameter.
Method View::Clone with region parameter.
AVX2 optimizations of function TransformImage (case of Gray8, Uv16, Bgr24, Bgra32 for Rotate90, Rotate270, TransposeRotate0, TransposeRotate180).
AVX-512BW optimizations of function TransformImage (case of Gray8, Uv16, Bgra32 for Rotate180, TransposeRotate90).
AVX-512BW optimizations of function TransformImage (case of Bgra32 for Rotate90, Rotate270, TransposeRotate0, TransposeRotate180).
AVX-512BW optimizations of function TransformImage (case of Uv16 for Rotate90, Rotate270, TransposeRotate0, TransposeRotate180).
AVX-512BW optimizations of function TransformImage (case of Gray8 for Rotate90, Rotate270, TransposeRotate0, TransposeRotate180).
Base implementation, SSE2, AVX2, AVX-512BW, NEON optimizations of function AlphaBlendingUniform.
AVX-512BW optimizations of function TransformImage (case of Bgr24 for Rotate180, TransposeRotate90, Rotate90, Rotate270, TransposeRotate0, TransposeRotate180).
Resize function (with size parameter).
Move constructor of View structure.
Move operator of View structure.
Clear method of Frame structure.
Swap method of Frame structure.
Move constructor of Frame structure.
Move operator of Frame structure.

Tests

New features

Tests for verifying functionality of function AlphaBlendingUniform.

Assets 3

03 Aug 09:06

ermig1979

v4.9.104

4009dfa

Simd v4.9.104

Algorithms

New features

Rgba32 format in Frame structure.
Rgba32 format in Convert function (for frames).
SSE4.1 optimizations of function Float32ToFloat16.
SSE4.1 optimizations of function Float16ToFloat32.
AVX2 optimizations of function TransformImage (case of Bgra32 for Rotate180, TransposeRotate90).

Improving

SSE2, AVX, AVX2, AVX-512F and NEON optimizations of class SynetConvolution32fNhwcDirect (case of fixed kernels).
Reducing of compilation time and binaries size of class SynetConvolution32f.
Reducing of compilation time and binaries size of class SynetDeconvolution32f.
Reducing of compilation time and binaries size of class SynetMergedConvolution32f.
Reducing of compilation time and binaries size of class SynetConvolution8i.
Reducing of compilation time and binaries size of class SynetMergedConvolution8i.
SSE41 optimizations of function TransformImage (case of Bgr24, Bgra32 for Rotate90, Rotate270, TransposeRotate180).
SSE41 optimizations of function TransformImage (case of Uv16 for Rotate90, Rotate270, TransposeRotate0, TransposeRotate180).
SSE41 optimizations of function TransformImage (case of Gray8 for Rotate90, Rotate270, TransposeRotate0, TransposeRotate180).

Bug fixing

Compiler error in file SimdAvx512bwResizer.cpp (GCC 5.4.0).
Compiler error in file SimdAvx512bwBgraToBgr.cpp (MSVS-2017).
Compiler error in file SimdInit.h (CLang, Windows).
Error in AVX2 and AVX-512BW optimizations of functions CosineDistancesMxNa16f and CosineDistancesMxNp16f (functions may return small negative values).
Error in function Base::DetectionLoadA (it generates exception instead of returns NULL).
Error in SSE2, AVX, AVX2, AVX-512F and NEON optimizations of class SynetDeconvolution32fNhwcDirect2x2.

Replacing

Replace SSE3 optimizations to SSE4.1 for function Gemm32fNT.
Replace SSE3 optimizations to SSE4.1 for function SynetConvolution32fInit.
Replace SSE3 optimizations to SSE4.1 for function NeuralAddConvolution2x2Sum.
Replace SSE3 optimizations to SSE4.1 for function NeuralAddConvolution3x3Sum.
Replace SSE3 optimizations to SSE4.1 for function NeuralAddConvolution4x4Sum.
Replace SSE3 optimizations to SSE4.1 for function NeuralAddConvolution5x5Sum.
Replace SSE3 optimizations to SSE4.1 for function NeuralConvolutionForward.
Replace SSE4.2 optimizations to SSE4.1 for function Crc32c.
Replace SSSE3 optimizations to SSE4.1 for function AlphaBlending.
Replace SSSE3 optimizations to SSE4.1 for function AlphaFilling.
Replace SSSE3 optimizations to SSE4.1 for function AlphaPremultiply.
Replace SSSE3 optimizations to SSE4.1 for function BayerToBgr.
Replace SSSE3 optimizations to SSE4.1 for function BgraToBayer.
Replace SSSE3 optimizations to SSE4.1 for function BgraToBgr.
Replace SSSE3 optimizations to SSE4.1 for function BgraToRgb.
Replace SSSE3 optimizations to SSE4.1 for function BgraToRgba.
Replace SSSE3 optimizations to SSE4.1 for function BgraToYuv420p.
Replace SSSE3 optimizations to SSE4.1 for function BgraToYuv422p.
Replace SSSE3 optimizations to SSE4.1 for function BgraToYuva420p.
Replace SSSE3 optimizations to SSE4.1 for function BgrToBayer.
Replace SSSE3 optimizations to SSE4.1 for function BgrToBgra.
Replace SSSE3 optimizations to SSE4.1 for function RgbToBgra.
Replace SSSE3 optimizations to SSE4.1 for function BgrToGray.
Replace SSSE3 optimizations to SSE4.1 for function RgbToGray.
Replace SSSE3 optimizations to SSE4.1 for function BgrToRgb.
Replace SSSE3 optimizations to SSE4.1 for function TransformImage.
Replace SSSE3 optimizations to SSE4.1 for function BgrToYuv420p.
Replace SSSE3 optimizations to SSE4.1 for function BgrToYuv422p.
Replace SSSE3 optimizations to SSE4.1 for function BgrToYuv444p.
Replace SSSE3 optimizations to SSE4.1 for function DeinterleaveBgr.
Replace SSSE3 optimizations to SSE4.1 for function DeinterleaveBgra.
Replace SSSE3 optimizations to SSE4.1 for function GaussianBlur3x3.
Replace SSSE3 optimizations to SSE4.1 for function GrayToBgr.
Replace SSSE3 optimizations to SSE4.1 for function InterleaveBgr.
Replace SSSE3 optimizations to SSE4.1 for function InterleaveBgra.
Replace SSSE3 optimizations to SSE4.1 for function Yuv420pToBgr.
Replace SSSE3 optimizations to SSE4.1 for function Yuv422pToBgr.
Replace SSSE3 optimizations to SSE4.1 for function Yuv444pToBgr.
Replace SSSE3 optimizations to SSE4.1 for function Yuv420pToRgb.
Replace SSSE3 optimizations to SSE4.1 for function Yuv422pToRgb.
Replace SSSE3 optimizations to SSE4.1 for function Yuv444pToRgb.
Replace SSSE3 optimizations to SSE4.1 for function Laplace.
Replace SSSE3 optimizations to SSE4.1 for function LaplaceAbs.
Replace SSSE3 optimizations to SSE4.1 for function LaplaceAbsSum.
Replace SSSE3 optimizations to SSE4.1 for function MeanFilter3x3.
Replace SSSE3 optimizations to SSE4.1 for function ReduceColor2x2.
Replace SSSE3 optimizations to SSE4.1 for function ReduceGray2x2.
Replace SSSE3 optimizations to SSE4.1 for function ReduceGray4x4.
Replace SSSE3 optimizations to SSE4.1 for function Reorder16bit.
Replace SSSE3 optimizations to SSE4.1 for function Reorder32bit.
Replace SSSE3 optimizations to SSE4.1 for function Reorder64bit.
Replace SSSE3 optimizations to SSE4.1 for function ResizeBilinear.
Replace SSSE3 optimizations to SSE4.1 for function SobelDx.
Replace SSSE3 optimizations to SSE4.1 for function SobelDxAbs.
Replace SSSE3 optimizations to SSE4.1 for function SobelDxAbsSum.
Replace SSSE3 optimizations to SSE4.1 for function SobelDy.
Replace SSSE3 optimizations to SSE4.1 for function SobelDyAbs.
Replace SSSE3 optimizations to SSE4.1 for function SobelDyAbsSum.
Replace SSSE3 optimizations to SSE4.1 for function ContourMetrics.
Replace SSSE3 optimizations to SSE4.1 for function ContourMetricsMasked.
Replace SSSE3 optimizations to SSE4.1 for function SquaredDifferenceSum.
Replace SSSE3 optimizations to SSE4.1 for function SquaredDifferenceSumMasked.
Replace SSSE3 optimizations to SSE4.1 for function TextureBoostedSaturatedGradient.
Replace SSSE3 optimizations to SSE4.1 for class ResizerByteBilinear.

Tests

New features

Colorized annotation in console logging.

Improving

Performance report generation to text file.
Thread ID annotation in console logging.

Infrastructure

New features

SIMD_INT8_DEBUG cmake option.

Removing

Separate support of SSE3 extension (it has been moved into SSE4.1).
Separate support of SSE4.2 extension (it has been moved into SSE4.1).
Separate support of SSSE3 extension (it has been moved into SSE4.1).

Assets 3

01 Jul 14:28

ermig1979

v4.8.103

1a5ee02

Simd v4.8.103

Algorithms

New features

Base implementation, SSE4.1, AVX2, AVX-512BW and NEON optimizations of class ResizerShortBilinear.
Base implementation, AVX2, AVX-512BW and NEON optimizations of function VectorNormNa16f.
Base implementation, AVX2, AVX-512BW and NEON optimizations of function VectorNormNp16f.
Parameter of ROI mask in Motion::Model.
SSE2, AVX-512BW and NEON optimizations of function AbsDifference.
NEON optimizations of function AlphaUnpremultiply.
NEON optimizations of function AlphaPremultiply.
NEON optimizations of function ValueSquareSums.

Improving

Performance of SSE4.1, AVX, AVX2, AVX-512F optimizations of SynetInnerProduct32fGemm class.

Bug fixing

Linker warning in file SimdImageLoad.h (MSVS).

Replacing

Replace SSE optimizations to SSE2 for function SvmSumLinear.
Replace SSE optimizations to SSE2 for function Fill32f.
Replace SSE optimizations to SSE2 for function CosineDistance32f.
Replace SSE optimizations to SSE2 for function DifferenceSum32f.
Replace SSE optimizations to SSE2 for function SquaredDifferenceKahanSum32f.
Replace SSE optimizations to SSE2 for function HogDeinterleave.
Replace SSE optimizations to SSE2 for function HogFilterSeparable.
Replace SSE optimizations to SSE2 for class ResizerFloatBilinear.
Replace SSE optimizations to SSE2 for function NeuralAddVectorMultipliedByValue.
Replace SSE optimizations to SSE2 for function NeuralAddVector.
Replace SSE optimizations to SSE2 for function NeuralAddVector.
Replace SSE optimizations to SSE2 for function NeuralAdaptiveGradientUpdate.
Replace SSE optimizations to SSE2 for function NeuralDerivativeRelu.
Replace SSE optimizations to SSE2 for function NeuralDerivativeSigmoid.
Replace SSE optimizations to SSE2 for function NeuralDerivativeTanh.
Replace SSE optimizations to SSE2 for function NeuralRoughSigmoid.
Replace SSE optimizations to SSE2 for function NeuralRoughSigmoid2.
Replace SSE optimizations to SSE2 for function NeuralRoughTanh.
Replace SSE optimizations to SSE2 for function NeuralUpdateWeights.
Replace SSE optimizations to SSE2 for function NeuralPooling1x1Max3x3.
Replace SSE optimizations to SSE2 for function NeuralPooling2x2Max2x2.
Replace SSE optimizations to SSE2 for function NeuralPooling2x2Max3x3.
Replace SSE optimizations to SSE2 for function SynetPoolingForwardAverage.
Replace SSE optimizations to SSE2 for function SynetPoolingForwardMax32f.
Replace SSE optimizations to SSE2 for function NeuralAddConvolution2x2Forward.
Replace SSE optimizations to SSE2 for function NeuralAddConvolution3x3Forward.
Replace SSE optimizations to SSE2 for function NeuralAddConvolution4x4Forward.
Replace SSE optimizations to SSE2 for function NeuralAddConvolution5x5Forward.
Replace SSE optimizations to SSE2 for function NeuralAddConvolution2x2Backward.
Replace SSE optimizations to SSE2 for function NeuralAddConvolution3x3Backward.
Replace SSE optimizations to SSE2 for function NeuralAddConvolution4x4Backward.
Replace SSE optimizations to SSE2 for function NeuralAddConvolution5x5Backward.
Replace SSE optimizations to SSE2 for function NeuralAddConvolution2x2Sum.
Replace SSE optimizations to SSE2 for function NeuralAddConvolution3x3Sum.
Replace SSE optimizations to SSE2 for function NeuralAddConvolution4x4Sum.
Replace SSE optimizations to SSE2 for function NeuralAddConvolution5x5Sum.
Replace SSE optimizations to SSE2 for function Gemm32fNN.
Replace SSE optimizations to SSE2 for function SynetFusedLayerForward0.
Replace SSE optimizations to SSE2 for function SynetFusedLayerForward1.
Replace SSE optimizations to SSE2 for function SynetFusedLayerForward2.
Replace SSE optimizations to SSE2 for function SynetFusedLayerForward3.
Replace SSE optimizations to SSE2 for function SynetFusedLayerForward4.
Replace SSE optimizations to SSE2 for function SynetFusedLayerForward8.
Replace SSE optimizations to SSE2 for function SynetFusedLayerForward9.
Replace SSE optimizations to SSE2 for function SynetReorderImage.
Replace SSE optimizations to SSE2 for function SynetReorderFilter.
Replace SSE optimizations to SSE2 for function SynetAddBias.
Replace SSE optimizations to SSE2 for function SynetEltwiseLayerForward.
Replace SSE optimizations to SSE2 for function SynetInnerProductLayerForward.
Replace SSE optimizations to SSE2 for function SynetShuffleLayerForward.
Replace SSE optimizations to SSE2 for function SynetHswish32f.
Replace SSE optimizations to SSE2 for function SynetPreluLayerForward.
Replace SSE optimizations to SSE2 for function SynetRelu32f.
Replace SSE optimizations to SSE2 for function SynetRestrictRange32f.
Replace SSE optimizations to SSE2 for function SynetScaleLayerForward.
Replace SSE optimizations to SSE2 for function WinogradKernel1x3Block1x4SetFilter.
Replace SSE optimizations to SSE2 for function WinogradKernel1x3Block1x4SetInput.
Replace SSE optimizations to SSE2 for function WinogradKernel1x3Block1x4SetOutput.
Replace SSE optimizations to SSE2 for function WinogradKernel1x5Block1x4SetFilter.
Replace SSE optimizations to SSE2 for function WinogradKernel1x5Block1x4SetInput.
Replace SSE optimizations to SSE2 for function WinogradKernel1x5Block1x4SetOutput.
Replace SSE optimizations to SSE2 for function WinogradKernel2x2Block2x2SetFilter.
Replace SSE optimizations to SSE2 for function WinogradKernel2x2Block2x2SetInput.
Replace SSE optimizations to SSE2 for function WinogradKernel2x2Block2x2SetOutput.
Replace SSE optimizations to SSE2 for function WinogradKernel2x2Block4x4SetFilter.
Replace SSE optimizations to SSE2 for function WinogradKernel2x2Block4x4SetInput.
Replace SSE optimizations to SSE2 for function WinogradKernel2x2Block4x4SetOutput.
Replace SSE optimizations to SSE2 for function WinogradKernel3x3Block2x2SetFilter.
Replace SSE optimizations to SSE2 for function WinogradKernel3x3Block2x2SetInput.
Replace SSE optimizations to SSE2 for function WinogradKernel3x3Block2x2SetOutput.
Replace SSE optimizations to SSE2 for function WinogradKernel3x3Block3x3SetFilter.
Replace SSE optimizations to SSE2 for function WinogradKernel3x3Block3x3SetInput.
Replace SSE optimizations to SSE2 for function WinogradKernel3x3Block3x3SetOutput.
Replace SSE optimizations to SSE2 for function WinogradKernel3x3Block4x4SetFilter.
Replace SSE optimizations to SSE2 for function WinogradKernel3x3Block4x4SetInput.
Replace SSE optimizations to SSE2 for function WinogradKernel3x3Block4x4SetOutput.

Tests

New features

Tests to verify functionality function of VectorNormNa16f.
Tests to verify functionality function of VectorNormNp16f.

Infrastructure

Removing

Support of SSE extension.

Assets 3

02 Jun 11:02

ermig1979

v4.7.102

63bfd6a

Simd v4.7.102

Algorithms

New features

Base implementation, SSE4.1, AVX2, AVX-512BW optimizations of function ValueSquareSums.

Improving

Performance of AVX2, AVX-512F and NEON optimizations of SynetConvolution32fGemmNN class.
Performance of Neural::FullyConnectedLayer::Forward method.

Bug fixing

Error in class SynetMergedConvolution32fDc (large weights case).
Compiler error in file SimdAvx2SynetConversion.cpp (MSVS-2015, Win32).
Error in SSSE3 optimization of ImageTransform function.
Compiler error in file SimdImageSaveJpeg.h (Clang, Mac mini).
Compiler warnings (Clang).
Error in function ImagePngLoader::ReadTransparency (test tbbn0g04.png).
Error in Base implementation, SSE4.1 optimization of class ImagePngLoader (test basn0g16.png).
Error in SSE4.1 optimization of class ImagePngLoader (test s02i3p01.png).

Tests

New features

Tests to verify functionality function of ValueSquareSums.

Improving

Header of performance report table.

Bug fixing

Compiler error in file TestFile.h (Clang, Mac mini).

Assets 3

Releases: ermig1979/Simd

Simd v4.9.111

Algorithms

New features

Improving

Bug fixing

Simd v4.9.110

Algorithms

New features

Bug fixing

Test framework

New features

Simd v4.9.109

Algorithms

New features

Improving

Renaming

Test framework

New features

Documentation

Changes

Simd v4.9.108

Algorithms

New features

Improving

Bug fixing

Test framework

New features

Infrastructure

Bug fixing

Simd v4.9.107

Algorithms

New features

Bug fixing

Infrastructure

New features

Simd v4.9.106

Algorithms

New features

Bug fixing

Removing

Test framework

New features

Simd v4.9.105

Algorithms

New features

Tests

New features

Simd v4.9.104

Algorithms

New features

Improving

Bug fixing

Replacing

Tests

New features

Improving

Infrastructure

New features

Removing

Simd v4.8.103

Algorithms

New features

Improving

Bug fixing

Replacing

Tests

New features

Infrastructure

Removing

Simd v4.7.102

Algorithms

New features

Improving

Bug fixing

Tests

New features

Improving

Bug fixing