New Release of the AMP Algorithms Library

If you are fond of high-performance algorithms, you will be pleased to find out that our friend Ade Miller has just issued a new iteration of the AMP Algorithms Library. As usual, Ade's work is top notch, and it brings notable improvements across the board; in his own words:

Finally, there is a new release of the C++ AMP Algorithms Library! It has taken a while, largely due to other things, like CppCon taking up my time. This release contains the following:

  • New C++ AMP features:

    • AMP and STL algorithms no longer depend on DirectX scan implementation.

    • New implementation of amp_algorithms::scan that does not have a direct dependency on the ID3DX11Scan and ID3DX11SegmentedScan interfaces.

    • The amp_stl_algorithms::copy_if and remove_if algorithms use the new scan implementation now, for improved performance.

    • Implementation of radix sort amp_algorithms::radix_sort.

    • New utility functions: log2, is_power_of_two, count_bits, padded_read, padded_write, pack_byte and unpack_byte.

    • New namespace added for DirectX dependent features, amp_algorithms::direct3d. All DirectX code now in a separate header file amp_algorithms_direct3d.h.

  • New C++ AMP STL features:

    • inner_product

    • minmax

    • pair<T1, T2>

    • rotate_copy

  • New SAXPY example.

  • Reorganized unit tests, consistent names and test categories.

As usual, you can download the latest iteration from: https://ampalgorithms.codeplex.com/, and enjoy the benefits of heterogeneous parallelism.

Comments

  • Anonymous
    November 24, 2014
    You really should create something comparable to Intel IPP and MKL to really drive this to the market.

  • Anonymous
    February 11, 2015
    I have a simple question, is there any plan to remake C++ amp runtime with using DirectX 12?

  • Anonymous
    July 08, 2015
    Do you really have any plan to develop this project? I ve optimized matrix multiplication and my version gives following results in managed code:

  1. Sequential realisation runs better than parallel in random traveler!
  2. My parallel code runs 2-5% faster than c++ amp warp version. Does c++ amp warp really use sse?
  • Anonymous
    July 23, 2015
    Concerning Random Traveler again... I have tested c++ compiler vectorization and visual studio 2015 new c# compiler. So vectorized tiled version of matrix multipication (C++ .dll best verision without any transposition) on 2900*2900 matrix loads only one core and runs at the speed approximately 5 GFlops on my computer, C# Parallel loop with partitioner and unsafe pointer on Visual Studio 2015 approximately 4 GFlops. Warp version 1.1 GFlops. And the first test of naive C# was at the speed 0.04 GFlops, your verision of parallel loop where you decided to use pointers was about 0.35 GFlops. GPU is needed only with bigger matrices. Seems you are giving unfair information.

  • Anonymous
    July 24, 2015
    Hi, VS 2015 and Win 10 are out. Do you have any plans for DirectX 12 or is c++ amp dead? Mike

  • Anonymous
    August 12, 2015
    Is c++ amp dead?

  • Anonymous
    September 08, 2015
    Any news on C++ AMP with Windows 10 / WDDM 2.0?

  • Anonymous
    November 18, 2015
    guys what's the state of C++amp?

  • Anonymous
    January 02, 2016
    Unfortunately, after 2 years without any update or news, we can now safely say that this project is dead. RIP.