PPL and ConcRT: What’s new in Visual Studio 11 Beta

In this product cycle release, our goal was to focus on providing comprehensive expressions for parallel patterns in C++. This Beta release allows you to start taking advantage of the libraries and go-live with your products using these features.

In Visual Studio 2010, we saw the introduction of C++ parallelization libraries. We also introduced concepts to leverage concurrency by expressing sophisticated dataflow pipelines. In Visual Studio 11 Beta, these libraries have been extended to provide better performance, more control, and richer support for the parallel patterns developers need most. In addition, one of the most awaited features – asynchronous PPL tasks and continuations is now available, complete with “Windows 8” integration. You may have seen a number of these features already in the sample packs and/or in the Developer Preview from last fall.

More parallel patterns

std:: patterns

concurrency:: patterns

sort

parallel_sort

parallel_buffered_sort

parallel_radix_sort

transform

parallel_transform

accumulate

parallel_reduce *

In addition, the parallel_for construct now takes an optional parameter – the partitioner. This allows you to get the best performance out of your parallel loop. There are four partitioners - auto (default), fixed for iterations of the same size, simple for small loops, and affinity for better cache behavior when iterating a loop-in-a-loop.

More concurrency-safe containers and data structures

In addition to concurrent_vector and concurrent_queue in Visual Studio 2010, we’ve added the following containers that are concurrency safe. We’ve also worked very closely with our partner and ally, Intel, to ensure that these containers have shared specification and implementation across Intel’s TBB.

std:: containers

concurrency:: containers

priority_queue

concurrent_priority_queue

map

concurrent_unordered_map

multimap

concurrent_unordered_multimap

set

concurrent_unordered_set

multiset

concurrent_unordered_multiset

Richer Task Model: Support for Asynchronous operations, with continuations

While the parallel algorithms are great for structured fork-join parallelism, there is a class of patterns that need loosely structured task-based programming. Beyond structure, continuations and call-back based programming lets a developer express “when you’re done with this task, continue with that one, and in the meantime I’ll go away to do some other work”. Doing other work can mean processing the message loop, or scheduling other tasks – all of which can run in parallel. More details can be found in this MSDN magazine article.

Composition with C++11 concurrency

The C++ 11 concurrency features are powered by ConcRT. Hence they are efficient and compose well with each other as well as PPL. Also note that synchronization primitives such as std::mutex, std::timed_mutex, std::conditional_variable are all ConcRT aware – in the sense that the runtime is aware of the blocking call, and can react appropriately.

“Windows 8” notes

WinRT Async: PPL asynchronous operations and continuations are fully composable with WinRT’s async operations. A developer can not only schedule a PPL continuation off of a WinRT async operation, but can also create a first-class async operation, by using the create_async API.

Windows Metro Style Application support: All of the PPL and Messaging blocks library are available to the developer. However, do note that advanced ConcRT scheduler and resource management APIs are unavailable in the Windows Metro world.

ARM support:  We’ve ported over our code to make PPL work on ARM. We’ve carefully accounted for differences in memory model between ARM and x86 architectures. We know of no compatibility issues.

ConcRT : Scheduler and resource management enhancements

In a previous post, I mentioned the performance improvements gained by simply recompiling the application. Some of the improvements come from applying better scheduling techniques and enhancing task locality.

Also, the Resource Manager now respects process affinity mask set prior to the use of parallel libraries. One can also programmatically set this at the Resource Manager level by calling LimitTaskExecutionResources() API. This feature is particularly aimed at enabling server-side developers who would like to partition and dedicate CPU resources for each application.

Thanks!

Thanks to many developers who have reached out to us to provide feedback on the forums (https://social.msdn.microsoft.com/Forums/en-US/parallelcppnative/threads). Some have met us in person and many others have chatted with us over the phone. The feedback has been very valuable and constructive.

Rahul V. Patil

Lead Program Manager

PS: If you’d like to share your story, please feel free to email me. [ r_ a_ h_ u_ l d o t p_ a_ t_ i_ l a t m_ i_ c_ r_ o_ s_ o_ f_ t_ d o t c_ o_ m (delete all underscores and spaces; replace dot with ‘.’ and at with ‘@’).]