Separating concerns

Last night, I realized that in my last post on messages, I skipped over one of the essential characters of message-passing APIs: that they separate the code that produces data from the code that acts on its availability with some level(s) of indirection. Consider the use of 'Future<T>' to produce values that are acted on by some method f(T):

var f1 = Future<T>.Create(() => ...).ContinueWith(f);
var f2 = Future<T>.Create(() => ...).ContinueWith(f);
var f3 = Future<T>.Create(() => ...).ContinueWith(f);
var f4 = Future<T>.Create(() => ...).ContinueWith(f);

The producers know about the consumer f and to change the relationship, you would have to change it everywhere. Moreover, if there are other sources of data for f to act, they would have to be known and managed separately. With some indirection, the two can be separated:

var buf = new Buffer<T>();

var f1 = Future<T>.Create(() => ...).ContinueWith(buf.Post);
var f2 = Future<T>.Create(() => ...).ContinueWith(buf.Post);
var f3 = Future<T>.Create(() => ...).ContinueWith(buf.Post);
var f4 = Future<T>.Create(() => ...).ContinueWith(buf.Post);

Activate(Receiver.Create(true, prt, f));

The buffering APIs here don't necessarily exist in this precise form, but the CCR offers the same functionality. In a trivial example like this, the advantages aren't very obvious, but we've separated the production of data from the consumption -- in the message-passing pattern, they are separate concerns. This separation forms the foundation for the loose coupling that I think we need to borrow from the WWW model and adapt for the smaller scale of many-core programming.

With buffering, we can do things like wait for the first two pieces of data to come in and act on those when they are both available:

Activate(Receiver.Create(true, prt, f));
Activate(Join.GatherMultiple(prt, 2, delegate(T [] arr) { /* Here we have two elements in arr */ } ));

There are many other operations on buffers that are equally useful and they can all be combined with each other to form complex expressions of data dependencies.

Anyone could, of course, just define f() to provide all the indirection necessary, but the method for doing so would be specific to the application being built. Having a unified approach and APIs for doing this means that libraries and components can be built independently but made to work together, and that is why I find it so important that we think about this very carefully.

Comments