非同期エージェントライブラリに関するベストプラクティス

[アーティクル]
06/16/2023

ここでは、非同期エージェントライブラリを効果的に使用する方法について説明します。エージェントライブラリは、アクターベースのプログラミングモデルと、粒度の粗いデータフローおよびパイプライン処理タスクのためのインプロセスメッセージパッシングを推進します。

エージェントライブラリの詳細については、「非同期エージェントライブラリ」を参照してください。

セクション

このドキュメントは、次のトピックに分かれています。

エージェントを使用して状態を分離する
調整メカニズムを使用してデータパイプライン内のメッセージ数を制限する
データパイプラインできめ細かな作業を実行しない
大きなメッセージペイロードを値渡ししない
所有権が未定義の場合にデータネットワークでshared_ptrを使用する

エージェントを使用して状態を分離する

エージェントライブラリは、非同期メッセージ引き渡し方法を使用して分離コンポーネントを接続できるようにすることで、共有状態に代わる手段を提供します。非同期エージェントは、他のコンポーネントから内部状態を分離する場合に最も効果的です。状態を分離することによって、通常、複数のコンポーネントが共有データに作用することがなくなります。状態分離により共有メモリの競合が軽減されるため、アプリケーションのスケーラビリティが高まります。また、コンポーネントが共有データへのアクセスを同期する必要がなくなるため、デッドロックや競合状態が発生しにくくなります。

通常、エージェントで状態を分離するには、エージェントクラスの private セクションまたは protected セクションにデータメンバーを保持し、メッセージバッファーを使用して状態の変化を通知します。次の例は、concurrency::agent から派生する basic_agent クラスを示しています。 basic_agent クラスは、2 つのメッセージバッファーを使用して外部コンポーネントと通信します。 2 つのメッセージバッファーにはそれぞれ受信メッセージと送信メッセージが保持されます。

// basic-agent.cpp
// compile with: /c /EHsc
#include <agents.h>

// An agent that uses message buffers to isolate state and communicate
// with other components.
class basic_agent : public concurrency::agent
{
public:
   basic_agent(concurrency::unbounded_buffer<int>& input)
      : _input(input)
   {
   }
   
   // Retrieves the message buffer that holds output messages.
   concurrency::unbounded_buffer<int>& output()
   {
      return _output;
   }

protected:
   void run()
   {
      while (true)
      {
         // Read from the input message buffer.
         int value = concurrency::receive(_input);

         // TODO: Do something with the value.
         int result = value;
         
         // Write the result to the output message buffer.
         concurrency::send(_output, result);
      }
      done();
   }

private:
   // Holds incoming messages.
   concurrency::unbounded_buffer<int>& _input;
   // Holds outgoing messages.
   concurrency::unbounded_buffer<int> _output;
};

エージェントを定義して使用する方法の完全な例については、「チュートリアル: エージェントベースのアプリケーションの作成」および「チュートリアル: データフローエージェントの作成」を参照してください。

[トップ]

調整メカニズムを使用してデータパイプライン内のメッセージ数を制限する

concurrency::unbounded_buffer など、メッセージバッファーの多くの型では、メッセージを無制限に保持できます。メッセージプロデューサーがメッセージをデータパイプラインに送信する速度が、コンシューマーがそれらのメッセージを処理する速度よりも速い場合、アプリケーションはメモリ不足の状態になることがあります。セマフォなどのスロットリング機構を使用すると、データパイプラインで同時にアクティブになるメッセージの数を制限できます。

次の基本的な例では、セマフォを使用してデータパイプラインのメッセージの数を制限する方法を示します。データパイプラインでは、concurrency::wait 関数を使用して、100 ミリ秒以上かかる操作をシミュレートします。コンシューマーがメッセージを処理するよりも速く送信側でメッセージが生成されるため、この例では semaphore クラスを定義し、アプリケーションがアクティブメッセージの数を制限できるようにしています。

// message-throttling.cpp
// compile with: /EHsc
#include <windows.h> // for GetTickCount()
#include <atomic>
#include <agents.h>
#include <concrt.h>
#include <concurrent_queue.h>
#include <sstream>
#include <iostream>

using namespace concurrency;
using namespace std;

// A semaphore type that uses cooperative blocking semantics.
class semaphore
{
public:
   explicit semaphore(long long capacity)
      : _semaphore_count(capacity)
   {
   }

   // Acquires access to the semaphore.
   void acquire()
   {
      // The capacity of the semaphore is exceeded when the semaphore count 
      // falls below zero. When this happens, add the current context to the 
      // back of the wait queue and block the current context.
      if (--_semaphore_count < 0)
      {
         _waiting_contexts.push(Context::CurrentContext());
         Context::Block();
      }
   }

   // Releases access to the semaphore.
   void release()
   {
      // If the semaphore count is negative, unblock the first waiting context.
      if (++_semaphore_count <= 0)
      {
         // A call to acquire might have decremented the counter, but has not
         // yet finished adding the context to the queue. 
         // Create a spin loop that waits for the context to become available.
         Context* waiting = NULL;
         while (!_waiting_contexts.try_pop(waiting))
         {
            (Context::Yield)(); // <windows.h> defines Yield as a macro. The parenthesis around Yield prevent the macro expansion so that Context::Yield() is called.  
         }

         // Unblock the context.
         waiting->Unblock();
      }
   }

private:
   // The semaphore count.
   atomic<long long> _semaphore_count;

   // A concurrency-safe queue of contexts that must wait to 
   // acquire the semaphore.
   concurrent_queue<Context*> _waiting_contexts;
};

// A synchronization primitive that is signaled when its 
// count reaches zero.
class countdown_event
{
public:
   countdown_event(long long count)
       : _current(count) 
    {
       // Set the event if the initial count is zero.
       if (_current == 0LL)
          _event.set();
    }

    // Decrements the event counter.
    void signal() {
       if(--_current == 0LL) {
          _event.set();
       }
    }

    // Increments the event counter.
    void add_count() {
       if(++_current == 1LL) {
          _event.reset();
       }
    }

    // Blocks the current context until the event is set.
    void wait() {
       _event.wait();
    }

private:
   // The current count.
   atomic<long long> _current;
   // The event that is set when the counter reaches zero.
   event _event;

   // Disable copy constructor.
   countdown_event(const countdown_event&);
   // Disable assignment.
   countdown_event const & operator=(countdown_event const&);
};

int wmain()
{
   // The number of messages to send to the consumer.
   const long long MessageCount = 5;

   // The number of messages that can be active at the same time.
   const long long ActiveMessages = 2;

   // Used to compute the elapsed time.
   DWORD start_time;

   // Computes the elapsed time, rounded-down to the nearest
   // 100 milliseconds.
   auto elapsed = [&start_time] {
      return (GetTickCount() - start_time)/100*100;
   };
  
   // Limits the number of active messages.
   semaphore s(ActiveMessages);

   // Enables the consumer message buffer to coordinate completion
   // with the main application.
   countdown_event e(MessageCount);

   // Create a data pipeline that has three stages.

   // The first stage of the pipeline prints a message.
   transformer<int, int> print_message([&elapsed](int n) -> int {
      wstringstream ss;
      ss << elapsed() << L": received " << n << endl;
      wcout << ss.str();

      // Send the input to the next pipeline stage.
      return n;
   });

   // The second stage of the pipeline simulates a 
   // time-consuming operation.
   transformer<int, int> long_operation([](int n) -> int {
      wait(100);

      // Send the input to the next pipeline stage.
      return n;
   });

   // The third stage of the pipeline releases the semaphore
   // and signals to the main appliation that the message has
   // been processed.
   call<int> release_and_signal([&](int unused) {
      // Enable the sender to send the next message.
      s.release();

      // Signal that the message has been processed.
      e.signal();
   });

   // Connect the pipeline.
   print_message.link_target(&long_operation);
   long_operation.link_target(&release_and_signal);

   // Send several messages to the pipeline.
   start_time = GetTickCount();
   for(auto i = 0; i < MessageCount; ++i)
   {
      // Acquire access to the semaphore.
      s.acquire();

      // Print the message to the console.
      wstringstream ss;
      ss << elapsed() << L": sending " << i << L"..." << endl;
      wcout << ss.str();

      // Send the message.
      send(print_message, i);
   }

   // Wait for the consumer to process all messages.
   e.wait();
}
/* Sample output:
    0: sending 0...
    0: received 0
    0: sending 1...
    0: received 1
    100: sending 2...
    100: received 2
    200: sending 3...
    200: received 3
    300: sending 4...
    300: received 4
*/

semaphore オブジェクトでは、パイプラインで同時に処理するメッセージ数を 2 までに制限しています。

この例では、プロデューサーからコンシューマーに送信されるメッセージは比較的少量です。そのため、メモリ不足の状態は発生しません。ただし、データパイプラインに含まれるメッセージの数が比較的多い場合、この機構は便利です。

この例で使用されているセマフォクラスを作成する方法の詳細については、「方法: Context クラスを使用して協調セマフォを実装する」を参照してください。

[トップ]

データパイプラインできめ細かな作業を実行しない

エージェントライブラリが最も役立つのは、データパイプラインで実行される処理の粒度が非常に粗い場合です。たとえば、1 つのアプリケーションコンポーネントがファイルまたはネットワーク接続からデータを読み取り、状況に応じてそのデータを別のコンポーネントに送信する場合があります。エージェントライブラリでメッセージの伝達に使用されるプロトコルにより、並列パターンライブラリ (PPL) で提供されるタスク parallel コンストラクトよりも、メッセージパッシング機構のオーバーヘッドが大きくなります。したがって、データパイプラインで実行される処理の時間が十分長く、このオーバーヘッドを相殺できることを確認してください。

データパイプラインはそのタスクの粒度が粗い場合に最も有効ですが、データパイプラインの各ステージでは PPL コンストラクト (タスクグループや並列アルゴリズムなど) を使用してより粒度の細かい処理を実行できます。各処理ステージで粒度の細かい並列処理を使用する粒度の粗いデータネットワークの例については、「チュートリアル: 画像処理ネットワークの作成」を参照してください。

[トップ]

大きなメッセージペイロードを値渡ししない

ランタイムでは、各メッセージのコピーを作成してそれをメッセージバッファー間での引き渡しに使用する場合があります。たとえば、concurrency::overwrite_buffer クラスは、受信した各メッセージのコピーを各ターゲットに提供します。また、concurrency::send や concurrency::receive などのメッセージパッシング関数を使用してメッセージバッファーとの間でメッセージの書き込み/読み取りを行う場合にも、メッセージデータのコピーが作成されます。この機構を使用すると、共有データに対する同時書き込みのリスクはなくなりますが、メッセージペイロードが比較的大きい場合、メモリのパフォーマンスが低下する可能性があります。

ペイロードの大きいメッセージを渡す場合は、ポインターまたは参照を使用してメモリのパフォーマンスを向上させることができます。次の例では、大きなメッセージの値渡しを行う場合と同じメッセージ型にポインターを渡す場合を比較します。この例では、producer オブジェクトに対して作用する 2 種類のエージェント consumer および message_data を定義します。また、プロデューサーが複数の message_data オブジェクトをコンシューマーに送信するのに必要な時間と、プロデューサーエージェントが複数のポインターを message_data オブジェクト、コンシューマーへと順番に送信するのに必要な時間を比較します。

// message-payloads.cpp
// compile with: /EHsc
#include <Windows.h>
#include <agents.h>
#include <iostream>

using namespace concurrency;
using namespace std;

// Calls the provided work function and returns the number of milliseconds 
// that it takes to call that function.
template <class Function>
__int64 time_call(Function&& f)
{
   __int64 begin = GetTickCount();
   f();
   return GetTickCount() - begin;
}

// A message structure that contains large payload data.
struct message_data
{
   int id;
   string source;
   unsigned char binary_data[32768];
};

// A basic agent that produces values.
template <typename T>
class producer : public agent
{
public:
   explicit producer(ITarget<T>& target, unsigned int message_count)
      : _target(target)
      , _message_count(message_count)
   {
   }
protected:
   void run();

private:
   // The target buffer to write to.
   ITarget<T>& _target;
   // The number of messages to send.
   unsigned int _message_count;
};

// Template specialization for message_data.
template <>
void producer<message_data>::run()
{
   // Send a number of messages to the target buffer.
   while (_message_count > 0)
   {
      message_data message;
      message.id = _message_count;
      message.source = "Application";

      send(_target, message);
      --_message_count;
   }
   
   // Set the agent to the finished state.
   done();
}

// Template specialization for message_data*.
template <>
void producer<message_data*>::run()
{
   // Send a number of messages to the target buffer.
   while (_message_count > 0)
   {
      message_data* message = new message_data;
      message->id = _message_count;
      message->source = "Application";

      send(_target, message);
      --_message_count;
   }
   
   // Set the agent to the finished state.
   done();
}

// A basic agent that consumes values.
template <typename T>
class consumer : public agent
{
public:
   explicit consumer(ISource<T>& source, unsigned int message_count)
      : _source(source)
      , _message_count(message_count)
   {
   }

protected:
   void run();

private:
   // The source buffer to read from.
   ISource<T>& _source;
   // The number of messages to receive.
   unsigned int _message_count;
};

// Template specialization for message_data.
template <>
void consumer<message_data>::run()
{
   // Receive a number of messages from the source buffer.
   while (_message_count > 0)
   {
      message_data message = receive(_source);
      --_message_count;

      // TODO: Do something with the message. 
      // ...
   }
       
   // Set the agent to the finished state.
   done();
}

template <>
void consumer<message_data*>::run()
{
   // Receive a number of messages from the source buffer.
   while (_message_count > 0)
   {
      message_data* message = receive(_source);
      --_message_count;

      // TODO: Do something with the message.
      // ...

      // Release the memory for the message.
      delete message;     
   }
       
   // Set the agent to the finished state.
   done();
}

int wmain()
{
   // The number of values for the producer agent to send.
   const unsigned int count = 10000;
      
   __int64 elapsed;

   // Run the producer and consumer agents.
   // This version uses message_data as the message payload type.

   wcout << L"Using message_data..." << endl;
   elapsed = time_call([count] {
      // A message buffer that is shared by the agents.
      unbounded_buffer<message_data> buffer;

      // Create and start the producer and consumer agents.
      producer<message_data> prod(buffer, count);
      consumer<message_data> cons(buffer, count);
      prod.start();
      cons.start();

      // Wait for the agents to finish.
      agent::wait(&prod);
      agent::wait(&cons);
   });
   wcout << L"took " << elapsed << L"ms." << endl;

   // Run the producer and consumer agents a second time.
   // This version uses message_data* as the message payload type.

   wcout << L"Using message_data*..." << endl;
   elapsed = time_call([count] {
      // A message buffer that is shared by the agents.
      unbounded_buffer<message_data*> buffer;

      // Create and start the producer and consumer agents.
      producer<message_data*> prod(buffer, count);
      consumer<message_data*> cons(buffer, count);
      prod.start();
      cons.start();

      // Wait for the agents to finish.
      agent::wait(&prod);
      agent::wait(&cons);
   });
   wcout << L"took " << elapsed << L"ms." << endl;
}

この例では、次のサンプル出力が生成されます。

Using message_data...
took 437ms.
Using message_data*...
took 47ms.

ポインターを使用したバージョンの方がパフォーマンスは高くなります。これは、ランタイムがプロデューサーからコンシューマーに渡す各 message_data オブジェクトの完全なコピーを作成する必要がなくなるためです。

[トップ]

所有権が未定義の場合にデータネットワークでshared_ptrを使用する

メッセージパッシングパイプラインまたはネットワークを通じてメッセージをポインターで送信する場合、通常、ネットワークの始端で各メッセージ用のメモリを割り当て、ネットワークの終端でそのメモリを解放します。この機構も通常は有効に働きますが、使用するのが困難な場合や、使用できない場合もあります。たとえば、データネットワークに複数の終了ノードが存在する場合を考えます。この場合、メッセージ用のメモリを解放する場所が明確ではありません。

この問題を解決するには、std::shared_ptr などの機構を使用して、ポインターを複数のコンポーネントで所有できるようにします。リソースを所有する最後の shared_ptr オブジェクトが破棄されると、そのリソースも解放されます。

次の例では、shared_ptr を使用して複数のメッセージバッファー間でポインター値を共有する方法を示します。この例では、concurrency::overwrite_buffer オブジェクトを 3 つの concurrency::call オブジェクトに接続します。 overwrite_buffer クラスは、メッセージを各ターゲットに提供します。データネットワークの終端にはデータの所有者が複数存在するため、この例では shared_ptr を使用して各 call オブジェクトでメッセージの所有権を共有できるようにしています。

// message-sharing.cpp
// compile with: /EHsc
#include <agents.h>
#include <iostream>
#include <sstream>

using namespace concurrency;
using namespace std;

// A type that holds a resource.
class resource
{
public:
   resource(int id) : _id(id)
   { 
      wcout << L"Creating resource " << _id << L"..." << endl;
   }
   ~resource()
   { 
      wcout << L"Destroying resource " << _id << L"..." << endl;
   }

   // Retrieves the identifier for the resource.
   int id() const { return _id; }

   // TODO: Add additional members here.
private:
   // An identifier for the resource.
   int _id;

   // TODO: Add additional members here.
};

int wmain()
{   
   // A message buffer that sends messages to each of its targets.
   overwrite_buffer<shared_ptr<resource>> input;
      
   // Create three call objects that each receive resource objects
   // from the input message buffer.

   call<shared_ptr<resource>> receiver1(
      [](shared_ptr<resource> res) {
         wstringstream ss;
         ss << L"receiver1: received resource " << res->id() << endl;
         wcout << ss.str();
      },
      [](shared_ptr<resource> res) { 
         return res != nullptr; 
      }
   );

   call<shared_ptr<resource>> receiver2(
      [](shared_ptr<resource> res) {
         wstringstream ss;
         ss << L"receiver2: received resource " << res->id() << endl;
         wcout << ss.str();
      },
      [](shared_ptr<resource> res) { 
         return res != nullptr; 
      }
   );

   event e;
   call<shared_ptr<resource>> receiver3(
      [&e](shared_ptr<resource> res) {
         e.set();
      },
      [](shared_ptr<resource> res) { 
         return res == nullptr; 
      }
   );

   // Connect the call objects to the input message buffer.
   input.link_target(&receiver1);
   input.link_target(&receiver2);
   input.link_target(&receiver3);

   // Send a few messages through the network.
   send(input, make_shared<resource>(42));
   send(input, make_shared<resource>(64));
   send(input, shared_ptr<resource>(nullptr));

   // Wait for the receiver that accepts the nullptr value to 
   // receive its message.
   e.wait();
}

この例では、次のサンプル出力が生成されます。

Creating resource 42...
receiver1: received resource 42
Creating resource 64...
receiver2: received resource 42
receiver1: received resource 64
Destroying resource 42...
receiver2: received resource 64
Destroying resource 64...

次の方法で共有

非同期エージェントライブラリに関するベストプラクティス

セクション

エージェントを使用して状態を分離する

調整メカニズムを使用してデータパイプライン内のメッセージ数を制限する

データパイプラインできめ細かな作業を実行しない

大きなメッセージペイロードを値渡ししない

所有権が未定義の場合にデータネットワークでshared_ptrを使用する

関連項目

フィードバック

フィードバック

その他のリソース

次の方法で共有

非同期エージェント ライブラリに関するベスト プラクティス

セクション

エージェントを使用して状態を分離する

調整メカニズムを使用してデータ パイプライン内のメッセージ数を制限する

データ パイプラインできめ細かな作業を実行しない

大きなメッセージ ペイロードを値渡ししない

所有権が未定義の場合にデータ ネットワークでshared_ptrを使用する

関連項目

フィードバック

フィードバック

その他のリソース

非同期エージェントライブラリに関するベストプラクティス

調整メカニズムを使用してデータパイプライン内のメッセージ数を制限する

データパイプラインできめ細かな作業を実行しない

大きなメッセージペイロードを値渡ししない

所有権が未定義の場合にデータネットワークでshared_ptrを使用する