The Baseline Version: A Very Poor Performing Application

The initial, poor performing code sample used to calculate the updates is as follows:

Note

For simplicity, there is no error handling in the following examples. Any production application always checks return values.

 

Warning

The first few examples of the application provide intentionally poor performance, in order to illustrate performance improvements possible with changes to code. Do not use these code samples in your application; they are for illustration purposes only.

 

#include <windows.h>

BOOL Map[ROWS][COLS];

void LifeUpdate()
{
    ComputeNext( Map );
    for( int i = 0 ; i < ROWS ; ++i )     //serialized
        for( int j = 0 ; j < COLS ; ++j )
            Set( i, j, Map[i][j] );    //chatty
}

BYTE Set(row, col, bAlive)
{
    SOCKET s = socket(...);
    BYTE byRet = 0;
    setsockopt( s, SO_SNDBUF, &Zero, sizeof(int) );
    bind( s, ... );
    connect( s, ... );
    send( s, &row, 1 );
    send( s, &col, 1 );
    send( s, &bAlive, 1 );
    recv( s, &byRet, 1 );
    closesocket( s );
    return byRet;
}

In this state, the application has the worst possible network performance. The problems with this version of the sample application include:

  • The application is chatty. Each transaction is too small — cells do not need to be updated one by one.
  • The transactions are strictly serialized, even though the cells could be updated concurrently.
  • The send buffer is set to zero, and the application incurs a 200-millisecond delay for each send — three times per cell.
  • The application is very connect heavy, connecting once for each cell. Applications are limited in the number of connections per second for a given destination because of TIME-WAIT state, but that is not an issue here, since each transaction takes over 600 milliseconds.
  • The application is fat; many transactions have no effect on the server state, because many cells do not change from update to update.
  • The application exhibits poor streaming; small sends consume a lot of CPU and RAM.
  • The application assumes little-endian representation for its sends. This is a natural assumption for the current Windows platform, but can be dangerous for long-lived code.

Key Performance Metrics

The following performance metrics are expressed in Round Trip Time (RTT), Goodput, and Protocol Overhead. See the Network Terminology topic for an explanation of these terms.

  • Cell time, network time for a single cell update, requires 4*RTT + 600 milliseconds for Nagle and delayed ACK interactions.
  • Goodput is less than 6 bytes.
  • Protocol Overhead is 99.6%.

Improving a Slow Application

Network Terminology

Revision 1: Cleaning up the Obvious

Revision 2: Redesigning for Fewer Connects

Revision 3: Compressed Block Send

Future Improvements