.NET Matters

Tales from the CryptoRandom

Stephen Toub and Shawn Farkas

Code download available at: NET Matters 2007_09.exe(151 KB)

Q I'm using the System.Random class to generate some random numbers in my application. A coworker reviewed my code and suggested that I use RNGCryptoServiceProvider instead. I'd like to follow her suggestion, but I'd also like not to have to modify all of my code that uses Random, and RNGCryptoServiceProvider looks nothing like Random in terms of the methods it exposes. Do you have any suggestions for making this easier?

Q I'm using the System.Random class to generate some random numbers in my application. A coworker reviewed my code and suggested that I use RNGCryptoServiceProvider instead. I'd like to follow her suggestion, but I'd also like not to have to modify all of my code that uses Random, and RNGCryptoServiceProvider looks nothing like Random in terms of the methods it exposes. Do you have any suggestions for making this easier?

A What you're looking for is something called an adapter. In design pattern terms, an adapter takes the functionality of one class and "adapts" it to fit the interface provided (or expected) by another. In this case, you want to adapt the functionality of RNGCryptoServiceProvider to the interface of Random. As luck would have it, Random is not sealed and its public methods are all virtual. This implies that the designers of the Microsoft® .NET Framework actually expected derived implementations to be created that provide their own random number generation algorithms while still maintaining the same interface. Thus, we can implement a type that derives from Random and overrides all of its public methods, providing new implementations that rely on RNGCryptoServiceProvider.

A What you're looking for is something called an adapter. In design pattern terms, an adapter takes the functionality of one class and "adapts" it to fit the interface provided (or expected) by another. In this case, you want to adapt the functionality of RNGCryptoServiceProvider to the interface of Random. As luck would have it, Random is not sealed and its public methods are all virtual. This implies that the designers of the Microsoft® .NET Framework actually expected derived implementations to be created that provide their own random number generation algorithms while still maintaining the same interface. Thus, we can implement a type that derives from Random and overrides all of its public methods, providing new implementations that rely on RNGCryptoServiceProvider.

RNGCryptoServiceProvider, which derives from the abstract RandomNumberGenerator class, exposes two public methods: GetBytes and GetNonZeroBytes. The GetBytes method accepts a byte array and fills it with a cryptographically strong sequence of random values. GetNonZeroBytes does the exact same thing, except that it guarantees that none of the bytes are zero.

For our purposes, the former is the most useful, as using the latter actually decreases the possible values that can be generated, since it restricts each byte into the range [1,255] instead of [0,255]. In fact, GetNonZeroBytes should almost never be used, since one test of a truly random number is that each bit should have an equal probability of being either 0 or 1. That is not true if you can never have a sequence of 15 or more consecutive zero bits, which is the case if no byte can be 0 (you can still get up to 14 sequential zero bits from the two bytes 0x8001, but to get to 15, at least one byte must be 0). If we're using GetNonZeroBytes and keep a counter of consecutive 0 bits, we know for certain that once we've seen 14 zero bits, the next bit must be 1. Since that bit had a 100 percent chance of being 1 and a 0 percent chance of being 0, it automatically makes the sequence more predictable and thus not truly random. One of the few cases for which you actually need GetNonZeroBytes is when generating PKCS #1 padding for RSA encryption.

In any event, given this cryptographically strong random number generator, our task is to create a class that looks like Random but that implements Random's methods using the GetBytes method of an underlying RNGCryptoServiceProvider. Our implementation of this is shown in Figure 1.

Figure 1 A More "Random" Random

public class CryptoRandom : Random
{
    private RNGCryptoServiceProvider _rng =
        new RNGCryptoServiceProvider();
    private byte[] _uint32Buffer = new byte[4];

    public CryptoRandom() { }
    public CryptoRandom(Int32 ignoredSeed) { }

    public override Int32 Next()
    {
        _rng.GetBytes(_uint32Buffer);
        return BitConverter.ToInt32(_uint32Buffer, 0) & 0x7FFFFFFF;
    }

    public override Int32 Next(Int32 maxValue)
    {
        if (maxValue < 0)
            throw new ArgumentOutOfRangeException("maxValue");
        return Next(0, maxValue);
    }

    public override Int32 Next(Int32 minValue, Int32 maxValue)
    {
        if (minValue > maxValue) 
            throw new ArgumentOutOfRangeException("minValue");
        if (minValue == maxValue) return minValue;
        Int64 diff = maxValue - minValue;
        while (true)
        {
            _rng.GetBytes(_uint32Buffer);
            UInt32 rand = BitConverter.ToUInt32(_uint32Buffer, 0);

            Int64 max = (1 + (Int64)UInt32.MaxValue);
            Int64 remainder = max % diff;
            if (rand < max - remainder)
            {
                return (Int32)(minValue + (rand % diff));
            }
        }
    }

    public override double NextDouble()
    {
        _rng.GetBytes(_uint32Buffer);
        UInt32 rand = BitConverter.ToUInt32(_uint32Buffer, 0);
        return rand / (1.0 + UInt32.MaxValue);
    }

    public override void NextBytes(byte[] buffer)
    {
        if (buffer == null) throw new ArgumentNullException("buffer");
        _rng.GetBytes(buffer);
    }
}

Our CryptoRandom class derives from Random, which provides two constructors: one that's parameterless and one that accepts an Int32 "seed" value. The former simply delegates to the latter with Environment.TickCount as the seed value, and the latter uses the seed value to calculate a starting value for the pseudo-random number sequence. Note the use of "sequence" here. Provided with the same seed value, Random always produces the same sequence of "random" values. This is great for debuggability (as it allows you to specify a debugging seed value that will guarantee you reproducible "random" results), but it's lousy for randomness.

There are only 2^32 possible Int32 values you can use as a seed, which means Random can only produce at most 2^32 unique sequences (and in reality, based on its current implementation, it's actually significantly fewer than that). More important, the lack of cryptographically strong sequences means that an attacker can analyze the history of numbers generated in a sequence to anticipate future "random" values.

RNGCryptoServiceProvider does not accept a starting seed value from the developer, though internally a seed is still used (for information on what contributes to the seed, see the documentation for the CryptGenRandom function at msdn2.microsoft.com/aa379942). In fact, while RNGCryptoServiceProvider provides several constructors, including one that accepts a string and one that accepts a byte array, the only one that actually does anything with the parameter is the one that accepts cryptographic service provider (CSP) parameters, and we'll ignore that one for our purposes. To maintain the exact same API as Random, our CryptoRandom class exposes both constructors that Random does, but the seed is ignored (nothing about the seed is passed along to the underlying RNGCryptoServiceProvider). By exposing the same API, a find and replace substitution in your existing code becomes very easy.

Now, the whole point of CryptoRandom is to adapt the RNGCryptoServiceProvider class to the interface exposed by Random, so CryptoRandom contains a private RNGCryptoServiceProvider member; the instance stored here is used to serve all requests for random values. In addition, CryptoRandom contains another private field, a byte array, that's used with the GetBytes method of RNGCryptoServiceProvider. Our implementation only ever requires one size byte array for all calls to GetBytes, and as the Random class is not thread-safe, CryptoRandom does not need to be either. As such, it makes sense to create this byte buffer once and reuse it on every call. (Note that, as currently implemented in the .NET Framework 2.0, the parameterless constructor of RNGCryptoServiceProvider creates thread-safe instances. As such, we could have created our private member to instead have been a private static member, and in doing so not have to create a new RNGCryptoServiceProvider instance for each instance of CryptoRandom. However, this thread-safety is not currently documented nor is it in any way baked into the class's contract or interface. Given that, we haven't relied on it for our implementation.)

With the structure taken care of, we can implement the public methods, of which there are five. The easiest to implement is NextBytes, which accepts a byte array and fills it with random values. This is the same signature and behavior as RNGCryptoServiceProvider's GetBytes method, which makes NextBytes painless to implement:

_rng.GetBytes(buffer);

The next method to implement is NextDouble, which returns a random real value in the range [0.0, 1.0). One standard solution for generating such a value is to generate a non-negative integer and then divide that value by one more than the maximum possible integral value. This translates directly into an implementation:

_rng.GetBytes(_uint32Buffer);
UInt32 rand = BitConverter.ToUInt32(_uint32Buffer, 0);
return rand / (1.0 + UInt32.MaxValue);

This isn't a perfect solution, however. In order to generate a random double, we have to define what we mean by a random double in the first place. Do we want it so that each bit pattern has equal probability of being generated, or do we want it such that if we were to generate many random doubles they would be evenly spread over the given interval? Given the nature of the real number implementation in the .NET Framework, those are not the same thing. Most likely we mean the second definition, which means that we don't want to be generating completely random bits (which would skew the overall range of numbers toward the lower end of the range). Instead, we do want to use the division method shown in the previous code snippet, which will not cause all bit patterns to appear with equal probability, but will cause us to generate numbers that are evenly spread out across the desired range. Somewhat paradoxically, this approach will cause us to start failing tests for randomness based on the first definition mentioned earlier, but it's very unlikely that consumers of the NextDouble method really want each valid bit pattern to appear with equal probability.

Now, the only randomness in the input to the NextDouble generation equation is the numerator. The denominator actually ends up costing us bits of randomness because of the division operation. What we really want is to do the division and then as we normalize the result, shift in random bits rather than zeros. However, without implementing our own floating point library, that's not doable, and writing our own floating point implementation is well beyond the scope of the column (and most likely your needs). Instead, we can attempt to optimize what we have, and that means getting the most random bits involved into the equation. Since the numerator is the only place that we can introduce randomness, and since a Double is 64 bits long, rather than generating a UInt32 and dividing by UInt32.MaxValue+1, we could try generating a random UInt64 and dividing by UInt64.Max + 1, thereby doubling the amount of random bits in the input.

This, unfortunately, leads to another problem. Double-precision numbers store an approximation of a real number; System.Double, which complies with the IEEE 754 standard for binary floating-point arithmetic, provides at most 17 decimal digits of precision. Unfortunately, UInt64.MaxValue (18446744073709551615) requires 20 decimal digits of precision. As a result, under .NET floating-point arithmetic, UInt64.MaxValue is equal to UInt64.MaxValue + 1, and, thus, substituting UInt64 for UInt32 in our NextDouble equation will change our range from [0.0, 1.0) to [0.0, 1.0], which violates the design of System.Random's implementation. As a result, we've chosen to keep the implementation shown in Figure 1, but please keep in mind there may be better (and probably more complicated) alternatives out there if any of this bothers you.

Next, we need to implement the parameterless overload of Next, which is used to retrieve a random non-negative Int32 value. We can do this by generating four random bytes, stripping off the high-order bit, and converting the remaining bits to an Int32 with the System.BitConverter class:

_rng.GetBytes(_uint32Buffer);
return BitConverter.ToInt32(_uint32Buffer, 0) & 0x7FFFFFFF;

Generating four random bytes will provide us with random values in the range [Int32.MinValue, Int32.MaxValue], but to match the semantics of Next, we really need a value in the range [0, Int32.MaxValue]. Using only the bottom 31 bits of the 32 random bits provides that mapping by turning any negative value into a non-negative equivalent. Since there are an equal number of negative numbers and non-negative numbers, and since stripping off that one bit provides a one-to-one mapping between them, every value has the same chance of being returned.

Things start to break down when we move on to the overloads of Next that generate a random Int32 within a given interval. The second overload of Next that accepts just a maximum value is simply a subcase of the third overload, which accepts both a minimum and maximum, as the second overload has an implicit minimum of 0. Thus the second overload can just delegate to the third. (The parameterless overload of Next is also a subcase of the third, with both an implicit minimum and an implicit maximum, but it's best handled on its own for reasons that will become apparent later.)

There are two obvious ways to generate a random Int32 within a given interval, and one not so obvious. The first involves multiplying by a random double:

return (int)(minValue + (NextDouble() * (maxValue – minValue)));

The second starts with picking up a random UInt32 and doing the operation:

result = minValue + (randomUInt32 % (maxValue – minValue))

Both of these approaches have problems. The first, where you multiply by a random double, is flawed due to the condition we've already seen where the double that's generated isn't truly cryptographically random. The other method (taking a random integer mod the range) works perfectly when the range is an even power of two (assuming that you have perfectly random input bits that you're selecting from). This is because the UInt32 values generated from GetBytes have 2^32 possible values; to evenly distribute these among all of the values in the target range, the size of that target range must evenly divide the input range, and with a power of two such as 2^32, that will only happen if the size of the target range is also a power of two. When the size of the target range is not a power of two, the "random" results will start to weigh more heavily toward the lower values.

For an example that demonstrates this skewing, let's say we want a number in the range [21, 27]. (For this example, we'll be selecting 8 bits rather than 32 bits just because it's easier for demonstration, but the principle generalizes out to the 32 bits we'd actually be working with.) Following the equation shown earlier, we'll now take a random number between 0 and 255 mod 6 + 21 to get our value. If the random number is in the range [0, 251] this works out great as each of the numbers [0,5] has come up exactly 42 times. However, once we move past 251, we'll get 0 (252), 1 (253), 2 (245), and 3 (255) as values from our mod operation. That means that 0, 1, 2, and 3 have a 43/256 chance of being selected by the mod operation while 4 and 5 only have a 42/256 chance. Therefore, we've got a ~16.80 percent chance of getting 21, 22, 23, or 24 back, but only a ~16.41 percent chance of getting 25 or 26 back. Over the long term, if we were an attacker trying to guess outputs, we would be more successful by always choosing a number at the low end of the range.

In light of this information, we've chosen to use the third option in our implementation in Figure 1. The problem we just explained regarding the mod approach happens because some values in the target range have the potential to be slightly favored over other values. We can fix this by detecting when such favoritism would happen and, in those cases, just trying again. As we saw in the previous example, the favoritism happens when the randomly selected value satisfies the following condition:

RandomValue >= RandomRange - (RandomRange % TargetRange)

In our previous example, the RandomRange is 256 and the TargetRange is 6; therefore, favoritism happens when the RandomValue is >= 252. For small target ranges, the chances of this happening are very small. For huge target ranges (such as a target range of size 2^31+1), the chances of this happening are no worse than 50 percent. Thus, we can generate a random value and check to see if it falls into this favoritism range. If it does, we start over with a new random value and try again. If it doesn't fall into the range, then we just return the result of formula shown for the second approach earlier. While theoretically this algorithm could loop forever, the chances of that happening are infinitely small. In most cases, this will at most take only a few iterations. And even in the worst case, where we have a 50 percent chance of retrieving a bad random value, it shouldn't take too many iterations before we find a good value. For example, after 10 iterations, the chances of not finding a good value are less than 0.1 percent, and after 20 iterations, that drops to less than 0.00001 percent. If for some reason the performance implications of this approach don't meet your needs, you can always revert to one of the previously mentioned approaches.

We have one last observation to make: you may now notice that we can implement the parameterless Next overload in terms of the other overloads with no loss of randomness. The target range required for the parameterless Next is 2^31, which is a power of two and thus does evenly divide the random range of 2^32. As such, no looping would be required. In fact, our bit manipulation in our Next implementation (&'ing with 0x7FFFFFFF) produces the same result as if we had modded the random value by the target range, 2^31. However, the simplicity of the implementation for the parameterless Next compelled us to leave it as is, even though we could have simply delegated and removed the existing code for the method.

With that, CryptoRandom is complete and can be used in scenarios where you're currently using Random. You don't even need to replace declarations of the Random class in your code; since CryptoRandom derives from Random, you only need to replace instantiations of Random with CryptoRandom and everything else will fall into place.

These statements need a disclaimer, however. If Random was meant to be extended, as we claimed toward the beginning of this column, you may be wondering why the designers of RNGCryptoServiceProvider didn't simply do what we've done here and derive from that class instead of creating their own RandomNumberGenerator base class. As we've seen, the interface for RNGCryptoServiceProvider is different because we cannot implement the entire System.Random interface to return true cryptographic random numbers. Instead, we have to start with a set of cryptographically random bits, and massage them to fit the interface, losing some randomness along the way. You as a developer can then take RNGCryptoServiceProvider and the cryptographically random bits it provides and massage them in a way that best fits your needs. If the CryptoRandom class we've provided for you does that in an acceptable way, wonderful. If not, you should now at least understand the issues involved and the tradeoffs that need to be made when you implement your own.

One caveat: CryptoRandom brings with it a performance cost. While Random's implementation does not produce cryptographically strong random numbers, it is extremely fast when compared to RNGCryptoServiceProvider. In our non-scientific tests, Random was over 100 times faster than CryptoRandom. Additionally, if you rely on the reproducibility of the sequences generated by Random, you won't be happy with the non-reproducible sequences generated by CryptoRandom. But after all, that's the point, isn't it?

Send your questions and comments to netqa@microsoft.com.

Stephen Toub is a Senior Program Manager on the Parallel Computing Platform team at Microsoft. He is also a Contributing Editor for MSDN Magazine.

Shawn Farkas is the Software Development Engineer on the CLR team at Microsoft responsible for all of the cryptography classes in the .NET Framework.