November 2016

Volume 31 Number 11

[Test Run]

Solving Sudoku Using Combinatorial Evolution

By James McCaffrey

James McCaffreyA combinatorial optimization problem is one where the goal is to arrange a set of discrete items into a particular order. A Sudoku puzzle is an example of a combinatorial optimization problem. In this article, I show you how to write a program to solve difficult Sudoku problems, using a technique I call combinatorial evolution.

The demo sets up a non-easy Sudoku problem (I’ll explain what I mean by non-easy shortly). In Sudoku, there are several constraints. Each row of the 9x9 solution grid must contain the numbers 1 through 9, with no duplicates. Each column must contain the numbers 1 through 9. Each 3x3 sub-grid must contain the numbers 1 through 9. And the solution grid cannot change any of the starting values in the problem grid.

The demo problem is shown in Figure 1. Some Sudoku problems can be solved using a brute force algorithm where you examine each empty cell in the grid and check to see which values can be legally placed there by checking the row, column and sub-grid constraints. If only one value is possible, then that value is placed in the empty cell. The process is continued until all empty cells have been assigned. These non-easy type of Sudoku problems are the ones usually found in newspapers and magazines.

A Non-Easy Sudoku Problem
Figure 1 A Non-Easy Sudoku Problem

The demo problem is non-easy because the brute-force algorithm doesn’t work. Suppose you scan the problem grid from left to right, and then top to bottom. The empty cell at (0, 0) can be one of (1, 3, 5, 9). The empty cell at (0, 1) can be one of (1, 3, 9). The next empty cell at (0, 4) can only be a 3, so you can place a 3 there and continue. However, using this approach, after placing nine values, you’d get stuck because all remaining cells could be two or more values.

To get an idea of what combinatorial evolution optimization is, take a look at the screenshot in Figure 2. Combinatorial evolution optimization uses ideas from several bio-inspired algorithms. The algorithm maintains a collection of virtual organisms. Each organism represents a possible solution to the problem. Combinatorial evolution is an iterative process. Each iteration is called an epoch. In each epoch, every organism attempts to find a better solution by examining a new possible solution.

Solving Sudoku Using Combinatorial Evolution
Figure 2 Solving Sudoku Using Combinatorial Evolution

After all organisms have had a chance to improve, two good organism-solutions are selected and used to give birth to a new organism, which replaces a poor solution. So the population of organisms evolves over time. If an optimal solution isn’t found after some maxEpochs time, the algorithm is restarted by killing all the organisms and creating a new population.

The demo sets up 200 virtual organisms and a time limit of 5,000 epochs. These values were found by using a little bit of trial and error. Combinatorial evolution doesn’t guarantee an optimal solution will be found, so a limit of 20 restarts was set to prevent a possible infinite loop.

The demo program found an optimal solution after three restarts, which took approximately 8 seconds running on my desktop machine. During the demo run, the program displayed a measure of error of the best organism. I’ll explain how error is defined and calculated when I present the relevant code. However, notice that the algorithm tends to get a very good solution (error = 2) very quickly, but then gets stuck. The restart process is one mechanism to combat this characteristic, and is a common technique in many optimization algorithms.

The Demo Program

To create the demo program, I launched Visual Studio, clicked on File | New | Project and selected the C# Console Application option. I named the project SudokuEvo. The demo program has no significant .NET dependencies so any version of Visual Studio will work. After the template code loaded, in the Solution Explorer window, I right-clicked on file Program.cs and renamed it to SudokuEvoProgram.cs and allowed Visual Studio to automatically rename class Program for me.

At the top of the editor window, I deleted all unnecessary using statements, leaving just references to the System and Collections.Generic namespaces. The overall structure of the program, with a few minor edits, is shown in Figure 3. The demo program is too long to present in its entirety in this article, but the complete demo source code is available in the accompanying download.

Figure 3 Demo Program Structure

using System;
using System.Collections.Generic;
namespace SudokuEvo
{
  class SudokuEvoProgram
  {
    static Random rnd = new Random(0);
    static void Main(string[] args)
    {
      Console.WriteLine("Begin solving Sudoku");
      Console.WriteLine("The problem is: ");
      int[][] problem = new int[9][];
      problem[0] = new int[] { 0, 0, 6, 2, 0, 0, 0, 8, 0 };
      problem[1] = new int[] { 0, 0, 8, 9, 7, 0, 0, 0, 0 };
      problem[2] = new int[] { 0, 0, 4, 8, 1, 0, 5, 0, 0 };
      problem[3] = new int[] { 0, 0, 0, 0, 6, 0, 0, 0, 2 };
      problem[4] = new int[] { 0, 7, 0, 0, 0, 0, 0, 3, 0 };
      problem[5] = new int[] { 6, 0, 0, 0, 5, 0, 0, 0, 0 };
      problem[6] = new int[] { 0, 0, 2, 0, 4, 7, 1, 0, 0 };
      problem[7] = new int[] { 0, 0, 3, 0, 2, 8, 4, 0, 0 };
      problem[8] = new int[] { 0, 5, 0, 0, 0, 1, 2, 0, 0 };
      DisplayMatrix(problem);
      int numOrganisms = 200;
      int maxEpochs = 5000;
      int maxRestarts = 20;
      int[][] soln = Solve(problem, numOrganisms,
        maxEpochs, maxRestarts);
      Console.WriteLine("Best solution found: ");
      DisplayMatrix(soln);
      int err = Error(soln);
      if (err == 0)
        Console.WriteLine("Success \n");
      else
        Console.WriteLine("Did not find optimal solution \n");
      Console.WriteLine("End Sudoku demo");
      Console.ReadLine();
    } // Main()
    public static int[][] Solve(int[][] problem,
      int numOrganisms, int maxEpochs, int maxRestarts) { . . }
    public static void DisplayMatrix(int[][] matrix) { . . }
    public static int[][] SolveEvo(int[][] problem,
      int numOrganisms, int maxEpochs) { . . }
    public static int[][] RandomMatrix(int[][] problem) { . . }
    public static int[] Corner(int block) { . . }
    public static int Block(int r, int c) { . . }
    public static int[][] NeighborMatrix(int[][] problem,
      int[][] matrix)
    public static int[][] MergeMatrices(int[][] m1,
      int[][] m2) { . . }
    public static int Error(int[][] matrix) { . . }
    public static int[][] DuplicateMatrix(int[][] matrix) { . . }
    public static int[][] CreateMatrix(int n) { . . }
  } // Program
  public class Organism
  {
    public int type;  // 0 = worker, 1 = explorer
    public int[][] matrix;
    public int error;
    public int age;
    public Organism(int type, int[][] m, int error, int age)
    {
      this.type = type;
      this.matrix = SudokuEvoProgram.DuplicateMatrix(m);
      this.error = error;
      this.age = age;
    }
  }
} // ns

The demo program isn’t as complicated as it might first appear, because many of the methods are short helper methods. The demo has two classes. The main class has all the code logic implemented as static methods. The Organism class defines a possible solution to the target Sudoku problem.

Each Organism object has a type, which can be 0 for a “worker” organism, or 1 for an “explorer” organism. The field named matrix is an integer array-of-arrays-style matrix that represents a possible solution. Each possible solution has an error, where an error value of 0 means that no constraints are violated and, therefore, the matrix field holds an optimal solution. Each Organism object has an age field to control whether the organism dies in each epoch.

The demo program sets up and displays the Sudoku problem using these statements:

int[][] problem = new int[9][];
problem[0] = new int[] { 0, 0, 6, 2, 0, 0, 0, 8, 0 };
...
problem[8] = new int[] { 0, 5, 0, 0, 0, 1, 2, 0, 0 };
DisplayMatrix(problem);

Notice that 0 values are used to indicate an empty cell. This manual, hardcoded approach is pretty tedious and in most realistic combinatorial optimization problems you’d read problem data from a text file.

The Sudoku problem is tackled using these statements:

int numOrganisms = 200;
int maxEpochs = 5000;
int maxRestarts = 20;
int[][] soln = Solve(problem, numOrganisms, maxEpochs, maxRestarts);
Console.WriteLine("Best solution found: ");
DisplayMatrix(soln);

Method Solve is mostly a wrapper around method SolveEvo, which does most of the work. This is a common design pattern in combinatorial optimization—one low-level solver method attempts to find an optimal solution, and that method is wrapped by a high-level solver method that performs restarts.

The combinatorial evolution algorithm isn’t guaranteed to find an optimal solution (that is, a solution that has no constraint errors), so the demo program checks to determine if the best solution found is optimal:

int err = Error(soln);
if (err == 0)
  Console.WriteLine("Success");
else
  Console.WriteLine("Did not find optimal solution");

Matrix Initialization and Error

In my opinion, the best way to understand combinatorial evolution optimization is to start by examining the helper methods. Once the helpers are understood, the solve-method is relatively easy to grasp. Let me begin by explaining method RandomMatrix, which initializes the matrix field of an Organism object to a random possible solution. Somewhat surprisingly, method RandomMatrix is the trickiest part of the entire algorithm.

Figure 4 shows the definition of method RandomMatrix.

Figure 4 Definition of the RandomMatrix Method

public static int[][] RandomMatrix(int[][] problem)
{
  int[][] result = DuplicateMatrix(problem);
  for (int block = 0; block < 9; ++block) {
    // Create a List with values 1-9
    // Shuffle the values in List
    // Walk through each cell in current block
    // If a cell is occupied, remove that value from List
    // Walk through each cell in current block
    // If cell is empty, add value from List
  }
  return result;
}

The algorithm is designed so that at any point in time, each of the nine 3x3 sub-grids is OK in the sense that each cell contains numbers 1 through 9, and there are no duplicate values. A condition that’s always true is sometimes called an invariant. This invariant influences all other methods of the algorithm. In theory, the row constraint or the column constraint could’ve been used as the invariant, but in practice using the sub-grid constraint is more effective.

Conceptually, walking through each 3x3 sub-grid in a 9x9 matrix isn’t deep, but implementation is mildly tricky. The approach I took was to define two helper methods, Block and Corner. Method Block accepts a row index r and a column index c, and returns a block number (0-8) that holds the cell at (r, c). Block numbers are assigned left-to-right, and then top-to-bottom. For example, if (r, c) = (3, 8) then method Block returns 5.

Method Corner accepts a block ID (0-8) and returns the indices of the upper-left-hand corner of the block. For example, if block = 8, method Corner returns (6, 6).

Once it’s known that each of the nine 3x3 sub-grids of a matrix are OK, it’s possible to define a relatively simple method that defines error:

public static int Error(int[][] matrix)
{
  int err = 0;
  for (int i = 0; i < 9; ++i) {  // Each row
    // Determine missing values in row
    // Add 1 to err for each missing value
  }
  for (int j = 0; j < 9; ++j) {  // each column
    // Determine missing values in column
    // Add 1 to err for each missing value
  }
  return err;
}

In words, the total error for a possible solution is the sum of the number of missing values in the rows, plus the number of missing values in the columns. Because of the algorithm invariant, all the 3x3 sub-grids have no missing values, so they don’t contribute to error. Note that counting the number of duplicate values in each row and column is equivalent to counting the number of missing values.

Generating a Neighbor Matrix

In combinatorial evolution, there are two types of Organism objects. Those that are type explorer search for possible solutions randomly, using method RandomMatrix. Organism objects that are type worker repeatedly try to find a better solution to the one stored in their matrix field, by examining a close, “neighbor” possible solution.

Because of the 3x3 sub-grid invariant, a neighbor solution must confine itself to being a permutation of a sub-grid. Put more concretely, to determine a neighbor matrix, the algorithm selects a block at random, then selects two cells in the block (where neither cell contains a fixed value from the problem definition), and exchanges the values in the two cells.

Method NeighborMatrix is defined like so:

public static int[][] NeighborMatrix(int[][] problem, int[][] matrix)
{
  int[][] result = DuplicateMatrix(matrix);
  int block = rnd.Next(0, 9);  // pick a random block
  // Determine which cells have values that can be swapped
  // Pick two of those cells: (r1,c1) and (r2,c2)
  int tmp = result[r1][c1];
  result[r1][c1] = result[r2][c2];
  result[r2][c2] = tmp;
  return result;
}

The demo program assumes the existence of a class-scope Random object named rnd. This design is common in many optimization algorithms. The idea of generating a neighbor solution is found in several combinatorial optimization algorithms, for example, simulated annealing optimization and simulated bee colony optimization.

Merging Two Matrices

The combinatorial evolution optimization algorithm implements a form of virtual evolution by selecting two good Organism objects (meaning they have a matrix field with small error), and then using those good objects to create a new child object. The new, presumably very good, child Organism replaces a poor Organism.

Method MergeMatrices accepts two 9x9 matrices from two Organism objects. The method scans through blocks 0 to 8. For each block, a random value between 0.0 and 1.0 is generated. If the random value is less than 0.50 (that is, about half the time), then the values in the two blocks are exchanged. The code is:

public static int[][] MergeMatrices(int[][] m1, int[][] m2)
{
  int[][] result = DuplicateMatrix(m1);
  for (int block = 0; block < 9; ++block) {
    double pr = rnd.NextDouble();
    if (pr < 0.50) {
      // Replace values in block of m1 with those in m2    
    }
  }
  return result;
}

This evolutionary mechanism is somewhat similar to the chromosome crossover mechanism used in genetic algorithms.

The SolveEvo Method

The primary algorithm is implemented in method SolveEvo. The method is best described with a combination of code and high-level pseudo-code, as shown in Figure 5.

Figure 5 Primary Algorithm Implemented in the SolveEvo Method

public static int[][] SolveEvo(int[][] problem,
  int numOrganisms, int maxEpochs)
{
  int numWorker = (int)(numOrganisms * 0.90);
  int numExplorer = numOrganisms - numWorker;
  Organism[] hive = new Organism[numOrganisms];
  // Initialize each Organism
  int epoch = 0;
  while (epoch < maxEpochs) {
    for (int i = 0; i < numOrganisms; ++i) {
      // Process each Organism
    }
    // Merge best worker with best explorer, increment epoch
  }
  return bestMatrix;
}

The method begins by determining the number of worker Organism objects as 90 percent of the total number used. This value was determined by trial and error. The Organism objects are stored in an array named hive.

The pseudo-code for processing a worker-type Organism is:

generate a neighbor matrix
if neighbor is better (or Organism makes mistake)
  update matrix with neighbor
  reset age to 0
else
  don't update matrix
  increment age
end-if

The algorithm occasionally (probability = 0.001) instructs an Organism object to accept a neighbor solution that’s worse than the object’s current solution. This is a common optimization strategy designed to help an algorithm get unstuck from a non-optimal solution.

At each epoch in the iteration loop, each explorer-type Organ­ism generates a random solution grid. Then after each of the Organism objects has been processed, a new Organism is created. In pseudo-code:

determine index of best worker Organism
determine index of best explorer Organism
create new Organism using best worker and best explorer
determine index of worst worker Organism
replace worst worker with new child Organism

As it turns out, this evolutionary mechanism is critical to the algorithm’s success. Without evolution, the algorithm sometimes fails, but with evolution, the algorithm has successfully solved every difficult Sudoku problem I’ve presented it with, including some heinously difficult problems I found on the Internet. However, combinatorial optimization hasn’t been subjected to research analysis, so the fact that combinatorial evolution can solve Sudoku puzzles is no guarantee that it can solve arbitrary combinatorial optimization problems.

Wrapping Up

The combinatorial evolution algorithm I’ve presented in this article isn’t really an algorithm. Combinatorial evolution is a meta-heuristic. By that I mean combinatorial evolution is just a set of general guidelines that can be used to design a concrete algorithm to solve a specific optimization problem.

It’s unlikely you’ll need to solve a Sudoku problem in your regular work environment, but combinatorial optimization can be used to solve real-life problems, too. The key ideas in combinatorial optimization are to define a data structure that describes the target problem; define what is meant by a random solution; define what a neighbor solution is; and define an error metric. With these pieces of the puzzle in place, many problems that can’t be solved by traditional algorithms can be solved quickly and efficiently using combinatorial evolution.


Dr. James McCaffrey works for Microsoft Research in Redmond, Wash. He has worked on several Microsoft products including Internet Explorer and Bing. Dr. McCaffrey can be reached at jammc@microsoft.com.

Thanks to the following Microsoft technical experts who reviewed this article: Chris Lee and Kirk Olynyk


Discuss this article in the MSDN Magazine forum