Solve sudokus automatically — and naturally

Guy Lipman
17 min readJan 17, 2020

I sometimes do sudokus in the newspaper, and most of the time I can’t solve them, it is because I’ve messed up. So I sometimes think about how write some code to solve a sudoku methodically, to avoid these kinds of mistakes.

Writing code that solves a sudoku using brute force, or recursion, is relatively straight forward, there are some examples out there (for example this one). However, I don’t want something that solves a sudoku using steps that I couldn’t ever replicate — I want something that does steps I could do, and shows me how it does them.

So, inspired by the link above, I decided to work on a solver that does what I do when I solve them. The basic steps I use are:

  1. Filling in the empty boxes with all the numbers that a square could be
  2. If a cell has a single number in it, then that number can’t be in any other cells in the row, column or box.
  3. If a group of n cells in a row (or column or box) have just n digits between them, those n digits can’t be anywhere else in the row (or column or box)
  4. If a number is only in one row (column) of a box, it cannot be in that row (column) in any other boxes.

In fact, as I thought about it, rule 2 is just a special case of 3.

I haven’t ever seen a sudoku that I couldn’t solve with these steps. Sure, one could create a sudoku grid that couldn’t be solved with these steps, but I’d most likely find it unsatisfying (and impossible to do on paper). So, I tried to code these steps, and see how I went. (If I find any satisfying sudokus that can’t be solved using this method, then I might look at enhancing my code to handle them.)

My starting point is the raw table, eg:

I then have a function preparesudoku() that populates fills all the 0’s with all the possible values:

def whatcanitbe(row, col):
if sudoku[row][col]==0:
rowvals = set(sudoku[row])
colvals = {row[col] for row in sudoku}
boxrow = int(row/3)*3
boxcol = int(col/3)*3
boxvals = [row[boxcol:boxcol+3]
for row in sudoku[boxrow:boxrow+3]]
boxvals = set(boxvals[0] + boxvals[1] + boxvals[2])
return list({1,2,3,4,5,6,7,8,9}
- rowvals - colvals - boxvals)
else:
return [sudoku[row][col]]
def preparesudoku():
global sudoku
sudoku = [[whatcanitbe(row, col)
for col in range(9) ]
for row in range(9)]

a function that works out the maximum number of digits still present, and one to print the sudoku:

def maxdigits():
return max([max([len(col) for col in row]) for row in sudoku])
def printsudoku():
collength='{{:{}}}'.format(maxdigits())
string = '\n'.join(
[' '.join(
[collength.format(''.join(
['{}'.format(s) for s in col]))
for col in row])
for row in sudoku])
print(string + '\n')

Next I have a function getgroup() which gets a row or col or box from the sudoku. This function includes the location of each of the cells it has brought back:

def getgroup(rowcolbox, num):
temp = [[(sudoku[i][j], i, j)
for j in range(9) ]
for i in range(9)]
if rowcolbox=='row':
return temp[num]
elif rowcolbox=='col':
return [row[num] for row in temp]
elif rowcolbox == 'box':
startrow = int(num/3)*3
startcol = (num%3)*3
group = [row[startcol:startcol+3]
for row in temp[startrow:startrow+3]]
return group[0] + group[1] + group[2]
else:
raise Exception

Next I have a function for step 3 above, ie it goes through a group, and for each group of ‘size’ elements, if there are only ‘size’ different numbers between them, it deletes those numbers from every other cell in the group.

def cleanarray(rowcolbox, num, size, audit=1):
global sudoku
array = getgroup(rowcolbox, num)
combs = list(itertools.combinations(array, size))
combs2 = [set.union(*[set(c[0])
for c in comb])
for comb in combs]
combs3 = [len(comb) for comb in combs2]
assert min(combs3)>=size
combs3 = [(comb==size) for comb in combs3]
for i in range(len(combs)):
if combs3[i]==1:
vals = list(combs2[i])
inlocs = [(comb[1], comb[2]) for comb in combs[i]]
for item in array:
if (item[1],item[2]) not in inlocs:
for val in vals:
if val in item[0]:
if audit>=1:
print('{} cannot be in cell {},{} '
'because it is in group {} in '
'that {} '.format(val,
item[1]+1,
item[2]+1,
combs2[i],
rowcolbox))
sudoku[item[1]][item[2]].remove(val)
if audit>=2:
printsudoku()

I wrap this function in a calling function cleanarrays():

def cleanarrays(maxsize, audit=1):
for size in range(1, maxsize+1):
for iteration in range(10):
if maxdigits()==1:
break
else:
for num in range(9):
cleanarray('row', num, size, audit)
cleanarray('col', num, size, audit)
cleanarray('box', num, size, audit)

Finally, I have written some code for step 4. It is pretty ugly, sorry.

def cleanboxes(audit=1):
global sudoku
rowcolbox = 'col'
temp = [
[sudoku[0][col] + sudoku[1][col] + sudoku[2][col],
sudoku[3][col] + sudoku[4][col] + sudoku[5][col],
sudoku[6][col] + sudoku[7][col] + sudoku[8][col]]
for col in range(9)]
for num in range(1,9):
tempfornum = [[int(num in col) for col in row] for row in temp]
for i in range(3):
t = tempfornum[3*i:3*i+3]
for row in range(3):
for col in range(3):
if (t[0][col] + t[1][col] + t[2][col] - t[row][col] == 0):
for col2 in range(3):
if col2 != col:
if t[row][col2] == 1:
for i2 in range(3):
if num in sudoku[col2*3 + i2][3*i + row]:
if audit >= 1:
print('{} cannot be in {},{} as it is needed in another block in this {}'.format(num, col2*3 + i2 + 1, 3*i + row + 1, rowcolbox ))
sudoku[col2*3 + i2][3*i + row].remove(num)
if audit >= 2:
printsudoku()
rowcolbox = 'row'
temp = [[row[0] + row[1] + row[2], row[3] + row[4] + row[5], row[6] + row[7] + row[8] ] for row in sudoku]
for num in range(1,9):
tempfornum = [[int(num in col) for col in row] for row in temp]
for i in range(3):
t = tempfornum[3*i:3*i+3]
for row in range(3):
for col in range(3):
if (t[0][col] + t[1][col] + t[2][col] - t[row][col] == 0):
for col2 in range(3):
if col2 != col:
if t[row][col2] == 1:
for i2 in range(3):
if num in sudoku[row][3*col2 + i2]:
if audit>=1:
print('remove {} from {},{} as it is needed in another block in this {}'.format(num, row, 3*col2 + i2, rowcolbox ))
sudoku[row][3*col2 + i2].remove(num)
if audit>=2:
printsudoku()

So, when I want to run this on a sudoku, I run the following lines of code:

preparesudoku()
printsudoku()
for j in range(4):
cleanarrays(maxsize=5)
printsudoku()
if maxdigits()==1:
break
cleanboxes()
printsudoku()
if maxdigits()==1:
break

This outputs:

2      9456   946    1      7      895    456    3      845   
1457 3 1497 24589 8924 6 12457 24578 12458
14567 14567 8 3 24 25 9 24567 1245
9 4567 467 24578 8234 23578 23456 1 23458
3 145 14 24589 6 8925 245 8245 7
145678 2 1467 8457 8134 8357 3456 8456 9
167 1796 5 927 923 4 8 27 123
147 147 12347 6 823 8237 123457 9 12345
47 8 23479 927 5 1 2347 247 6
1 cannot be in cell 7,1 because it is in group {1, 4, 7} in that box
7 cannot be in cell 7,1 because it is in group {1, 4, 7} in that box
1 cannot be in cell 7,2 because it is in group {1, 4, 7} in that box
7 cannot be in cell 7,2 because it is in group {1, 4, 7} in that box
1 cannot be in cell 8,3 because it is in group {1, 4, 7} in that box
4 cannot be in cell 8,3 because it is in group {1, 4, 7} in that box
7 cannot be in cell 8,3 because it is in group {1, 4, 7} in that box
4 cannot be in cell 9,3 because it is in group {1, 4, 7} in that box
7 cannot be in cell 9,3 because it is in group {1, 4, 7} in that box
6 cannot be in cell 3,1 because it is in group {9, 2, 6} in that col
6 cannot be in cell 6,1 because it is in group {9, 2, 6} in that col
9 cannot be in cell 7,4 because it is in group {9, 5, 6} in that row
9 cannot be in cell 7,5 because it is in group {9, 5, 6} in that row
6 cannot be in cell 7,2 because it is in group {4, 5, 6} in that row
9 cannot be in cell 9,3 because it is in group {9, 5, 6} in that box
9 cannot be in cell 1,2 because it is in group {9, 2, 3} in that col
2 cannot be in cell 7,5 because it is in group {2, 6, 7} in that row
2 cannot be in cell 7,9 because it is in group {2, 6, 7} in that row
3 cannot be in cell 7,9 because it is in group {2, 3, 7} in that row
3 cannot be in cell 8,5 because it is in group {3, 4, 6} in that box
3 cannot be in cell 8,6 because it is in group {3, 4, 6} in that box
1 cannot be in cell 2,9 because it is in group {9, 1, 7} in that col
1 cannot be in cell 3,9 because it is in group {9, 1, 7} in that col
1 cannot be in cell 8,9 because it is in group {9, 1, 7} in that col
1 cannot be in cell 8,7 because it is in group {8, 1, 9} in that box
4 cannot be in cell 3,1 because it is in group {2, 4, 5} in that row
5 cannot be in cell 3,1 because it is in group {2, 4, 5} in that row
4 cannot be in cell 3,2 because it is in group {2, 4, 5} in that row
5 cannot be in cell 3,2 because it is in group {2, 4, 5} in that row
2 cannot be in cell 3,8 because it is in group {2, 4, 5} in that row
4 cannot be in cell 3,8 because it is in group {2, 4, 5} in that row
5 cannot be in cell 3,8 because it is in group {2, 4, 5} in that row
3 cannot be in cell 4,5 because it is in group {3, 6, 7} in that col
3 cannot be in cell 6,5 because it is in group {3, 6, 7} in that col
2 cannot be in cell 9,4 because it is in group {2, 7, 8} in that box
7 cannot be in cell 9,4 because it is in group {2, 7, 8} in that box
1 cannot be in cell 2,1 because it is in group {1, 4, 7} in that col
4 cannot be in cell 2,1 because it is in group {1, 4, 7} in that col
7 cannot be in cell 2,1 because it is in group {1, 4, 7} in that col
1 cannot be in cell 6,1 because it is in group {1, 4, 7} in that col
4 cannot be in cell 6,1 because it is in group {1, 4, 7} in that col
7 cannot be in cell 6,1 because it is in group {1, 4, 7} in that col
5 cannot be in cell 1,2 because it is in group {2, 3, 5} in that box
5 cannot be in cell 2,4 because it is in group {3, 5, 6} in that row
5 cannot be in cell 2,7 because it is in group {3, 5, 6} in that row
5 cannot be in cell 2,8 because it is in group {3, 5, 6} in that row
5 cannot be in cell 2,9 because it is in group {3, 5, 6} in that row
9 cannot be in cell 2,4 because it is in group {1, 3, 9} in that col
9 cannot be in cell 5,4 because it is in group {1, 3, 9} in that col
2 cannot be in cell 2,5 because it is in group {2, 4, 8} in that col
4 cannot be in cell 2,5 because it is in group {2, 4, 8} in that col
8 cannot be in cell 2,5 because it is in group {2, 4, 8} in that col
4 cannot be in cell 6,5 because it is in group {2, 4, 8} in that col
8 cannot be in cell 6,5 because it is in group {2, 4, 8} in that col
1 cannot be in cell 6,3 because it is in group {1, 2, 9} in that row
5 cannot be in cell 6,1 because it is in group {9, 2, 5} in that col
9 cannot be in cell 2,3 because it is in group {9, 3, 5} in that row
9 cannot be in cell 1,6 because it is in group {1, 9, 7} in that box
8 cannot be in cell 6,4 because it is in group {8, 1, 2} in that row
8 cannot be in cell 6,6 because it is in group {8, 1, 2} in that row
8 cannot be in cell 6,8 because it is in group {8, 1, 2} in that row
4 cannot be in cell 1,3 because it is in group {4, 5, 6, 8} in that row
6 cannot be in cell 1,3 because it is in group {4, 5, 6, 8} in that row
2 cannot be in cell 5,6 because it is in group {1, 2, 4, 5, 8} in that row
5 cannot be in cell 5,6 because it is in group {1, 2, 4, 5, 8} in that row
8 cannot be in cell 5,6 because it is in group {1, 2, 4, 5, 8} in that row
2 46 9 1 7 85 456 3 845
5 3 147 248 9 6 1247 2478 248
17 167 8 3 24 25 9 67 245
9 4567 467 24578 824 23578 23456 1 23458
3 145 14 2458 6 9 245 8245 7
8 2 467 457 1 357 3456 456 9
6 9 5 27 3 4 8 27 1
147 147 23 6 82 827 23457 9 2345
47 8 23 9 5 1 2347 247 6
5 cannot be in 4,6 as it is needed in another block in this col
5 cannot be in 6,6 as it is needed in another block in this col
6 cannot be in 4,2 as it is needed in another block in this col
remove 1 from 1,2 as it is needed in another block in this row
2 46 9 1 7 85 456 3 845
5 3 47 248 9 6 1247 2478 248
17 167 8 3 24 25 9 67 245
9 457 467 24578 824 2378 23456 1 23458
3 145 14 2458 6 9 245 8245 7
8 2 467 457 1 37 3456 456 9
6 9 5 27 3 4 8 27 1
147 147 23 6 82 827 23457 9 2345
47 8 23 9 5 1 2347 247 6
4 cannot be in cell 5,3 because it is in group {4, 6, 7} in that col
1 cannot be in cell 5,2 because it is in group {9, 3, 1} in that box
2 cannot be in cell 2,7 because it is in group {2, 4, 7, 8} in that row
4 cannot be in cell 2,7 because it is in group {2, 4, 7, 8} in that row
7 cannot be in cell 2,7 because it is in group {2, 4, 7, 8} in that row
2 46 9 1 7 85 456 3 845
5 3 47 248 9 6 1 2478 248
17 167 8 3 24 25 9 67 245
9 457 467 24578 824 2378 23456 1 23458
3 45 1 2458 6 9 245 8245 7
8 2 467 457 1 37 3456 456 9
6 9 5 27 3 4 8 27 1
147 147 23 6 82 827 23457 9 2345
47 8 23 9 5 1 2347 247 6
7 cannot be in 7,8 as it is needed in another block in this col
7 cannot be in 9,8 as it is needed in another block in this col
2 46 9 1 7 85 456 3 845
5 3 47 248 9 6 1 2478 248
17 167 8 3 24 25 9 67 245
9 457 467 24578 824 2378 23456 1 23458
3 45 1 2458 6 9 245 8245 7
8 2 467 457 1 37 3456 456 9
6 9 5 27 3 4 8 2 1
147 147 23 6 82 827 23457 9 2345
47 8 23 9 5 1 2347 24 6
2 cannot be in cell 7,4 because it is in group {2} in that row
2 cannot be in cell 2,8 because it is in group {2} in that col
2 cannot be in cell 5,8 because it is in group {2} in that col
2 cannot be in cell 9,8 because it is in group {2} in that col
7 cannot be in cell 8,6 because it is in group {7} in that box
4 cannot be in cell 9,1 because it is in group {4} in that row
4 cannot be in cell 9,7 because it is in group {4} in that row
2 cannot be in cell 8,7 because it is in group {2} in that box
2 cannot be in cell 8,9 because it is in group {2} in that box
2 cannot be in cell 9,7 because it is in group {2} in that box
4 cannot be in cell 8,7 because it is in group {4} in that box
4 cannot be in cell 8,9 because it is in group {4} in that box
7 cannot be in cell 3,1 because it is in group {7} in that col
7 cannot be in cell 8,1 because it is in group {7} in that col
1 cannot be in cell 3,2 because it is in group {1} in that box
7 cannot be in cell 4,4 because it is in group {7} in that col
7 cannot be in cell 6,4 because it is in group {7} in that col
7 cannot be in cell 8,2 because it is in group {7} in that box
4 cannot be in cell 2,8 because it is in group {4} in that col
4 cannot be in cell 5,8 because it is in group {4} in that col
4 cannot be in cell 6,8 because it is in group {4} in that col
7 cannot be in cell 9,7 because it is in group {7} in that row
3 cannot be in cell 8,7 because it is in group {3} in that box
3 cannot be in cell 8,9 because it is in group {3} in that box
1 cannot be in cell 8,1 because it is in group {1} in that col
3 cannot be in cell 4,7 because it is in group {3} in that col
3 cannot be in cell 6,7 because it is in group {3} in that col
4 cannot be in cell 8,2 because it is in group {4} in that box
5 cannot be in cell 8,7 because it is in group {5} in that row
3 cannot be in cell 9,3 because it is in group {3} in that row
5 cannot be in cell 1,9 because it is in group {5} in that col
5 cannot be in cell 3,9 because it is in group {5} in that col
5 cannot be in cell 4,9 because it is in group {5} in that col
2 cannot be in cell 8,3 because it is in group {2} in that col
2 cannot be in cell 3,6 because it is in group {2, 4} in that row
8 cannot be in cell 4,6 because it is in group {8, 5} in that col
8 cannot be in cell 8,6 because it is in group {8, 5} in that col
5 cannot be in cell 1,6 because it is in group {5, 6} in that col
2 cannot be in cell 8,5 because it is in group {2, 4} in that row
8 cannot be in cell 1,9 because it is in group {8, 2} in that row
4 cannot be in cell 1,2 because it is in group {8, 4} in that row
4 cannot be in cell 1,7 because it is in group {8, 4} in that row
6 cannot be in cell 3,2 because it is in group {2, 6} in that box
7 cannot be in cell 2,3 because it is in group {6, 7} in that box
4 cannot be in cell 2,4 because it is in group {4, 5} in that row
4 cannot be in cell 2,9 because it is in group {4, 5} in that row
7 cannot be in cell 4,2 because it is in group {6, 7} in that col
8 cannot be in cell 2,4 because it is in group {8, 1} in that box
2 cannot be in cell 3,5 because it is in group {8, 2} in that box
7 cannot be in cell 3,8 because it is in group {1, 7} in that row
4 cannot be in cell 3,9 because it is in group {1, 4} in that row
4 cannot be in cell 4,3 because it is in group {9, 4} in that col
4 cannot be in cell 6,3 because it is in group {9, 4} in that col
6 cannot be in cell 1,7 because it is in group {3, 6} in that box
2 cannot be in cell 2,9 because it is in group {2, 3} in that box
8 cannot be in cell 2,8 because it is in group {8, 2} in that box
2 cannot be in cell 4,4 because it is in group {1, 2} in that col
2 cannot be in cell 5,4 because it is in group {1, 2} in that col
4 cannot be in cell 4,5 because it is in group {4, 7} in that col
8 cannot be in cell 4,5 because it is in group {8, 7} in that col
2 cannot be in cell 4,6 because it is in group {2, 6} in that box
5 cannot be in cell 4,7 because it is in group {1, 5} in that col
5 cannot be in cell 5,7 because it is in group {1, 5} in that col
5 cannot be in cell 6,7 because it is in group {1, 5} in that col
6 cannot be in cell 6,8 because it is in group {3, 6} in that col
5 cannot be in cell 5,8 because it is in group {5, 6} in that col
8 cannot be in cell 4,9 because it is in group {8, 4} in that col
4 cannot be in cell 4,9 because it is in group {8, 4} in that col
2 cannot be in cell 4,9 because it is in group {2, 4} in that col
2 cannot be in cell 4,7 because it is in group {9, 2} in that row
3 cannot be in cell 4,6 because it is in group {9, 3} in that row
7 cannot be in cell 4,3 because it is in group {3, 7} in that row
6 cannot be in cell 6,3 because it is in group {9, 6} in that box
8 cannot be in cell 5,4 because it is in group {8, 3} in that row
7 cannot be in cell 6,6 because it is in group {2, 7} in that box
4 cannot be in cell 4,4 because it is in group {4, 5} in that box
5 cannot be in cell 4,4 because it is in group {4, 5} in that box
5 cannot be in cell 6,4 because it is in group {8, 5} in that row
4 cannot be in cell 6,7 because it is in group {4, 5} in that row
4 cannot be in cell 5,7 because it is in group {4, 6} in that box
6 cannot be in cell 4,7 because it is in group {1, 6} in that box
4 cannot be in cell 4,2 because it is in group {9, 4} in that row
4 cannot be in cell 5,4 because it is in group {1, 4} in that col
5 cannot be in cell 5,2 because it is in group {9, 5} in that box
2 6 9 1 7 8 5 3 4
5 3 4 2 9 6 1 7 8
1 7 8 3 4 5 9 6 2
9 5 6 8 2 7 4 1 3
3 4 1 5 6 9 2 8 7
8 2 7 4 1 3 6 5 9
6 9 5 7 3 4 8 2 1
4 1 3 6 8 2 7 9 5
7 8 2 9 5 1 3 4 6

I’ve tried this algorithm on a few sudokus on http://websudoku.com/, even evil level ones, and it has managed to solve them all. Obviously that doesn’t guarantee it will work on every sudoku — do let me know if you find one that you think should be solveable with pure logic like this, that my algorithm isn’t able to solve.

And please don’t judge me on my python code, I mainly just wanted something to show someone, and figured other people might be interested. I have also included the code in my public github repository.

--

--

Guy Lipman

Fascinated by what makes societies and markets work, especially in sustainable energy. http://guylipman.com. Views not necessarily reflect those of my employer.