So this entire post is based on a puzzle. I've got this one-a-day puzzle book, I suppose it's intended at the general brain-workout kind of mind set- something to get you rolling in the morning, keep you sharp a little bit. Naturally, I ignore the one-daily part and do them as batteries of several in a row, whenever I feel like doing puzzles!

What I like most about this specific book is that it has a nice mix of familiar, kind of standard, workout puzzles- Sudoku, word transformations and the like; and also a bunch of unique, novel puzzles as well. Like this one pictured here- the twelve marked cells each take a number 1-12, inclusive, with each number used exactly once.

The goal is to make the numbers on every diagonal sum to 26. It turns out it's relatively easy to solve by making a couple guesses and then filling in the remaining cells deductively. However, this was insufficiently interesting to me, and I naturally had to probe deeper. I enjoy working with deductive algorithms, and this seemed like a great problem to apply such a program to!

Let's say we start by labeling all the slots with letters, A-L as in the attached image. Now, the shape has 6 fold symmetry, so we can settle with one point at 'North' without losing generality. We could assign each slot a value combinatorially, but then we would have 12! total possibilities to check, which is too many.

Instead, we'll look at ways of using the structure of the problem to narrow this combinatorial check down a little.

We'll start by noting the sums that must be satisfied:

A+C+F+H = 26

A+D+G+K = 26

H+I+J+K = 26

B+C+D+E = 26

B+F+I+L = 26

L+J+G+E = 26

Now, this is a set of 6 equations in 12 unknowns, and cannot be solved deterministically. However, we can observe that B+C+D+E and H+I+J+K have no numbers in common. Therefore, we can assign 8 numbers directly and check if each of these chains equals 26. This can be done without evaluating any configuration of A, F, G, or L, the remaining 4 numbers. In fact, we can even break this down further- we can pick 4 numbers for (B,C,D,E) and then pass along if they do not sum to 26. If they do, we can then select 4 more numbers for (H,I,J,K) until we find a set summing to 26. By eliminating invalid sums as early as possible, we are effectively removing all probes associated with that configuration, so the fewer options we allocate before checking, the more of the 12! total possibilities we can throw away, hence breaking the puzzle down into the minimal number of assignments necessary.

For instance, by picking (B,C,D,E), we create an initial subspace for exploration which contains K*(12!/8!) entries, where K is the proportion of valid choices for those first 4 cells, assumed small. K is originally unknown, but by brute force examination, we determine it to be 6.67%. I'm going to interject here and simplify these representations- for notational clarity, let's define a function F(M,N) as M!/N!. Let's also represent line lists by contracting (A,B,C,D) to ABCD. Then we can write this subspace size as K*F(12,8). This is meaningful, because for all of the (1-K)*F(12,8) allocations rejected, we save 8! total configuration checks- which works out to a reduction by (1-K)*(12!). Considering how small K is, this is a sizable gain.

We can also notice that, due to symmetry, a total BCDE solution corresponds to a given HIJK solution by applying a vertical mirror transformation. Therefore, we can proceed by picking HIJK from the candidates identified by the BCDE search without losing generality. Because we know that HIJK will share no entries with BCDE, we can dramatically improve our selection within this subspace: the set of 4-long allocations is K*F(12,8) entries large. If we picked BCDE and HIJK from all pairs, we'd have (K*F(12,8))*(K*F(12,8)-1) choices. If we reject all pairs which share any elements, then we'll reduce this number.

Consider evaluating for a given BCDE. Then HIJK should not contain any of [B,C,D,E]. Let's call this valid ratio J. The exact counts will depend on the specific BCDE combination, but looking at the probabilistic evaluation, this means we'll have about 1 - (2/3)^4 invalid combinations (the probability of randomly picking HIJK from the 12 potential cases such that no elements are duplicated). Now, this won't be precise, since the nature of the effect of the sum constraint on the subspace means the 'choice' of HIJK is not uniformly distributed, but this is a ballpark estimate, and suggests that about 19.7% of such HIJK possibilities will be valid. Empirical examination informs us that on average, we actually have only 17.6% valid options in the subspace.

Now, once we've got our list of valid combinations- K*F(12,8) sized subspace, with J*K*F(12,8) HIJK possibilities per BCDE choice, for a J*K^2*F(12,8)^2 sized space. This is approximately 0.023% the size of the original 12! space. From here, we can look at minimal selections for the remaining cells: A, F, G and L. We have 4 remaining possible entries. Because of the rotational and mirror symmetry, we need only to assign one of these values to any of the remaining cells, and the remaining cells are deterministically set by computing the value necessary to fill the slot. For instance, trying A, we then can compute F from ACFH, then L from BFIL, and finally G from LJGE. Then, once computed from these specific lines, we can check their validity against the other lines, to see if a consistent answer exists. This gives us an additional 4 checks per (BCDE,HIJK) pair, for 4*J*K2*F(12,8)2 total possibilities- 0.093% of the original space. Pretty significant savings.

However, there is one more observation we can make to gain considerable improvements. We note that there is that there is an interior and an exterior set of rings, in addition to the 12-fold (mirror and rotational) symmetry. We can note that if we move a cell's entry (say D) from inside to outside (or vise versa), then we can reconstruct, with mirror flips and rotations of the other triangle, a new valid organization (this is not a proof, of course, but a loose explanation). This means we can take one value, say 12, and remove it from the permutative chain all together- we automatically assign it to A, for one battery, and then to C for the next.. This reduces all are combinative sub-problems in complexity drastically, giving us two separate batteries in which all factorials have their arguments reduced by 1. This results in a final search space of: 8*J*K2*F(11,7)2- a very satisfying 0.082% of the original search space. It takes my computer, in python, 169 milliseconds to find my first solution: [8, 1, 2, 11, 12, 9, 3, 7, 10, 5, 4, 6].

One last observation: Why 26? 12+1 = 13 = 11+2 = 10+3 = ... Since each line has two such pairs, the average sum of any 4 will be 26. If we then imagine altering any line such that the sum is different, a total equal to the difference between 26 and that new sum must be distributed among the other lines. Consequently, 26 should, statistically speaking, have the greatest number of potential solutions. This is of course, not a proof, but more like a sketch of an approach to a proof.

However, note that the very minimum sum is 10 (1+2+3+4), and the maximum is 42 (12+11+10+9), and each of these has exactly 1 valid combination. We need at least 6 (in theory, practically speaking, likely very many more) to fill the star. We can, using the same logic as described above, but with a check of sum to each trial value, evaluate total solutions for each number, 11 to 41, check all possibilities and find out which has the most, or even if there are solutions to other totals. For statistical completeness, we'll eliminate the inner/outer cut off, and total all solutions.

My hypothesis was that only a sum of 26 would be soluble, since the difference that becomes distributed ought not be able to be resolved, as it would create more variance. Again, not proof, just a hunch. Upon running this spanning experiment, we find that sum to 26 has 800 possibilities (up to symmetry), and no other possible sum has any, confirming the suspicion. Because This covers all sums possible within 1-12 range, this does constitute a firm answer. Take that, puzzle!

What I like most about this specific book is that it has a nice mix of familiar, kind of standard, workout puzzles- Sudoku, word transformations and the like; and also a bunch of unique, novel puzzles as well. Like this one pictured here- the twelve marked cells each take a number 1-12, inclusive, with each number used exactly once.

The goal is to make the numbers on every diagonal sum to 26. It turns out it's relatively easy to solve by making a couple guesses and then filling in the remaining cells deductively. However, this was insufficiently interesting to me, and I naturally had to probe deeper. I enjoy working with deductive algorithms, and this seemed like a great problem to apply such a program to!

Let's say we start by labeling all the slots with letters, A-L as in the attached image. Now, the shape has 6 fold symmetry, so we can settle with one point at 'North' without losing generality. We could assign each slot a value combinatorially, but then we would have 12! total possibilities to check, which is too many.

Instead, we'll look at ways of using the structure of the problem to narrow this combinatorial check down a little.

We'll start by noting the sums that must be satisfied:

A+C+F+H = 26

A+D+G+K = 26

H+I+J+K = 26

B+C+D+E = 26

B+F+I+L = 26

L+J+G+E = 26

Now, this is a set of 6 equations in 12 unknowns, and cannot be solved deterministically. However, we can observe that B+C+D+E and H+I+J+K have no numbers in common. Therefore, we can assign 8 numbers directly and check if each of these chains equals 26. This can be done without evaluating any configuration of A, F, G, or L, the remaining 4 numbers. In fact, we can even break this down further- we can pick 4 numbers for (B,C,D,E) and then pass along if they do not sum to 26. If they do, we can then select 4 more numbers for (H,I,J,K) until we find a set summing to 26. By eliminating invalid sums as early as possible, we are effectively removing all probes associated with that configuration, so the fewer options we allocate before checking, the more of the 12! total possibilities we can throw away, hence breaking the puzzle down into the minimal number of assignments necessary.

For instance, by picking (B,C,D,E), we create an initial subspace for exploration which contains K*(12!/8!) entries, where K is the proportion of valid choices for those first 4 cells, assumed small. K is originally unknown, but by brute force examination, we determine it to be 6.67%. I'm going to interject here and simplify these representations- for notational clarity, let's define a function F(M,N) as M!/N!. Let's also represent line lists by contracting (A,B,C,D) to ABCD. Then we can write this subspace size as K*F(12,8). This is meaningful, because for all of the (1-K)*F(12,8) allocations rejected, we save 8! total configuration checks- which works out to a reduction by (1-K)*(12!). Considering how small K is, this is a sizable gain.

We can also notice that, due to symmetry, a total BCDE solution corresponds to a given HIJK solution by applying a vertical mirror transformation. Therefore, we can proceed by picking HIJK from the candidates identified by the BCDE search without losing generality. Because we know that HIJK will share no entries with BCDE, we can dramatically improve our selection within this subspace: the set of 4-long allocations is K*F(12,8) entries large. If we picked BCDE and HIJK from all pairs, we'd have (K*F(12,8))*(K*F(12,8)-1) choices. If we reject all pairs which share any elements, then we'll reduce this number.

Consider evaluating for a given BCDE. Then HIJK should not contain any of [B,C,D,E]. Let's call this valid ratio J. The exact counts will depend on the specific BCDE combination, but looking at the probabilistic evaluation, this means we'll have about 1 - (2/3)^4 invalid combinations (the probability of randomly picking HIJK from the 12 potential cases such that no elements are duplicated). Now, this won't be precise, since the nature of the effect of the sum constraint on the subspace means the 'choice' of HIJK is not uniformly distributed, but this is a ballpark estimate, and suggests that about 19.7% of such HIJK possibilities will be valid. Empirical examination informs us that on average, we actually have only 17.6% valid options in the subspace.

Now, once we've got our list of valid combinations- K*F(12,8) sized subspace, with J*K*F(12,8) HIJK possibilities per BCDE choice, for a J*K^2*F(12,8)^2 sized space. This is approximately 0.023% the size of the original 12! space. From here, we can look at minimal selections for the remaining cells: A, F, G and L. We have 4 remaining possible entries. Because of the rotational and mirror symmetry, we need only to assign one of these values to any of the remaining cells, and the remaining cells are deterministically set by computing the value necessary to fill the slot. For instance, trying A, we then can compute F from ACFH, then L from BFIL, and finally G from LJGE. Then, once computed from these specific lines, we can check their validity against the other lines, to see if a consistent answer exists. This gives us an additional 4 checks per (BCDE,HIJK) pair, for 4*J*K2*F(12,8)2 total possibilities- 0.093% of the original space. Pretty significant savings.

However, there is one more observation we can make to gain considerable improvements. We note that there is that there is an interior and an exterior set of rings, in addition to the 12-fold (mirror and rotational) symmetry. We can note that if we move a cell's entry (say D) from inside to outside (or vise versa), then we can reconstruct, with mirror flips and rotations of the other triangle, a new valid organization (this is not a proof, of course, but a loose explanation). This means we can take one value, say 12, and remove it from the permutative chain all together- we automatically assign it to A, for one battery, and then to C for the next.. This reduces all are combinative sub-problems in complexity drastically, giving us two separate batteries in which all factorials have their arguments reduced by 1. This results in a final search space of: 8*J*K2*F(11,7)2- a very satisfying 0.082% of the original search space. It takes my computer, in python, 169 milliseconds to find my first solution: [8, 1, 2, 11, 12, 9, 3, 7, 10, 5, 4, 6].

One last observation: Why 26? 12+1 = 13 = 11+2 = 10+3 = ... Since each line has two such pairs, the average sum of any 4 will be 26. If we then imagine altering any line such that the sum is different, a total equal to the difference between 26 and that new sum must be distributed among the other lines. Consequently, 26 should, statistically speaking, have the greatest number of potential solutions. This is of course, not a proof, but more like a sketch of an approach to a proof.

However, note that the very minimum sum is 10 (1+2+3+4), and the maximum is 42 (12+11+10+9), and each of these has exactly 1 valid combination. We need at least 6 (in theory, practically speaking, likely very many more) to fill the star. We can, using the same logic as described above, but with a check of sum to each trial value, evaluate total solutions for each number, 11 to 41, check all possibilities and find out which has the most, or even if there are solutions to other totals. For statistical completeness, we'll eliminate the inner/outer cut off, and total all solutions.

My hypothesis was that only a sum of 26 would be soluble, since the difference that becomes distributed ought not be able to be resolved, as it would create more variance. Again, not proof, just a hunch. Upon running this spanning experiment, we find that sum to 26 has 800 possibilities (up to symmetry), and no other possible sum has any, confirming the suspicion. Because This covers all sums possible within 1-12 range, this does constitute a firm answer. Take that, puzzle!