最近我一直在iPhone上玩一款名为《Scramble》的游戏。有些人可能知道这个游戏叫拼字游戏。从本质上讲,当游戏开始时,你会得到一个字母矩阵:

F X I E
A M L O
E W B X
A S T U

The goal of the game is to find as many words as you can that can be formed by chaining letters together. You can start with any letter, and all the letters that surround it are fair game, and then once you move on to the next letter, all the letters that surround that letter are fair game, except for any previously used letters. So in the grid above, for example, I could come up with the words LOB, TUX, SEA, FAME, etc. Words must be at least 3 characters, and no more than NxN characters, which would be 16 in this game but can vary in some implementations. While this game is fun and addictive, I am apparently not very good at it and I wanted to cheat a little bit by making a program that would give me the best possible words (the longer the word the more points you get).

(来源:boggled.org)

不幸的是,我不太擅长算法或它们的效率等等。我的第一次尝试使用一个像这样的字典(约2.3MB),并进行线性搜索,试图匹配字典条目的组合。这需要花费很长时间来找到可能的单词,因为你每轮只有2分钟的时间,这是不够的。

我很有兴趣看看是否有任何Stackoverflowers可以提出更有效的解决方案。我主要是在寻找使用三大p的解决方案:Python、PHP和Perl,尽管任何使用Java或c++的东西也很酷,因为速度是至关重要的。

目前的解决方案:

Adam Rosenfield, Python, ~20岁 John Fouhy, Python, ~3秒 Kent Fredric, Perl, ~1s Darius Bacon, Python, ~1s rvarcher, VB。净,~ 1 s Paolo Bergantino, PHP(实时链接),~5s(本地~2s)


当前回答

下面是使用NLTK工具包中的预定义单词的解决方案 NLTK有NLTK。语料库包,我们有一个叫做单词的包,它包含超过20万个英语单词,你可以简单地把它们都用到你的程序中。

一旦创建你的矩阵转换成一个字符数组,并执行这段代码

import nltk
from nltk.corpus import words
from collections import Counter

def possibleWords(input, charSet):
    for word in input:
        dict = Counter(word)
        flag = 1
        for key in dict.keys():
            if key not in charSet:
                flag = 0
        if flag == 1 and len(word)>5: #its depends if you want only length more than 5 use this otherwise remove that one. 
            print(word)


nltk.download('words')
word_list = words.words()
# prints 236736
print(len(word_list))
charSet = ['h', 'e', 'l', 'o', 'n', 'v', 't']
possibleWords(word_list, charSet)

输出:

eleven
eleventh
elevon
entente
entone
ethene
ethenol
evolve
evolvent
hellhole
helvell
hooven
letten
looten
nettle
nonene
nonent
nonlevel
notelet
novelet
novelette
novene
teenet
teethe
teevee
telethon
tellee
tenent
tentlet
theelol
toetoe
tonlet
toothlet
tootle
tottle
vellon
velvet
velveteen
venene
vennel
venthole
voeten
volent
volvelle
volvent
voteen

我希望你能得到它。

其他回答

一个Node.JS JavaScript解决方案。在不到一秒钟的时间内计算所有100个独特的单词,其中包括阅读字典文件(MBA 2012)。

Output: ["FAM","TUX","TUB","FAE","ELI","ELM","ELB","TWA","TWA","SAW","AMI","SWA","SWA","AME","SEA","SEW","AES","AWL","AWE","SEA","AWA","MIX","MIL","AST","ASE","MAX","MAE","MAW","MEW","AWE","MES","AWL","LIE","LIM","AWA","AES","BUT","BLO","WAS","WAE","WEA","LEI","LEO","LOB","LOX","WEM","OIL","OLM","WEA","WAE","WAX","WAF","MILO","EAST","WAME","TWAS","TWAE","EMIL","WEAM","OIME","AXIL","WEST","TWAE","LIMB","WASE","WAST","BLEO","STUB","BOIL","BOLE","LIME","SAWT","LIMA","MESA","MEWL","AXLE","FAME","ASEM","MILE","AMIL","SEAX","SEAM","SEMI","SWAM","AMBO","AMLI","AXILE","AMBLE","SWAMI","AWEST","AWEST","LIMAX","LIMES","LIMBU","LIMBO","EMBOX","SEMBLE","EMBOLE","WAMBLE","FAMBLE"]

代码:

var fs = require('fs')

var Node = function(value, row, col) {
    this.value = value
    this.row = row
    this.col = col
}

var Path = function() {
    this.nodes = []
}

Path.prototype.push = function(node) {
    this.nodes.push(node)
    return this
}

Path.prototype.contains = function(node) {
    for (var i = 0, ii = this.nodes.length; i < ii; i++) {
        if (this.nodes[i] === node) {
            return true
        }
    }

    return false
}

Path.prototype.clone = function() {
    var path = new Path()
    path.nodes = this.nodes.slice(0)
    return path
}

Path.prototype.to_word = function() {
    var word = ''

    for (var i = 0, ii = this.nodes.length; i < ii; ++i) {
        word += this.nodes[i].value
    }

    return word
}

var Board = function(nodes, dict) {
    // Expects n x m array.
    this.nodes = nodes
    this.words = []
    this.row_count = nodes.length
    this.col_count = nodes[0].length
    this.dict = dict
}

Board.from_raw = function(board, dict) {
    var ROW_COUNT = board.length
      , COL_COUNT = board[0].length

    var nodes = []

    // Replace board with Nodes
    for (var i = 0, ii = ROW_COUNT; i < ii; ++i) {
        nodes.push([])
        for (var j = 0, jj = COL_COUNT; j < jj; ++j) {
            nodes[i].push(new Node(board[i][j], i, j))
        }
    }

    return new Board(nodes, dict)
}

Board.prototype.toString = function() {
    return JSON.stringify(this.nodes)
}

Board.prototype.update_potential_words = function(dict) {
    for (var i = 0, ii = this.row_count; i < ii; ++i) {
        for (var j = 0, jj = this.col_count; j < jj; ++j) {
            var node = this.nodes[i][j]
              , path = new Path()

            path.push(node)

            this.dfs_search(path)
        }
    }
}

Board.prototype.on_board = function(row, col) {
    return 0 <= row && row < this.row_count && 0 <= col && col < this.col_count
}

Board.prototype.get_unsearched_neighbours = function(path) {
    var last_node = path.nodes[path.nodes.length - 1]

    var offsets = [
        [-1, -1], [-1,  0], [-1, +1]
      , [ 0, -1],           [ 0, +1]
      , [+1, -1], [+1,  0], [+1, +1]
    ]

    var neighbours = []

    for (var i = 0, ii = offsets.length; i < ii; ++i) {
        var offset = offsets[i]
        if (this.on_board(last_node.row + offset[0], last_node.col + offset[1])) {

            var potential_node = this.nodes[last_node.row + offset[0]][last_node.col + offset[1]]
            if (!path.contains(potential_node)) {
                // Create a new path if on board and we haven't visited this node yet.
                neighbours.push(potential_node)
            }
        }
    }

    return neighbours
}

Board.prototype.dfs_search = function(path) {
    var path_word = path.to_word()

    if (this.dict.contains_exact(path_word) && path_word.length >= 3) {
        this.words.push(path_word)
    }

    var neighbours = this.get_unsearched_neighbours(path)

    for (var i = 0, ii = neighbours.length; i < ii; ++i) {
        var neighbour = neighbours[i]
        var new_path = path.clone()
        new_path.push(neighbour)

        if (this.dict.contains_prefix(new_path.to_word())) {
            this.dfs_search(new_path)
        }
    }
}

var Dict = function() {
    this.dict_array = []

    var dict_data = fs.readFileSync('./web2', 'utf8')
    var dict_array = dict_data.split('\n')

    for (var i = 0, ii = dict_array.length; i < ii; ++i) {
        dict_array[i] = dict_array[i].toUpperCase()
    }

    this.dict_array = dict_array.sort()
}

Dict.prototype.contains_prefix = function(prefix) {
    // Binary search
    return this.search_prefix(prefix, 0, this.dict_array.length)
}

Dict.prototype.contains_exact = function(exact) {
    // Binary search
    return this.search_exact(exact, 0, this.dict_array.length)
}

Dict.prototype.search_prefix = function(prefix, start, end) {
    if (start >= end) {
        // If no more place to search, return no matter what.
        return this.dict_array[start].indexOf(prefix) > -1
    }

    var middle = Math.floor((start + end)/2)

    if (this.dict_array[middle].indexOf(prefix) > -1) {
        // If we prefix exists, return true.
        return true
    } else {
        // Recurse
        if (prefix <= this.dict_array[middle]) {
            return this.search_prefix(prefix, start, middle - 1)
        } else {
            return this.search_prefix(prefix, middle + 1, end)
        }
    }
}

Dict.prototype.search_exact = function(exact, start, end) {
    if (start >= end) {
        // If no more place to search, return no matter what.
        return this.dict_array[start] === exact
    }

    var middle = Math.floor((start + end)/2)

    if (this.dict_array[middle] === exact) {
        // If we prefix exists, return true.
        return true
    } else {
        // Recurse
        if (exact <= this.dict_array[middle]) {
            return this.search_exact(exact, start, middle - 1)
        } else {
            return this.search_exact(exact, middle + 1, end)
        }
    }
}

var board = [
    ['F', 'X', 'I', 'E']
  , ['A', 'M', 'L', 'O']
  , ['E', 'W', 'B', 'X']
  , ['A', 'S', 'T', 'U']
]

var dict = new Dict()

var b = Board.from_raw(board, dict)
b.update_potential_words()
console.log(JSON.stringify(b.words.sort(function(a, b) {
    return a.length - b.length
})))

我认为你可能会花大部分时间去匹配那些不可能由你的字母网格构成的单词。所以,我要做的第一件事就是加快这一步,这应该能让你大致达到目的。

为此,我将把网格重新表示为一个可能的“移动”表,您可以根据您正在查看的字母转换对其进行索引。

首先从你的字母表中给每个字母分配一个数字(a =0, B=1, C=2,…等等)。

让我们举个例子:

h b c d
e e g h
l l k l
m o f p

现在,让我们使用现有字母的字母表(通常你可能每次都想使用相同的字母表):

 b | c | d | e | f | g | h | k | l | m |  o |  p
---+---+---+---+---+---+---+---+---+---+----+----
 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11

然后你创建一个2D布尔数组,告诉你是否有某个字母转换可用:

     |  0  1  2  3  4  5  6  7  8  9 10 11  <- from letter
     |  b  c  d  e  f  g  h  k  l  m  o  p
-----+--------------------------------------
 0 b |     T     T     T  T     
 1 c |  T     T  T     T  T
 2 d |     T           T  T
 3 e |  T  T     T     T  T  T  T
 4 f |                       T  T     T  T
 5 g |  T  T  T  T        T  T  T
 6 h |  T  T  T  T     T     T  T
 7 k |           T  T  T  T     T     T  T
 8 l |           T  T  T  T  T  T  T  T  T
 9 m |                          T     T
10 o |              T        T  T  T
11 p |              T        T  T
 ^
 to letter

现在浏览单词列表,将单词转换为过渡段:

hello (6, 3, 8, 8, 10):
6 -> 3, 3 -> 8, 8 -> 8, 8 -> 10

然后检查这些转换是否允许通过在你的表中查找它们:

[6][ 3] : T
[3][ 8] : T
[8][ 8] : T
[8][10] : T

如果它们都被允许,就有可能找到这个词。

例如,单词“头盔”可以在第4个转换(m到e:头盔)时被排除,因为表中的这个条目是假的。

单词hamster可以被排除,因为第一个(h到a)的转换是不允许的(在你的表中甚至不存在)。

现在,对于可能剩下的很少几个你没有删除的单词,试着按照你现在做的方法或在这里的其他答案中建议的方法在网格中找到它们。这是为了避免网格中相同字母之间的跳转导致的误报。例如,表格允许使用单词“help”,但网格不允许。

关于这个想法的一些进一步的性能改进技巧:

Instead of using a 2D array, use a 1D array and simply compute the index of the second letter yourself. So, instead of a 12x12 array like above, make a 1D array of length 144. If you then always use the same alphabet (i.e. a 26x26 = 676x1 array for the standard english alphabet), even if not all letters show up in your grid, you can pre-compute the indices into this 1D array that you need to test to match your dictionary words. For example, the indices for 'hello' in the example above would be hello (6, 3, 8, 8, 10): 42 (from 6 + 3x12), 99, 104, 128 -> "hello" will be stored as 42, 99, 104, 128 in the dictionary Extend the idea to a 3D table (expressed as a 1D array), i.e. all allowed 3-letter combinations. That way you can eliminate even more words immediately and you reduce the number of array lookups for each word by 1: For 'hello', you only need 3 array lookups: hel, ell, llo. It will be very quick to build this table, by the way, as there are only 400 possible 3-letter-moves in your grid. Pre-compute the indices of the moves in your grid that you need to include in your table. For the example above, you need to set the following entries to 'True': (0,0) (0,1) -> here: h, b : [6][0] (0,0) (1,0) -> here: h, e : [6][3] (0,0) (1,1) -> here: h, e : [6][3] (0,1) (0,0) -> here: b, h : [0][6] (0,1) (0,2) -> here: b, c : [0][1] . : Also represent your game grid in a 1-D array with 16 entries and have the table pre-computed in 3. contain the indices into this array.

我相信如果您使用这种方法,您可以让您的代码运行得非常快,如果您预先计算了字典并已经加载到内存中。

顺便说一句:如果你正在创造一款游戏,你可以在后台立即运行这些内容。在用户仍然盯着你的应用标题屏幕,并将手指放在按“Play”的位置时开始生成和解决第一款游戏。然后在用户玩前一款游戏时生成并解决下一款游戏。这应该会给您很多时间来运行代码。

(我喜欢这个问题,所以我可能会忍不住在未来几天的某个时候用Java实现我的提议,看看它实际上是如何执行的……一旦我这样做,我将在这里张贴代码。)

更新:

好的,我今天有一些时间在Java中实现了这个想法:

class DictionaryEntry {
  public int[] letters;
  public int[] triplets;
}

class BoggleSolver {

  // Constants
  final int ALPHABET_SIZE = 5;  // up to 2^5 = 32 letters
  final int BOARD_SIZE    = 4;  // 4x4 board
  final int[] moves = {-BOARD_SIZE-1, -BOARD_SIZE, -BOARD_SIZE+1, 
                                  -1,                         +1,
                       +BOARD_SIZE-1, +BOARD_SIZE, +BOARD_SIZE+1};


  // Technically constant (calculated here for flexibility, but should be fixed)
  DictionaryEntry[] dictionary; // Processed word list
  int maxWordLength = 0;
  int[] boardTripletIndices; // List of all 3-letter moves in board coordinates

  DictionaryEntry[] buildDictionary(String fileName) throws IOException {
    BufferedReader fileReader = new BufferedReader(new FileReader(fileName));
    String word = fileReader.readLine();
    ArrayList<DictionaryEntry> result = new ArrayList<DictionaryEntry>();
    while (word!=null) {
      if (word.length()>=3) {
        word = word.toUpperCase();
        if (word.length()>maxWordLength) maxWordLength = word.length();
        DictionaryEntry entry = new DictionaryEntry();
        entry.letters  = new int[word.length()  ];
        entry.triplets = new int[word.length()-2];
        int i=0;
        for (char letter: word.toCharArray()) {
          entry.letters[i] = (byte) letter - 65; // Convert ASCII to 0..25
          if (i>=2)
            entry.triplets[i-2] = (((entry.letters[i-2]  << ALPHABET_SIZE) +
                                     entry.letters[i-1]) << ALPHABET_SIZE) +
                                     entry.letters[i];
          i++;
        }
        result.add(entry);
      }
      word = fileReader.readLine();
    }
    return result.toArray(new DictionaryEntry[result.size()]);
  }

  boolean isWrap(int a, int b) { // Checks if move a->b wraps board edge (like 3->4)
    return Math.abs(a%BOARD_SIZE-b%BOARD_SIZE)>1;
  }

  int[] buildTripletIndices() {
    ArrayList<Integer> result = new ArrayList<Integer>();
    for (int a=0; a<BOARD_SIZE*BOARD_SIZE; a++)
      for (int bm: moves) {
        int b=a+bm;
        if ((b>=0) && (b<board.length) && !isWrap(a, b))
          for (int cm: moves) {
            int c=b+cm;
            if ((c>=0) && (c<board.length) && (c!=a) && !isWrap(b, c)) {
              result.add(a);
              result.add(b);
              result.add(c);
            }
          }
      }
    int[] result2 = new int[result.size()];
    int i=0;
    for (Integer r: result) result2[i++] = r;
    return result2;
  }


  // Variables that depend on the actual game layout
  int[] board = new int[BOARD_SIZE*BOARD_SIZE]; // Letters in board
  boolean[] possibleTriplets = new boolean[1 << (ALPHABET_SIZE*3)];

  DictionaryEntry[] candidateWords;
  int candidateCount;

  int[] usedBoardPositions;

  DictionaryEntry[] foundWords;
  int foundCount;

  void initializeBoard(String[] letters) {
    for (int row=0; row<BOARD_SIZE; row++)
      for (int col=0; col<BOARD_SIZE; col++)
        board[row*BOARD_SIZE + col] = (byte) letters[row].charAt(col) - 65;
  }

  void setPossibleTriplets() {
    Arrays.fill(possibleTriplets, false); // Reset list
    int i=0;
    while (i<boardTripletIndices.length) {
      int triplet = (((board[boardTripletIndices[i++]]  << ALPHABET_SIZE) +
                       board[boardTripletIndices[i++]]) << ALPHABET_SIZE) +
                       board[boardTripletIndices[i++]];
      possibleTriplets[triplet] = true; 
    }
  }

  void checkWordTriplets() {
    candidateCount = 0;
    for (DictionaryEntry entry: dictionary) {
      boolean ok = true;
      int len = entry.triplets.length;
      for (int t=0; (t<len) && ok; t++)
        ok = possibleTriplets[entry.triplets[t]];
      if (ok) candidateWords[candidateCount++] = entry;
    }
  }

  void checkWords() { // Can probably be optimized a lot
    foundCount = 0;
    for (int i=0; i<candidateCount; i++) {
      DictionaryEntry candidate = candidateWords[i];
      for (int j=0; j<board.length; j++)
        if (board[j]==candidate.letters[0]) { 
          usedBoardPositions[0] = j;
          if (checkNextLetters(candidate, 1, j)) {
            foundWords[foundCount++] = candidate;
            break;
          }
        }
    }
  }

  boolean checkNextLetters(DictionaryEntry candidate, int letter, int pos) {
    if (letter==candidate.letters.length) return true;
    int match = candidate.letters[letter];
    for (int move: moves) {
      int next=pos+move;
      if ((next>=0) && (next<board.length) && (board[next]==match) && !isWrap(pos, next)) {
        boolean ok = true;
        for (int i=0; (i<letter) && ok; i++)
          ok = usedBoardPositions[i]!=next;
        if (ok) {
          usedBoardPositions[letter] = next;
          if (checkNextLetters(candidate, letter+1, next)) return true;
        }
      }
    }   
    return false;
  }


  // Just some helper functions
  String formatTime(long start, long end, long repetitions) {
    long time = (end-start)/repetitions;
    return time/1000000 + "." + (time/100000) % 10 + "" + (time/10000) % 10 + "ms";
  }

  String getWord(DictionaryEntry entry) {
    char[] result = new char[entry.letters.length];
    int i=0;
    for (int letter: entry.letters)
      result[i++] = (char) (letter+97);
    return new String(result);
  }

  void run() throws IOException {
    long start = System.nanoTime();

    // The following can be pre-computed and should be replaced by constants
    dictionary = buildDictionary("C:/TWL06.txt");
    boardTripletIndices = buildTripletIndices();
    long precomputed = System.nanoTime();


    // The following only needs to run once at the beginning of the program
    candidateWords     = new DictionaryEntry[dictionary.length]; // WAAAY too generous
    foundWords         = new DictionaryEntry[dictionary.length]; // WAAAY too generous
    usedBoardPositions = new int[maxWordLength];
    long initialized = System.nanoTime(); 

    for (int n=1; n<=100; n++) {
      // The following needs to run again for every new board
      initializeBoard(new String[] {"DGHI",
                                    "KLPS",
                                    "YEUT",
                                    "EORN"});
      setPossibleTriplets();
      checkWordTriplets();
      checkWords();
    }
    long solved = System.nanoTime();


    // Print out result and statistics
    System.out.println("Precomputation finished in " + formatTime(start, precomputed, 1)+":");
    System.out.println("  Words in the dictionary: "+dictionary.length);
    System.out.println("  Longest word:            "+maxWordLength+" letters");
    System.out.println("  Number of triplet-moves: "+boardTripletIndices.length/3);
    System.out.println();

    System.out.println("Initialization finished in " + formatTime(precomputed, initialized, 1));
    System.out.println();

    System.out.println("Board solved in "+formatTime(initialized, solved, 100)+":");
    System.out.println("  Number of candidates: "+candidateCount);
    System.out.println("  Number of actual words: "+foundCount);
    System.out.println();

    System.out.println("Words found:");
    int w=0;
    System.out.print("  ");
    for (int i=0; i<foundCount; i++) {
      System.out.print(getWord(foundWords[i]));
      w++;
      if (w==10) {
        w=0;
        System.out.println(); System.out.print("  ");
      } else
        if (i<foundCount-1) System.out.print(", ");
    }
    System.out.println();
  }

  public static void main(String[] args) throws IOException {
    new BoggleSolver().run();
  }
}

以下是一些结果:

对于原始问题(DGHI…)中发布的图片的网格:

Precomputation finished in 239.59ms:
  Words in the dictionary: 178590
  Longest word:            15 letters
  Number of triplet-moves: 408

Initialization finished in 0.22ms

Board solved in 3.70ms:
  Number of candidates: 230
  Number of actual words: 163 

Words found:
  eek, eel, eely, eld, elhi, elk, ern, erupt, erupts, euro
  eye, eyer, ghi, ghis, glee, gley, glue, gluer, gluey, glut
  gluts, hip, hiply, hips, his, hist, kelp, kelps, kep, kepi
  kepis, keps, kept, kern, key, kye, lee, lek, lept, leu
  ley, lunt, lunts, lure, lush, lust, lustre, lye, nus, nut
  nuts, ore, ort, orts, ouph, ouphs, our, oust, out, outre
  outs, oyer, pee, per, pert, phi, phis, pis, pish, plus
  plush, ply, plyer, psi, pst, pul, pule, puler, pun, punt
  punts, pur, pure, puree, purely, pus, push, put, puts, ree
  rely, rep, reply, reps, roe, roue, roup, roups, roust, rout
  routs, rue, rule, ruly, run, runt, runts, rupee, rush, rust
  rut, ruts, ship, shlep, sip, sipe, spue, spun, spur, spurn
  spurt, strep, stroy, stun, stupe, sue, suer, sulk, sulker, sulky
  sun, sup, supe, super, sure, surely, tree, trek, trey, troupe
  troy, true, truly, tule, tun, tup, tups, turn, tush, ups
  urn, uts, yeld, yelk, yelp, yelps, yep, yeps, yore, you
  your, yourn, yous

对于在原始问题中作为示例发布的信件(FXIE…)

Precomputation finished in 239.68ms:
  Words in the dictionary: 178590
  Longest word:            15 letters
  Number of triplet-moves: 408

Initialization finished in 0.21ms

Board solved in 3.69ms:
  Number of candidates: 87
  Number of actual words: 76

Words found:
  amble, ambo, ami, amie, asea, awa, awe, awes, awl, axil
  axile, axle, boil, bole, box, but, buts, east, elm, emboli
  fame, fames, fax, lei, lie, lima, limb, limbo, limbs, lime
  limes, lob, lobs, lox, mae, maes, maw, maws, max, maxi
  mesa, mew, mewl, mews, mil, mile, milo, mix, oil, ole
  sae, saw, sea, seam, semi, sew, stub, swam, swami, tub
  tubs, tux, twa, twae, twaes, twas, uts, wae, waes, wamble
  wame, wames, was, wast, wax, west

对于以下5x5网格:

R P R I T
A H H L N
I E T E P
Z R Y S G
O G W E Y

它给出了这个:

Precomputation finished in 240.39ms:
  Words in the dictionary: 178590
  Longest word:            15 letters
  Number of triplet-moves: 768

Initialization finished in 0.23ms

Board solved in 3.85ms:
  Number of candidates: 331
  Number of actual words: 240

Words found:
  aero, aery, ahi, air, airt, airth, airts, airy, ear, egest
  elhi, elint, erg, ergo, ester, eth, ether, eye, eyen, eyer
  eyes, eyre, eyrie, gel, gelt, gelts, gen, gent, gentil, gest
  geste, get, gets, gey, gor, gore, gory, grey, greyest, greys
  gyre, gyri, gyro, hae, haet, haets, hair, hairy, hap, harp
  heap, hear, heh, heir, help, helps, hen, hent, hep, her
  hero, hes, hest, het, hetero, heth, hets, hey, hie, hilt
  hilts, hin, hint, hire, hit, inlet, inlets, ire, leg, leges
  legs, lehr, lent, les, lest, let, lethe, lets, ley, leys
  lin, line, lines, liney, lint, lit, neg, negs, nest, nester
  net, nether, nets, nil, nit, ogre, ore, orgy, ort, orts
  pah, pair, par, peg, pegs, peh, pelt, pelter, peltry, pelts
  pen, pent, pes, pest, pester, pesty, pet, peter, pets, phi
  philter, philtre, phiz, pht, print, pst, rah, rai, rap, raphe
  raphes, reap, rear, rei, ret, rete, rets, rhaphe, rhaphes, rhea
  ria, rile, riles, riley, rin, rye, ryes, seg, sel, sen
  sent, senti, set, sew, spelt, spelter, spent, splent, spline, splint
  split, stent, step, stey, stria, striae, sty, stye, tea, tear
  teg, tegs, tel, ten, tent, thae, the, their, then, these
  thesp, they, thin, thine, thir, thirl, til, tile, tiles, tilt
  tilter, tilth, tilts, tin, tine, tines, tirl, trey, treys, trog
  try, tye, tyer, tyes, tyre, tyro, west, wester, wry, wryest
  wye, wyes, wyte, wytes, yea, yeah, year, yeh, yelp, yelps
  yen, yep, yeps, yes, yester, yet, yew, yews, zero, zori

为此,我使用了TWL06锦标赛拼字词列表,因为原始问题中的链接不再有效。这个文件是1.85MB,所以略短一些。buildDictionary函数抛出所有小于3个字母的单词。

以下是对其性能的一些观察:

It's about 10 times slower than the reported performance of Victor Nicollet's OCaml implementation. Whether this is caused by the different algorithm, the shorter dictionary he used, the fact that his code is compiled and mine runs in a Java virtual machine, or the performance of our computers (mine is an Intel Q6600 @ 2.4MHz running WinXP), I don't know. But it's much faster than the results for the other implementations quoted at the end of the original question. So, whether this algorithm is superior to the trie dictionary or not, I don't know at this point. The table method used in checkWordTriplets() yields a very good approximation to the actual answers. Only 1 in 3-5 words passed by it will fail the checkWords() test (See number of candidates vs. number of actual words above). Something you can't see above: The checkWordTriplets() function takes about 3.65ms and is therefore fully dominant in the search process. The checkWords() function takes up pretty much the remaining 0.05-0.20 ms. The execution time of the checkWordTriplets() function depends linearly on the dictionary size and is virtually independent of board size! The execution time of checkWords() depends on the board size and the number of words not ruled out by checkWordTriplets(). The checkWords() implementation above is the dumbest first version I came up with. It is basically not optimized at all. But compared to checkWordTriplets() it is irrelevant for the total performance of the application, so I didn't worry about it. But, if the board size gets bigger, this function will get slower and slower and will eventually start to matter. Then, it would need to be optimized as well. One nice thing about this code is its flexibility: You can easily change the board size: Update line 10 and the String array passed to initializeBoard(). It can support larger/different alphabets and can handle things like treating 'Qu' as one letter without any performance overhead. To do this, one would need to update line 9 and the couple of places where characters are converted to numbers (currently simply by subtracting 65 from the ASCII value)

好吧,但我觉得现在这篇文章已经足够长了。我当然可以回答你可能有的任何问题,但让我们把它转移到评论。

这是我想出的解决填字游戏的办法。我想这是最“python”的做事方式:

from itertools import combinations
from itertools import izip
from math import fabs

def isAllowedStep(current,step,length,doubleLength):
            # for step == length -1 not to be 0 => trivial solutions are not allowed
    return length > 1 and \
           current + step < doubleLength and current - step > 0 and \
           ( step == 1 or step == -1 or step <= length+1 or step >= length - 1)

def getPairwiseList(someList):
    iterableList = iter(someList)
    return izip(iterableList, iterableList)

def isCombinationAllowed(combination,length,doubleLength):

    for (first,second) in  getPairwiseList(combination):
        _, firstCoordinate = first
        _, secondCoordinate = second
        if not isAllowedStep(firstCoordinate, fabs(secondCoordinate-firstCoordinate),length,doubleLength):
            return False
    return True

def extractSolution(combinations):
    return ["".join([x[0] for x in combinationTuple]) for combinationTuple in combinations]


length = 4
text = tuple("".join("fxie amlo ewbx astu".split()))
textIndices = tuple(range(len(text)))
coordinates = zip(text,textIndices)

validCombinations = [combination for combination in combinations(coordinates,length) if isCombinationAllowed(combination,length,length*length)]
solution = extractSolution(validCombinations)

我善意地建议你不要将这部分用于所有可能的匹配,但它实际上提供了一种检查你生成的单词是否真的构成有效单词的可能性:

import mechanize
def checkWord(word):
    url = "https://en.oxforddictionaries.com/search?filter=dictionary&query="+word
    br = mechanize.Browser()
    br.set_handle_robots(False)
    response = br.open(url)
    text = response.read()
    return "no exact matches"  not in text.lower()

print [valid for valid in solution[:10] if checkWord(valid)]

该解决方案还提供了在给定的板中搜索的方向

一件事:

1. Uses trie to save all the word in the english to fasten the search
2. The uses DFS to search the words in Boggle

输出:

Found "pic" directions from (4,0)(p) go  → →
Found "pick" directions from (4,0)(p) go  → → ↑
Found "pickman" directions from (4,0)(p) go  → → ↑ ↑ ↖ ↑
Found "picket" directions from (4,0)(p) go  → → ↑ ↗ ↖
Found "picked" directions from (4,0)(p) go  → → ↑ ↗ ↘
Found "pickle" directions from (4,0)(p) go  → → ↑ ↘ →

代码:

from collections import defaultdict
from nltk.corpus import words
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize

english_words = words.words()

# If you wan to remove stop words
# stop_words = set(stopwords.words('english'))
# english_words = [w for w in english_words if w not in stop_words]

boggle = [
    ['c', 'n', 't', 's', 's'],
    ['d', 'a', 't', 'i', 'n'],
    ['o', 'o', 'm', 'e', 'l'],
    ['s', 'i', 'k', 'n', 'd'],
    ['p', 'i', 'c', 'l', 'e']
]

# Instead of X and Y co-ordinates
# better to use Row and column
lenc = len(boggle[0])
lenr = len(boggle)

# Initialize trie datastructure
trie_node = {'valid': False, 'next': {}}

# lets get the delta to find all the nighbors
neighbors_delta = [
    (-1,-1, "↖"),
    (-1, 0, "↑"),
    (-1, 1, "↗"),
    (0, -1, "←"),
    (0,  1, "→"),
    (1, -1, "↙"),
    (1,  0, "↓"),
    (1,  1, "↘"),
]


def gen_trie(word, node):
    """udpates the trie datastructure using the given word"""
    if not word:
        return

    if word[0] not in node:
        node[word[0]] = {'valid': len(word) == 1, 'next': {}}

    # recursively build trie
    gen_trie(word[1:], node[word[0]])


def build_trie(words, trie):
    """Builds trie data structure from the list of words given"""
    for word in words:
        gen_trie(word, trie)
    return trie


def get_neighbors(r, c):
    """Returns the neighbors for a given co-ordinates"""
    n = []
    for neigh in neighbors_delta:
        new_r = r + neigh[0]
        new_c = c + neigh[1]

        if (new_r >= lenr) or (new_c >= lenc) or (new_r < 0) or (new_c < 0):
            continue
        n.append((new_r, new_c, neigh[2]))
    return n


def dfs(r, c, visited, trie, now_word, direction):
    """Scan the graph using DFS"""
    if (r, c) in visited:
        return

    letter = boggle[r][c]
    visited.append((r, c))

    if letter in trie:
        now_word += letter

        if trie[letter]['valid']:
            print('Found "{}" {}'.format(now_word, direction))

        neighbors = get_neighbors(r, c)
        for n in neighbors:
            dfs(n[0], n[1], visited[::], trie[letter], now_word, direction + " " + n[2])


def main(trie_node):
    """Initiate the search for words in boggle"""
    trie_node = build_trie(english_words, trie_node)

    # print the board
    print("Given board")
    for i in range(lenr):print (boggle[i])
    print ('\n')

    for r in range(lenr):
        for c in range(lenc):
            letter = boggle[r][c]
            dfs(r, c, [], trie_node, '', 'directions from ({},{})({}) go '.format(r, c, letter))


if __name__ == '__main__':
    main(trie_node)

我意识到这个问题的时间来了又去了,但由于我自己正在研究一个求解器,并在谷歌搜索时偶然发现了这个,我想我应该发布一个参考,因为它似乎与其他一些问题有点不同。

我选择在游戏棋盘上使用平面数组,并从棋盘上的每个字母进行递归搜索,从有效邻居遍历到有效邻居,如果索引中的有效前缀是当前字母列表,则扩展搜索。而遍历当前单词的概念是进入板的索引列表,而不是组成单词的字母。在检查索引时,将索引转换为字母并完成检查。

索引是一个蛮力字典,有点像trie,但允许对索引进行python查询。如果单词'cat'和'cater'在列表中,你会在字典中看到:

   d = { 'c': ['cat','cater'],
     'ca': ['cat','cater'],
     'cat': ['cat','cater'],
     'cate': ['cater'],
     'cater': ['cater'],
   }

因此,如果current_word是'ca',您就知道它是一个有效的前缀,因为'ca'在d中返回True(因此继续遍历板)。如果current_word是'cat',那么你知道它是一个有效的单词,因为它是一个有效的前缀,并且d['cat']中的'cat'也返回True。

如果感觉这允许一些可读的代码,似乎不是太慢。像其他人一样,这个系统的费用是读取/构建索引。解这个板子相当麻烦。

代码在http://gist.github.com/268079。它是故意垂直和幼稚的,有很多明确的有效性检查,因为我想理解问题,而不是用一堆魔法或晦涩难懂的东西把它弄得乱七八糟。