最近我一直在iPhone上玩一款名为《Scramble》的游戏。有些人可能知道这个游戏叫拼字游戏。从本质上讲,当游戏开始时,你会得到一个字母矩阵:

F X I E
A M L O
E W B X
A S T U

The goal of the game is to find as many words as you can that can be formed by chaining letters together. You can start with any letter, and all the letters that surround it are fair game, and then once you move on to the next letter, all the letters that surround that letter are fair game, except for any previously used letters. So in the grid above, for example, I could come up with the words LOB, TUX, SEA, FAME, etc. Words must be at least 3 characters, and no more than NxN characters, which would be 16 in this game but can vary in some implementations. While this game is fun and addictive, I am apparently not very good at it and I wanted to cheat a little bit by making a program that would give me the best possible words (the longer the word the more points you get).

(来源:boggled.org)

不幸的是,我不太擅长算法或它们的效率等等。我的第一次尝试使用一个像这样的字典(约2.3MB),并进行线性搜索,试图匹配字典条目的组合。这需要花费很长时间来找到可能的单词,因为你每轮只有2分钟的时间,这是不够的。

我很有兴趣看看是否有任何Stackoverflowers可以提出更有效的解决方案。我主要是在寻找使用三大p的解决方案:Python、PHP和Perl,尽管任何使用Java或c++的东西也很酷,因为速度是至关重要的。

目前的解决方案:

Adam Rosenfield, Python, ~20岁 John Fouhy, Python, ~3秒 Kent Fredric, Perl, ~1s Darius Bacon, Python, ~1s rvarcher, VB。净,~ 1 s Paolo Bergantino, PHP(实时链接),~5s(本地~2s)


当前回答

下面是我的java实现:https://github.com/zouzhile/interview/blob/master/src/com/interview/algorithms/tree/BoggleSolver.java

Trie构建耗时0小时0分1秒532毫秒 单词搜索花了0小时0分0秒92毫秒

eel eeler eely eer eke eker eld eleut elk ell 
elle epee epihippus ere erept err error erupt eurus eye 
eyer eyey hip hipe hiper hippish hipple hippus his hish 
hiss hist hler hsi ihi iphis isis issue issuer ist 
isurus kee keek keeker keel keeler keep keeper keld kele 
kelek kelep kelk kell kelly kelp kelper kep kepi kept 
ker kerel kern keup keuper key kyl kyle lee leek 
leeky leep leer lek leo leper leptus lepus ler leu 
ley lleu lue lull luller lulu lunn lunt lunule luo 
lupe lupis lupulus lupus lur lure lurer lush lushly lust 
lustrous lut lye nul null nun nupe nurture nurturer nut 
oer ore ort ouphish our oust out outpeep outpeer outpipe 
outpull outpush output outre outrun outrush outspell outspue outspurn outspurt 
outstrut outstunt outsulk outturn outusure oyer pee peek peel peele 
peeler peeoy peep peeper peepeye peer pele peleus pell peller 
pelu pep peplus pepper pepperer pepsis per pern pert pertussis 
peru perule perun peul phi pip pipe piper pipi pipistrel 
pipistrelle pipistrellus pipper pish piss pist plup plus plush ply 
plyer psi pst puerer pul pule puler pulk pull puller 
pulley pullus pulp pulper pulu puly pun punt pup puppis 
pur pure puree purely purer purr purre purree purrel purrer 
puru purupuru pus push puss pustule put putt puture ree 
reek reeker reeky reel reeler reeper rel rely reoutput rep 
repel repeller repipe reply repp reps reree rereel rerun reuel 
roe roer roey roue rouelle roun roup rouper roust rout 
roy rue ruelle ruer rule ruler rull ruller run runt 
rupee rupert rupture ruru rus rush russ rust rustre rut 
shi shih ship shipper shish shlu sip sipe siper sipper 
sis sish sisi siss sissu sist sistrurus speel speer spelk 
spell speller splurt spun spur spurn spurrer spurt sput ssi 
ssu stre stree streek streel streeler streep streke streperous strepsis 
strey stroup stroy stroyer strue strunt strut stu stue stull 
stuller stun stunt stupe stupeous stupp sturnus sturt stuss stut 
sue suer suerre suld sulk sulker sulky sull sully sulu 
sun sunn sunt sunup sup supe super superoutput supper supple 
supplely supply sur sure surely surrey sus susi susu susurr 
susurrous susurrus sutu suture suu tree treey trek trekker trey 
troupe trouper trout troy true truer trull truller truly trun 
trush truss trust tshi tst tsun tsutsutsi tue tule tulle 
tulu tun tunu tup tupek tupi tur turn turnup turr 
turus tush tussis tussur tut tuts tutu tutulus ule ull 
uller ulu ululu unreel unrule unruly unrun unrust untrue untruly 
untruss untrust unturn unurn upper upperer uppish uppishly uppull uppush 
upspurt upsun upsup uptree uptruss upturn ure urn uro uru 
urus urushi ush ust usun usure usurer utu yee yeel 
yeld yelk yell yeller yelp yelper yeo yep yer yere 
yern yoe yor yore you youl youp your yourn yoy 

注意: 我在这个线程的开头使用了字典和字符矩阵。代码在我的MacBookPro上运行,下面是关于这台机器的一些信息。

型号:MacBook Pro 型号标识符:MacBookPro8,1 处理器名称:Intel Core i5 处理器速度:2.3 GHz 处理器数量:1 总核数:2 L2缓存(每核):256kb L3 Cache: 3mb 内存:4gb 引导ROM版本:MBP81.0047.B0E SMC版本(系统):1.68f96

其他回答

令人惊讶的是,没有人尝试使用PHP版本。

这是John Fouhy的Python解决方案的PHP版本。

虽然我从其他人的答案中得到了一些建议,但这主要是抄袭约翰的。

$boggle = "fxie
           amlo
           ewbx
           astu";

$alphabet = str_split(str_replace(array("\n", " ", "\r"), "", strtolower($boggle)));
$rows = array_map('trim', explode("\n", $boggle));
$dictionary = file("C:/dict.txt");
$prefixes = array(''=>'');
$words = array();
$regex = '/[' . implode('', $alphabet) . ']{3,}$/S';
foreach($dictionary as $k=>$value) {
    $value = trim(strtolower($value));
    $length = strlen($value);
    if(preg_match($regex, $value)) {
        for($x = 0; $x < $length; $x++) {
            $letter = substr($value, 0, $x+1);
            if($letter == $value) {
                $words[$value] = 1;
            } else {
                $prefixes[$letter] = 1;
            }
        }
    }
}

$graph = array();
$chardict = array();
$positions = array();
$c = count($rows);
for($i = 0; $i < $c; $i++) {
    $l = strlen($rows[$i]);
    for($j = 0; $j < $l; $j++) {
        $chardict[$i.','.$j] = $rows[$i][$j];
        $children = array();
        $pos = array(-1,0,1);
        foreach($pos as $z) {
            $xCoord = $z + $i;
            if($xCoord < 0 || $xCoord >= count($rows)) {
                continue;
            }
            $len = strlen($rows[0]);
            foreach($pos as $w) {
                $yCoord = $j + $w;
                if(($yCoord < 0 || $yCoord >= $len) || ($z == 0 && $w == 0)) {
                    continue;
                }
                $children[] = array($xCoord, $yCoord);
            }
        }
        $graph['None'][] = array($i, $j);
        $graph[$i.','.$j] = $children;
    }
}

function to_word($chardict, $prefix) {
    $word = array();
    foreach($prefix as $v) {
        $word[] = $chardict[$v[0].','.$v[1]];
    }
    return implode("", $word);
}

function find_words($graph, $chardict, $position, $prefix, $prefixes, &$results, $words) {
    $word = to_word($chardict, $prefix);
    if(!isset($prefixes[$word])) return false;

    if(isset($words[$word])) {
        $results[] = $word;
    }

    foreach($graph[$position] as $child) {
        if(!in_array($child, $prefix)) {
            $newprefix = $prefix;
            $newprefix[] = $child;
            find_words($graph, $chardict, $child[0].','.$child[1], $newprefix, $prefixes, $results, $words);
        }
    }
}

$solution = array();
find_words($graph, $chardict, 'None', array(), $prefixes, $solution);
print_r($solution);

如果你想尝试的话,这里有一个实时链接。虽然在我的本地机器上需要大约2秒,但在我的web服务器上需要大约5秒。无论哪种情况,它都不是很快。尽管如此,它还是很可怕,所以我可以想象时间可以大大缩短。任何关于如何实现这一目标的建议都将不胜感激。PHP缺少元组,这使得坐标处理起来很奇怪,而且我无法理解到底发生了什么,这对我一点帮助都没有。

编辑:一些修复使它在本地少于1秒。

所以我想添加另一种PHP方法来解决这个问题,因为每个人都喜欢PHP。 我想做一点重构,比如对字典文件使用regexpression匹配,但现在我只是将整个字典文件加载到一个wordList中。

我使用了链表的思想。每个Node都有一个字符值、一个位置值和一个next指针。

location值是我发现两个节点是否连接的方法。

1     2     3     4
11    12    13    14
21    22    23    24
31    32    33    34

所以使用这个网格,如果第一个节点的位置等于第二个节点的位置+/- 1(同一行),+/- 9,10,11(上下一行),我就知道两个节点是连接的。

我使用递归进行主搜索。它从wordList中取出一个单词,找到所有可能的起点,然后递归地找到下一个可能的连接,记住它不能去到它已经使用的位置(这就是为什么我添加$notInLoc)。

无论如何,我知道它需要一些重构,并且希望听到关于如何使它更干净的想法,但是它根据我使用的字典文件产生了正确的结果。根据黑板上元音和组合的数量,大约需要3到6秒。我知道,一旦我对字典结果进行预匹配,这将显著减少。

<?php
    ini_set('xdebug.var_display_max_depth', 20);
    ini_set('xdebug.var_display_max_children', 1024);
    ini_set('xdebug.var_display_max_data', 1024);

    class Node {
        var $loc;

        function __construct($value) {
            $this->value = $value;
            $next = null;
        }
    }

    class Boggle {
        var $root;
        var $locList = array (1, 2, 3, 4, 11, 12, 13, 14, 21, 22, 23, 24, 31, 32, 33, 34);
        var $wordList = [];
        var $foundWords = [];

        function __construct($board) {
            // Takes in a board string and creates all the nodes
            $node = new Node($board[0]);
            $node->loc = $this->locList[0];
            $this->root = $node;
            for ($i = 1; $i < strlen($board); $i++) {
                    $node->next = new Node($board[$i]);
                    $node->next->loc = $this->locList[$i];
                    $node = $node->next;
            }
            // Load in a dictionary file
            // Use regexp to elimate all the words that could never appear and load the 
            // rest of the words into wordList
            $handle = fopen("dict.txt", "r");
            if ($handle) {
                while (($line = fgets($handle)) !== false) {
                    // process the line read.
                    $line = trim($line);
                    if (strlen($line) > 2) {
                        $this->wordList[] = trim($line);
                    }
                }
                fclose($handle);
            } else {
                // error opening the file.
                echo "Problem with the file.";
            } 
        }

        function isConnected($node1, $node2) {
        // Determines if 2 nodes are connected on the boggle board

            return (($node1->loc == $node2->loc + 1) || ($node1->loc == $node2->loc - 1) ||
               ($node1->loc == $node2->loc - 9) || ($node1->loc == $node2->loc - 10) || ($node1->loc == $node2->loc - 11) ||
               ($node1->loc == $node2->loc + 9) || ($node1->loc == $node2->loc + 10) || ($node1->loc == $node2->loc + 11)) ? true : false;

        }

        function find($value, $notInLoc = []) {
            // Returns a node with the value that isn't in a location
            $current = $this->root;
            while($current) {
                if ($current->value == $value && !in_array($current->loc, $notInLoc)) {
                    return $current;
                }
                if (isset($current->next)) {
                    $current = $current->next;
                } else {
                    break;
                }
            }
            return false;
        }

        function findAll($value) {
            // Returns an array of nodes with a specific value
            $current = $this->root;
            $foundNodes = [];
            while ($current) {
                if ($current->value == $value) {
                    $foundNodes[] = $current;
                }
                if (isset($current->next)) {
                    $current = $current->next;
                } else {
                    break;
                }
            }
            return (empty($foundNodes)) ? false : $foundNodes;
        }

        function findAllConnectedTo($node, $value, $notInLoc = []) {
            // Returns an array of nodes that are connected to a specific node and 
            // contain a specific value and are not in a certain location
            $nodeList = $this->findAll($value);
            $newList = [];
            if ($nodeList) {
                foreach ($nodeList as $node2) {
                    if (!in_array($node2->loc, $notInLoc) && $this->isConnected($node, $node2)) {
                        $newList[] = $node2;
                    }
                }
            }
            return (empty($newList)) ? false : $newList;
        }



        function inner($word, $list, $i = 0, $notInLoc = []) {
            $i++;
            foreach($list as $node) {
                $notInLoc[] = $node->loc;
                if ($list2 = $this->findAllConnectedTo($node, $word[$i], $notInLoc)) {
                    if ($i == (strlen($word) - 1)) {
                        return true;
                    } else {
                        return $this->inner($word, $list2, $i, $notInLoc);
                    }
                }
            }
            return false;
        }

        function findWord($word) {
            if ($list = $this->findAll($word[0])) {
                return $this->inner($word, $list);
            }
            return false;
        }

        function findAllWords() {
            foreach($this->wordList as $word) {
                if ($this->findWord($word)) {
                    $this->foundWords[] = $word;
                }
            }
        }

        function displayBoard() {
            $current = $this->root;
            for ($i=0; $i < 4; $i++) {
                echo $current->value . " " . $current->next->value . " " . $current->next->next->value . " " . $current->next->next->next->value . "<br />";
                if ($i < 3) {
                    $current = $current->next->next->next->next;
                }
            }
        }

    }

    function randomBoardString() {
        return substr(str_shuffle(str_repeat("abcdefghijklmnopqrstuvwxyz", 16)), 0, 16);
    }

    $myBoggle = new Boggle(randomBoardString());
    $myBoggle->displayBoard();
    $x = microtime(true);
    $myBoggle->findAllWords();
    $y = microtime(true);
    echo ($y-$x);
    var_dump($myBoggle->foundWords);

    ?>

我不得不对一个完整的解决方案进行更多的思考,但作为一种方便的优化,我想知道是否值得根据字典中的所有单词预先计算一个图表和三字母组合(2字母和3字母组合)的频率表,并使用它来确定搜索的优先级。我会选择单词的首字母。因此,如果你的字典包含“印度”、“水”、“极端”和“非凡”这些词,那么你预先计算的表可能是:

'IN': 1
'WA': 1
'EX': 2

然后按照共性的顺序(首先是EX,然后是WA/ in)搜索这些图表

我的答案和这里的其他答案一样,但我把它贴出来是因为它看起来比其他Python解决方案快一些,因为设置字典更快。(我对比了John Fouhy的解决方案。)设置后,解决的时间在噪声中下降。

grid = "fxie amlo ewbx astu".split()
nrows, ncols = len(grid), len(grid[0])

# A dictionary word that could be a solution must use only the grid's
# letters and have length >= 3. (With a case-insensitive match.)
import re
alphabet = ''.join(set(''.join(grid)))
bogglable = re.compile('[' + alphabet + ']{3,}$', re.I).match

words = set(word.rstrip('\n') for word in open('words') if bogglable(word))
prefixes = set(word[:i] for word in words
               for i in range(2, len(word)+1))

def solve():
    for y, row in enumerate(grid):
        for x, letter in enumerate(row):
            for result in extending(letter, ((x, y),)):
                yield result

def extending(prefix, path):
    if prefix in words:
        yield (prefix, path)
    for (nx, ny) in neighbors(path[-1]):
        if (nx, ny) not in path:
            prefix1 = prefix + grid[ny][nx]
            if prefix1 in prefixes:
                for result in extending(prefix1, path + ((nx, ny),)):
                    yield result

def neighbors((x, y)):
    for nx in range(max(0, x-1), min(x+2, ncols)):
        for ny in range(max(0, y-1), min(y+2, nrows)):
            yield (nx, ny)

示例用法:

# Print a maximal-length word and its path:
print max(solve(), key=lambda (word, path): len(word))

编辑:过滤掉长度小于3个字母的单词。

编辑2:我很好奇为什么Kent Fredric的Perl解决方案更快;它使用正则表达式匹配,而不是一组字符。在Python中做同样的事情,速度大约会翻倍。

该解决方案还提供了在给定的板中搜索的方向

一件事:

1. Uses trie to save all the word in the english to fasten the search
2. The uses DFS to search the words in Boggle

输出:

Found "pic" directions from (4,0)(p) go  → →
Found "pick" directions from (4,0)(p) go  → → ↑
Found "pickman" directions from (4,0)(p) go  → → ↑ ↑ ↖ ↑
Found "picket" directions from (4,0)(p) go  → → ↑ ↗ ↖
Found "picked" directions from (4,0)(p) go  → → ↑ ↗ ↘
Found "pickle" directions from (4,0)(p) go  → → ↑ ↘ →

代码:

from collections import defaultdict
from nltk.corpus import words
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize

english_words = words.words()

# If you wan to remove stop words
# stop_words = set(stopwords.words('english'))
# english_words = [w for w in english_words if w not in stop_words]

boggle = [
    ['c', 'n', 't', 's', 's'],
    ['d', 'a', 't', 'i', 'n'],
    ['o', 'o', 'm', 'e', 'l'],
    ['s', 'i', 'k', 'n', 'd'],
    ['p', 'i', 'c', 'l', 'e']
]

# Instead of X and Y co-ordinates
# better to use Row and column
lenc = len(boggle[0])
lenr = len(boggle)

# Initialize trie datastructure
trie_node = {'valid': False, 'next': {}}

# lets get the delta to find all the nighbors
neighbors_delta = [
    (-1,-1, "↖"),
    (-1, 0, "↑"),
    (-1, 1, "↗"),
    (0, -1, "←"),
    (0,  1, "→"),
    (1, -1, "↙"),
    (1,  0, "↓"),
    (1,  1, "↘"),
]


def gen_trie(word, node):
    """udpates the trie datastructure using the given word"""
    if not word:
        return

    if word[0] not in node:
        node[word[0]] = {'valid': len(word) == 1, 'next': {}}

    # recursively build trie
    gen_trie(word[1:], node[word[0]])


def build_trie(words, trie):
    """Builds trie data structure from the list of words given"""
    for word in words:
        gen_trie(word, trie)
    return trie


def get_neighbors(r, c):
    """Returns the neighbors for a given co-ordinates"""
    n = []
    for neigh in neighbors_delta:
        new_r = r + neigh[0]
        new_c = c + neigh[1]

        if (new_r >= lenr) or (new_c >= lenc) or (new_r < 0) or (new_c < 0):
            continue
        n.append((new_r, new_c, neigh[2]))
    return n


def dfs(r, c, visited, trie, now_word, direction):
    """Scan the graph using DFS"""
    if (r, c) in visited:
        return

    letter = boggle[r][c]
    visited.append((r, c))

    if letter in trie:
        now_word += letter

        if trie[letter]['valid']:
            print('Found "{}" {}'.format(now_word, direction))

        neighbors = get_neighbors(r, c)
        for n in neighbors:
            dfs(n[0], n[1], visited[::], trie[letter], now_word, direction + " " + n[2])


def main(trie_node):
    """Initiate the search for words in boggle"""
    trie_node = build_trie(english_words, trie_node)

    # print the board
    print("Given board")
    for i in range(lenr):print (boggle[i])
    print ('\n')

    for r in range(lenr):
        for c in range(lenc):
            letter = boggle[r][c]
            dfs(r, c, [], trie_node, '', 'directions from ({},{})({}) go '.format(r, c, letter))


if __name__ == '__main__':
    main(trie_node)