最近我一直在iPhone上玩一款名为《Scramble》的游戏。有些人可能知道这个游戏叫拼字游戏。从本质上讲,当游戏开始时,你会得到一个字母矩阵:

F X I E
A M L O
E W B X
A S T U

The goal of the game is to find as many words as you can that can be formed by chaining letters together. You can start with any letter, and all the letters that surround it are fair game, and then once you move on to the next letter, all the letters that surround that letter are fair game, except for any previously used letters. So in the grid above, for example, I could come up with the words LOB, TUX, SEA, FAME, etc. Words must be at least 3 characters, and no more than NxN characters, which would be 16 in this game but can vary in some implementations. While this game is fun and addictive, I am apparently not very good at it and I wanted to cheat a little bit by making a program that would give me the best possible words (the longer the word the more points you get).

(来源:boggled.org)

不幸的是,我不太擅长算法或它们的效率等等。我的第一次尝试使用一个像这样的字典(约2.3MB),并进行线性搜索,试图匹配字典条目的组合。这需要花费很长时间来找到可能的单词,因为你每轮只有2分钟的时间,这是不够的。

我很有兴趣看看是否有任何Stackoverflowers可以提出更有效的解决方案。我主要是在寻找使用三大p的解决方案:Python、PHP和Perl,尽管任何使用Java或c++的东西也很酷,因为速度是至关重要的。

目前的解决方案:

Adam Rosenfield, Python, ~20岁 John Fouhy, Python, ~3秒 Kent Fredric, Perl, ~1s Darius Bacon, Python, ~1s rvarcher, VB。净,~ 1 s Paolo Bergantino, PHP(实时链接),~5s(本地~2s)


当前回答

我花了3个月的时间致力于解决10个最佳点密集的5x5 Boggle板问题。

这个问题现在已经解决了,并在5个网页上进行了全面披露。有问题请联系我。

该棋盘分析算法使用显式堆栈,通过具有直接子信息的有向无环词图伪递归遍历棋盘方格,并使用时间戳跟踪机制。这很可能是世界上最先进的词汇数据结构。

该方案在四核上每秒评估大约10,000块非常好的电路板。(9500 +分)

父网页:

DeepSearch.c - http://www.pathcom.com/~vadco/deep.html

组件网页:

最佳记分牌- http://www.pathcom.com/~vadco/binary.html

高级词汇结构- http://www.pathcom.com/~vadco/adtdawg.html

板分析算法- http://www.pathcom.com/~vadco/guns.html

并行批处理- http://www.pathcom.com/~vadco/parallel.html

- 只有追求最好的人才会对这本全面的著作感兴趣。

其他回答

你可以把这个问题分成两部分:

某种搜索算法可以在网格中列举出可能的字符串。 一种测试字符串是否是有效单词的方法。

理想情况下,(2)还应该包括一种测试字符串是否是有效单词前缀的方法——这将允许您精简搜索并节省大量时间。

亚当·罗森菲尔德(Adam Rosenfield)的Trie是(2)的一个解决方案。它很优雅,可能是算法专家的首选,但有了现代语言和现代计算机,我们可能会更懒一点。此外,正如Kent所建议的,我们可以通过丢弃网格中没有字母的单词来减少字典的大小。这是一些蟒蛇:

def make_lookups(grid, fn='dict.txt'):
    # Make set of valid characters.
    chars = set()
    for word in grid:
        chars.update(word)

    words = set(x.strip() for x in open(fn) if set(x.strip()) <= chars)
    prefixes = set()
    for w in words:
        for i in range(len(w)+1):
            prefixes.add(w[:i])

    return words, prefixes

哇;常数时间前缀测试。加载你链接的字典需要几秒钟,但只有几秒钟:-)(注意words <= prefixes)

现在,对于第(1)部分,我倾向于用图表来思考。所以我将创建一个像这样的字典:

graph = { (x, y):set([(x0,y0), (x1,y1), (x2,y2)]), }

例如,graph[(x, y)]是你从位置(x, y)可以到达的坐标集。我还将添加一个虚拟节点None,它将连接到所有东西。

构建它有点笨拙,因为有8个可能的位置,你必须做边界检查。下面是一些相应笨拙的python代码:

def make_graph(grid):
    root = None
    graph = { root:set() }
    chardict = { root:'' }

    for i, row in enumerate(grid):
        for j, char in enumerate(row):
            chardict[(i, j)] = char
            node = (i, j)
            children = set()
            graph[node] = children
            graph[root].add(node)
            add_children(node, children, grid)

    return graph, chardict

def add_children(node, children, grid):
    x0, y0 = node
    for i in [-1,0,1]:
        x = x0 + i
        if not (0 <= x < len(grid)):
            continue
        for j in [-1,0,1]:
            y = y0 + j
            if not (0 <= y < len(grid[0])) or (i == j == 0):
                continue

            children.add((x,y))

这段代码还建立了一个字典映射(x,y)到相应的字符。这让我把一个位置列表转换成一个单词:

def to_word(chardict, pos_list):
    return ''.join(chardict[x] for x in pos_list)

最后,我们进行深度优先搜索。基本程序是:

搜索到达一个特定的节点。 检查到目前为止的路径是否可能是单词的一部分。如果不是,就不要进一步探索这个分支。 检查到目前为止的路径是否是一个单词。如果是,则添加到结果列表中。 探索迄今为止所有孩子未走的路。

Python:

def find_words(graph, chardict, position, prefix, results, words, prefixes):
    """ Arguments:
      graph :: mapping (x,y) to set of reachable positions
      chardict :: mapping (x,y) to character
      position :: current position (x,y) -- equals prefix[-1]
      prefix :: list of positions in current string
      results :: set of words found
      words :: set of valid words in the dictionary
      prefixes :: set of valid words or prefixes thereof
    """
    word = to_word(chardict, prefix)

    if word not in prefixes:
        return

    if word in words:
        results.add(word)

    for child in graph[position]:
        if child not in prefix:
            find_words(graph, chardict, child, prefix+[child], results, words, prefixes)

运行代码如下:

grid = ['fxie', 'amlo', 'ewbx', 'astu']
g, c = make_graph(grid)
w, p = make_lookups(grid)
res = set()
find_words(g, c, None, [], res, w, p)

检查保留区,看看答案。下面是为你的例子找到的单词列表,按大小排序:

 ['a', 'b', 'e', 'f', 'i', 'l', 'm', 'o', 's', 't',
 'u', 'w', 'x', 'ae', 'am', 'as', 'aw', 'ax', 'bo',
 'bu', 'ea', 'el', 'em', 'es', 'fa', 'ie', 'io', 'li',
 'lo', 'ma', 'me', 'mi', 'oe', 'ox', 'sa', 'se', 'st',
 'tu', 'ut', 'wa', 'we', 'xi', 'aes', 'ame', 'ami',
 'ase', 'ast', 'awa', 'awe', 'awl', 'blo', 'but', 'elb',
 'elm', 'fae', 'fam', 'lei', 'lie', 'lim', 'lob', 'lox',
 'mae', 'maw', 'mew', 'mil', 'mix', 'oil', 'olm', 'saw',
 'sea', 'sew', 'swa', 'tub', 'tux', 'twa', 'wae', 'was',
 'wax', 'wem', 'ambo', 'amil', 'amli', 'asem', 'axil',
 'axle', 'bleo', 'boil', 'bole', 'east', 'fame', 'limb',
 'lime', 'mesa', 'mewl', 'mile', 'milo', 'oime', 'sawt',
 'seam', 'seax', 'semi', 'stub', 'swam', 'twae', 'twas',
 'wame', 'wase', 'wast', 'weam', 'west', 'amble', 'awest',
 'axile', 'embox', 'limbo', 'limes', 'swami', 'embole',
 'famble', 'semble', 'wamble']

代码需要(字面上的)几秒钟来加载字典,但其余的在我的机器上是立即完成的。

我已经在OCaml中实现了一个解决方案。它将字典预编译为trie,并使用双字母序列频率来消除单词中永远不会出现的边,以进一步加快处理速度。

它在0.35ms内解决了示例板的问题(额外的6ms启动时间主要与将trie加载到内存有关)。

找到的解决方案:

["swami"; "emile"; "limbs"; "limbo"; "limes"; "amble"; "tubs"; "stub";
 "swam"; "semi"; "seam"; "awes"; "buts"; "bole"; "boil"; "west"; "east";
 "emil"; "lobs"; "limb"; "lime"; "lima"; "mesa"; "mews"; "mewl"; "maws";
 "milo"; "mile"; "awes"; "amie"; "axle"; "elma"; "fame"; "ubs"; "tux"; "tub";
 "twa"; "twa"; "stu"; "saw"; "sea"; "sew"; "sea"; "awe"; "awl"; "but"; "btu";
 "box"; "bmw"; "was"; "wax"; "oil"; "lox"; "lob"; "leo"; "lei"; "lie"; "mes";
 "mew"; "mae"; "maw"; "max"; "mil"; "mix"; "awe"; "awl"; "elm"; "eli"; "fax"]

我很快完美地解决了这个问题。我把它放进了一个安卓应用程序。在play store链接中查看视频,看看它是如何运作的。

单词作弊是一个应用程序,“破解”任何矩阵风格的文字游戏。这个应用程序 来帮我在文字混淆器上作弊。它可以用于单词搜索, 沙沙,单词,单词查找器,单词破解,拼字游戏,和更多!

在这里可以看到 https://play.google.com/store/apps/details?id=com.harris.wordcracker

在视频中查看应用程序的操作 https://www.youtube.com/watch?v=DL2974WmNAI

下面是我的java实现:https://github.com/zouzhile/interview/blob/master/src/com/interview/algorithms/tree/BoggleSolver.java

Trie构建耗时0小时0分1秒532毫秒 单词搜索花了0小时0分0秒92毫秒

eel eeler eely eer eke eker eld eleut elk ell 
elle epee epihippus ere erept err error erupt eurus eye 
eyer eyey hip hipe hiper hippish hipple hippus his hish 
hiss hist hler hsi ihi iphis isis issue issuer ist 
isurus kee keek keeker keel keeler keep keeper keld kele 
kelek kelep kelk kell kelly kelp kelper kep kepi kept 
ker kerel kern keup keuper key kyl kyle lee leek 
leeky leep leer lek leo leper leptus lepus ler leu 
ley lleu lue lull luller lulu lunn lunt lunule luo 
lupe lupis lupulus lupus lur lure lurer lush lushly lust 
lustrous lut lye nul null nun nupe nurture nurturer nut 
oer ore ort ouphish our oust out outpeep outpeer outpipe 
outpull outpush output outre outrun outrush outspell outspue outspurn outspurt 
outstrut outstunt outsulk outturn outusure oyer pee peek peel peele 
peeler peeoy peep peeper peepeye peer pele peleus pell peller 
pelu pep peplus pepper pepperer pepsis per pern pert pertussis 
peru perule perun peul phi pip pipe piper pipi pipistrel 
pipistrelle pipistrellus pipper pish piss pist plup plus plush ply 
plyer psi pst puerer pul pule puler pulk pull puller 
pulley pullus pulp pulper pulu puly pun punt pup puppis 
pur pure puree purely purer purr purre purree purrel purrer 
puru purupuru pus push puss pustule put putt puture ree 
reek reeker reeky reel reeler reeper rel rely reoutput rep 
repel repeller repipe reply repp reps reree rereel rerun reuel 
roe roer roey roue rouelle roun roup rouper roust rout 
roy rue ruelle ruer rule ruler rull ruller run runt 
rupee rupert rupture ruru rus rush russ rust rustre rut 
shi shih ship shipper shish shlu sip sipe siper sipper 
sis sish sisi siss sissu sist sistrurus speel speer spelk 
spell speller splurt spun spur spurn spurrer spurt sput ssi 
ssu stre stree streek streel streeler streep streke streperous strepsis 
strey stroup stroy stroyer strue strunt strut stu stue stull 
stuller stun stunt stupe stupeous stupp sturnus sturt stuss stut 
sue suer suerre suld sulk sulker sulky sull sully sulu 
sun sunn sunt sunup sup supe super superoutput supper supple 
supplely supply sur sure surely surrey sus susi susu susurr 
susurrous susurrus sutu suture suu tree treey trek trekker trey 
troupe trouper trout troy true truer trull truller truly trun 
trush truss trust tshi tst tsun tsutsutsi tue tule tulle 
tulu tun tunu tup tupek tupi tur turn turnup turr 
turus tush tussis tussur tut tuts tutu tutulus ule ull 
uller ulu ululu unreel unrule unruly unrun unrust untrue untruly 
untruss untrust unturn unurn upper upperer uppish uppishly uppull uppush 
upspurt upsun upsup uptree uptruss upturn ure urn uro uru 
urus urushi ush ust usun usure usurer utu yee yeel 
yeld yelk yell yeller yelp yelper yeo yep yer yere 
yern yoe yor yore you youl youp your yourn yoy 

注意: 我在这个线程的开头使用了字典和字符矩阵。代码在我的MacBookPro上运行,下面是关于这台机器的一些信息。

型号:MacBook Pro 型号标识符:MacBookPro8,1 处理器名称:Intel Core i5 处理器速度:2.3 GHz 处理器数量:1 总核数:2 L2缓存(每核):256kb L3 Cache: 3mb 内存:4gb 引导ROM版本:MBP81.0047.B0E SMC版本(系统):1.68f96

下面是使用NLTK工具包中的预定义单词的解决方案 NLTK有NLTK。语料库包,我们有一个叫做单词的包,它包含超过20万个英语单词,你可以简单地把它们都用到你的程序中。

一旦创建你的矩阵转换成一个字符数组,并执行这段代码

import nltk
from nltk.corpus import words
from collections import Counter

def possibleWords(input, charSet):
    for word in input:
        dict = Counter(word)
        flag = 1
        for key in dict.keys():
            if key not in charSet:
                flag = 0
        if flag == 1 and len(word)>5: #its depends if you want only length more than 5 use this otherwise remove that one. 
            print(word)


nltk.download('words')
word_list = words.words()
# prints 236736
print(len(word_list))
charSet = ['h', 'e', 'l', 'o', 'n', 'v', 't']
possibleWords(word_list, charSet)

输出:

eleven
eleventh
elevon
entente
entone
ethene
ethenol
evolve
evolvent
hellhole
helvell
hooven
letten
looten
nettle
nonene
nonent
nonlevel
notelet
novelet
novelette
novene
teenet
teethe
teevee
telethon
tellee
tenent
tentlet
theelol
toetoe
tonlet
toothlet
tootle
tottle
vellon
velvet
velveteen
venene
vennel
venthole
voeten
volent
volvelle
volvent
voteen

我希望你能得到它。