最近我一直在iPhone上玩一款名为《Scramble》的游戏。有些人可能知道这个游戏叫拼字游戏。从本质上讲,当游戏开始时,你会得到一个字母矩阵:

F X I E
A M L O
E W B X
A S T U

The goal of the game is to find as many words as you can that can be formed by chaining letters together. You can start with any letter, and all the letters that surround it are fair game, and then once you move on to the next letter, all the letters that surround that letter are fair game, except for any previously used letters. So in the grid above, for example, I could come up with the words LOB, TUX, SEA, FAME, etc. Words must be at least 3 characters, and no more than NxN characters, which would be 16 in this game but can vary in some implementations. While this game is fun and addictive, I am apparently not very good at it and I wanted to cheat a little bit by making a program that would give me the best possible words (the longer the word the more points you get).

(来源:boggled.org)

不幸的是,我不太擅长算法或它们的效率等等。我的第一次尝试使用一个像这样的字典(约2.3MB),并进行线性搜索,试图匹配字典条目的组合。这需要花费很长时间来找到可能的单词,因为你每轮只有2分钟的时间,这是不够的。

我很有兴趣看看是否有任何Stackoverflowers可以提出更有效的解决方案。我主要是在寻找使用三大p的解决方案:Python、PHP和Perl,尽管任何使用Java或c++的东西也很酷,因为速度是至关重要的。

目前的解决方案:

Adam Rosenfield, Python, ~20岁 John Fouhy, Python, ~3秒 Kent Fredric, Perl, ~1s Darius Bacon, Python, ~1s rvarcher, VB。净,~ 1 s Paolo Bergantino, PHP(实时链接),~5s(本地~2s)


当前回答

我不得不对一个完整的解决方案进行更多的思考,但作为一种方便的优化,我想知道是否值得根据字典中的所有单词预先计算一个图表和三字母组合(2字母和3字母组合)的频率表,并使用它来确定搜索的优先级。我会选择单词的首字母。因此,如果你的字典包含“印度”、“水”、“极端”和“非凡”这些词,那么你预先计算的表可能是:

'IN': 1
'WA': 1
'EX': 2

然后按照共性的顺序(首先是EX,然后是WA/ in)搜索这些图表

其他回答

当我看到问题陈述时,我想到了“Trie”。但看到其他一些海报使用了这种方法,我寻找另一种不同的方法。可惜的是,Trie方法表现更好。我在我的机器上运行了Kent的Perl解决方案,在调整它以使用我的字典文件后,它花了0.31秒运行。我自己的perl实现需要0.54秒才能运行。

这就是我的方法:

Create a transition hash to model the legal transitions. Iterate through all 16^3 possible three letter combinations. In the loop, exclude illegal transitions and repeat visits to the same square. Form all the legal 3-letter sequences and store them in a hash. Then loop through all words in the dictionary. Exclude words that are too long or short Slide a 3-letter window across each word and see if it is among the 3-letter combos from step 2. Exclude words that fail. This eliminates most non-matches. If still not eliminated, use a recursive algorithm to see if the word can be formed by making paths through the puzzle. (This part is slow, but called infrequently.) Print out the words I found. I tried 3-letter and 4-letter sequences, but 4-letter sequences slowed the program down.

在我的代码中,我使用/usr/share/dict/words作为我的字典。它是MAC OS X和许多Unix系统的标准配置。如果你愿意,你可以使用另一个文件。要破解不同的谜题,只需更改变量@puzzle。这将很容易适应更大的矩阵。你只需要改变%transitions哈希值和%legalTransitions哈希值。

这种解决方案的优点是代码短,数据结构简单。

下面是Perl代码(我知道它使用了太多的全局变量):

#!/usr/bin/perl
use Time::HiRes  qw{ time };

sub readFile($);
sub findAllPrefixes($);
sub isWordTraceable($);
sub findWordsInPuzzle(@);

my $startTime = time;

# Puzzle to solve

my @puzzle = ( 
    F, X, I, E,
    A, M, L, O,
    E, W, B, X,
    A, S, T, U
);

my $minimumWordLength = 3;
my $maximumPrefixLength = 3; # I tried four and it slowed down.

# Slurp the word list.
my $wordlistFile = "/usr/share/dict/words";

my @words = split(/\n/, uc(readFile($wordlistFile)));
print "Words loaded from word list: " . scalar @words . "\n";

print "Word file load time: " . (time - $startTime) . "\n";
my $postLoad = time;

# Define the legal transitions from one letter position to another. 
# Positions are numbered 0-15.
#     0  1  2  3
#     4  5  6  7
#     8  9 10 11
#    12 13 14 15
my %transitions = ( 
   -1 => [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15],
    0 => [1,4,5], 
    1 => [0,2,4,5,6],
    2 => [1,3,5,6,7],
    3 => [2,6,7],
    4 => [0,1,5,8,9],
    5 => [0,1,2,4,6,8,9,10],
    6 => [1,2,3,5,7,9,10,11],
    7 => [2,3,6,10,11],
    8 => [4,5,9,12,13],
    9 => [4,5,6,8,10,12,13,14],
    10 => [5,6,7,9,11,13,14,15],
    11 => [6,7,10,14,15],
    12 => [8,9,13],
    13 => [8,9,10,12,14],
    14 => [9,10,11,13,15],
    15 => [10,11,14]
);

# Convert the transition matrix into a hash for easy access.
my %legalTransitions = ();
foreach my $start (keys %transitions) {
    my $legalRef = $transitions{$start};
    foreach my $stop (@$legalRef) {
        my $index = ($start + 1) * (scalar @puzzle) + ($stop + 1);
        $legalTransitions{$index} = 1;
    }
}

my %prefixesInPuzzle = findAllPrefixes($maximumPrefixLength);

print "Find prefixes time: " . (time - $postLoad) . "\n";
my $postPrefix = time;

my @wordsFoundInPuzzle = findWordsInPuzzle(@words);

print "Find words in puzzle time: " . (time - $postPrefix) . "\n";

print "Unique prefixes found: " . (scalar keys %prefixesInPuzzle) . "\n";
print "Words found (" . (scalar @wordsFoundInPuzzle) . ") :\n    " . join("\n    ", @wordsFoundInPuzzle) . "\n";

print "Total Elapsed time: " . (time - $startTime) . "\n";

###########################################

sub readFile($) {
    my ($filename) = @_;
    my $contents;
    if (-e $filename) {
        # This is magic: it opens and reads a file into a scalar in one line of code. 
        # See http://www.perl.com/pub/a/2003/11/21/slurp.html
        $contents = do { local( @ARGV, $/ ) = $filename ; <> } ; 
    }
    else {
        $contents = '';
    }
    return $contents;
}

# Is it legal to move from the first position to the second? They must be adjacent.
sub isLegalTransition($$) {
    my ($pos1,$pos2) = @_;
    my $index = ($pos1 + 1) * (scalar @puzzle) + ($pos2 + 1);
    return $legalTransitions{$index};
}

# Find all prefixes where $minimumWordLength <= length <= $maxPrefixLength
#
#   $maxPrefixLength ... Maximum length of prefix we will store. Three gives best performance. 
sub findAllPrefixes($) {
    my ($maxPrefixLength) = @_;
    my %prefixes = ();
    my $puzzleSize = scalar @puzzle;

    # Every possible N-letter combination of the letters in the puzzle 
    # can be represented as an integer, though many of those combinations
    # involve illegal transitions, duplicated letters, etc.
    # Iterate through all those possibilities and eliminate the illegal ones.
    my $maxIndex = $puzzleSize ** $maxPrefixLength;

    for (my $i = 0; $i < $maxIndex; $i++) {
        my @path;
        my $remainder = $i;
        my $prevPosition = -1;
        my $prefix = '';
        my %usedPositions = ();
        for (my $prefixLength = 1; $prefixLength <= $maxPrefixLength; $prefixLength++) {
            my $position = $remainder % $puzzleSize;

            # Is this a valid step?
            #  a. Is the transition legal (to an adjacent square)?
            if (! isLegalTransition($prevPosition, $position)) {
                last;
            }

            #  b. Have we repeated a square?
            if ($usedPositions{$position}) {
                last;
            }
            else {
                $usedPositions{$position} = 1;
            }

            # Record this prefix if length >= $minimumWordLength.
            $prefix .= $puzzle[$position];
            if ($prefixLength >= $minimumWordLength) {
                $prefixes{$prefix} = 1;
            }

            push @path, $position;
            $remainder -= $position;
            $remainder /= $puzzleSize;
            $prevPosition = $position;
        } # end inner for
    } # end outer for
    return %prefixes;
}

# Loop through all words in dictionary, looking for ones that are in the puzzle.
sub findWordsInPuzzle(@) {
    my @allWords = @_;
    my @wordsFound = ();
    my $puzzleSize = scalar @puzzle;
WORD: foreach my $word (@allWords) {
        my $wordLength = length($word);
        if ($wordLength > $puzzleSize || $wordLength < $minimumWordLength) {
            # Reject word as too short or too long.
        }
        elsif ($wordLength <= $maximumPrefixLength ) {
            # Word should be in the prefix hash.
            if ($prefixesInPuzzle{$word}) {
                push @wordsFound, $word;
            }
        }
        else {
            # Scan through the word using a window of length $maximumPrefixLength, looking for any strings not in our prefix list.
            # If any are found that are not in the list, this word is not possible.
            # If no non-matches are found, we have more work to do.
            my $limit = $wordLength - $maximumPrefixLength + 1;
            for (my $startIndex = 0; $startIndex < $limit; $startIndex ++) {
                if (! $prefixesInPuzzle{substr($word, $startIndex, $maximumPrefixLength)}) {
                    next WORD;
                }
            }
            if (isWordTraceable($word)) {
                # Additional test necessary: see if we can form this word by following legal transitions
                push @wordsFound, $word;
            }
        }

    }
    return @wordsFound;
}

# Is it possible to trace out the word using only legal transitions?
sub isWordTraceable($) {
    my $word = shift;
    return traverse([split(//, $word)], [-1]); # Start at special square -1, which may transition to any square in the puzzle.
}

# Recursively look for a path through the puzzle that matches the word.
sub traverse($$) {
    my ($lettersRef, $pathRef) = @_;
    my $index = scalar @$pathRef - 1;
    my $position = $pathRef->[$index];
    my $letter = $lettersRef->[$index];
    my $branchesRef =  $transitions{$position};
BRANCH: foreach my $branch (@$branchesRef) {
            if ($puzzle[$branch] eq $letter) {
                # Have we used this position yet?
                foreach my $usedBranch (@$pathRef) {
                    if ($usedBranch == $branch) {
                        next BRANCH;
                    }
                }
                if (scalar @$lettersRef == $index + 1) {
                    return 1; # End of word and success.
                }
                push @$pathRef, $branch;
                if (traverse($lettersRef, $pathRef)) {
                    return 1; # Recursive success.
                }
                else {
                    pop @$pathRef;
                }
            }
        }
    return 0; # No path found. Failed.
}

我知道我已经非常晚了,但是我之前用PHP做了一个——只是为了好玩……

http://www.lostsockdesign.com.au/sandbox/boggle/index.php?letters=fxieamloewbxastu 在0.90108秒内找到75个单词(133分)

F……X . .我 .............. E ............... 一个 ...................................... 米 .............................. L ............................ O ............................... E .................... W ............................ B .......................... X 一个 .................. 年代 .................................................. T ................. U…

给出了一些程序实际在做什么的指示-每个字母是它开始查看模式的地方,而每个'。这显示了中国试图走的一条道路。越多越好。“它搜索得越远。

如果你想要密码,请告诉我…这是一个可怕的PHP和HTML的混合体,从来没有想过要看到阳光,所以我不敢在这里张贴:P

下面是使用NLTK工具包中的预定义单词的解决方案 NLTK有NLTK。语料库包,我们有一个叫做单词的包,它包含超过20万个英语单词,你可以简单地把它们都用到你的程序中。

一旦创建你的矩阵转换成一个字符数组,并执行这段代码

import nltk
from nltk.corpus import words
from collections import Counter

def possibleWords(input, charSet):
    for word in input:
        dict = Counter(word)
        flag = 1
        for key in dict.keys():
            if key not in charSet:
                flag = 0
        if flag == 1 and len(word)>5: #its depends if you want only length more than 5 use this otherwise remove that one. 
            print(word)


nltk.download('words')
word_list = words.words()
# prints 236736
print(len(word_list))
charSet = ['h', 'e', 'l', 'o', 'n', 'v', 't']
possibleWords(word_list, charSet)

输出:

eleven
eleventh
elevon
entente
entone
ethene
ethenol
evolve
evolvent
hellhole
helvell
hooven
letten
looten
nettle
nonene
nonent
nonlevel
notelet
novelet
novelette
novene
teenet
teethe
teevee
telethon
tellee
tenent
tentlet
theelol
toetoe
tonlet
toothlet
tootle
tottle
vellon
velvet
velveteen
venene
vennel
venthole
voeten
volent
volvelle
volvent
voteen

我希望你能得到它。

首先,阅读c#语言设计师如何解决一个相关问题: http://blogs.msdn.com/ericlippert/archive/2009/02/04/a-nasality-talisman-for-the-sultana-analyst.aspx。

像他一样,您可以从字典开始,并通过从字母排序的字母数组到可以根据这些字母拼写的单词列表创建字典来规范化单词。

接下来,开始从黑板上创建可能的单词并查找它们。我怀疑这将让你走得很远,但肯定有更多的技巧可以加快速度。

我建议根据单词做一个字母树。这棵树将由字母结构组成,像这样:

letter: char
isWord: boolean

然后构建树,每个深度添加一个新字母。换句话说,第一层是字母表;然后从这些树中,会有另外26个条目,以此类推,直到你把所有的单词都拼出来。坚持这个解析树,它将使所有可能的答案更快地查找。

使用这个解析过的树,您可以非常快速地找到解决方案。下面是伪代码:

BEGIN: 
    For each letter:
        if the struct representing it on the current depth has isWord == true, enter it as an answer.
        Cycle through all its neighbors; if there is a child of the current node corresponding to the letter, recursively call BEGIN on it.

这可以通过一些动态编程来加快。例如,在你的样本中,两个“A”都在一个“E”和一个“W”旁边,这(从它们击中它们的点来看)是相同的。我没有足够的时间来详细说明这个代码,但我想你们可以理解。

此外,我相信你会找到其他解决方案,如果你谷歌“Boggle solver”。