我想创建一个URL缩短服务,你可以写一个长URL到输入字段和服务缩短URL为“http://www.example.org/abcdef”。

可以用包含a-z, a-z和0-9的6个字符的字符串代替"abcdef"。这样就有560 ~ 570亿个可能的字符串。

我的方法:

我有一个有三列的数据库表:

Id,整数,自动递增 long, string,用户输入的长URL 短,字符串,缩短的URL(或只有六个字符)

然后将长URL插入到表中。然后,我将为“id”选择自动递增的值,并构建它的散列。这个散列应该作为“short”插入。但是我应该构建什么样的哈希呢?像MD5这样的哈希算法会创建太长的字符串。我认为我不用这些算法。一个自建的算法也可以。

我的想法:

对于“http://www.google.de/”,我得到了自动增量id 239472。然后我执行以下步骤:

short = '';
if divisible by 2, add "a"+the result to short
if divisible by 3, add "b"+the result to short
... until I have divisors for a-z and A-Z.

可以不断重复,直到这个数不能被整除为止。你认为这是一个好方法吗?你有更好的主意吗?

由于对这个主题的持续兴趣,我已经发布了一个高效的GitHub解决方案,包括JavaScript、PHP、Python和Java的实现。如果你喜欢,添加你的解决方案:)


当前回答

alphabet = map(chr, range(97,123)+range(65,91)) + map(str,range(0,10))

def lookup(k, a=alphabet):
    if type(k) == int:
        return a[k]
    elif type(k) == str:
        return a.index(k)


def encode(i, a=alphabet):
    '''Takes an integer and returns it in the given base with mappings for upper/lower case letters and numbers 0-9.'''
    try:
        i = int(i)
    except Exception:
        raise TypeError("Input must be an integer.")

    def incode(i=i, p=1, a=a):
        # Here to protect p.                                                                                                                                                                                                                
        if i <= 61:
            return lookup(i)

        else:
            pval = pow(62,p)
            nval = i/pval
            remainder = i % pval
            if nval <= 61:
                return lookup(nval) + incode(i % pval)
            else:
                return incode(i, p+1)

    return incode()



def decode(s, a=alphabet):
    '''Takes a base 62 string in our alphabet and returns it in base10.'''
    try:
        s = str(s)
    except Exception:
        raise TypeError("Input must be a string.")

    return sum([lookup(i) * pow(62,p) for p,i in enumerate(list(reversed(s)))])a

这是我的版本,给任何需要的人。

其他回答

要获得高质量的Node.js / JavaScript解决方案,请参阅id缩短器模块,该模块经过了全面测试,已经在生产环境中使用了几个月。

它提供了一个有效的id / URL缩短支持的可插拔存储默认为Redis,你甚至可以自定义你的短id字符集,是否缩短是幂等的。这是一个重要的区别,并不是所有的URL缩短器都考虑在内。

相对于这里的其他答案,这个模块实现了上面Marcel Jackwerth的优秀的公认答案。

解决方案的核心由下面的Redis Lua代码段提供:

local sequence = redis.call('incr', KEYS[1])

local chars = '0123456789ABCDEFGHJKLMNPQRSTUVWXYZ_abcdefghijkmnopqrstuvwxyz'
local remaining = sequence
local slug = ''

while (remaining > 0) do
  local d = (remaining % 60)
  local character = string.sub(chars, d + 1, d + 1)

  slug = character .. slug
  remaining = (remaining - d) / 60
end

redis.call('hset', KEYS[2], slug, ARGV[1])

return slug

c#版本:

public class UrlShortener 
{
    private static String ALPHABET = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
    private static int    BASE     = 62;

    public static String encode(int num)
    {
        StringBuilder sb = new StringBuilder();

        while ( num > 0 )
        {
            sb.Append( ALPHABET[( num % BASE )] );
            num /= BASE;
        }

        StringBuilder builder = new StringBuilder();
        for (int i = sb.Length - 1; i >= 0; i--)
        {
            builder.Append(sb[i]);
        }
        return builder.ToString(); 
    }

    public static int decode(String str)
    {
        int num = 0;

        for ( int i = 0, len = str.Length; i < len; i++ )
        {
            num = num * BASE + ALPHABET.IndexOf( str[(i)] ); 
        }

        return num;
    }   
}

Scala中的实现:

class Encoder(alphabet: String) extends (Long => String) {

  val Base = alphabet.size

  override def apply(number: Long) = {
    def encode(current: Long): List[Int] = {
      if (current == 0) Nil
      else (current % Base).toInt :: encode(current / Base)
    }
    encode(number).reverse
      .map(current => alphabet.charAt(current)).mkString
  }
}

class Decoder(alphabet: String) extends (String => Long) {

  val Base = alphabet.size

  override def apply(string: String) = {
    def decode(current: Long, encodedPart: String): Long = {
      if (encodedPart.size == 0) current
      else decode(current * Base + alphabet.indexOf(encodedPart.head),encodedPart.tail)
    }
    decode(0,string)
  }
}

使用Scala测试的测试示例:

import org.scalatest.{FlatSpec, Matchers}

class DecoderAndEncoderTest extends FlatSpec with Matchers {

  val Alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"

  "A number with base 10" should "be correctly encoded into base 62 string" in {
    val encoder = new Encoder(Alphabet)
    encoder(127) should be ("cd")
    encoder(543513414) should be ("KWGPy")
  }

  "A base 62 string" should "be correctly decoded into a number with base 10" in {
    val decoder = new Decoder(Alphabet)
    decoder("cd") should be (127)
    decoder("KWGPy") should be (543513414)
  }

}

你是故意省略O 0和i的吗?

我刚刚基于Ryan的解决方案创建了一个PHP类。

<?php

    $shorty = new App_Shorty();

    echo 'ID: ' . 1000;
    echo '<br/> Short link: ' . $shorty->encode(1000);
    echo '<br/> Decoded Short Link: ' . $shorty->decode($shorty->encode(1000));


    /**
     * A nice shorting class based on Ryan Charmley's suggestion see the link on Stack Overflow below.
     * @author Svetoslav Marinov (Slavi) | http://WebWeb.ca
     * @see http://stackoverflow.com/questions/742013/how-to-code-a-url-shortener/10386945#10386945
     */
    class App_Shorty {
        /**
         * Explicitly omitted: i, o, 1, 0 because they are confusing. Also use only lowercase ... as
         * dictating this over the phone might be tough.
         * @var string
         */
        private $dictionary = "abcdfghjklmnpqrstvwxyz23456789";
        private $dictionary_array = array();

        public function __construct() {
            $this->dictionary_array = str_split($this->dictionary);
        }

        /**
         * Gets ID and converts it into a string.
         * @param int $id
         */
        public function encode($id) {
            $str_id = '';
            $base = count($this->dictionary_array);

            while ($id > 0) {
                $rem = $id % $base;
                $id = ($id - $rem) / $base;
                $str_id .= $this->dictionary_array[$rem];
            }

            return $str_id;
        }

        /**
         * Converts /abc into an integer ID
         * @param string
         * @return int $id
         */
        public function decode($str_id) {
            $id = 0;
            $id_ar = str_split($str_id);
            $base = count($this->dictionary_array);

            for ($i = count($id_ar); $i > 0; $i--) {
                $id += array_search($id_ar[$i - 1], $this->dictionary_array) * pow($base, $i - 1);
            }
            return $id;
        }
    }
?>

您可以散列整个URL,但如果您只想缩短id,请按照marcel的建议进行。我写了这个Python实现:

https://gist.github.com/778542