我想有一个函数从Unicode字符串创建鼻涕虫,例如gen_slug('Andrés Cortez')应该返回andres-cortez。我该怎么做呢?


当前回答

这里有一个处理特殊字符的好解决方案。

fantastic text => fantastic text

function slugify( $string, $separator = '-' ) {
    $accents_regex = '~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml);~i';
    $special_cases = array( '&' => 'and', "'" => '');
    $string = mb_strtolower( trim( $string ), 'UTF-8' );
    $string = str_replace( array_keys($special_cases), array_values( $special_cases), $string );
    $string = preg_replace( $accents_regex, '$1', htmlentities( $string, ENT_QUOTES, 'UTF-8' ) );
    $string = preg_replace("/[^a-z0-9]/u", "$separator", $string);
    $string = preg_replace("/[$separator]+/u", "$separator", $string);
    return $string;
}

作家:Natxet

其他回答

Vous pouvez cette function pour transformer ton title en url
function slug($string, $delimiter = '-') {
    $oldLocale = setlocale(LC_ALL, '0');
    setlocale(LC_ALL, 'en_US.UTF-8');
    $clean = iconv('UTF-8', 'ASCII//TRANSLIT', $string);
    $clean = preg_replace("/[^a-zA-Z0-9\/_|+ -]/", '', $clean);
    $clean = strtolower($clean);
    $clean = preg_replace("/[\/_|+ -]+/", $delimiter, $clean);
    $clean = trim($clean, $delimiter);
    setlocale(LC_ALL, $oldLocale);
    return $clean;
}

对我来说,这个变体是完美的,它也改变&和。下面是代码:

function dSlug($string) {
    return strtolower(trim(preg_replace('~[^0-9a-z]+~i', '-', html_entity_decode(preg_replace('~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml);~i', '$1',htmlentities(preg_replace('/[&]/', ' and ', $title), ENT_QUOTES, 'UTF-8')), ENT_QUOTES, 'UTF-8')), '-'));
}`

更新

由于这个答案引起了一些关注,我在这里添加了一些解释。

所提供的解决方案基本上将用-(连字符)替换除A-Z、A-Z、0-9和-(连字符)之外的所有内容。因此,它不能与其他unicode字符(URL段码/字符串的有效字符)正常工作。一种常见的情况是输入字符串包含非英语字符。

只有当您确信输入字符串不会包含unicode字符时才使用此解决方案,您可能希望这些字符成为output/slug的一部分。

如。“नारीशक्ति”将成为 "----------" ( 连字符)而不是“नारी——शक्ति”(有效的URL蛞蝓)。

回答

$slug = strtolower(trim(preg_replace('/[^A-Za-z0-9-]+/', '-', $string)));

与其用冗长的替换,不如试试下面这个:

public static function slugify($text, string $divider = '-')
{
  // replace non letter or digits by divider
  $text = preg_replace('~[^\pL\d]+~u', $divider, $text);

  // transliterate
  $text = iconv('utf-8', 'us-ascii//TRANSLIT', $text);

  // remove unwanted characters
  $text = preg_replace('~[^-\w]+~', '', $text);

  // trim
  $text = trim($text, $divider);

  // remove duplicate divider
  $text = preg_replace('~-+~', $divider, $text);

  // lowercase
  $text = strtolower($text);

  if (empty($text)) {
    return 'n-a';
  }

  return $text;
}

这是基于Symfony的Jobeet教程。

这可能也是一种方法。灵感来自这些链接专家交流和alinalexander

function slugifier($txt){

   /* Get rid of accented characters */
   $search = explode(",","ç,æ,œ,á,é,í,ó,ú,à,è,ì,ò,ù,ä,ë,ï,ö,ü,ÿ,â,ê,î,ô,û,å,e,i,ø,u");
   $replace = explode(",","c,ae,oe,a,e,i,o,u,a,e,i,o,u,a,e,i,o,u,y,a,e,i,o,u,a,e,i,o,u");
   $txt = str_replace($search, $replace, $txt);

   /* Lowercase all the characters */
   $txt = strtolower($txt);

   /* Avoid whitespace at the beginning and the ending */
   $txt = trim($txt);

   /* Replace all the characters that are not in a-z or 0-9 by a hyphen */
   $txt = preg_replace("/[^a-z0-9]/", "-", $txt);
   /* Remove hyphen anywhere it's more than one */
   $txt = preg_replace("/[\-]+/", '-', $txt);
   return $txt;   
}