百科狗-知识改变命运!
--

Transliterator::transliterate() - Transliterator类

乐乐2年前 (2023-11-21)阅读数 25#技术干货
文章标签风格

Transliterator::transliterate()

(PHP 5 >= 5.4.0, PHP 7, PECL intl >= 2.0.0)

Transliterate a string

说明

面向对象风格
publicTransliterator::transliterate(string $subject[,int $start[,int $end]]): string过程化风格
transliterator_transliterate(mixed $transliterator,string $subject[,int $start[,int $end]])

Transforms a string or part thereof using an ICU transliterator.

参数

$transliterator

In the procedural version, either a Transliterator or a string from which a Transliterator can be built.

$subject

The string to be transformed.

$start

The start index(in UTF-16 code units)from which the string will start to be transformed, inclusive. Indexing starts at 0. The text before will be left as is.

$end

The end index(in UTF-16 code units)until which the string will be transformed, exclusive. Indexing starts at 0. The text after will be left as is.

返回值

The transfomed string on success,或者在失败时返回FALSE.

范例

Transliterator::transliterate() - Transliterator类

Converting escaped UTF-16 code units

以上例程的输出类似于:

お早うございます
1
\uD834\uDD1E
𝄞

参见

  • Transliterator::getErrorMessage() Get last error message
  • Transliterator::__construct() Private constructor to deny instantiation
I pretty much like the idea of hdogan, but there's at least one group of characters he's missing: ligature characters.
They're at least used in Norwegian and I read something about French, too ... Some are just used for styling (f.e. fi)
Here's an example that supports all characters (should at least, according to the documentation):

In this example any character will firstly be converted to a latin character. If that's finished, replace all latin characters by their ASCII replacement.
Sorry, for posting it again, but I found a bug in my code:
If you have a character, like the cyrillic ь (a soft-sign - no sound), the "Any-Latin" would translate it to a prime-character, and the "Latin-ASCII" doesn't touch prime-characters. Therefore I added an option to remove all characters, that are higher than \u0100.
Here's my new code, including an example:
var_dump(transliterator_transliterate('Any-Latin; Latin-ASCII; [\u0100-\u7fff] remove',
  "A æ Übérmensch på høyeste nivå! И я люблю PHP! есть. fi"));
// string(50) "A ae Ubermensch pa hoyeste niva! I a lublu PHP! est. fi"
Another approach, I found quite helpful (if you by no way want to remove characters ...), try to use iconv() in addition. This surely will just return ASCII characters.
See: http://stackoverflow.com/a/3542748/517914
Also an example here:
var_dump(iconv("UTF-8", "ASCII//TRANSLIT//IGNORE", transliterator_transliterate('Any-Latin; Latin-ASCII',
  "A æ Übérmensch på høyeste nivå! И я люблю PHP! есть. fi"));
// string(50) "A ae Ubermensch pa hoyeste niva! I a lublu PHP! est'. fi"
You can create slugs easily with: 
OOP version : 
There are some possibly undesirable conversions with ASCII//TRANSLIT//IGNORE or your users may require some custom stuff.
You might want to run a substitution up front for certain things, such as when you want 3 letter ISO codes to replace currency symbols. £ transliterates to "lb", for example, which is incorrect since it's a currency symbol, not a weight symbol (#). 
ASCII//TRANSLIT//IGNORE does a great job within the realm of possibility :-)
When it doesn't do something you want it to, you can set up a CSV with one replacement per line and run a function like:
  function stripByMap($inputString, $mapFile)
  {
    $csv = file($mapFile);
    foreach($csv as $line)
    {
      $arrLine = explode(',', trim($line));
      $inputString = str_replace($arrLine[0],$arrLine[1],$inputString);
    }
    return $inputString;
  }
or you can write some regexes. Transliterating using ASCII//TRANSLIT//IGNORE works so well that your map probably won't be very long...

鹏仔微信 15129739599 鹏仔QQ344225443 鹏仔前端 pjxi.com 共享博客 sharedbk.com

免责声明:我们致力于保护作者版权,注重分享,当前被刊用文章因无法核实真实出处,未能及时与作者取得联系,或有版权异议的,请联系管理员,我们会立即处理! 部分文章是来自自研大数据AI进行生成,内容摘自(百度百科,百度知道,头条百科,中国民法典,刑法,牛津词典,新华词典,汉语词典,国家院校,科普平台)等数据,内容仅供学习参考,不准确地方联系删除处理!邮箱:344225443@qq.com)

图片声明:本站部分配图来自网络。本站只作为美观性配图使用,无任何非法侵犯第三方意图,一切解释权归图片著作权方,本站不承担任何责任。如有恶意碰瓷者,必当奉陪到底严惩不贷!

内容声明:本文中引用的各种信息及资料(包括但不限于文字、数据、图表及超链接等)均来源于该信息及资料的相关主体(包括但不限于公司、媒体、协会等机构)的官方网站或公开发表的信息。部分内容参考包括:(百度百科,百度知道,头条百科,中国民法典,刑法,牛津词典,新华词典,汉语词典,国家院校,科普平台)等数据,内容仅供参考使用,不准确地方联系删除处理!本站为非盈利性质站点,本着为中国教育事业出一份力,发布内容不收取任何费用也不接任何广告!)