Transliterator::transliterate() - Transliterator类
Transliterator::transliterate()
(PHP 5 >= 5.4.0, PHP 7, PECL intl >= 2.0.0)
Transliterate a string
说明
面向对象风格publicTransliterator::transliterate(string $subject[,int $start[,int $end]]): string过程化风格
transliterator_transliterate(mixed $transliterator,string $subject[,int $start[,int $end]])
Transforms a string or part thereof using an ICU transliterator.
参数
$transliteratorIn the procedural version, either a Transliterator or a string from which a Transliterator can be built.
$subjectThe string to be transformed.
$startThe start index(in UTF-16 code units)from which the string will start to be transformed, inclusive. Indexing starts at 0. The text before will be left as is.
$endThe end index(in UTF-16 code units)until which the string will be transformed, exclusive. Indexing starts at 0. The text after will be left as is.
返回值
The transfomed string on success,或者在失败时返回FALSE
.
范例
Converting escaped UTF-16 code units
以上例程的输出类似于:
お早うございます 1 \uD834\uDD1E ?
参见
- Transliterator::getErrorMessage() Get last error message
- Transliterator::__construct() Private constructor to deny instantiation
I pretty much like the idea of hdogan, but there's at least one group of characters he's missing: ligature characters. They're at least used in Norwegian and I read something about French, too ... Some are just used for styling (f.e. fi) Here's an example that supports all characters (should at least, according to the documentation): In this example any character will firstly be converted to a latin character. If that's finished, replace all latin characters by their ASCII replacement.
Sorry, for posting it again, but I found a bug in my code: If you have a character, like the cyrillic ь (a soft-sign - no sound), the "Any-Latin" would translate it to a prime-character, and the "Latin-ASCII" doesn't touch prime-characters. Therefore I added an option to remove all characters, that are higher than \u0100. Here's my new code, including an example: var_dump(transliterator_transliterate('Any-Latin; Latin-ASCII; [\u0100-\u7fff] remove', "A æ Übérmensch på høyeste nivå! И я люблю PHP! есть. fi")); // string(50) "A ae Ubermensch pa hoyeste niva! I a lublu PHP! est. fi" Another approach, I found quite helpful (if you by no way want to remove characters ...), try to use iconv() in addition. This surely will just return ASCII characters. See: http://stackoverflow.com/a/3542748/517914 Also an example here: var_dump(iconv("UTF-8", "ASCII//TRANSLIT//IGNORE", transliterator_transliterate('Any-Latin; Latin-ASCII', "A æ Übérmensch på høyeste nivå! И я люблю PHP! есть. fi")); // string(50) "A ae Ubermensch pa hoyeste niva! I a lublu PHP! est'. fi"
You can create slugs easily with:
OOP version :
There are some possibly undesirable conversions with ASCII//TRANSLIT//IGNORE or your users may require some custom stuff. You might want to run a substitution up front for certain things, such as when you want 3 letter ISO codes to replace currency symbols. £ transliterates to "lb", for example, which is incorrect since it's a currency symbol, not a weight symbol (#). ASCII//TRANSLIT//IGNORE does a great job within the realm of possibility :-) When it doesn't do something you want it to, you can set up a CSV with one replacement per line and run a function like: function stripByMap($inputString, $mapFile) { $csv = file($mapFile); foreach($csv as $line) { $arrLine = explode(',', trim($line)); $inputString = str_replace($arrLine[0],$arrLine[1],$inputString); } return $inputString; } or you can write some regexes. Transliterating using ASCII//TRANSLIT//IGNORE works so well that your map probably won't be very long...
内容声明:本文中引用的各种信息及资料(包括但不限于文字、数据、图表及超链接等)均来源于该信息及资料的相关主体(包括但不限于公司、媒体、协会等机构)的官方网站或公开发表的信息。部分内容参考包括:(百度百科,百度知道,头条百科,中国民法典,刑法,牛津词典,新华词典,汉语词典,国家院校,科普平台)等数据,内容仅供参考使用,不准确地方联系删除处理!本站为非盈利性质站点,本着为中国教育事业出一份力,发布内容不收取任何费用也不接任何广告!)