PHP

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • substr and UTF-8

    0 answers - 942 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    [snip]
    Actually this is false. I don't know what I was thinking. The high bit
    will be set in all bytes of a UTF-8 byte sequence. If it's not it's an
    ASCII character.
    The bytes are actually layed out as follows [1]:
    U-00000000 U-0000007F: 0xxxxxxx
    U-00000080 U-000007FF: 110xxxxx 10xxxxxx
    U-00000800 U-0000FFFF: 1110xxxx 10xxxxxx 10xxxxxx
    U-00010000 U-001FFFFF: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
    So there's no way to tell the last byte of a UTF-8 byte sequence but you
    can tell if it's the first byt looking at bits 7 and 8. Specifically,
    if bit 8 is not on, the character is ASCII and thus the "start" of a
    new character. , if bit 7 is on it's the start of a new UTF-8
    byte sequence.
    function is_utf8_start($b) {
    return (($b & 0x80) == 0) || ($b & 0x40);
    }
    [/snip]
    :) I think I will go with the mb_substr function, it works for me :)

Re: substr and UTF-8


max 4000 letters.
Your nickname that display:
In order to stop the spam: 4 + 3 =
QUESTION ON "PHP"

EMSDN.COM