PHP

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • Ongoing encoding issues

    7 answers - 725 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    Hi all, I posted a question a couple of days ago regarding a web app I have
    wherein users are able to indicated prices and concessions via a text field,
    and the resulting encoding issues I have experienced, the main one being
    seeing the pound sign as if viewing the results in a browser with the
    encoding set to Latin-1.
    My question is, how do I overcome this. If I set my browser encoding to
    Latin-1 and enter the data I get that odd symbol, if I set it to UTF-8 I get
    clean data. Is there a way to sniff out what encoding the browser is using
    and then clean the data in any way.
    I am googling for help also but you guys have been so helpful in the past I
    thought I'd try you also.
  • No.1 | | 1307 bytes | |

    Dave Goodchild wrote:
    Hi all, I posted a question a couple of days ago regarding a web app I have
    wherein users are able to indicated prices and concessions via a text
    field,
    and the resulting encoding issues I have experienced, the main one being
    seeing the pound sign as if viewing the results in a browser with the
    encoding set to Latin-1.

    My question is, how do I overcome this. If I set my browser encoding to
    Latin-1 and enter the data I get that odd symbol, if I set it to UTF-8 I
    get
    clean data. Is there a way to sniff out what encoding the browser is using
    and then clean the data in any way.

    check out phpinfo(); there is stuff in there telling you about what client
    encoding was [probably] used.

    that said you should probably opt to output everything as UTF-8 - all decent
    browsers will return data in the same encoding as the page was given to them in
    by default - this requires you to have php send the correct header (don't
    bother with all that META tag crap), doing the following will automatically cause
    the appropriate header to be sent:

    ini_set('output_encoding', 'UTF-8');

    I am googling for help also but you guys have been so helpful in the past I
    thought I'd try you also.
  • No.2 | | 1449 bytes | |

    Hi Dave.

    I don't think you are able to detect your users character encoding
    with php only (at least not rock-solid). Just some days ago, there
    was a discussion about that issue (at least concerning Safari) on the
    Apple web dev mailing list.

    Have a look at:

    I could be possible to send some information about character encoding
    along with the user submitted post data to your php script as well.
    Depending on that encoding, do some string replace on your input data.

    Have you provided a valid "charset" encoding in your html?

    Maybe you could give us a link to a test page?

    //frank

    26 jan 2007 kl. 10.33 skrev Dave Goodchild:

    Hi all, I posted a question a couple of days ago regarding a web
    app I have
    wherein users are able to indicated prices and concessions via a
    text field,
    and the resulting encoding issues I have experienced, the main one
    being
    seeing the pound sign as if viewing the results in a browser
    with the
    encoding set to Latin-1.

    My question is, how do I overcome this. If I set my browser
    encoding to
    Latin-1 and enter the data I get that odd symbol, if I set it to
    UTF-8 I get
    clean data. Is there a way to sniff out what encoding the browser
    is using
    and then clean the data in any way.

    I am googling for help also but you guys have been so helpful in
    the past I
    thought I'd try you also.
  • No.3 | | 677 bytes | |

    # frank.arensmeier (AT) nikehydraulics (DOT) se / 2007-01-26 14:29:52 +0100:
    I don't think you are able to detect your users character encoding
    with php only (at least not rock-solid). Just some days ago, there
    was a discussion about that issue (at least concerning Safari) on the
    Apple web dev mailing list.

    Have a look at:

    That thread is about a different problem.

    I could be possible to send some information about character encoding
    along with the user submitted post data to your php script as well.

    Yeah, it's called the Content-Type entity header, and it's an important
    part of HTTP Have you heard about HTTP?
  • No.4 | | 1514 bytes | |

    # buddhamagnet (AT) gmail (DOT) com / 2007-01-26 09:33:13 +0000:
    Hi all, I posted a question a couple of days ago regarding a web app I have
    wherein users are able to indicated prices and concessions via a text field,
    and the resulting encoding issues I have experienced, the main one being
    seeing the pound sign as ? if viewing the results in a browser with the
    encoding set to Latin-1.

    My question is, how do I overcome this. If I set my browser encoding to
    Latin-1 and enter the data I get that odd symbol, if I set it to UTF-8 I get
    clean data. Is there a way to sniff out what encoding the browser is using
    and then clean the data in any way.

    I am googling for help also but you guys have been so helpful in the past I
    thought I'd try you also.

    Your PostgreSQL database uses some encoding, your PHP script runs under
    some locale (incl. character encoding), and the browser sent the text in
    some encoding. PostgreSQL assumes the input data is in the charset the
    database uses (unless you have client_encoding set in postgresql.conf, or
    PGCLIENTENCDING (IIRC) in the environment, or have set client_encoding
    using the SET command).

    It's important that you correctly identify encoding of the inserted data
    to PostgreSQL or convert it to the encoding it expects beforehand. You
    can use iconv or recode functions in PHP, I'd probably have a look if
    there's an apache input filter for character encoding conversions.
  • No.5 | | 1373 bytes | |

    # neuhauser (AT) sigpipe (DOT) cz / 2007-01-26 21:09:34 +0000:
    # buddhamagnet (AT) gmail (DOT) com / 2007-01-26 09:33:13 +0000:
    Hi all, I posted a question a couple of days ago regarding a web app I have
    wherein users are able to indicated prices and concessions via a text field,
    and the resulting encoding issues I have experienced, the main one being
    seeing the pound sign as ? if viewing the results in a browser with the
    encoding set to Latin-1.

    Your PostgreSQL database uses some encoding,

    Dave pointed out to me that he's using MySQL. That means the
    configuration mechanisms for the database will be different, but the
    principal issue remains the same.

    your PHP script runs under some locale (incl. character encoding), and
    the browser sent the text in some encoding. PostgreSQL assumes the
    input data is in the charset the database uses (unless you have
    client_encoding set in postgresql.conf, or PGCLIENTENCDING (IIRC) in
    the environment, or have set client_encoding using the SET command).

    It's important that you correctly identify encoding of the inserted data
    to PostgreSQL or convert it to the encoding it expects beforehand. You
    can use iconv or recode functions in PHP, I'd probably have a look if
    there's an apache input filter for character encoding conversions.
  • No.6 | | 1141 bytes | |

    Fri, January 26, 2007 3:33 am, Dave Goodchild wrote:
    Hi all, I posted a question a couple of days ago regarding a web app I
    have
    wherein users are able to indicated prices and concessions via a text
    field,
    and the resulting encoding issues I have experienced, the main one
    being
    seeing the pound sign as if viewing the results in a browser with
    the
    encoding set to Latin-1.

    My question is, how do I overcome this. If I set my browser encoding
    to
    Latin-1 and enter the data I get that odd symbol, if I set it to UTF-8
    I get
    clean data. Is there a way to sniff out what encoding the browser is
    using
    and then clean the data in any way.

    I am googling for help also but you guys have been so helpful in the
    past I
    thought I'd try you also.

    Send the charset in your headers *AND* set it in a META tag.

    Firefox trusts the headers.
    IE trusts the META tag.

    If the user insists on viewing your UTF-8 document with Latin-1 after
    that, then they probably had to work pretty hard at it, and you should
    just leave them alone with the bed they have made.
  • No.7 | | 1016 bytes | |

    Fri, January 26, 2007 7:24 am, Jochem Maas wrote:
    Dave Goodchild wrote:
    that said you should probably opt to output everything as UTF-8 - all
    decent
    browsers will return data in the same encoding as the page was given
    to them in
    by default - this requires you to have php send the correct header
    (don't
    bother with all that META tag crap), doing the following will
    automatically cause
    the appropriate header to be sent:

    *Do* bother with the META tag crap.

    MS IE will ignore the headers and attempt to "guess" the charset
    otherwise, based on some funky algorithm they've made up to compare
    the bytes in the HTML to what they "expect" for any given charset, and
    you'll get weird and confusing cases when the same "page" won't be
    UTF-8 just because the data within it suddenly tips over their
    count-point of what "should" be in a UTF-8 or Latin-1 document.

    Don't blame me -- I'm just reporting the behaviour. Talk to Bill.

Re: Ongoing encoding issues


max 4000 letters.
Your nickname that display:
In order to stop the spam: 6 + 5 =
QUESTION ON "PHP"

EMSDN.COM