Windows

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • Folder names in THttpServer

    12 answers - 515 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    Hello,
    I have some reports and experience that there is a problem with non-ascii
    characters in folder names with files. Here is the report I got:
    The characters 80C@ turns out to be 80C?/A>. ATV
    Q@Qjqsi{ turns out to be ATV Q?/A>.
    R.@ turns out to be R.?/A>. And @m turns out to be ?/A>
    it doesn't seem to be a specific character
    Anyone have any idea why this happens? Is there an RFC that explains this
    behavior and its cure?
    Best Regards,
    SubZero
  • No.1 | | 324 bytes | |

    The characters 80C@ turns out to be 80C?/A>. ATV

    I have no experience with turkish, russian and other chatacter set.
    What I can say is that I have no problem with accented characters used in
    french.
    Maybe a unicode or double byte character set issue ?

    btw: FTP is defined as a 8 bit ascii protocol.
  • No.2 | | 832 bytes | |

    Hello,

    Thank you for your reply. Turkish or Russian standard characters are not
    unicode. They are 8-bit. In Turkish the first 7-bits (0-127) are the same as
    ANSI but characters (128-255) are different.

    Best Regards,

    SZ

    Message
    From: "Francois PIETTE" <francois.piette (AT) overbyte (DOT) be>
    To: "ICS support mailing" <twsocket (AT) elists (DOT) org>
    Sent: Saturday, June 04, 2005 11:09 AM
    Subject: Re: [twsocket] Folder names in THttpServer

    The characters 80C@ turns out to be 80C?/A>. ATV

    I have no experience with turkish, russian and other chatacter set.
    What I can say is that I have no problem with accented characters used in
    french.
    Maybe a unicode or double byte character set issue ?

    btw: FTP is defined as a 8 bit ascii protocol.
  • No.3 | | 439 bytes | |

    Turkish or Russian standard characters are not unicode. They are 8-bit. In
    Turkish the first 7-bits (0-127) are the same as ANSI but characters
    (128-255) are different.

    So it is the same as french. And it works very well with french text. If it
    doesn't with turkish or russian characters, then there is something I don't
    understand. You should single step thru the component code and try to
    understand what.
  • No.4 | | 1083 bytes | |

    I traced it to on command and it reads the input character by character
    wrong on some characters such as "". Could you try with these characters?
    I know they look corrupted in your email client but indeed they are valid
    Turkish characters and when you put them in a folder name, you will see that
    ICS server will make them really corrupt!

    Best Regards,

    SZ

    Message
    From: "Francois PIETTE" <francois.piette (AT) overbyte (DOT) be>
    To: "ICS support mailing" <twsocket (AT) elists (DOT) org>
    Sent: Saturday, June 04, 2005 12:28 PM
    Subject: Re: [twsocket] Folder names in THttpServer

    Turkish or Russian standard characters are not unicode. They are 8-bit. In
    Turkish the first 7-bits (0-127) are the same as ANSI but characters
    (128-255) are different.

    So it is the same as french. And it works very well with french text. If it
    doesn't with turkish or russian characters, then there is something I don't
    understand. You should single step thru the component code and try to
    understand what.
  • No.5 | | 1290 bytes | |

    >I traced it to on command and it reads the input character by character
    >wrong on some characters such as "".


    Why is it wrong ? In a previous message you told me turkish was 8 bit
    characters. Now I understand it is double byte characters.

    Could you try with these characters? I know they look corrupted in your
    email client but indeed they are valid

    What you show me is actually TW characters. I see a lower case c with a
    cedilla (like in my french first name "F" and lower case letter g. I
    have no idea how to try with this character. If I try, they would be
    interpreted as "" that is perfectly correct in french and works very well.

    Turkish characters and when you put them in a folder name, you will see
    that ICS server will make them really corrupt!

    Please write a program (A very short BCB console mode program is K, better
    in Delphi of course) that create such a directory with a filename with
    turkish characters. Do a screen dump on your turkish screen so that I can
    compare with mine (do a partial screen dump so that it is not too large,
    just enough to see what you talk about). Put that screen dump on a server so
    that everybody can download it, along with your short program
  • No.6 | | 1716 bytes | |

    No Turkish is not 16 bit characters. I sent you the two distinct problematic
    characters!

    I will try to build a test code.

    Best Regards,

    SZ

    Message
    From: "Francois PIETTE" <francois.piette (AT) overbyte (DOT) be>
    To: "ICS support mailing" <twsocket (AT) elists (DOT) org>
    Sent: Sunday, June 05, 2005 12:09 PM
    Subject: Re: [twsocket] Folder names in THttpServer


    >I traced it to on command and it reads the input character by character
    >wrong on some characters such as "".


    Why is it wrong ? In a previous message you told me turkish was 8 bit
    characters. Now I understand it is double byte characters.

    Could you try with these characters? I know they look corrupted in your
    email client but indeed they are valid

    What you show me is actually TW characters. I see a lower case c with a
    cedilla (like in my french first name "F" and lower case letter g. I
    have no idea how to try with this character. If I try, they would be
    interpreted as "" that is perfectly correct in french and works very well.

    Turkish characters and when you put them in a folder name, you will see
    that ICS server will make them really corrupt!

    Please write a program (A very short BCB console mode program is K, better
    in Delphi of course) that create such a directory with a filename with
    turkish characters. Do a screen dump on your turkish screen so that I can
    compare with mine (do a partial screen dump so that it is not too large,
    just enough to see what you talk about). Put that screen dump on a server so
    that everybody can download it, along with your short program
  • No.7 | | 3886 bytes | |

    I miss earlier mails of this thread (they were accidentally deleted) but
    picking up on this one (I hope I'm not missing the point entirely)

    Make sure you are not confusing Unicode with MBCS and SBCS.
    Unicode is always 16 bits
    MBCS (Multi Byte Character Set) as opposed to SBCS (Single Byte CS) can
    contain, for certain characters, two (or more) bytes, as opposed to only one
    byte for most other characters in that same character set.
    I'm not sure if Turkish can have MB characters ?

    If you're clear about that, also know that Borland VCL is limited in it's
    capabilities to properly show non-latin characters on latin-set systems.
    This is because VCL is completely MBCS inside, if it were Unicode we
    wouldn't have all these problems.
    So, suppose Win2K or XP, they are unicode based and convert MBCS via a
    system defined code page to Unicode before displaying it anywhere.

    If, for instance, your system's code page is set to "UK English" (as an
    example) and you try to use foreign characters in a MBCS application (like
    Borland builds them) then you will have problems seeing the correct
    characters !
    To work around that you have to tell XP that you want to use a different
    code page (and next reboot your system)

    To do this (XP) :

    My Computer / Control panel / Date, Time, Language and regional settings /
    Regional and Language / (third tab) Advanced
    Language for non-unicode programs
    See what language is specified there. If not Turkish, try setting it to
    Turkish and see if this fixes things (*)

    (*) Without really knowing what the real issue is ;-))
    Like said, I hope I'm not beside the point entirely

    Best Regards,
    Peter

    Peter Van Hove
    CD and DVD Data recovery
    Peter (AT) Smart-Projects (DOT) net

    www.Smart-Projects.net
    www.IsoBuster.com

    Message
    From: "Fastream Technologies" <gates (AT) fastream (DOT) com>
    To: "ICS support mailing" <twsocket (AT) elists (DOT) org>
    Sent: Sunday, June 05, 2005 4:46 PM
    Subject: Re: [twsocket] Folder names in THttpServer

    No Turkish is not 16 bit characters. I sent you the two distinct problematic
    characters!

    I will try to build a test code.

    Best Regards,

    SZ

    Message
    From: "Francois PIETTE" <francois.piette (AT) overbyte (DOT) be>
    To: "ICS support mailing" <twsocket (AT) elists (DOT) org>
    Sent: Sunday, June 05, 2005 12:09 PM
    Subject: Re: [twsocket] Folder names in THttpServer


    >I traced it to on command and it reads the input character by character
    >wrong on some characters such as "".


    Why is it wrong ? In a previous message you told me turkish was 8 bit
    characters. Now I understand it is double byte characters.

    Could you try with these characters? I know they look corrupted in your
    email client but indeed they are valid

    What you show me is actually TW characters. I see a lower case c with a
    cedilla (like in my french first name "F" and lower case letter g. I
    have no idea how to try with this character. If I try, they would be
    interpreted as "" that is perfectly correct in french and works very well.

    Turkish characters and when you put them in a folder name, you will see
    that ICS server will make them really corrupt!

    Please write a program (A very short BCB console mode program is K, better
    in Delphi of course) that create such a directory with a filename with
    turkish characters. Do a screen dump on your turkish screen so that I can
    compare with mine (do a partial screen dump so that it is not too large,
    just enough to see what you talk about). Put that screen dump on a server so
    that everybody can download it, along with your short program
  • No.8 | | 246 bytes | |

    No Turkish is not 16 bit characters. I sent you the two distinct
    problematic characters!
    Those characters, as received are used in french and wroks perfectly well
    So there is something different between turkish and french characters.
  • No.9 | | 1364 bytes | |

    Hello,

    I just recall, that I hade a long time ago a simular problem with a
    French customar. He gived me text in a file, that I should transmit to
    vehicles. At a certain moment he was changing to new machines, and lots
    of parts of text where corrupted in my logs. However in the customar's
    log not (he did backoffice program, and I did the gateway/mobile part).

    It seemed after viewing the text files with a hex editor, in the
    beginning some characters where 2 byte, and in some parts of the file
    the same (2 byte). However moste characters where 1 byte.

    I hade sent someone else to him and I recall it was fixed with changing
    something in regional settings. It was certainly not unicode, because
    what I understeand of unicode is that each character is 2 byte.

    I will try to trace back what the engineer I sent did. Not easy He is
    the kind of person who write nothing down and trust his volatile memory
    :)

    Rgds, Wilfried
    http://www.mestdagh.biz

    Sunday, June 5, 2005, 17:37, Francois PIETTE wrote:

    >No Turkish is not 16 bit characters. I sent you the two distinct
    >problematic characters!


    Those characters, as received are used in french and wroks perfectly well.
    So there is something different between turkish and french characters.
  • No.10 | | 2555 bytes | |

    Hello Peter,

    this confirms my reply I did an hour ago or so. program wrote to
    files (written in VB), and other program (written in Delphi) was reading
    corrupted text. The text was always corrupted in the first bytes and
    here and there also. Most part was K. And the log's of the VB program
    showed nice text. It was all in French.

    I recall that some change in regional settings fixed it, as you mention
    also.

    I googled a little on MBCS, and there seems to be different on win9x and
    NT systems. From what I have found is what you say, if regional settings
    are correct then it should be translated with the right code page.

    Rgds, Wilfried
    http://www.mestdagh.biz

    Sunday, June 5, 2005, 17:31, Peter Van Hove wrote:

    I miss earlier mails of this thread (they were accidentally deleted) but
    picking up on this one (I hope I'm not missing the point entirely)

    Make sure you are not confusing Unicode with MBCS and SBCS.
    Unicode is always 16 bits
    MBCS (Multi Byte Character Set) as opposed to SBCS (Single Byte CS) can
    contain, for certain characters, two (or more) bytes, as opposed to only one
    byte for most other characters in that same character set.
    I'm not sure if Turkish can have MB characters ?

    If you're clear about that, also know that Borland VCL is limited in it's
    capabilities to properly show non-latin characters on latin-set systems.
    This is because VCL is completely MBCS inside, if it were Unicode we
    wouldn't have all these problems.
    So, suppose Win2K or XP, they are unicode based and convert MBCS via a
    system defined code page to Unicode before displaying it anywhere.

    If, for instance, your system's code page is set to "UK English" (as an
    example) and you try to use foreign characters in a MBCS application (like
    Borland builds them) then you will have problems seeing the correct
    characters !
    To work around that you have to tell XP that you want to use a different
    code page (and next reboot your system)

    To do this (XP) :

    My Computer / Control panel / Date, Time, Language and regional settings /
    Regional and Language / (third tab) Advanced
    Language for non-unicode programs
    See what language is specified there. If not Turkish, try setting it to
    Turkish and see if this fixes things (*)

    (*) Without really knowing what the real issue is ;-))
    Like said, I hope I'm not beside the point entirely

    Best Regards,
    Peter
  • No.11 | | 4381 bytes | |

    Hello Peter,

    Now I still see the folder name corrupted in DS FTP but I can get into the
    folder with the seen corrupted folder name. However filezilla works fine so
    there should be no problem!

    Thanks a LT! :-))

    SZ

    Message
    From: "Peter Van Hove" <Peter (AT) Smart-Projects (DOT) net>
    To: "ICS support mailing" <twsocket (AT) elists (DOT) org>
    Sent: Sunday, June 05, 2005 6:31 PM
    Subject: Re: [twsocket] Folder names in THttpServer

    I miss earlier mails of this thread (they were accidentally deleted) but
    picking up on this one (I hope I'm not missing the point entirely)

    Make sure you are not confusing Unicode with MBCS and SBCS.
    Unicode is always 16 bits
    MBCS (Multi Byte Character Set) as opposed to SBCS (Single Byte CS) can
    contain, for certain characters, two (or more) bytes, as opposed to only one
    byte for most other characters in that same character set.
    I'm not sure if Turkish can have MB characters ?

    If you're clear about that, also know that Borland VCL is limited in it's
    capabilities to properly show non-latin characters on latin-set systems.
    This is because VCL is completely MBCS inside, if it were Unicode we
    wouldn't have all these problems.
    So, suppose Win2K or XP, they are unicode based and convert MBCS via a
    system defined code page to Unicode before displaying it anywhere.

    If, for instance, your system's code page is set to "UK English" (as an
    example) and you try to use foreign characters in a MBCS application (like
    Borland builds them) then you will have problems seeing the correct
    characters !
    To work around that you have to tell XP that you want to use a different
    code page (and next reboot your system)

    To do this (XP) :

    My Computer / Control panel / Date, Time, Language and regional settings /
    Regional and Language / (third tab) Advanced
    Language for non-unicode programs
    See what language is specified there. If not Turkish, try setting it to
    Turkish and see if this fixes things (*)

    (*) Without really knowing what the real issue is ;-))
    Like said, I hope I'm not beside the point entirely

    Best Regards,
    Peter

    Peter Van Hove
    CD and DVD Data recovery
    Peter (AT) Smart-Projects (DOT) net

    www.Smart-Projects.net
    www.IsoBuster.com

    Message
    From: "Fastream Technologies" <gates (AT) fastream (DOT) com>
    To: "ICS support mailing" <twsocket (AT) elists (DOT) org>
    Sent: Sunday, June 05, 2005 4:46 PM
    Subject: Re: [twsocket] Folder names in THttpServer

    No Turkish is not 16 bit characters. I sent you the two distinct problematic
    characters!

    I will try to build a test code.

    Best Regards,

    SZ

    Message
    From: "Francois PIETTE" <francois.piette (AT) overbyte (DOT) be>
    To: "ICS support mailing" <twsocket (AT) elists (DOT) org>
    Sent: Sunday, June 05, 2005 12:09 PM
    Subject: Re: [twsocket] Folder names in THttpServer


    >I traced it to on command and it reads the input character by character
    >wrong on some characters such as "".


    Why is it wrong ? In a previous message you told me turkish was 8 bit
    characters. Now I understand it is double byte characters.

    Could you try with these characters? I know they look corrupted in your
    email client but indeed they are valid

    What you show me is actually TW characters. I see a lower case c with a
    cedilla (like in my french first name "F" and lower case letter g. I
    have no idea how to try with this character. If I try, they would be
    interpreted as "" that is perfectly correct in french and works very well.

    Turkish characters and when you put them in a folder name, you will see
    that ICS server will make them really corrupt!

    Please write a program (A very short BCB console mode program is K, better
    in Delphi of course) that create such a directory with a filename with
    turkish characters. Do a screen dump on your turkish screen so that I can
    compare with mine (do a partial screen dump so that it is not too large,
    just enough to see what you talk about). Put that screen dump on a server so
    that everybody can download it, along with your short program
  • No.12 | | 4163 bytes | |

    Hi,

    Good to hear that things are working (more or less) now.
    Again, I'm not sure what the original question was, but if you created a
    folder with the old code page it may indeed still contain garbled characters
    (because the conversion never happened properly). If so, create the folder
    again, this time with correct settings.

    FYI, for those interested :

    As for MBCS and the limitations it causes, it is frustrating that Borland
    has never addressed this properly.
    For a unicode-internally application such as mine, it is sad to see Borland
    convert everything to MBCS, hand that to Windows (because VCL wraps around
    the MBCS APIs), and next Windows needs to converts to Unicode again.
    For those interested I once put this to Team B in the newsgroups :
    #00a00b5180de3cb2

    If you want your Borland app to display things properly for all languages on
    all systems you need to resort to third party Unicode components over the
    standard VCL components. What's stupid about this is that you need to
    completely replace all VCL objects and I personally haven't found the "moral
    strength" to start doing that in my apps :-)
    E.g. :
    (This is NT tested by me, but mentioned here just to complete the
    information on this topic)

    Hope this helps.

    Best Regards,
    Peter

    Peter Van Hove
    CD and DVD Data recovery
    Peter (AT) Smart-Projects (DOT) net

    www.Smart-Projects.net
    www.IsoBuster.com

    Message
    From: "Fastream Technologies" <gates (AT) fastream (DOT) com>
    To: "ICS support mailing" <twsocket (AT) elists (DOT) org>
    Sent: Sunday, June 05, 2005 7:46 PM
    Subject: Re: [twsocket] Folder names in THttpServer

    Hello Peter,

    Now I still see the folder name corrupted in DS FTP but I can get into the
    folder with the seen corrupted folder name. However filezilla works fine so
    there should be no problem!

    Thanks a LT! :-))

    SZ

    Message
    From: "Peter Van Hove" <Peter (AT) Smart-Projects (DOT) net>
    To: "ICS support mailing" <twsocket (AT) elists (DOT) org>
    Sent: Sunday, June 05, 2005 6:31 PM
    Subject: Re: [twsocket] Folder names in THttpServer

    I miss earlier mails of this thread (they were accidentally deleted) but
    picking up on this one (I hope I'm not missing the point entirely)

    Make sure you are not confusing Unicode with MBCS and SBCS.
    Unicode is always 16 bits
    MBCS (Multi Byte Character Set) as opposed to SBCS (Single Byte CS) can
    contain, for certain characters, two (or more) bytes, as opposed to only one
    byte for most other characters in that same character set.
    I'm not sure if Turkish can have MB characters ?

    If you're clear about that, also know that Borland VCL is limited in it's
    capabilities to properly show non-latin characters on latin-set systems.
    This is because VCL is completely MBCS inside, if it were Unicode we
    wouldn't have all these problems.
    So, suppose Win2K or XP, they are unicode based and convert MBCS via a
    system defined code page to Unicode before displaying it anywhere.

    If, for instance, your system's code page is set to "UK English" (as an
    example) and you try to use foreign characters in a MBCS application (like
    Borland builds them) then you will have problems seeing the correct
    characters !
    To work around that you have to tell XP that you want to use a different
    code page (and next reboot your system)

    To do this (XP) :

    My Computer / Control panel / Date, Time, Language and regional settings /
    Regional and Language / (third tab) Advanced
    Language for non-unicode programs
    See what language is specified there. If not Turkish, try setting it to
    Turkish and see if this fixes things (*)

    (*) Without really knowing what the real issue is ;-))
    Like said, I hope I'm not beside the point entirely

    Best Regards,
    Peter

    Peter Van Hove
    CD and DVD Data recovery
    Peter (AT) Smart-Projects (DOT) net

    www.Smart-Projects.net
    www.IsoBuster.com

Re: Folder names in THttpServer


max 4000 letters.
Your nickname that display:
In order to stop the spam: 2 + 1 =
QUESTION ON "Windows"

EMSDN.COM