Register forum user name Search FAQ

Gammon Forum

Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the password reset link.

Due to spam on this forum, all posts now need moderator approval.

 Entire forum ➜ MUSHclient ➜ International ➜ UTF-8 Plug-In Encoding Not Recognized ?!?

UTF-8 Plug-In Encoding Not Recognized ?!?

It is now over 60 days since the last post. This thread is closed.     Refresh page


Posted by 1of10   Canada  (54 posts)  Bio
Date Sat 10 Apr 2004 08:20 AM (UTC)

Amended on Sat 10 Apr 2004 08:26 AM (UTC) by 1of10

Message
I'm not sure if this is the fault of my editor incorrectly saving the files, or if MUclient has a small bug in it when loading certain edited UTF-8 files...

My editor is UltraEdit-32 v10.10c. MUclient is v3.47.

My editor status line says the file format is U8-UNIX (or U8-DOS).

When I attempt to load one of these edited files, I get the following error:

Line 1: Expected '<', got "ÿ" (content not permitted here) (problem in this file)

I can only solve this problem by converting the file "UTF-8 to ASCII" and changing the 'encoding=' in the first line of the plug-in to "ASCII."

At first, while adding this report/post, I thought it might be my editor was improperly handling UTF-8 (maybe it still is?). I was using 10.10a, and just now upgraded to 10.10c. The changelog says improved UTF-8 handling. However, this problem still exists...

Addendum:
Oddly enough, when I examine both an ASCII encoded, edited file and a UTF-8 encoded, un-edited (original MUclient distributed) file, the same two characters are always present, before the '<' of the <?xml ...?> tag: ÿþ

One of Ten
Secondary Adjunct of Unimatrix Z03
Top

Posted by Nick Gammon   Australia  (23,133 posts)  Bio   Forum Administrator
Date Reply #1 on Sat 10 Apr 2004 10:47 PM (UTC)
Message
The characters you refer to are hex FF FE which are used as a "unicode-marker" at the start of a text file (eg. by Notepad).

MUSHclient does not at present detect that marker. I have added it as suggestion #515 for it to do so.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,133 posts)  Bio   Forum Administrator
Date Reply #2 on Fri 16 Apr 2004 04:13 AM (UTC)
Message
After checking the file (a Unicode file created with Notepad) it is really a 16-byte Unicode file, not a UTF-8 file, so strictly speaking MUSHclient *is* handling UTF-8 (however not 16-byte Unicode).

However I have changed MUSHclient v 3.48 to detect these "indicator bytes" and convert the file from 16-bit Unicode to UTF-8, and then process it.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,133 posts)  Bio   Forum Administrator
Date Reply #3 on Mon 26 Apr 2004 02:23 AM (UTC)
Message
Version 3.48 should recognise those files correctly now.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


21,909 views.

It is now over 60 days since the last post. This thread is closed.     Refresh page

Go to topic:           Search the forum


[Go to top] top

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.