[Home] [Downloads] [Search] [Help/forum]

Gammon Software Solutions forum

See www.mushclient.com/spam for dealing with forum spam. Please read the MUSHclient FAQ!

[Folder]  Entire forum
-> [Folder]  MUSHclient
. -> [Folder]  Bug reports
. . -> [Subject]  4.79 crash on unicode-flagged WindowText/WindowTextWidth

Home  |  Users  |  Search  |  FAQ
Username:
Register forum user name
Password:
Forgotten password?

4.79 crash on unicode-flagged WindowText/WindowTextWidth

[Reply to this subject]  Reply to this subject   [New subject]  Start a new subject   [Refresh] Refresh page


Posted by Fiendish   USA  (848 posts)  [Biography] bio   Global Moderator
Date Fri 09 Dec 2011 08:57 PM (UTC)  quote  ]

Amended on Fri 09 Dec 2011 08:58 PM (UTC) by Fiendish

Message
Both of the following Lua lines crash MUSHclient 4.79. If I remove the "true" from the end there's no crash.
WindowTextWidth(win, font, "´", true)
WindowText(win, font, "´", 0, 0, 0, 0, 0x000000, true)

The symbol above is the acute accent.

http://aardwolfclientpackage.googlecode.com/
[Go to top] top

Posted by Nick Gammon   Australia  (18,769 posts)  [Biography] bio   Forum Administrator
Date Reply #1 on Sat 10 Dec 2011 03:28 AM (UTC)  quote  ]
Message
The string you mention is, of course, bad UTF8*. The crash is fixed in version 4.80.

* It is 0xB4 which is not valid on its own inside a UTF8 string. It should be converted to a UTF8 equivalent, namely in this case:


0xC2 0xB4


As a work-around, the server should not be sending such a string to UTF8-enabled clients, and if it is generated by a plugin, then it should also not be using such a string.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Fiendish   USA  (848 posts)  [Biography] bio   Global Moderator
Date Reply #2 on Sat 10 Dec 2011 05:23 AM (UTC)  quote  ]

Amended on Sat 10 Dec 2011 05:30 AM (UTC) by Fiendish

Message
Quote:
The string you mention is, of course, bad UTF8
Of course, but the functions should be returning -3 for bad utf as mentioned in the docs. This affects mapper.lua, by the way. I wonder if all those utf trues should be removed.

http://aardwolfclientpackage.googlecode.com/
[Go to top] top

Posted by Nick Gammon   Australia  (18,769 posts)  [Biography] bio   Forum Administrator
Date Reply #3 on Sat 10 Dec 2011 09:24 AM (UTC)  quote  ]
Message
There was a bug. It's been fixed.

It was actually caused indirectly because the author of the PCRE library changed the way that the underlying function worked from returning "good" or "bad" to a "how bad" code and an offset. Somehow I didn't implement that correctly when the new library came out.

The UTF "true" are needed to handle the case for if you actually supply UTF8 strings.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Fiendish   USA  (848 posts)  [Biography] bio   Global Moderator
Date Reply #4 on Sat 10 Dec 2011 03:42 PM (UTC)  quote  ]
Message
Quote:
The UTF "true" are needed to handle the case for if you actually supply UTF8 strings.

Huh. Then I don't understand when the -3 error return code described in the doc for WindowTextWidth would happen.

http://aardwolfclientpackage.googlecode.com/
[Go to top] top

Posted by Nick Gammon   Australia  (18,769 posts)  [Biography] bio   Forum Administrator
Date Reply #5 on Sat 10 Dec 2011 08:37 PM (UTC)  quote  ]
Message
In the fixed version, your problem line will return -3.

That is, you claim to be supplying UTF8, but you are not.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Fiendish   USA  (848 posts)  [Biography] bio   Global Moderator
Date Reply #6 on Sat 10 Dec 2011 09:03 PM (UTC)  quote  ]

Amended on Sat 10 Dec 2011 09:50 PM (UTC) by Fiendish

Message
Quote:
In the fixed version, your problem line will return -3.
Right, good. This means though that the room name overlays in mapper.lua do not display correctly if the room name is, for example, "Academy´s Entrance" or "-=]·Sn00bie HQ·[=-" (actual rooms in Aardwolf with acute accent and middle dot respectively, presumably for style reasons) unless all those "true"s are removed. Is Aardwolf sending the wrong thing over GMCP?

http://aardwolfclientpackage.googlecode.com/
[Go to top] top

Posted by Nick Gammon   Australia  (18,769 posts)  [Biography] bio   Forum Administrator
Date Reply #7 on Sat 10 Dec 2011 11:11 PM (UTC)  quote  ]
Message
Well we had a lengthy discussion about Unicode during the development of GMCP. Since I believe the data is sent in JSON format, consider:

http://www.json.org/fatfree.html

Quote:

The character encoding of JSON text is always Unicode. UTF-8 is the only encoding that makes sense on the wire ...


I don't recall what the end specification for GMCP was, but I do recall that a very lengthy and somewhat, er, robust debate occurred on the point of whether or not everything was UTF8.

I would suggest that, based on the above, and since you have Unicode set to "true" in the plugin, you should be assuming that the strings are UTF8. Therefore the server should be encoding any string data as UTF8. In particular, anything above 0x7F would need to be properly encoded, like the quote sign you showed.

I seem to recall that Zugg was very emphatic that the JSON would be Unicode, so one presumes and hopes that CMud would correctly handle the UTF8-encoded data, if the server was changed to send it. However it is interesting that, I guess, you haven't had any complaints from CMud users so far about this particular problem.

See this page for confirmation that the GMCP (ATCP2) data should be UTF8-encoded:

http://www.mudstandards.org/forum/viewtopic.php?f=7&t=107

(Keep reading all 9 pages, Unicode gets a mention from time to time).


- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


1,289 views.

[Reply to this subject]  Reply to this subject   [New subject]  Start a new subject   [Refresh] Refresh page

Go to topic:           Search the forum


[Go to top] top

[Home]

Written by Nick Gammon - 5K

Comments to: Gammon Software support
[RH click to get RSS URL] Forum RSS feed ( http://www.gammon.com.au/rss/forum.xml )

[Best viewed with any browser - 2K]    [Internet Contents Rating Association (ICRA) - 2K]    [Web site powered by FutureQuest.Net]