[Home] [Downloads] [Search] [Help/forum]

Gammon Software Solutions forum

See www.mushclient.com/spam for dealing with forum spam. Please read the MUSHclient FAQ!

[Folder]  Entire forum
-> [Folder]  MUSHclient
. -> [Folder]  General
. . -> [Subject]  Chinese trigger not loaded

Home  |  Users  |  Search  |  FAQ
Username:
Register forum user name
Password:
Forgotten password?

Chinese trigger not loaded

[Reply to this subject]  Reply to this subject   [New subject]  Start a new subject   [Refresh] Refresh page


Posted by Flow   (5 posts)  [Biography] bio
Date Wed 13 Jun 2012 04:10 AM (UTC)  quote  ]
Message
Hi everyone,

I am new to Mushclient and I have been investigating one issues regarding chinese trigger using regex.

The mud is using Big5 and I cannot check the utf-8 box.
When the triggers contain some special chinese words(eg. "架","跋", "崙"), they are not loaded when I open the world.

It gave a error message saying "Failed: missing terminating ] for character class"

I tried to use some encoder to check those words, all of them contain "%5B" which can be decoded to "[".

I think that's the cause of the problem.

Is there any workarounds to make the regex treating the whole sentence as one string but to check it byte by byte?

Please help.

Thanks
[Go to top] top

Posted by Nick Gammon   Australia  (18,770 posts)  [Biography] bio   Forum Administrator
Date Reply #1 on Wed 13 Jun 2012 06:15 AM (UTC)  quote  ]
Message
What is Big5?

Quote:

I cannot check the utf-8 box.


Why not?

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Flow   (5 posts)  [Biography] bio
Date Reply #2 on Wed 13 Jun 2012 06:43 AM (UTC)  quote  ]
Message
Hi Nick,

Big5 is an encoding for traditional chinese..
Because the mud is using big5, the words will be corrupted if I check the utf-8 box..

The second byte of the chinese character was translated to "[" which make the triggers failed.

Thanks.

[Go to top] top

Posted by Nick Gammon   Australia  (18,770 posts)  [Biography] bio   Forum Administrator
Date Reply #3 on Wed 13 Jun 2012 07:47 AM (UTC)  quote  ]
Message
I see. Well I suggest making a plugin that converts incoming packets from Big5 to UTF8, then you can check the UTF8 box and the trigger should work.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Nick Gammon   Australia  (18,770 posts)  [Biography] bio   Forum Administrator
Date Reply #4 on Wed 13 Jun 2012 07:48 AM (UTC)  quote  ]
Message
I don't know enough about Big5 to be much more specific, but check out this:

http://www.gammon.com.au/scripts/doc.php?general=plugin_callbacks

In particular:


OnPluginPacketReceived

You should be able to do a simple Lua global replace where it converts Big5 to UTF8 from a simple table.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Flow   (5 posts)  [Biography] bio
Date Reply #5 on Wed 13 Jun 2012 07:52 AM (UTC)  quote  ]
Message
Nick,

Thank you very much.

I will try that out first.

by the way, is there any way to make pcre work better on chinese?
[Go to top] top

Posted by Nick Gammon   Australia  (18,770 posts)  [Biography] bio   Forum Administrator
Date Reply #6 on Wed 13 Jun 2012 07:53 AM (UTC)  quote  ]
Message
Sort of an example here:

http://www.gammon.com.au/forum/bbshowpost.php?bbsubject_id=8747

You basically want to match on "." (anything) and then look up each character in a table and convert it to UTF8.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Nick Gammon   Australia  (18,770 posts)  [Biography] bio   Forum Administrator
Date Reply #7 on Wed 13 Jun 2012 07:55 AM (UTC)  quote  ]
Message
Flow said:

by the way, is there any way to make pcre work better on chinese?


Turn UTF-8 on, it can't know that the characters are not the usual meanings.

Although for triggers it *might* just work to put an underscore before it.

For example, instead of matching on 架 match on \架

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Flow   (5 posts)  [Biography] bio
Date Reply #8 on Wed 13 Jun 2012 08:20 AM (UTC)  quote  ]
Message
Nick Gammon said:

For example, instead of matching on 架 match on \架


this does not work..
Because chinese words have 2 bytes..
the problem is that the last byte become a special character..
there is no way to insert \ in between those 2 bytes..

I am looking for a way to group the words and then ignore all special characters inside the group..

seems no such method.
[Go to top] top

Posted by Flow   (5 posts)  [Biography] bio
Date Reply #9 on Wed 13 Jun 2012 09:45 AM (UTC)  quote  ]
Message
Finally got one way to solve this...
Quote the word by /Q.../E ..
this will enclose the characters as literal and ignore all syntax...

Thanks all...

[Go to top] top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


778 views.

[Reply to this subject]  Reply to this subject   [New subject]  Start a new subject   [Refresh] Refresh page

Go to topic:           Search the forum


[Go to top] top

[Home]

Written by Nick Gammon - 5K

Comments to: Gammon Software support
[RH click to get RSS URL] Forum RSS feed ( http://www.gammon.com.au/rss/forum.xml )

[Best viewed with any browser - 2K]    [Internet Contents Rating Association (ICRA) - 2K]    [Web site powered by FutureQuest.Net]