Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are
spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the
password reset link.
Due to spam on this forum, all posts now need moderator approval.
Entire forum
➜ MUSHclient
➜ General
➜ Retranslation of ASCII Char to German Umlaut for Screen reader
Retranslation of ASCII Char to German Umlaut for Screen reader
|
It is now over 60 days since the last post. This thread is closed.
Refresh page
Posted by
| Luca Surion
(5 posts) Bio
|
Date
| Fri 26 Oct 2012 08:27 PM (UTC) |
Message
| I found lots of posts in the forum for translating German Umlaute to standard ASCII before sending text to a mud world but nothing for retranslating incoming ASCII from a world back properly.
While ASCII poses no problem for readers it is rather disturbing when mudding with a screen reader. The pronounciation sounds sometimes so weird that it is almost impossible to understand. I tried various kinds of triggers but didn't succeed yet. This is what I would need:
ae -> ä
oe -> ö
ue -> ü
And of course same for capital letters. Has anybody an idea what I could do? And how? As I am not familiar with Lua and scripting a step-by-step hint for dummies would help me best.
Thanks in advance! | Top |
|
Posted by
| Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Date
| Reply #1 on Fri 26 Oct 2012 08:48 PM (UTC) |
Message
| This should do it. I don't know how to type the capitals in so I'll let you do that, it should be obvious how.
![Template:saveplugin=Fix_Umlautes](/images/mushclient_logo_tiny.png) |
To save and install the Fix_Umlautes plugin do this:
- Copy between the lines below (to the Clipboard)
- Open a text editor (such as Notepad) and paste the plugin into it
- Save to disk on your PC, preferably in your plugins directory, as Fix_Umlautes.xml
- Go to the MUSHclient File menu -> Plugins
- Click "Add"
- Choose the file Fix_Umlautes.xml (which you just saved in step 3) as a plugin
- Click "Close"
|
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE muclient>
<muclient>
<plugin
name="Fix_Umlautes"
author="Nick Gammon"
id="786d90b60afd7d2ae26eaa5b"
language="Lua"
purpose="Converts German Umlautes back"
date_written="2012-10-27"
requires="4.00"
version="1.0"
>
</plugin>
<!-- Script -->
<script>
conversions = {
ae = "ä",
oe = "ö",
ue = "ü",
-- add more here
}
function OnPluginPacketReceived (s)
return (string.gsub (s, "[aAoOuUs][es]", conversions))
end -- function OnPluginPacketReceived
</script>
</muclient>
It seemed to work when I tested it. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Luca Surion
(5 posts) Bio
|
Date
| Reply #2 on Sat 27 Oct 2012 12:53 AM (UTC) |
Message
| Hi Nick!
Thank you very much for your instant and very easy to follow help. I just testet it with the screen reader jaws and it works great. Now every word is clearly understandable. What a difference! Blind German Mudders will love you for writing this script.
Best regards,
Luca
(from the German Mud UNItopia) | Top |
|
Posted by
| Luca Surion
(5 posts) Bio
|
Date
| Reply #3 on Sun 28 Oct 2012 06:35 AM (UTC) |
Message
| Hi Nick,
may I ask you again for your help? Retranslation to German Umlaut is more difficult than I was aware of. The rules in the plugin are fine but need to be supplemented with some rules for standard exeptions and an option of defining a list of words which should be skipped.
Standard exeptions are instances of "ue" where the "u" is tied to the letter preceeding it:
aue (au = diphthong)
eue (eu = diphthong)
que (q is always followed by u)
aero (words of greek origin)
Some words, e.g. person names like Raffael (painter), should be skipped in the procedure. It would be greatest, when the lists of standard and of full word exceptions could be updated easily for exceptions not thought of before. Would that be possible?
Best regards,
Luca
| Top |
|
Posted by
| Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Date
| Reply #4 on Sun 28 Oct 2012 09:41 AM (UTC) |
Message
| Anything is possible if you can codify the rules.
It gets more complex. :)
Try this:
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE muclient>
<muclient>
<plugin
name="Fix_Umlautes"
author="Nick Gammon"
id="786d90b60afd7d2ae26eaa5b"
language="Lua"
purpose="Converts German Umlautes back"
date_written="2012-10-27"
date_modified="2012-10-28"
requires="4.00"
version="2.0"
>
</plugin>
<!-- Script -->
<script>
-- basic 2-letter conversions
conversions = {
ae = "ä",
oe = "ö",
ue = "ü",
-- add more here
} -- end of conversions table
-- exceptions words (put in *lower* case)
exceptions = {
raffael = true,
aero = true,
-- add more here
} -- end of exceptions table
-- dipthongs (3-letter sequences that will be ignored)
dipthongs = {
aue = true,
eue = true,
-- add more here
} -- end of dipthongs table
-- fix umlautes within a word (this function gets 2 or 3 letter sequences)
function fixer (str)
-- if just two letters, look it up
if #str == 2 then
return (string.gsub (str, "[aAoOuUs][es]", conversions))
end -- if
-- look for stuff like aue (dipthongs)
if dipthongs [str] then
return
end -- if a dipthong
-- ignore uXX etc.
if str:match ("^u") then
return
end
-- ignore quX
if str:match ("^qu") then
return
end
-- look up in table
if conversions [str:sub (2)] then
return str:sub (1, 1) .. conversions [str:sub (2)]
end -- if
end -- fixer
-- here for whole words
function word_fixer (str)
-- handle words like "aero"
if exceptions [str:lower ()] then
return
end -- if
return (string.gsub (str, ".?[aAoOuUs][es]", fixer))
end -- word_fixer
-- handle incoming packet
function OnPluginPacketReceived (s)
return (string.gsub (s, "%f[%a]%a*%f[%A]", word_fixer))
end -- function OnPluginPacketReceived
</script>
</muclient>
The above first looks for whole words, and looks those words up in an exceptions table (that's where you put stuff like "aero").
Then, per word, it looks for the 2-letter sequences, except it carries down the preceding letter. This lets you get rid of leading "q" or "u.." sequences.
It mightn't be perfect, but it will be closer. Bear in mind I don't speak German. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Luca Surion
(5 posts) Bio
|
Date
| Reply #5 on Sun 28 Oct 2012 09:15 PM (UTC) |
Message
| wow. It is much closer now. After adding these lines of code (in the function word_fixer) to capture instances where exceptions are substrings within words:
-- ignore AeroX
if (str:lower ()):match("aero") then
return
end
-- ignore RaffaelX
if str:match ("^Raffael") then
return
end
it did really fine. I just tried it with a testing text containing (hopefully) all kinds of cases to be consired:
This was the test text stored in the MUD:
Vor den Stadtmauern ist ein neuer Park mit viel Gruen und einer eigenen Quelle. Hier treffen sich vor allem die aelteren Leute oefter tagsueber zum Kloenen. Ganz in der Naehe liegt das Aerodrom von Raffaels Ballondienst.
This was the output which was exactly as it should be:
Vor den Stadtmauern ist ein neuer Park mit viel Grün und
einer eigenen Quelle. Hier treffen sich vor allem die
älteren Leute öfter tagsüber zum Klönen. Ganz in der Nähe
liegt das Aerodrom von Raffaels Ballondienst.
It works, but it would be more convenient for future updates if I could make use of the exeptions table defined above. I tried so in quite a number of ways but didn't succeed yet.
But I am sure there will be a much easier and more elegant solution. May I ask once more for your help?
(I am only a native German but you are the native Lua! :)
Thank you very much in advance!
Best regards,
Luca | Top |
|
Posted by
| Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Date
| Reply #6 on Mon 29 Oct 2012 03:13 AM (UTC) |
Message
| OK, version 3:
<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE muclient>
<muclient>
<plugin
name="Fix_Umlautes"
author="Nick Gammon"
id="786d90b60afd7d2ae26eaa5b"
language="Lua"
purpose="Converts German Umlautes back"
date_written="2012-10-27"
date_modified="2012-10-28"
requires="4.00"
version="3.0"
>
</plugin>
<!-- Script -->
<script>
-- basic 2-letter conversions
conversions = {
ae = "ä",
oe = "ö",
ue = "ü",
-- add more here
} -- end of conversions table
-- Exceptions strings (words or part words)
exceptions = {
"Raffael",
"aero",
-- add more here
} -- end of exceptions table
-- dipthongs (3-letter sequences that will be ignored)
dipthongs = {
aue = true,
eue = true,
-- add more here
} -- end of dipthongs table
-- fix umlautes within a word (this function gets 2 or 3 letter sequences)
function fixer (str)
-- if just two letters, look it up
if #str == 2 then
return (string.gsub (str, "[aAoOuUsS][eEsS]", conversions))
end -- if
-- look for stuff like aue (dipthongs)
if dipthongs [str] then
return
end -- if a dipthong
-- ignore uXX etc.
if str:match ("^[Uu]") then
return
end
-- ignore quX
if str:match ("^[Qq][Uu]") then
return
end
-- look up in table
if conversions [str:sub (2)] then
return str:sub (1, 1) .. conversions [str:sub (2)]
end -- if
end -- fixer
-- here for whole words
function word_fixer (str)
local lower_case_word = str:lower ()
for k, v in ipairs (exceptions) do
if string.match (lower_case_word, v) then
return
end -- end of exception found somewhere in the word
end -- for each exception
return (string.gsub (str, ".?[aAoOuUs][es]", fixer))
end -- word_fixer
-- handle incoming packet
function OnPluginPacketReceived (s)
return (string.gsub (s, "%f[%a]%a*%f[%A]", word_fixer))
end -- function OnPluginPacketReceived
-- run when plugin installed (loaded at runtime)
function OnPluginInstall ()
-- make sure all exceptions are lower-case
for k, v in ipairs (exceptions) do
exceptions [k] = v:lower ()
end -- for each exception
end -- OnPluginInstall
</script>
</muclient>
I fixed the lower-case issue and the partial-match issue, and a couple of other things. Your test message now comes out as:
Vor den Stadtmauern ist ein neuer Park mit viel Grün und einer eigenen Quelle. Hier treffen sich vor allem die älteren Leute öfter tagsüber zum Klönen. Ganz in der Nähe liegt das Aerodrom von Raffaels Ballondienst.
|
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Luca Surion
(5 posts) Bio
|
Date
| Reply #7 on Tue 30 Oct 2012 08:36 PM (UTC) |
Message
| Thanke you very, very much. The plugin works great and it is really simple to update for more execptions.
I just startet testing and looking for exceptions to be added. If you like I will post the exeptions list here when it's more complete. Perhaps we detect some regularities among them which allow for more codifying rules. We will see. :)
Best regards,
Luca | Top |
|
Posted by
| Nick Gammon
Australia (23,133 posts) Bio
Forum Administrator |
Date
| Reply #8 on Wed 31 Oct 2012 12:12 AM (UTC) |
Message
| Yes, thanks, do that please.
|
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| ForgottenMUD
Belgium (22 posts) Bio
|
Date
| Reply #9 on Sun 09 Dec 2012 12:30 AM (UTC) Amended on Sun 09 Dec 2012 12:36 AM (UTC) by ForgottenMUD
|
Message
| If you add the exceptions one by one manually, I think it could take years and you will miss many of them.
You can search for a German dictionary/word list, there are plenty on Google (I couldn't find one for exceptions, people seem to only do the reverse conversion.)
This one here appears to have over a million entries: http://sourceforge.net/projects/germandict/files/
You can run a script that will search through the dictionary for exceptions i.e. search for words that contain "ue", excluding "que", etc. to obtain a complete exception list.
If you are unable to do so, I will help. |
ForgottenMUD - original Java MUD
http://sites.google.com/site/forgottenmud | Top |
|
The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).
To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.
23,326 views.
It is now over 60 days since the last post. This thread is closed.
Refresh page
top