Register forum user name Search FAQ

Gammon Forum

Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the password reset link.

Due to spam on this forum, all posts now need moderator approval.

 Entire forum ➜ MUSHclient ➜ General ➜ Retranslation of ASCII Char to German Umlaut for Screen reader

Retranslation of ASCII Char to German Umlaut for Screen reader

It is now over 60 days since the last post. This thread is closed.     Refresh page


Posted by Luca Surion   (5 posts)  Bio
Date Fri 26 Oct 2012 08:27 PM (UTC)
Message
I found lots of posts in the forum for translating German Umlaute to standard ASCII before sending text to a mud world but nothing for retranslating incoming ASCII from a world back properly.

While ASCII poses no problem for readers it is rather disturbing when mudding with a screen reader. The pronounciation sounds sometimes so weird that it is almost impossible to understand. I tried various kinds of triggers but didn't succeed yet. This is what I would need:

ae -> ä
oe -> ö
ue -> ü

And of course same for capital letters. Has anybody an idea what I could do? And how? As I am not familiar with Lua and scripting a step-by-step hint for dummies would help me best.
Thanks in advance!
Top

Posted by Nick Gammon   Australia  (23,133 posts)  Bio   Forum Administrator
Date Reply #1 on Fri 26 Oct 2012 08:48 PM (UTC)
Message
This should do it. I don't know how to type the capitals in so I'll let you do that, it should be obvious how.

Template:saveplugin=Fix_Umlautes To save and install the Fix_Umlautes plugin do this:
  1. Copy between the lines below (to the Clipboard)
  2. Open a text editor (such as Notepad) and paste the plugin into it
  3. Save to disk on your PC, preferably in your plugins directory, as Fix_Umlautes.xml
  4. Go to the MUSHclient File menu -> Plugins
  5. Click "Add"
  6. Choose the file Fix_Umlautes.xml (which you just saved in step 3) as a plugin
  7. Click "Close"



<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE muclient>

<muclient>
<plugin
   name="Fix_Umlautes"
   author="Nick Gammon"
   id="786d90b60afd7d2ae26eaa5b"
   language="Lua"
   purpose="Converts German Umlautes back"
   date_written="2012-10-27"
   requires="4.00"
   version="1.0"
   >

</plugin>

<!--  Script  -->

<script>

conversions = {
  ae = "ä",
  oe = "ö",
  ue = "ü",

  -- add more here
  }
  
function OnPluginPacketReceived (s)
  return (string.gsub (s, "[aAoOuUs][es]", conversions))
end -- function OnPluginPacketReceived
</script>

</muclient>


It seemed to work when I tested it.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Luca Surion   (5 posts)  Bio
Date Reply #2 on Sat 27 Oct 2012 12:53 AM (UTC)
Message
Hi Nick!

Thank you very much for your instant and very easy to follow help. I just testet it with the screen reader jaws and it works great. Now every word is clearly understandable. What a difference! Blind German Mudders will love you for writing this script.

Best regards,

Luca

(from the German Mud UNItopia)
Top

Posted by Luca Surion   (5 posts)  Bio
Date Reply #3 on Sun 28 Oct 2012 06:35 AM (UTC)
Message
Hi Nick,

may I ask you again for your help? Retranslation to German Umlaut is more difficult than I was aware of. The rules in the plugin are fine but need to be supplemented with some rules for standard exeptions and an option of defining a list of words which should be skipped.

Standard exeptions are instances of "ue" where the "u" is tied to the letter preceeding it:

aue (au = diphthong)
eue (eu = diphthong)
que (q is always followed by u)
aero (words of greek origin)

Some words, e.g. person names like Raffael (painter), should be skipped in the procedure. It would be greatest, when the lists of standard and of full word exceptions could be updated easily for exceptions not thought of before. Would that be possible?

Best regards,

Luca
Top

Posted by Nick Gammon   Australia  (23,133 posts)  Bio   Forum Administrator
Date Reply #4 on Sun 28 Oct 2012 09:41 AM (UTC)
Message
Anything is possible if you can codify the rules.

It gets more complex. :)

Try this:


<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE muclient>

<muclient>
<plugin
   name="Fix_Umlautes"
   author="Nick Gammon"
   id="786d90b60afd7d2ae26eaa5b"
   language="Lua"
   purpose="Converts German Umlautes back"
   date_written="2012-10-27"
   date_modified="2012-10-28"
   requires="4.00"
   version="2.0"
   >

</plugin>

<!--  Script  -->

<script>

-- basic 2-letter conversions
conversions = {
  ae = "ä",
  oe = "ö",
  ue = "ü",

  -- add more here
  }  -- end of conversions table
  
-- exceptions words (put in *lower* case)
exceptions = {

  raffael = true,
  aero = true,
  
  -- add more here
  
  } -- end of exceptions table

-- dipthongs (3-letter sequences that will be ignored)
dipthongs = {

  aue = true,
  eue = true,

  -- add more here

  } -- end of dipthongs table
  
-- fix umlautes within a word (this function gets 2 or 3 letter sequences)
function fixer (str)
  
  -- if just two letters, look it up 
  if #str == 2 then
    return (string.gsub (str, "[aAoOuUs][es]", conversions))
  end -- if
  
  -- look for stuff like aue (dipthongs)
  if dipthongs [str] then
    return
  end -- if a dipthong
  
  -- ignore uXX etc.
  if str:match ("^u") then
    return
  end
  
  -- ignore quX
  if str:match ("^qu") then
    return
  end
  
  -- look up in table
  if conversions [str:sub (2)] then
    return str:sub (1, 1) .. conversions [str:sub (2)]
  end -- if
  
end -- fixer

-- here for whole words
function word_fixer (str)
  
  -- handle words like "aero"
  if exceptions [str:lower ()] then
    return
  end -- if
 
  return (string.gsub (str, ".?[aAoOuUs][es]", fixer))

end -- word_fixer

-- handle incoming packet
function OnPluginPacketReceived (s)
  return (string.gsub (s, "%f[%a]%a*%f[%A]", word_fixer))
end -- function OnPluginPacketReceived
</script>

</muclient>


The above first looks for whole words, and looks those words up in an exceptions table (that's where you put stuff like "aero").

Then, per word, it looks for the 2-letter sequences, except it carries down the preceding letter. This lets you get rid of leading "q" or "u.." sequences.

It mightn't be perfect, but it will be closer. Bear in mind I don't speak German.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Luca Surion   (5 posts)  Bio
Date Reply #5 on Sun 28 Oct 2012 09:15 PM (UTC)
Message
wow. It is much closer now. After adding these lines of code (in the function word_fixer) to capture instances where exceptions are substrings within words:

-- ignore AeroX
if (str:lower ()):match("aero") then
return
end

-- ignore RaffaelX
if str:match ("^Raffael") then
return
end

it did really fine. I just tried it with a testing text containing (hopefully) all kinds of cases to be consired:


This was the test text stored in the MUD:

Vor den Stadtmauern ist ein neuer Park mit viel Gruen und einer eigenen Quelle. Hier treffen sich vor allem die aelteren Leute oefter tagsueber zum Kloenen. Ganz in der Naehe liegt das Aerodrom von Raffaels Ballondienst.

This was the output which was exactly as it should be:

Vor den Stadtmauern ist ein neuer Park mit viel Grün und
einer eigenen Quelle. Hier treffen sich vor allem die
älteren Leute öfter tagsüber zum Klönen. Ganz in der Nähe
liegt das Aerodrom von Raffaels Ballondienst.


It works, but it would be more convenient for future updates if I could make use of the exeptions table defined above. I tried so in quite a number of ways but didn't succeed yet.

But I am sure there will be a much easier and more elegant solution. May I ask once more for your help?
(I am only a native German but you are the native Lua! :)
Thank you very much in advance!

Best regards,

Luca
Top

Posted by Nick Gammon   Australia  (23,133 posts)  Bio   Forum Administrator
Date Reply #6 on Mon 29 Oct 2012 03:13 AM (UTC)
Message
OK, version 3:


<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE muclient>

<muclient>
<plugin
   name="Fix_Umlautes"
   author="Nick Gammon"
   id="786d90b60afd7d2ae26eaa5b"
   language="Lua"
   purpose="Converts German Umlautes back"
   date_written="2012-10-27"
   date_modified="2012-10-28"
   requires="4.00"
   version="3.0"
   >

</plugin>

<!--  Script  -->

<script>

-- basic 2-letter conversions
conversions = {
  ae = "ä",
  oe = "ö",
  ue = "ü",

  -- add more here
  }  -- end of conversions table
  
-- Exceptions strings (words or part words)
exceptions = {

  "Raffael",
  "aero",
  
  -- add more here
  
  } -- end of exceptions table

-- dipthongs (3-letter sequences that will be ignored)
dipthongs = {

  aue = true,
  eue = true,

  -- add more here

  } -- end of dipthongs table
  
-- fix umlautes within a word (this function gets 2 or 3 letter sequences)
function fixer (str)
  
  -- if just two letters, look it up 
  if #str == 2 then
    return (string.gsub (str, "[aAoOuUsS][eEsS]", conversions))
  end -- if
  
  -- look for stuff like aue (dipthongs)
  if dipthongs [str] then
    return
  end -- if a dipthong
  
  -- ignore uXX etc.
  if str:match ("^[Uu]") then
    return
  end
  
  -- ignore quX
  if str:match ("^[Qq][Uu]") then
    return
  end
  
  -- look up in table
  if conversions [str:sub (2)] then
    return str:sub (1, 1) .. conversions [str:sub (2)]
  end -- if
  
end -- fixer

-- here for whole words
function word_fixer (str)
  
  local lower_case_word = str:lower ()
  for k, v in ipairs (exceptions) do
    if string.match (lower_case_word, v) then
      return
    end  -- end of exception found somewhere in the word
  end -- for each exception
   
  return (string.gsub (str, ".?[aAoOuUs][es]", fixer))

end -- word_fixer

-- handle incoming packet
function OnPluginPacketReceived (s)
  return (string.gsub (s, "%f[%a]%a*%f[%A]", word_fixer))
end -- function OnPluginPacketReceived

-- run when plugin installed (loaded at runtime)
function OnPluginInstall ()

  -- make sure all exceptions are lower-case
  for k, v in ipairs (exceptions) do
     exceptions [k] = v:lower ()
  end -- for each exception
  
end -- OnPluginInstall

</script>

</muclient>


I fixed the lower-case issue and the partial-match issue, and a couple of other things. Your test message now comes out as:


Vor den Stadtmauern ist ein neuer Park mit viel Grün und einer eigenen Quelle. Hier treffen sich vor allem die älteren Leute öfter tagsüber zum Klönen. Ganz in der Nähe liegt das Aerodrom von Raffaels Ballondienst.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Luca Surion   (5 posts)  Bio
Date Reply #7 on Tue 30 Oct 2012 08:36 PM (UTC)
Message
Thanke you very, very much. The plugin works great and it is really simple to update for more execptions.

I just startet testing and looking for exceptions to be added. If you like I will post the exeptions list here when it's more complete. Perhaps we detect some regularities among them which allow for more codifying rules. We will see. :)

Best regards,

Luca
Top

Posted by Nick Gammon   Australia  (23,133 posts)  Bio   Forum Administrator
Date Reply #8 on Wed 31 Oct 2012 12:12 AM (UTC)
Message
Yes, thanks, do that please.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by ForgottenMUD   Belgium  (22 posts)  Bio
Date Reply #9 on Sun 09 Dec 2012 12:30 AM (UTC)

Amended on Sun 09 Dec 2012 12:36 AM (UTC) by ForgottenMUD

Message
If you add the exceptions one by one manually, I think it could take years and you will miss many of them.

You can search for a German dictionary/word list, there are plenty on Google (I couldn't find one for exceptions, people seem to only do the reverse conversion.)

This one here appears to have over a million entries: http://sourceforge.net/projects/germandict/files/

You can run a script that will search through the dictionary for exceptions i.e. search for words that contain "ue", excluding "que", etc. to obtain a complete exception list.

If you are unable to do so, I will help.

ForgottenMUD - original Java MUD
http://sites.google.com/site/forgottenmud
Top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


23,326 views.

It is now over 60 days since the last post. This thread is closed.     Refresh page

Go to topic:           Search the forum


[Go to top] top

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.