[Home] [Downloads] [Search] [Help/forum]


Register forum user name Search FAQ

Gammon Forum

[Folder]  Entire forum
-> [Folder]  MUSHclient
. -> [Folder]  International
. . -> [Subject]  How to localize MUSHclient messages into other languages

How to localize MUSHclient messages into other languages

It is now over 60 days since the last post. This thread is closed.     [Refresh] Refresh page


Pages: 1  2 3  

Posted by Nick Gammon   Australia  (22,975 posts)  [Biography] bio   Forum Administrator
Date Reply #15 on Wed 13 Jun 2007 09:12 PM (UTC)
Message
I should point out that even if you use English, you can edit the resource file to make changes like make the world configuration dialog boxes larger, or laid out more to your taste.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Nick Gammon   Australia  (22,975 posts)  [Biography] bio   Forum Administrator
Date Reply #16 on Wed 13 Jun 2007 09:58 PM (UTC)

Amended on Sat 16 Jun 2007 05:00 AM (UTC) by Nick Gammon

Message
Utility to detect changes in a new release

To help localizers, this utility below, written in Lua, can be used to compare a new distribution of MUSHclient with an already localized file.


-- stuff already localized
locale = "en" -- change to suit you

dofile (locale .. ".lua")

-- make copy
original = { 
    messages = messages, 
    formatted = formatted, 
    times = times, 
    headings = headings 
    }

messages, formatted, times, headings = nil 

-- from distribution
dofile ("Localize_template.lua")

-- make copy
distribution = { 
    messages = messages, 
    formatted = formatted, 
    times = times, 
    headings = headings 
    }

messages, formatted, times, headings = nil 

function compare_table (name)
 local count = 0
 local old = original [name]
 local new = distribution [name]

 print ("Processing table", name)
 print ""

 -- new message is in distribution, but not in already localized file
  for k, v in pairs (new) do
    if not old [k] then
      count = count + 1
      print (string.format ("  New message: %q", k))
    end -- if not there
  end -- for
  
  
  print ("Found ", count, " new messages")  
  print ""

  count = 0
  
 -- old message is in already localized file, but not in distribution
  for k, v in pairs (old) do
    if not new [k] then
      count = count + 1
      print (string.format ("  Deleted message: %q", k))
    end -- if not there
  end -- for
  
  print ("Found ", count, " deleted messages")  
  print ""
  
end -- compare_table

compare_table ("messages")
compare_table ("formatted")
compare_table ("times")
compare_table ("headings")



It is intended to be run in the "locale" directory. Here is some example output:


Processing table	messages

  New message: "You must specify a font file."
  New message: "You must specify some text to insert."
Found 	2	 new messages

Found 	0	 deleted messages

Processing table	formatted

  New message: "Chat sessions for %s"
  New message: "Saturation: %5.3f"
Found 	2	 new messages

Found 	0	 deleted messages

Processing table	times

  New message: "%A, %#d %B %Y, %#I:%M %p"
Found 	1	 new messages

Found 	0	 deleted messages

Processing table	headings

  New message: "When"
Found 	1	 new messages

  Deleted message: "Action"
Found 	1	 deleted messages



What it does is load up both files (with dofile) and compare the messages tables, to see if entries have been added or deleted.

If there is a new message in the distribution, then that will appear as "new message" as shown above. That alerts you to the need to find it (by using a 'find' in your editor), and copy and paste the message into your localized file - in the appropriate section - and then translate it.

Multi-line messages will not appear exactly the same as in the message file. In the message file line breaks will be indicated by \n, however this program will simply show them as \ followed by a line break. For example:


In en.lua:  "This is a 2-line\nmessage"

In program output: "This is a 2-line\
message"


The simple solution here (apart from amending this program) is to simply search for the contents of the first line, and check the rest matches manually.

If a message appears to be deleted, but another very similar one has been added, then what has probably happened is that a spelling or punctuation mistake has been corrected, and that you can continue to use the old translation, once the "key" of the message has been adjusted.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Nick Gammon   Australia  (22,975 posts)  [Biography] bio   Forum Administrator
Date Reply #17 on Wed 13 Jun 2007 10:47 PM (UTC)
Message
There is quite a bit of discussion about gettext here:

http://www.gnu.org/software/gettext/manual/gettext.html

MUSHclient does not use gettext internally - personally I found the utility a bit obscure in places, however the comments in that document were very helpful in developing an alternative. It makes good background reading.

I did however use part of the gettext suite, namely xgettext, which scans source code for strings to be localized.

Based on the comments in the gettext manual, the MUSHclient source was changed slightly in spots, in particular places where strings were built up "piecemeal". For example:


CString s = "You need ";
s += atoi (a);
s += " to ";
s += aoi (b);
s += " piece";
if (b != 1)
 s += "s";


This sort of code does not lend itself to being translated very well, because translating the individual words will probably not result in a readable result. Thus it would be revamped as:


CString s = TFormat ("You need %i to %i piece%s", 
                    a, b, b == 1 ? "" : "s");


The formatted version at least gives the translator a whole sentence to work with, which is much more reasonable.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Nick Gammon   Australia  (22,975 posts)  [Biography] bio   Forum Administrator
Date Reply #18 on Sat 16 Jun 2007 01:22 AM (UTC)

Amended on Tue 19 Jun 2007 10:53 PM (UTC) by Nick Gammon

Message
How to detect which messages are used the most

Reading the gettext documentation gave me another idea. There are something like 750 messages to be localized. It would be nice to initially translate the common ones, and then work back to the less common ones.

We can do this fairly readily, because the translation file is in fact a Lua script, and we can make use of metatables to catch accesses to the translation items.

As an example, I appended the following to my en.lua file:



-- make copy of original tables
orig_messages = messages
orig_formatted = formatted
orig_times = times
orig_headings = headings

-- empty them out so __index is triggered
-- save original tables so we can look them up eventually

messages  = { _orig = orig_messages }
formatted = { _orig = orig_formatted }
times     = { _orig = orig_times }
headings  = { _orig = orig_headings }

counts = {} -- keep counts here

-- metatable for messages, titles, headings
mt_static = {
  -- called to access an entry
  __index=
    function (t, name)
      local s = rawget (t._orig, name)
      if s == nil or #s == 0 then
        counts [name] = (counts [name] or 0) + 1
      end -- not translated yet
      return s
    end;
  }
  
-- metatable for formatted messages
mt_formatted = {
  -- called to access an entry
  __index=
    function (t, name)
      local f = rawget (t._orig, name)
      -- no function? not translated then
      if f == nil then
        counts [name] = (counts [name] or 0) + 1
        return nil
      end
      assert (type (f) == "function")
    
      -- return a function, that will count if the original function
      -- returns an empty string
      return function (...)
         local s = f (...)  -- call original function
         if type (s) ~= "string" or #s == 0 then
           counts [name] = (counts [name] or 0) + 1
         end -- not translated
         return s  -- return translated value
         end -- function 
     end;
  }
  
-- apply the metatables
setmetatable (messages,   mt_static)
setmetatable (times,      mt_static)
setmetatable (headings,   mt_static)
setmetatable (formatted,  mt_formatted)

-- the user will call world.TranslateDebug to invoke this
function Debug ()
   
  -- for sorting
  local t = {}
  
  -- build into table which can be sorted
  for k, v in pairs (counts) do
    table.insert (t, k)
  end -- for

  -- clear out notepad, make heading
  utils.appendtonotepad ("translation", "Translation counts\n\n", true)

  -- sort into descending order
  table.sort (t, function (a, b)
                  return counts [a] > counts [b]
                 end)
                    
  -- display results
  for k, v in ipairs (t) do
    utils.appendtonotepad ("translation", string.format ("%4i: %q \n", counts [v], v))
  end -- for
  
end -- Debug




[EDIT] Amended 20th June 2007 - to fix problem where it didn't correctly look up the original message.

What this does is replace the translation tables by empty ones (after making a copy of them). Then by adding a metatable to the new, empty, tables, we will get a trigger to the __index function each time we attempt to locate a message to be translated. The __index function simply counts the number of times this item has been accessed, and then returns the saved, original, value.

Finally we make a Debug function that can be called by the (new) script function TranslateDebug. Now if we type into the command window:


/TranslateDebug ()


We see something like this:


Translation counts

  12: "Ready" 
   2: "Alias: %s" 
   2: "Clipboard converted for use with the Forum, %i change%s made" 
   2: "&Flip To World	Ctrl+Alt+Space" 
   2: "&Send To World	Shift+Ctrl+S" 
   2: "The connection to %s is not open. Attempt to reconnect?" 
   2: "Connecting to %s, port %d" 
   2: "--- Connected on %A, %B %d, %Y, %#I:%M %p ---" 
   1: "Trigger: %s" 
   1: "For information and assistance about MUSHclient visit our forum at:" 
   1: "Go to forum" 
   1: "Execution of line %i column %i" 
   1: "--- Disconnected on %A, %B %d, %Y, %#I:%M %p ---" 
   1: "Opening world \"%s\"" 
   1: "World: %s" 
   1: "No active world" 
   1: "Script error" 
   1: "--- Connected for %i day%s, %i hour%s, %i minute%s, %i second%s. ---" 
   1: "The \"%s\" server has closed the connection" 
   1: "Immediate execution" 
   1: "Written by Nick Gammon." 


Thus we see that in this (short) session, the above messages needed to be translated. In particular "Ready" was the most often used, thus we might translate that first.

The script shown above has enough complexity so that it doesn't count messages that are already translated. Thus, as you translate some, they will drop off the list, and the next most frequent ones can be translated.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Nick Gammon   Australia  (22,975 posts)  [Biography] bio   Forum Administrator
Date Reply #19 on Sat 16 Jun 2007 04:59 AM (UTC)
Message
Locale codes

I have been searching around for a definitive list of locale codes (eg EN=English, FR=French etc.).

So far I have found this one:

http://www.loc.gov/standards/iso639-2/php/code_list.php

Anyone who is planning to localize MUSHclient as described earlier will probably want to use a code from that list.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Isthiriel   (113 posts)  [Biography] bio
Date Reply #20 on Sat 16 Jun 2007 08:59 AM (UTC)
Message
And you aren't using the default Windows locale codes (ie EN-US = 1033)?

http://www.microsoft.com/globaldev/reference/winxp/xp-lcid.mspx
[Go to top] top

Posted by Nick Gammon   Australia  (22,975 posts)  [Biography] bio   Forum Administrator
Date Reply #21 on Sat 16 Jun 2007 10:06 AM (UTC)
Message
Well, I thought English was English.

I mean, if the messages comes out in 1033, I imagine someone who speaks 3081 or 4105 or 5129 will understand them.

Another example of Windows making things too complicated, I think.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Nick Gammon   Australia  (22,975 posts)  [Biography] bio   Forum Administrator
Date Reply #22 on Sat 16 Jun 2007 08:38 PM (UTC)
Message
Just to elaborate on that answer. I think if someone speaks English then they will understand a message like:


"This variable name is already in the list of variables."


I don't think that message needs to be translated into:


  • English_United_States
  • English_United_Kingdom
  • English_Australian
  • English_Canadian
  • English_New_Zealand
  • English_Ireland
  • English_South_Africa
  • English_Jamaica
  • English_Caribbean
  • English_Belize
  • English_Trinidad
  • English_Zimbabwe
  • English_Philippines


Even if it was, I can't offhand think of what the different translations would be.

And, if we did start using locale codes, and the file was renamed 1033.lua (English_United_States), and someone from England started using it, the program would find the locale to be 2057 (English_United_Kingdom), and not process the file.

So, we have actually made things harder for people doing that.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Shadowfyr   USA  (1,786 posts)  [Biography] bio
Date Reply #23 on Sun 17 Jun 2007 01:32 AM (UTC)

Amended on Sun 17 Jun 2007 01:37 AM (UTC) by Shadowfyr

Message
Usually things have default region codes, so if you "don't" have the correct code for 2057, it defaults to 1033, in most programs. In fact, 1033 is probably the default-default, so if you lack "any" support for other languages installed, then it would automatically drop to 1033, same way fonts go from its specific name, to its general family, to the system default, if each one in the chain is missing and can't be loaded.

Mind you, it occurs to me after posting that you might actually want to treat them as "families", so it some fool deletes 1033, you try to set it to 2057 and there is some "other" English set available, it will grab that instead, being the only "available" one to use in that family. Either way, if you wanted to allow for that granularity, instead of just jumping strait to the "default" English file, you would need to have some sort of table showing what the "closest" set of possible alternatives where, so you could load one that is available, if the one requested wasn't there.
[Go to top] top

Posted by Isthiriel   (113 posts)  [Biography] bio
Date Reply #24 on Sun 17 Jun 2007 10:29 AM (UTC)
Message
I'm in Australia and all the XP localization settings are EN-AU (except the keyboard is EN-US) and Office and the localized MS Games that I have all installed a 1033.dll for their resources.

*shrug*

You are going to need a translation table from machine locale to mushclient locale anyway and there's presumably going to be an option somewhere to change it if mushclient's first guess is wrong.
[Go to top] top

Posted by Nick Gammon   Australia  (22,975 posts)  [Biography] bio   Forum Administrator
Date Reply #25 on Sun 17 Jun 2007 08:59 PM (UTC)

Amended on Sun 17 Jun 2007 09:08 PM (UTC) by Nick Gammon

Message
Well, that is a Microsoft game. :)

I am calling GetLocaleInfo to get the abbreviated language name (eg. EN), based on the current locale.

However bearing in mind this may not be what the user actually wants to use (what happens if they speak multiple languages?) - then there is a configuration option to override that.

In Lua scripting, if you type:


/print (os.setlocale (""))


You see:


English_Australia.1252


Now I'm not sure where 1252 comes into it (I am in Australia too), as it isn't 1033.

From that table you suggested, this is the line for Australian English:


English_Australian / 0c09 / 3081 / 0c09:00000409


Now 1033 in decimal is 0409 in hex. But that entry (0c09:00000409) is listed as "Valid Locale ID:InputLocale combinations".

Thus 1033 (0x409) is really the input locale. I was under the impression the input locale was the keyboard locale, not the locale for displaying messages. Anyway you can see how confusing it is if you start looking at Microsoft's table.

To quote from Microsoft's documentation:


Input locales, implemented in Win9x, NT4, Windows 2000, and Windows XP are pairings of an input language with an input method (which might be a particular keyboard layout, an Input Method Editor, or speech-to-text converter, for example). Specifically, an input locale describes the language being entered, and how it is being entered.

Input locales are added on a per user basis. For each account it is possible to install multiple input locales and switch between them when entering text, allowing for the composition of multilanguage documents.

...

An Arabic user using Arabic Windows 2000 wants to type an email message in a mixture of Arabic and Russian. The user already has an Arabic input locale, and installs a second input locale for the Russian language (with an Arabic keyboard layout). When entering text, the user is able to switch between the Arabic input locale and the Russian input locale.


Thus, the input locale is really to help you enter text in different languages. However in my case I am trying to solve displaying of text in the desired language.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Nick Gammon   Australia  (22,975 posts)  [Biography] bio   Forum Administrator
Date Reply #26 on Sun 17 Jun 2007 09:04 PM (UTC)
Message
The PennMUSH localization files seemed to be named using the countrycode_languagecode.po convention. See:

http://download.pennmush.org/Source/translations/1.8.3p2/

These are their file names:


da_DK.po
de_DE.po
eo.po
es_ES.po
fr_FR.po
hr_HR.po
hu_HU.po
id_ID.po
nl_NL.po
no_NO.po
pl_PL.po
pt_BR.po
pt_PT.po
ro_RO.po
ru_RU.po
sv_SE.po
zh_TW.po


I suppose there could be an argument for translating French slightly differently depending on which country you are in (or English for that matter), but in the case of a MUD client, I don't think that level of detail is needed. In a spell-checker, maybe.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Nick Gammon   Australia  (22,975 posts)  [Biography] bio   Forum Administrator
Date Reply #27 on Mon 18 Jun 2007 12:36 AM (UTC)
Message
Version 4.12 has now been released, which fixes a few niggling problems with the earlier interim releases. See:

http://www.gammon.com.au/forum/?id=7967

The resource file en.dll has not changed at all from the earlier release, so if you have been localizing 4.11 resources, the same file can be used for 4.12.

http://www.gammon.com.au/files/mushclient/resources/mushclient_resources_4.11.zip

Some of the Lua code posted earlier in this thread is now distributed as part of the locale directory, to make it easier to do localization. There is also a locale_notes.txt file which explains the purpose of the various files.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Isthiriel   (113 posts)  [Biography] bio
Date Reply #28 on Mon 18 Jun 2007 05:38 AM (UTC)
Message
1252 is probably the codepage. Though I'm not completely sure why that would be specified in the locale name.
[Go to top] top

Posted by StuDraKi   Germany  (17 posts)  [Biography] bio
Date Reply #29 on Sat 29 Dec 2007 05:29 AM (UTC)
Message
I have two issues with the localisation, may be they are not related to each other. I'm using MushClient 4.19 on an German WinXP.

1) Cannot load DE.dll: I've used this article and created a de.lua, but restarting MushClient does not create a DE.DLL. Also changing the paramter in Global Preferences dows not influence anything. (Personally I'm used to the english version and I don't want to change it.)

2) In a world file I have a timer, running every 30 second. when it is saved, it is second="30,00". But during loading there is an error message Line 682: Invalid character ',' in numeric attribute named 'second' (timer not loaded)
During reading he is expecting a '.', so if I change manually it to '.' or delete the, and the 0-decimals it works.
',' and '. are exchanged in German and French compared with the English localisation of it.
[Go to top] top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


144,394 views.

This is page 2, subject is 3 pages long:  [Previous page]  1  2 3  [Next page]

It is now over 60 days since the last post. This thread is closed.     [Refresh] Refresh page

Go to topic:           Search the forum


[Go to top] top

Quick links: MUSHclient. MUSHclient help. Forum shortcuts. Posting templates. Lua modules. Lua documentation.

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.

[Home]


Written by Nick Gammon - 5K   profile for Nick Gammon on Stack Exchange, a network of free, community-driven Q&A sites   Marriage equality

Comments to: Gammon Software support
[RH click to get RSS URL] Forum RSS feed ( https://gammon.com.au/rss/forum.xml )

[Best viewed with any browser - 2K]    [Hosted at HostDash]