Register forum user name Search FAQ

Gammon Forum

Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the password reset link.

Due to spam on this forum, all posts now need moderator approval.

 Entire forum ➜ MUSHclient ➜ General ➜ New experimental spell-checker

New experimental spell-checker

It is now over 60 days since the last post. This thread is closed.     Refresh page


Posted by Nick Gammon   Australia  (23,133 posts)  Bio   Forum Administrator
Date Tue 10 Oct 2006 03:19 AM (UTC)
Message
Version 3.81 of MUSHclient, just released, incorporates the start of what is intended to be a replacement for the existing spell-checker. The current spell-checker is a proprietary one, and thus making changes to it are difficult, and adding different dictionaries (eg. for other languages) is expensive.

Version 3.81 still incorporates the old spell-checker. To enable the new one go to the File menu -> Global Preferences -> General tab, and check the "Custom" checkbox.

However before you do that, you need to grab some dictionary files for use with the spellchecker. To save space in the download, and as the new spellchecker is experimental right now, they are available as a separate file.

You need to download this 150 Kb file:

http://www.gammon.com.au/files/mushclient/dicts.zip


Inside are 9 straight text files. Unzip them and put them in the "spell" directory, which resides under where the MUSHclient executable is.


I got these files from:

http://wordlist.sourceforge.net/

You are welcome to go to that page and grab other dictionary files (eg. Canadian English, British English), or indeed use any files in any language.

For simplicity, the dictionaries are straight text, that is, one word per line, like this:


a
abilities
ability
ability's
able
about
above
absence
absence's
absolute
...
yet
you
young
your
yours
yourself
zero
zero's


To make it easy to modify the spell checker, the bulk of its operations is coded in Lua in a file supplied with MUSHclient, called "spellchecker.lua" that resides in the same directory as the MUSHclient executable.

When you enable the "custom" spell checker, MUSHclient attempts to load that file into an application-wide Lua script space. That is, it is shared between all open worlds. This is intended to save memory in case you have dozens of worlds open, so only one copy of the dictionary is kept in memory.

Because of this, the spell checker scripts cannot access any of the MUSHclient "world" functions, as they are relative to a particular world. However the Lua script extensions (like the utils library) are still available.

Near the start of the file spellchecker.lua is a list of the dictionaries it will try to open. You can add or remove files from the list. They are expected to be found in the "spell" directory directly under the MUSHclient executable, however you can change the script to look somewhere else (just change the directory variable).

The 9 files listed below that are not commented out are the ones supplied in the dicts.zip file mentioned earlier. However the userdict.txt file is not in that download, as an empty one is created automatically if it is not found.


-- dictionaries
local files = {

-- lower-case words

  "english-words.10",
  "english-words.20",
  "english-words.35",
  "english-words.40",
--   "english-words.50",
--   "english-words.55",
--   "english-words.60",
--   "english-words.70",
--   "english-words.80",
--   "english-words.95",

-- upper case words

  "english-upper.10",
  "english-upper.35",
  "english-upper.40",
--   "english-upper.50",
--   "english-upper.60",
--   "english-upper.70",
--   "english-upper.80",
--   "english-upper.95",

-- contractions (eg. aren't, doesn't)

  "english-contractions.10",
  "english-contractions.35",
  
-- user dictionary, we read and write this one
  userdict,
  }


You can obtain further dictionary files from that site, simply put their names in the list above as necessary.

Later on I envisage a GUI interface for adding dictionaries.

The more dictionaries, the slower it will be to load, and the more memory it will take. Probably for MUD purposes obscure words like "abaptiston" are not required, so I suggest keeping the number of words as low as possible. Too many words will also result in a long list of suggestions when you make a misspelling.

Once you have sucessfully installed the dictionaries, you should see a progress-bar appear as MUSHclient reads them into memory. In my case, with the 9 supplied dictionary files, it takes around a second.

Then you should be able to spell-check-on-send, or by request (Ctrl+J).

You will see a similar dialog box to the earlier spell-checker. For each word it considers misspelt you can choose to:


  • Retype the correct word manually (then click "Change").

  • Select a suggested alternative word, and click "Change".

  • Double-click a suggested word.

  • Click "Ignore" to ignore this word once.

  • Click "Ignore All" to ignore all instance of this word in this session (that is, this line).

  • Click "Add" to accept the word, and add it into the userdict.txt file. Thus it will be considered correctly spelt in future.

  • Click "Change" to change this word once from the misspelling to either what you typed, or the selected suggestion.

  • Click "Change All" to change all instances of this word from the misspelling to either what you typed, or the selected suggestion, for this session (that is, this line).

  • Click "Cancel" to exit the spell checker and make no changes. However any words added to the user dictionary before you click Cancel will remain added.





Differences to the current spell checker


  • Right now it doesn't detect duplicated words (eg. the the foot)

  • The code does a case-insensitive compare, due to problems with detecting whether a word at the start of a sentence is a proper noun or not.

  • There are no provisions for ignore words in all caps, this could easily be added to the Lua code.

  • In order to detect words with apostrophes in them, the minimum word size processed is 2 characters. That is, single-character words are not checked.

  • It doesn't update the text in the command window on-the-fly like the current one does. I'm not sure if this is a big problem or not.

  • There are no configuration dialogs set up yet, you configure it by amending the spellcheck.lua file right now.

  • If you use the world.SpellCheck script function, the existing spell checker is called.

  • If you use the world.AddSpellCheckWord script function, it applies to the existing spell checker, not the new one.


Any suggested amendments or problems found, please report them in this thread.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Zeno   USA  (2,871 posts)  Bio
Date Reply #1 on Wed 11 Oct 2006 02:03 AM (UTC)
Message
Nothing happens when I click "Options". Is that normal?

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
Top

Posted by Nick Gammon   Australia  (23,133 posts)  Bio   Forum Administrator
Date Reply #2 on Wed 11 Oct 2006 02:40 AM (UTC)
Message
Yes, not implemented yet.

Before I got into making an options dialog I wanted to make sure the basics were working.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


9,054 views.

It is now over 60 days since the last post. This thread is closed.     Refresh page

Go to topic:           Search the forum


[Go to top] top

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.