Register forum user name Search FAQ

Gammon Forum

Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the password reset link.

Due to spam on this forum, all posts now need moderator approval.

 Entire forum ➜ MUSHclient ➜ Lua ➜ Modifying the Spellchecker

Modifying the Spellchecker

It is now over 60 days since the last post. This thread is closed.     Refresh page


Posted by Balerion   (18 posts)  Bio
Date Thu 12 Apr 2007 12:11 PM (UTC)
Message
Hi,

Has anyone worked to make the LUA-based spellchecker function more like the proprietary spellchecker, in terms of options? Right now it's a bit beyond me, but I figured perhaps others have already tackled it. I know that Nick was kind enough to add within the script a way to tell it to ignore capitalized words, but it'd be very useful if the script had some of the other options from the propiertary spellchecker in it:

Ignore All-Caps words
Ignore words with numbers
Ignore words with mixed-case
Report doubled words (e.g. the the)
Case-sensitive

Ideally, it'd be done in such a way that you could uncomment the ones you want to use. Not sure if that's doable, though.

Thanks.
Top

Posted by Shaun Biggs   USA  (644 posts)  Bio
Date Reply #1 on Thu 12 Apr 2007 07:28 PM (UTC)
Message
That shouldn't be too hard to do. Just let me know how the old spellchecker handled these things, and I'll get to it. I mostly need to know you used to set the flags in the old one. I never use spellcheckers in my mud, although I probably should start with area building I'm doing now.

It is much easier to fight for one's ideals than to live up to them.
Top

Posted by Balerion   (18 posts)  Bio
Date Reply #2 on Thu 12 Apr 2007 07:36 PM (UTC)
Message
In the proprietary spellchecker, there was a GUI with checkboxes for the options you wanted ennabled.

Here's a URL to a screenshot:

http://tinypic.com/view.php?pic=2uzeds7
Top

Posted by Shaun Biggs   USA  (644 posts)  Bio
Date Reply #3 on Thu 12 Apr 2007 08:24 PM (UTC)
Message
Well, I can't get that dialog exactly, but I'm sure I can approximate it with some of the provided MUSHclient functions.

It is much easier to fight for one's ideals than to live up to them.
Top

Posted by Balerion   (18 posts)  Bio
Date Reply #4 on Thu 12 Apr 2007 08:42 PM (UTC)
Message
That'd be cool. It doesn't really so much if it's done via a GUI or something. For example, in the script you can define IGNORE_CAPITALIZED as true, and the script will then take that into account when evaluating text to be spellchecked. I thought approaching it that way wouldn't be a bad idea. I tried to do an IGNORE_MIXED based on IGNORE_CAPITALIZED, with some help on the regexp, but it doesn't quite seem to work (possibly I've got the REGEXP wrong).

For reference, my try at ignoring mixed case words:

if IGNORE_MIXED then
-- ignore words featuring mixed case
if string.find (word, "^([a-z].*[A-Z])|^(.*([a-z].*[A-Z])|([A-Z].*[a-z]))") then
return word, "ignore"
end -- this round, ignore this word
end -- if IGNORE_MIXED
Top

Posted by Nick Gammon   Australia  (23,140 posts)  Bio   Forum Administrator
Date Reply #5 on Thu 12 Apr 2007 10:01 PM (UTC)
Message
The Lua regular expressions do not support the "or" feature that the MUSHclient ones do. However this will work:


  if IGNORE_MIXED  then
    -- ignore words in mixed case
    if string.find (word, "[A-Z]") and
       string.find (word, "[a-z]") then
      return word, "ignore"
    end -- this round, ignore this word
  end -- if IGNORE_MIXED


Quote:

Ignore words with numbers


It already ignores words which are all numbers.

However if you mean to take something like "route66" as a single word, and then ignore it, you need to do something like this:

First, change the "word" regexp near the start of the spellcheck file from:


-- this is the pattern we use to find "words" in the text to be spell-checked
local pattern = "%a+'?%a+" -- regexp to give us a word with a possible single imbedded quote



to:


-- this is the pattern we use to find "words" in the text to be spell-checked
local pattern = "%a+'?[%a%d]+" -- regexp to give us a word with a possible single imbedded quote


Now it recognises imbedded numbers in a word.

Further down, (where you had the IGNORE_MIXED) stuff, add a test that now drops those words:

  
  -- ignore words with numbers in them
  if string.find (word, "%d") then
    return word, "ignore"
  end -- this round, ignore this word


In other words, the first change makes numbers be considered as part of a word, the second change makes us ignore such words.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


17,904 views.

It is now over 60 days since the last post. This thread is closed.     Refresh page

Go to topic:           Search the forum


[Go to top] top

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.