[Home] [Downloads] [Search] [Help/forum]


Register forum user name Search FAQ

Gammon Forum

Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to "verify" your details, making threats, or asking for money, are spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the password reset link.
 Entire forum ➜ MUSHclient ➜ Suggestions ➜ Spell checker improvements?

Spell checker improvements?

It is now over 60 days since the last post. This thread is closed.     Refresh page


Pages: 1  2 3  

Posted by Linda   Sweden  (164 posts)  Bio
Date Reply #15 on Sat 07 Oct 2006 06:33 PM (UTC)
Message
Yes, dialog boxes popping up would be an issue, I think. All the applications I have seen that do live checking do it as live marking of errors, not live offers to change via a dialog box.

Regarding the uppercase and lowercase issue, btw, it doesn't hold true for MUSHes that most everything is typed in lowercase.
Top

Posted by Nick Gammon   Australia  (23,046 posts)  Bio   Forum Administrator
Date Reply #16 on Sun 08 Oct 2006 05:43 AM (UTC)
Message
Quote:

Yes, dialog boxes popping up would be an issue, I think.


So you don't use the current spell checker then? That pops up a dialog box if there are misspelt words.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Linda   Sweden  (164 posts)  Bio
Date Reply #17 on Sun 08 Oct 2006 05:40 PM (UTC)
Message
I do, but that's an on demand spellchecker, not an automatic, live spell-checker. If boxes popped up automatically as I was typing and happened to get something wrong, it would be a problem. :)
Top

Posted by Nick Gammon   Australia  (23,046 posts)  Bio   Forum Administrator
Date Reply #18 on Sun 08 Oct 2006 09:03 PM (UTC)
Message
Well, for simplicity, if the existing spellchecker was replaced by our own one (thus supporting multiple dictionaries for instance), then at the very least, having a spell-check on send, with dialog boxes, would be no worse than we currently have.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Linda   Sweden  (164 posts)  Bio
Date Reply #19 on Sun 08 Oct 2006 09:08 PM (UTC)
Message
Oh, that's true. :)

Though, I think it should probably be optional whether to have it automatically spell-check on send or whether it should be called up manually.

I wonder, though, how some of the other programs accomplish the live marking of misspelled words. ICQs input window, for example, seems very similar to that of MUSHclient, and it doesn't seem to slow things down.
Top

Posted by Zeno   USA  (2,871 posts)  Bio
Date Reply #20 on Sun 08 Oct 2006 10:03 PM (UTC)
Message
Trillian does as well, what I think it does is only check the box after a space is typed.

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
Top

Posted by David Haley   USA  (3,881 posts)  Bio
Date Reply #21 on Sun 08 Oct 2006 10:22 PM (UTC)
Message
Thunderbird also does highlight-as-you-type, and checks words only after a word-boundary character has been typed. A word-boundary character is something like a space, punctuation mark, new line, etc. It's quite fast.

I think that if there's a 0.5 second delay in checking spelling, something very wrong is happening. It should be very, very quick to see if a word is in the dictionary or not.

I did a quick test, and it took 5 seconds to load up 127k words, find suggestions for "bote", and then look up the word "transformation" a million times. If I change that to 10,000 lookups, it takes 1 second. (Interesting result, actually.) This is without optimization. If I turn on optimization, with -O3, it takes 2.8 seconds to do everything with one million words, and 0.6 seconds to do everything with ten thousand words. Recall that in both cases, I load up the entire dictionary of 127k words first.

So I think that we can get much better performance than 0.5s. :-)

David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone

http://david.the-haleys.org
Top

Posted by Nick Gammon   Australia  (23,046 posts)  Bio   Forum Administrator
Date Reply #22 on Mon 09 Oct 2006 05:39 AM (UTC)
Message
I am not worried about the speed, as much as making a custom input box that allows words to be highlighted (eg. with a red squiggle) as you type.

The current edit control just doesn't allow that, and if it was changed to a rich-edit control, then that would have other ramifications (eg. if you pasted styled text into it).

I have a prototype going, I might release that for analysis. This is a straight "dialog box" one that pops up when you hit <enter> to send your text, similarly to the current one.

There are still some administrative issues, like the question about upper/lower case words. Also cases like this:


I want you to type 'east' now.


The problem is the single quotes. If they are considered part of the word, then we get an error that: east' is not in the dictionary.

However, if single quotes are not considered part of a word then something like this will raise an error:


Hold the gun at arm's length.


Given that "s" is not a word, it would accept "arm" but not "s".

A similar problem occurs with the word "I'll".

Any suggestions here are welcome.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Onoitsu2   USA  (248 posts)  Bio
Date Reply #23 on Mon 09 Oct 2006 11:14 AM (UTC)
Message
Well the fix to that is to just ignore contractions, or have a separate contraction checker for words with an appostrophe in it, and have a check if the "word" (using term loosely here, to suggest characters separated by white space, or another punctuation character other than an apostrophe) begins and ends in an appostrophe, and if so then ignote it, as it is a quoted phrase.

I think a little logic workaround like that would work for the english language, but others, well lets just say 1 word on that 'Havok!'

Laterzzz,
Onoitsu2
Top

Posted by David Haley   USA  (3,881 posts)  Bio
Date Reply #24 on Mon 09 Oct 2006 07:08 PM (UTC)
Message
My guess is that this is solved by a slight hack:

- if quotation marks appear at the boundaries of words, ignore them.
- if quotation marks appear in the middle of a word, require that word to exist in the dictionary.

It's not sufficient to see if a single word is enclosed in quotation marks, because sometimes you have a quoted phrase that's longer than a word.

I find quite often that as I add somewhat unusual names to, say, OpenOffice's spell checker (e.g. Anscombe) it will not accept the possessive (e.g. Anscombe's) unless I add that too. However once I add Anscombe it will happily accept 'Anscombe', 'Anscombe, Anscombe' and so forth, so that I can use the name in a sentence or quoted sentence normally.

I think that this is a fairly safe rule, and much more preferable than simply throwing out contraction checking.

David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone

http://david.the-haleys.org
Top

Posted by Nick Gammon   Australia  (23,046 posts)  Bio   Forum Administrator
Date Reply #25 on Tue 10 Oct 2006 01:17 AM (UTC)
Message
I got around it a bit by searching for the (Lua) regexp:


%a+'?%a+


That allows for a single imbedded quote. The disadvantage is that it won't match single-letter words. I suppose you could argue that single letter words are not exactly misspelt, so much as "noise" (oops, my finger hit the keyboard).

I mean, how many people say to themselves "I wonder how you spell the word 'I' (or 'A')"?

Also, in a MUD you might want words like N, S, E, W and so on.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,046 posts)  Bio   Forum Administrator
Date Reply #26 on Tue 10 Oct 2006 03:42 AM (UTC)
Message
The new spell checker is available for testing, see:

http://www.gammon.com.au/forum/?id=7403

The release announcement for version 3.81 is here:

http://www.gammon.com.au/forum/?id=7404




User dictionary

There are other issues too. For example, the user dictionary.

Now, say there is a word the checker doesn't recognise, and you want it to (so you click "Add").

Currently the behaviour is to add it to the spell checker dictionary space, in the same way as existing words.

The significance of this is that it is also considered now as a "sound-alike" for other misspellings.

Here is an example. Say you type "ROFL" and you get a spell-check error. If you click "add" then "ROFL" is now in the dictionary. Now, if you type "ROFL" it is accepted, of course. But if you type "RFL" then one of the suggestions is now "ROFL" as that is now in the dictionary.

I'm not sure if that is a good thing or not. Possibly it is.




Auto-replacement

A feature that is not there at present is an "auto-replace" dictionary. Perhaps we could have a dictionary like this:


teh=the
attck=attack


If there was such a thing, then perhaps the spell checker can silently replace any words found, by the replacement.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Shadowfyr   USA  (1,787 posts)  Bio
Date Reply #27 on Tue 10 Oct 2006 11:09 PM (UTC)
Message
Quote:
Currently the behaviour is to add it to the spell checker dictionary space, in the same way as existing words.


Not entirely.. One thing that bugs the heck out of me is that most won't let you specify, in the cases of completely new worlds, what type of word it is. For example, you might want it to be smart enough to tell that 'ing' is a valid addon for 'frack', but not 'smurf' and more to the point, that 'smurf' is 's', not 'es' when plural, not to mention the simple fact that it is something that "should" be allowed to be plural. This always has bugged me about user dictionaries.

Then again, it could always be worse. They also started designing a non-proprietary spellchecker for the news reader I use, and somehow failed to add the, "No, I don't like any of those choices, but use this...", option. Nothing like mispelling a word, having the checker completely fail to find the "correct" spelling, then being stuck with "only" the wrong version or 10 unrelated choices, because the bloody spellchecker won't let you manually change the spelling to correct the mistake (or at least not in the dialog itself).

There are definitely some screwy problems with spellcheckers of all types, though the funiest has got to be the MS ones that are a) ignorant of MS product names and basic computer terms and b) missing other common, if 'maybe' slightly less used words. lol
Top

Posted by Nick Gammon   Australia  (23,046 posts)  Bio   Forum Administrator
Date Reply #28 on Tue 10 Oct 2006 11:44 PM (UTC)
Message
Quote:

For example, you might want it to be smart enough to tell that 'ing' is a valid addon for 'frack', but not 'smurf' ...


Well in my case you could just add the word "fracking" when required.

Quote:

"No, I don't like any of those choices, but use this...", option.


In my spellchecker you get around this by using "add" to add the word. Otherwise if you simply type a correction it rechecks it. For example, you type: "grp" and it gives an error. You correct it to "grup" and it still gives an error. You either Add grup to the dictionary, or keep going until you get the correct spelling (eg. "group").

Quote:

... though the funiest has got to be the MS ones ...


The thing I like about my spellchecker is that the dictionaries are under your control, and are human-readable. So, you can browse them and check that the words in them are what you expect.

Someone might like to make a "MUD dictionary" of common mudding terms and abbreviations (eg. LOL, ROFL etc.)

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,046 posts)  Bio   Forum Administrator
Date Reply #29 on Tue 10 Oct 2006 11:46 PM (UTC)
Message
The next step will be a thesaurus. Maybe Linda can tell us what she has in mind.


  • Just alternative words?
  • Or the definition as well?
  • How about opposites?

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


105,835 views.

This is page 2, subject is 3 pages long:  [Previous page]  1  2 3  [Next page]

It is now over 60 days since the last post. This thread is closed.     Refresh page

Go to topic:           Search the forum


[Go to top] top

Quick links: MUSHclient. MUSHclient help. Forum shortcuts. Posting templates. Lua modules. Lua documentation.

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.

[Home]