Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to "verify" your details, making threats, or asking for money, are
spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the
password reset link.
Entire forum
➜ MUSHclient
➜ Suggestions
➜ Spell checker improvements?
Spell checker improvements?
|
It is now over 60 days since the last post. This thread is closed.
Refresh page
Pages: 1
2 3
Posted by
| Linda
Sweden (164 posts) Bio
|
Date
| Reply #15 on Sat 07 Oct 2006 06:33 PM (UTC) |
Message
| Yes, dialog boxes popping up would be an issue, I think. All the applications I have seen that do live checking do it as live marking of errors, not live offers to change via a dialog box.
Regarding the uppercase and lowercase issue, btw, it doesn't hold true for MUSHes that most everything is typed in lowercase. | Top |
|
Posted by
| Nick Gammon
Australia (23,046 posts) Bio
Forum Administrator |
Date
| Reply #16 on Sun 08 Oct 2006 05:43 AM (UTC) |
Message
|
Quote:
Yes, dialog boxes popping up would be an issue, I think.
So you don't use the current spell checker then? That pops up a dialog box if there are misspelt words. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Linda
Sweden (164 posts) Bio
|
Date
| Reply #17 on Sun 08 Oct 2006 05:40 PM (UTC) |
Message
| I do, but that's an on demand spellchecker, not an automatic, live spell-checker. If boxes popped up automatically as I was typing and happened to get something wrong, it would be a problem. :) | Top |
|
Posted by
| Nick Gammon
Australia (23,046 posts) Bio
Forum Administrator |
Date
| Reply #18 on Sun 08 Oct 2006 09:03 PM (UTC) |
Message
| Well, for simplicity, if the existing spellchecker was replaced by our own one (thus supporting multiple dictionaries for instance), then at the very least, having a spell-check on send, with dialog boxes, would be no worse than we currently have. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Linda
Sweden (164 posts) Bio
|
Date
| Reply #19 on Sun 08 Oct 2006 09:08 PM (UTC) |
Message
| Oh, that's true. :)
Though, I think it should probably be optional whether to have it automatically spell-check on send or whether it should be called up manually.
I wonder, though, how some of the other programs accomplish the live marking of misspelled words. ICQs input window, for example, seems very similar to that of MUSHclient, and it doesn't seem to slow things down. | Top |
|
Posted by
| Zeno
USA (2,871 posts) Bio
|
Date
| Reply #20 on Sun 08 Oct 2006 10:03 PM (UTC) |
Message
| Trillian does as well, what I think it does is only check the box after a space is typed. |
Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org | Top |
|
Posted by
| David Haley
USA (3,881 posts) Bio
|
Date
| Reply #21 on Sun 08 Oct 2006 10:22 PM (UTC) |
Message
| Thunderbird also does highlight-as-you-type, and checks words only after a word-boundary character has been typed. A word-boundary character is something like a space, punctuation mark, new line, etc. It's quite fast.
I think that if there's a 0.5 second delay in checking spelling, something very wrong is happening. It should be very, very quick to see if a word is in the dictionary or not.
I did a quick test, and it took 5 seconds to load up 127k words, find suggestions for "bote", and then look up the word "transformation" a million times. If I change that to 10,000 lookups, it takes 1 second. (Interesting result, actually.) This is without optimization. If I turn on optimization, with -O3, it takes 2.8 seconds to do everything with one million words, and 0.6 seconds to do everything with ten thousand words. Recall that in both cases, I load up the entire dictionary of 127k words first.
So I think that we can get much better performance than 0.5s. :-) |
David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone
http://david.the-haleys.org | Top |
|
Posted by
| Nick Gammon
Australia (23,046 posts) Bio
Forum Administrator |
Date
| Reply #22 on Mon 09 Oct 2006 05:39 AM (UTC) |
Message
| I am not worried about the speed, as much as making a custom input box that allows words to be highlighted (eg. with a red squiggle) as you type.
The current edit control just doesn't allow that, and if it was changed to a rich-edit control, then that would have other ramifications (eg. if you pasted styled text into it).
I have a prototype going, I might release that for analysis. This is a straight "dialog box" one that pops up when you hit <enter> to send your text, similarly to the current one.
There are still some administrative issues, like the question about upper/lower case words. Also cases like this:
I want you to type 'east' now.
The problem is the single quotes. If they are considered part of the word, then we get an error that: east' is not in the dictionary.
However, if single quotes are not considered part of a word then something like this will raise an error:
Hold the gun at arm's length.
Given that "s" is not a word, it would accept "arm" but not "s".
A similar problem occurs with the word "I'll".
Any suggestions here are welcome.
|
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Onoitsu2
USA (248 posts) Bio
|
Date
| Reply #23 on Mon 09 Oct 2006 11:14 AM (UTC) |
Message
| Well the fix to that is to just ignore contractions, or have a separate contraction checker for words with an appostrophe in it, and have a check if the "word" (using term loosely here, to suggest characters separated by white space, or another punctuation character other than an apostrophe) begins and ends in an appostrophe, and if so then ignote it, as it is a quoted phrase.
I think a little logic workaround like that would work for the english language, but others, well lets just say 1 word on that 'Havok!'
Laterzzz,
Onoitsu2 | Top |
|
Posted by
| David Haley
USA (3,881 posts) Bio
|
Date
| Reply #24 on Mon 09 Oct 2006 07:08 PM (UTC) |
Message
| My guess is that this is solved by a slight hack:
- if quotation marks appear at the boundaries of words, ignore them.
- if quotation marks appear in the middle of a word, require that word to exist in the dictionary.
It's not sufficient to see if a single word is enclosed in quotation marks, because sometimes you have a quoted phrase that's longer than a word.
I find quite often that as I add somewhat unusual names to, say, OpenOffice's spell checker (e.g. Anscombe) it will not accept the possessive (e.g. Anscombe's) unless I add that too. However once I add Anscombe it will happily accept 'Anscombe', 'Anscombe, Anscombe' and so forth, so that I can use the name in a sentence or quoted sentence normally.
I think that this is a fairly safe rule, and much more preferable than simply throwing out contraction checking. |
David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone
http://david.the-haleys.org | Top |
|
Posted by
| Nick Gammon
Australia (23,046 posts) Bio
Forum Administrator |
Date
| Reply #25 on Tue 10 Oct 2006 01:17 AM (UTC) |
Message
| I got around it a bit by searching for the (Lua) regexp:
%a+'?%a+
That allows for a single imbedded quote. The disadvantage is that it won't match single-letter words. I suppose you could argue that single letter words are not exactly misspelt, so much as "noise" (oops, my finger hit the keyboard).
I mean, how many people say to themselves "I wonder how you spell the word 'I' (or 'A')"?
Also, in a MUD you might want words like N, S, E, W and so on.
|
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Nick Gammon
Australia (23,046 posts) Bio
Forum Administrator |
Date
| Reply #26 on Tue 10 Oct 2006 03:42 AM (UTC) |
Message
| The new spell checker is available for testing, see:
http://www.gammon.com.au/forum/?id=7403
The release announcement for version 3.81 is here:
http://www.gammon.com.au/forum/?id=7404
User dictionary
There are other issues too. For example, the user dictionary.
Now, say there is a word the checker doesn't recognise, and you want it to (so you click "Add").
Currently the behaviour is to add it to the spell checker dictionary space, in the same way as existing words.
The significance of this is that it is also considered now as a "sound-alike" for other misspellings.
Here is an example. Say you type "ROFL" and you get a spell-check error. If you click "add" then "ROFL" is now in the dictionary. Now, if you type "ROFL" it is accepted, of course. But if you type "RFL" then one of the suggestions is now "ROFL" as that is now in the dictionary.
I'm not sure if that is a good thing or not. Possibly it is.
Auto-replacement
A feature that is not there at present is an "auto-replace" dictionary. Perhaps we could have a dictionary like this:
teh=the
attck=attack
If there was such a thing, then perhaps the spell checker can silently replace any words found, by the replacement. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Shadowfyr
USA (1,787 posts) Bio
|
Date
| Reply #27 on Tue 10 Oct 2006 11:09 PM (UTC) |
Message
|
Quote: Currently the behaviour is to add it to the spell checker dictionary space, in the same way as existing words.
Not entirely.. One thing that bugs the heck out of me is that most won't let you specify, in the cases of completely new worlds, what type of word it is. For example, you might want it to be smart enough to tell that 'ing' is a valid addon for 'frack', but not 'smurf' and more to the point, that 'smurf' is 's', not 'es' when plural, not to mention the simple fact that it is something that "should" be allowed to be plural. This always has bugged me about user dictionaries.
Then again, it could always be worse. They also started designing a non-proprietary spellchecker for the news reader I use, and somehow failed to add the, "No, I don't like any of those choices, but use this...", option. Nothing like mispelling a word, having the checker completely fail to find the "correct" spelling, then being stuck with "only" the wrong version or 10 unrelated choices, because the bloody spellchecker won't let you manually change the spelling to correct the mistake (or at least not in the dialog itself).
There are definitely some screwy problems with spellcheckers of all types, though the funiest has got to be the MS ones that are a) ignorant of MS product names and basic computer terms and b) missing other common, if 'maybe' slightly less used words. lol | Top |
|
Posted by
| Nick Gammon
Australia (23,046 posts) Bio
Forum Administrator |
Date
| Reply #28 on Tue 10 Oct 2006 11:44 PM (UTC) |
Message
|
Quote:
For example, you might want it to be smart enough to tell that 'ing' is a valid addon for 'frack', but not 'smurf' ...
Well in my case you could just add the word "fracking" when required.
Quote:
"No, I don't like any of those choices, but use this...", option.
In my spellchecker you get around this by using "add" to add the word. Otherwise if you simply type a correction it rechecks it. For example, you type: "grp" and it gives an error. You correct it to "grup" and it still gives an error. You either Add grup to the dictionary, or keep going until you get the correct spelling (eg. "group").
Quote:
... though the funiest has got to be the MS ones ...
The thing I like about my spellchecker is that the dictionaries are under your control, and are human-readable. So, you can browse them and check that the words in them are what you expect.
Someone might like to make a "MUD dictionary" of common mudding terms and abbreviations (eg. LOL, ROFL etc.)
|
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Nick Gammon
Australia (23,046 posts) Bio
Forum Administrator |
Date
| Reply #29 on Tue 10 Oct 2006 11:46 PM (UTC) |
Message
| The next step will be a thesaurus. Maybe Linda can tell us what she has in mind.
- Just alternative words?
- Or the definition as well?
- How about opposites?
|
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).
To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.
105,835 views.
This is page 2, subject is 3 pages long:
1
2 3
It is now over 60 days since the last post. This thread is closed.
Refresh page
top