Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are
spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the
password reset link.
Due to spam on this forum, all posts now need moderator approval.
Entire forum
➜ MUSHclient
➜ General
➜ How do I make an Autocleric?
How do I make an Autocleric?
|
It is now over 60 days since the last post. This thread is closed.
Refresh page
Pages: 1
2
3
Posted by
| David Haley
USA (3,881 posts) Bio
|
Date
| Reply #30 on Fri 02 Mar 2007 07:15 PM (UTC) |
Message
|
Quote: but also the fact that, while sections of memory might go unused, the chunks don't "necessarilly", I assume, need to be fixed sizes, so you use "less" space for data that doesn't need fixed sizes in it. Unless I am completely wrong about that. Well... :-)
What you're really talking about here is the difference between pointers and storing the actual data. If I had an array of pointers, (hint: a list of buckets) then I don't suffer from data elements that are larger than others in the sense that I don't have to preallocate all necessary space.
But this has nothing to do with hash tables per se.
One advantage of hash tables over arrays regarding storage efficiency is that if you have a very sparse array you are wasting lots of space. E.g., if all you have is a[1], a[50], a[102] then it would be wasteful to store than in an array of size 102 because you'd have 99 empty slots. If you had a hash table with a reasonably sized (or dynamically growing) bucket list you would be using a lot less.
Quote: Though, I admit I don't know if hashes help to solve that problem or not, but it just seems reasonable that, to save on space, you could/would do something like that. Yes, it's a reasonable thing to do, which is what I was referring to just above here regarding pointers. Still it is not a property of hash tables and is rather the difference between storing pointers to data versus the actual data. |
David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone
http://david.the-haleys.org | Top |
|
Posted by
| Nick Gammon
Australia (23,158 posts) Bio
Forum Administrator |
Date
| Reply #31 on Fri 02 Mar 2007 07:52 PM (UTC) Amended on Fri 02 Mar 2007 07:53 PM (UTC) by Nick Gammon
|
Message
|
Quote:
... all anyone had to do, which Nick did quite handilly, is explain that a hash was being used, not a linear address space ...
I don't really want to get bogged down here with claims and counter claims about what people know, or don't know.
The forum is intended to be an informational and educational tool, and thus I am happy to clarify things like that.
It is easy - if you have come from a history, as I have, of using simpler languages (if you can call assembler simple) - to assume that language implementors still do things the same way as they always have. Which in a sense they do, deep down.
You can see from the fact that what look like arrays in Lua, but support negative subscripts, must really use something else.
Interestingly, the Standard Template Library (STL), when storing similar things (in a map or set), does not actually use hashes but a B-Tree, which has the interesting property that regardless of the order in which you put things, they come out in sequence (alphabetic order for strings).
If you expected the keys to be hashed you would be surprised by that.
It shows that taking an interest in the implementation details can clarify some confusion, when the behaviour of the overall algorithm is not quite what you expect. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| Nick Gammon
Australia (23,158 posts) Bio
Forum Administrator |
Date
| Reply #32 on Fri 02 Mar 2007 07:55 PM (UTC) |
Message
| Oh, and how is the Autocleric going? |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| David Haley
USA (3,881 posts) Bio
|
Date
| Reply #33 on Fri 02 Mar 2007 09:13 PM (UTC) |
Message
|
Quote: Interestingly, the Standard Template Library (STL), when storing similar things (in a map or set), does not actually use hashes but a B-Tree, which has the interesting property that regardless of the order in which you put things, they come out in sequence (alphabetic order for strings). This is indeed a pretty interesting design decision on their part, mainly because it requires that whatever you are storing be sortable/comparable. If you're inserting pointers, then everything is fine because you can compare pointer addresses. String objects also come with the comparison operator (lexicographical comparison). But if you're trying to key on a new data structure (or store it in a set) you need to have your own comparison.
Of course, you have the same problem with hash functions, in that it's unclear how you would go about hashing an unknown data structure without additional information about it.
So I'm guessing the btree choice had to do not only with wanting the properties of binary trees (or not), but also with how to actually implement a binary tree versus a hash table. As soon as you stop storing pointers you need to worry about your "keying" function, and it's probably easier to define an ordering on things than to define a hash function -- because, in part, a hash function has to be much more carefully crafted in order to distribute elements more or less equally over the buckets and not have (too many) collisions. |
David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone
http://david.the-haleys.org | Top |
|
Posted by
| Shaun Biggs
USA (644 posts) Bio
|
Date
| Reply #34 on Sat 03 Mar 2007 08:08 AM (UTC) |
Message
|
Quote: This is indeed a pretty interesting design decision on their part, mainly because it requires that whatever you are storing be sortable/comparable. If you're inserting pointers, then everything is fine because you can compare pointer addresses. String objects also come with the comparison operator (lexicographical comparison). But if you're trying to key on a new data structure (or store it in a set) you need to have your own comparison.
I'm pretty sure the Lua developers came up with on their own. Comparison by strings is definitely not what is going on, because you can have functions as keys and values for tables in Lua. |
It is much easier to fight for one's ideals than to live up to them. | Top |
|
Posted by
| David Haley
USA (3,881 posts) Bio
|
Date
| Reply #35 on Sat 03 Mar 2007 12:13 PM (UTC) |
Message
| They use a hash function, not an ordering. I'm guessing that since Lua has a relatively small number of pure data types, it's possible to code in advance the hash function for all those types. But right, a straight comparison is definitely not what they use. |
David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone
http://david.the-haleys.org | Top |
|
Posted by
| Nick Gammon
Australia (23,158 posts) Bio
Forum Administrator |
Date
| Reply #36 on Wed 07 Mar 2007 03:39 AM (UTC) |
Message
| Interestingly, I think you will find that all Lua types can be readily hashed because they are stored internally as a union. See the source for this:
/*
** Union of all Lua values
*/
typedef union {
GCObject *gc;
void *p;
lua_Number n;
int b;
} Value;
Depending on the size of lua_Number, this value is going to be a small size (say around 4 bytes, maybe slightly more). Lua internally stores strings in such a way that identical strings are stored in the same place. Thus if the *address* of two strings is the same, the strings are the same. Similarly for tables, which are really a pointer to the table. And, userdata is simply an address. Once you take the type of data into account, in addition to the value, you have a small field that needs to be hashed to index into a table. |
- Nick Gammon
www.gammon.com.au, www.mushclient.com | Top |
|
Posted by
| David Haley
USA (3,881 posts) Bio
|
Date
| Reply #37 on Wed 07 Mar 2007 05:40 AM (UTC) |
Message
| Oh, that is interesting. I guess it's what makes the most sense, too. :-)
I'm guessing that lua_Number is closer to 8 bytes, since the default number size is a double.
I wonder if their hashing function simply operates on a sequence of bytes. After all, it appears that all of their values are 4 to 8 bytes (the pointer sizes depend on 32 vs 64 bit).
One of these days, I'd like to really look through the Lua implementation and see what it's like under the hood... |
David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone
http://david.the-haleys.org | Top |
|
The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).
To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.
97,538 views.
This is page 3, subject is 3 pages long:
1
2
3
It is now over 60 days since the last post. This thread is closed.
Refresh page
top