Search FAQ

Gammon Forum

Notice: Any messages purporting to come from this site telling you that your password has expired, or that you need to verify your details, confirm your email, resolve issues, making threats, or asking for money, are spam. We do not email users with any such messages. If you have lost your password you can obtain a new one by using the password reset link.
 Entire forum ➜ MUSHclient ➜ Lua ➜ Copying tables

Copying tables

Posting of new messages is disabled at present.

Refresh page


Pages: 1 2  

Posted by Shaun Biggs   USA  (644 posts)  Bio
Date Fri 20 Jul 2007 08:35 AM (UTC)
Message
I'm writing a script that will compare certain statistics over time. I have a set of triggers set up to capture all the data I need and dump all of it into a table sorted by players. This works great, no issues with it. I have no issues until I try to copy the table over to another variable, "oldranks", and then I try to edit data on the main table. Same issue when I just grab the statistics again later, rewriting the original table. The problem is this: the variables are holding a reference to the same table. If I edit one, they both get changed, since it's the same data structure.

Is there any way to nicely just copy over the data into a new table without having to make a function to traverse the entire tree? I'm just hoping that there is a way built into Lua to do this easily.

It is much easier to fight for one's ideals than to live up to them.
Top

Posted by David Haley   USA  (3,881 posts)  Bio
Date Reply #1 on Fri 20 Jul 2007 05:11 PM (UTC)
Message
Shallow table copies are pretty easy, but you do need to iterate over all pairs in the table and copy them into a new table. Deep copies are a little bit harder, but not much; you can do it with a simple recursive function that recurses whenever it finds a table. And then the same function straightforwardly applies to both versions.

The only catch I can think of is that you might need to copy the metatable.

David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone

http://david.the-haleys.org
Top

Posted by Shaun Biggs   USA  (644 posts)  Bio
Date Reply #2 on Fri 20 Jul 2007 08:16 PM (UTC)
Message
Well, so far I've just been using a recursive function that I made myself, but I was hoping that there was some sort of table.clone() function that I was missing somewhere in the documentation. I haven't mucked about with metatables, but I'm sure that they would be ignored by the function I made.

Not having a way to clone the data in the standard Lua libraries seems kind of odd to me. Kind of a pain to have to copy/paste a cloning function into whatever project needs it.

It is much easier to fight for one's ideals than to live up to them.
Top

Posted by Nick Gammon   Australia  (23,122 posts)  Bio   Forum Administrator
Date Reply #3 on Fri 20 Jul 2007 10:04 PM (UTC)
Message
You could make one file and "require" it wherever you need it. That is what I did with serialization (which recurses), and the "tprint" (which recursively prints a table for debugging).

There are so many "ifs" and "buts" with a deep copy that it probably the reason there is no library routine. You could say the same about the serialization. I recently did a different serialization for certain table types, which results in a better looking result, providing the tables conform to certain restrictions.

One potential problem (and you can probably find many) is if a table contained a userdata (for example, a file handle to an open file). How do you copy that? If you simply copy the handle then if the clone closes the file, it gets closed in the original. If you try to make a new one, how do you make a copy of a file handle, especially to an output file?

Then there is the issue of metatables. For example, with a metatable you can make __index refer to a different table if an item is not found. Do you clone the metatable? Do you clone the target table as well?

Probably they have left it as something the end user does, who will know the exact sort of copy they need to make in a particular case.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,122 posts)  Bio   Forum Administrator
Date Reply #4 on Fri 20 Jul 2007 10:26 PM (UTC)

Amended on Fri 20 Jul 2007 10:27 PM (UTC) by Nick Gammon

Message
There is another interesting consideration. Table keys can be tables. For example:


key = { foo = bar, answer = 42 }

t = {
   [key] = "test",
   another = 43,
    }


Now to "deep copy" this (table "t" that is) the "key" table (which is itself a table, but is being used as a key in table "t"), should probably be copied - that is, if you really want a full copy.

But if you do copy it, what name will it have?

I think the answer is that you make yourself some restrictions, for example:


  • Keys will be strings or numbers
  • Values will be strings, numbers, booleans, or sub-tables which do not have cycles, and which themselves conform to these same restrictions


Then you make a copier that copies that sort of table.

What I mean by "cycles" is an example like this:


x = { a = 1, b = 2 }

t1 = {
  m = x,
  n = x,
  }

t2 = {
  m = { a = 1, b = 2 },
  n = { a = 1, b = 2 },
  }


Now t1 and t2 both have elements m and n, and in both cases m and n are tables with { a = 1, b = 2 } in them. However t1 shares a common table, whereas t2 has a copy.

If you clone t1 do you get something like looks like t1 again (that is, it refers to the external table x - and if so, is the same x, or a clone of x?) or do you look up the values of x, and get a table that looks like t2?

I don't necessarily think there is a "right" answer to these questions.


- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by David Haley   USA  (3,881 posts)  Bio
Date Reply #5 on Sat 21 Jul 2007 12:41 AM (UTC)
Message
This is what I do. I assume the following when copying tables:

(a) keys can be tables, but in that case they just get aliased. Arguably, though, that is a good thing, because you can then use the same key to access both copies of the table, and you won't be modifying the key anyhow.

(b) everything must be safe to copy, that is, you don't have unsafe userdata copying like Nick brought up. (If you did, you would need to do something else.) "Everything" here means both keys and values.

(d) you can copy the table metatable around, which has been ok thus far for me but might cause trouble in some odd situations.


module "util"

function _M.deepCopyTable(t)
    return copyTable(t, "deep")
end

function _M.shallowCopyTable(t)
    return copyTable(t, "shallow")
end

function _M.copyTable(t, depth)
    assert(type(t) == "table", "You must specify a table to copy")
    assert(depth ~= nil, "You must specify either a deep or shallow copy")
    assert(depth == "deep" or depth == "shallow", "depth must be either 'deep' or 'shallow'")

    local result = {}

    for k,v in pairs(t) do
        if type(v) == "table" and depth == "deep" then
            result[k] = _M.copyTable(v,depth)
        else
            result[k] = v
        end
    end

    -- copy the metatable, if there is one
    setmetatable(result, getmetatable(t))
    
    return result
end

David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone

http://david.the-haleys.org
Top

Posted by Nick Gammon   Australia  (23,122 posts)  Bio   Forum Administrator
Date Reply #6 on Sat 21 Jul 2007 01:42 AM (UTC)
Message
That looks nice, but isn't this unnecessarily wordy? :


module "util"

function _M.deepCopyTable(t)
    return copyTable(t, "deep")
end


Isn't that the same as:


module "util"

function deepCopyTable(t)
    return copyTable(t, "deep")
end


The idea of modules is that every variable inside them is relative to the module environment (_M), you don't need to explicitly state it. That is like putting "this" everywhere inside a C++ class.

I am tempted to rework it a bit like this:


module (..., package.seeall)

local copyTable  -- forward declaration

function deepCopyTable(t)
    return copyTable(t, "deep")
end

function shallowCopyTable(t)
    return copyTable(t, "shallow")
end

function copyTable(t, depth)
    assert(type(t) == "table", "You must specify a table to copy")
    assert(depth ~= nil, "You must specify either a deep or shallow copy")
    assert(depth == "deep" or depth == "shallow", "depth must be either 'deep' or 'shallow'")

    local result = {}

    for k,v in pairs(t) do
        if type(v) == "table" and depth == "deep" then
            result[k] = copyTable(v,depth)
        else
            result[k] = v
        end
    end

    -- copy the metatable, if there is one
    setmetatable(result, getmetatable(t))
    
    return result
end


You need the "seeall" or it doesn't find things like "type".

This version makes the helper function copyTable a local declaration (and thus a private function).

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,122 posts)  Bio   Forum Administrator
Date Reply #7 on Sat 21 Jul 2007 01:46 AM (UTC)
Message
Then I start to worry that we are testing for "deep" and "shallow" for every subtable, which is really a decision that needs to be made once only. So I reworked it a bit more:


module (..., package.seeall)

function deepCopyTable(t)
  assert(type(t) == "table", "You must specify a table to copy")

  local result = {}

  for k,v in pairs(t) do
    if type(v) == "table"  then
        result[k] = deepCopyTable (v)
    else
        result[k] = v
    end
  end

  -- copy the metatable, if there is one
  return setmetatable(result, getmetatable(t))

end

function shallowCopyTable(t)
assert (type(t) == "table", "You must specify a table to copy")
    
local result = {}

  for k,v in pairs(t) do
    result[k] = v
  end

  -- copy the metatable, if there is one
  return setmetatable(result, getmetatable(t))

end


- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,122 posts)  Bio   Forum Administrator
Date Reply #8 on Sat 21 Jul 2007 01:50 AM (UTC)

Amended on Sat 21 Jul 2007 01:51 AM (UTC) by Nick Gammon

Message
By using "..." in the module declaration, the name of the file is also the name of the module. So in my case I saved it as "copytable.lua", and then tested it like this:


require "copytable"
require "tprint"

t1 = {
  m = { a = 1, b = 2 },
  n = { a = 1, b = 2 },
  }

t2 = copytable.deepCopyTable (t1)

tprint (t2)



Of course, if all that file did was copy tables, you could rename the functions as "deep" and "shallow", so the call would simply be:


t2 = copytable.deep (t1)

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,122 posts)  Bio   Forum Administrator
Date Reply #9 on Sat 21 Jul 2007 02:45 AM (UTC)

Amended on Sat 21 Jul 2007 02:54 AM (UTC) by Nick Gammon

Message
An interesting problem with the deep table copy is that you can send it into a loop if a table appears within itself. It recurses to make the new copy, which starts the table again from the start, and eventually you overflow the stack. An example is:


t = copytable.deep (_G)


Since _G is defined inside _G, we get the loop. The modified version below is safer because it keeps an internal table of what nested tables are being copied. A check disallows a recursive copy of a table that we are already in the middle of copying.

This version renames the functions as "deep" and "shallow" for easier use.





-- copytable.lua

--[[

Table copying functions. 

See: http://www.gammon.com.au/forum/?id=8042

Ideas by Shaun Biggs, David Haley, Nick Gammon

Date: 21st July 2007

This is intended to copy tables (make a real copy, rather than just the table
reference).

You can do a deep or shallow copy.

Shallow: Simply copies the keys and values. 
If a value is a table, you will get the same table as in the original.

Deep: Copies keys and values recursively.
If a value is a table, makes a copy of that table, and so on.

Restrictions: Items must be "safe" to copy (eg. not file IO userdata). 
If doing a deep copy, a table may not appear inside itself.

Example:

t1 = {
  m = { a = 1, b = 2 },
  n = { c = 3, d = 4 },
  }

require "copytable"  -- load this file

t2 = copytable.shallow (t1)  -- shallow copy
t3 = copytable.deep (t1)  -- copies sub tables as well

--]]

module (..., package.seeall)

function deep (t, done)
  assert (type (t) == "table", "You must specify a table to copy")

  local result = {}
  done = done or {}

  assert (not done [t], "table is defined within itself")

  done [t] = true -- we will disallow recopying table t

  for k, v in pairs (t) do
    if type (v) == "table"  then
      result [k] = deep (v, done)
    else
      result [k] = v
    end -- if table or not
  end -- for each table element

  done [t] = nil -- table t can be copied again

  -- copy the metatable
  return setmetatable (result, getmetatable (t))

end -- function deep

function shallow (t)
  assert (type (t) == "table", "You must specify a table to copy")
    
  local result = {}

  for k, v in pairs (t) do
    result [k] = v
  end -- for each table element

  -- copy the metatable
  return setmetatable (result, getmetatable (t))

end  -- function shallow

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by David Haley   USA  (3,881 posts)  Bio
Date Reply #10 on Sat 21 Jul 2007 03:12 AM (UTC)
Message
Ah, yes, I forgot the assumption of not containing cycles. I meant to write that, actually...

Quote:
The idea of modules is that every variable inside them is relative to the module environment (_M), you don't need to explicitly state it.

By using _M in front of functions that are meant for the public, it is quickly clear to me what is what. A function that is declared local up at the top won't be obviously local when redeclared further down. So, my convention is that a function meant for the public is _M and a function meant for just the module will be local, and later declared without a _M.

It's wordy for small files like a table copying, but it makes things really easy when you are designing large classes for OOP type stuff, where you have more than just a few functions lying around.

Quote:
You need the "seeall" or it doesn't find things like "type".


Indeed; I didn't paste all that stuff because it's just a heap of locals at the beginning that I thought were unnecessary obfuscation for pasting the code.

I don't like using seeall for a small self-contained module because I don't necessarily want to inherit the global environment for indexing. That is a very big deal when defining objects. If you are writing an object module, and you use seeall, then if you try to get a field of the object that doesn't exist (i.e. is nil) it'll go into _G to get it. If you are then also using the 'strict' module, you will get an error if it doesn't exist. And if it *does* exist, you are not getting what you think you are getting.

The main reason I don't use '...' but actually name the module is because it's meant to be a library-type thing that sits around and is called by lots of people; its name is essentially static so I figure I might as well just name it. I don't feel terribly strongly about this, though, unlike the 'seeall' problems I describe above.

Quote:
This version makes the helper function copyTable a local declaration (and thus a private function).

I suppose you could do that. I left it public so that people could call it directly if they wanted to. I think it was for historic reasons: at first, I didn't have the helper interfaces so all my calls were to copyTable; had I made it public it would have required going through and changing an awful lot of code, which I didn't feel like doing.

Quote:
This version renames the functions as "deep" and "shallow" for easier use.

I don't do that because what I pasted is actually just a subset of a whole utility file with various other routines; it wouldn't make sense for me to have just 'deep' and 'shallow' in there.

Quote:
Then I start to worry that we are testing for "deep" and "shallow" for every subtable, which is really a decision that needs to be made once only.

Yes, I thought about doing that, but I figured it didn't matter a whole lot (for now) since string checks are essentially free in Lua. And my tables were fairly "shallow" in depth, so it wasn't a big issue. But it is something I was thinking of getting around to doing, but now you've done it for me. :)

David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone

http://david.the-haleys.org
Top

Posted by Nick Gammon   Australia  (23,122 posts)  Bio   Forum Administrator
Date Reply #11 on Sat 21 Jul 2007 04:26 AM (UTC)
Message
It wasn't really a criticism, just nitpicking. ;)

Look at this for example:


for k,v in pairs(t) do
        if type(v) == "table" and depth == "deep" then
            result[k] = copyTable(v,depth)
        else
            result[k] = v
        end
    end


Based on the if check, it may as well read:


for k,v in pairs(t) do
        if type(v) == "table" and depth == "deep" then
            result[k] = copyTable(v, "deep")
        else
            result[k] = v
        end
    end


I guessed you were posting part of a larger utils file, but once I started working on it as a standalone "copy table" file, I started mucking around with it.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Shaun Biggs   USA  (644 posts)  Bio
Date Reply #12 on Sat 21 Jul 2007 05:57 AM (UTC)
Message
heh, this got a bit deeper than I had meant it to be. I was testing data display using tprint, and even found an odd workaround with serialize. You can just substitute certain portions of data within the serialized string if you are EXACTLY sure of what you are starting with. Just a "\n%s*(oldranks)%." for the example I posted below.

Just to display a very small section of data (this repeats out to about 60 people), I have a quick example of my ranking at the start of a clan competition, but other data sets I'm building have 3-4 more layers of depth.

<variables>
  <variable name="oldranks">oldranks = {}
  oldranks.Balaam = {}
    oldranks.Balaam.kills = "33"
    oldranks.Balaam.pups = "14726"
    oldranks.Balaam.levels = "4221"
    oldranks.Balaam.wars = "24"
    oldranks.Balaam.qp = "332067"
</variable>
</variables>

It is much easier to fight for one's ideals than to live up to them.
Top

Posted by Nick Gammon   Australia  (23,122 posts)  Bio   Forum Administrator
Date Reply #13 on Sat 21 Jul 2007 07:08 AM (UTC)
Message
If you use the newly published serialize.save_simple, the results look like this:


{
  Balaam = {
    levels = "4221",
    pups = "14726",
    kills = "33",
    wars = "24",
    qp = "332067",
    },
  } 


Since the names aren't imbedded all through the table, it is easy to copy and paste it to make a duplicate.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

Posted by Nick Gammon   Australia  (23,122 posts)  Bio   Forum Administrator
Date Reply #14 on Sat 28 Jul 2007 06:31 AM (UTC)
Message
An interesting variation on the deep copy idea is presented at the Lua Wiki site:

http://lua-users.org/wiki/CopyTable

The nice thing about this function is it copies any object (including tables), and rather than looping or giving an error on a nested table, simply includes the same table again.

In other words, you can "deep copy" the _G table, because when it recursively refers to itself, the copy gets a reference to the copy.


function deepcopy(object)
    local lookup_table = {}
    local function _copy(object)
        if type(object) ~= "table" then
            return object
        elseif lookup_table[object] then
            return lookup_table[object]
        end  -- if
        local new_table = {}
        lookup_table[object] = new_table
        for index, value in pairs(object) do
            new_table[_copy(index)] = _copy(value)
        end  -- for
        return setmetatable(new_table, getmetatable(object))
    end  -- function _copy
    return _copy(object)
end  -- function deepcopy


The author comments: "But the 2 tables share the same metatable, you can avoid this by changing this 'getmetatable(object)' to '_copy( getmetatable(object) )'.".


- Nick Gammon

www.gammon.com.au, www.mushclient.com
Top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


56,374 views.

This is page 1, subject is 2 pages long: 1 2  [Next page]

Posting of new messages is disabled at present.

Refresh page

Go to topic:           Search the forum


[Go to top] top

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.