[Home] [Downloads] [Search] [Help/forum]

Gammon Software Solutions forum

See www.mushclient.com/spam for dealing with forum spam. Please read the MUSHclient FAQ!

[Folder]  Entire forum
-> [Folder]  Programming
. -> [Folder]  General
. . -> [Subject]  Linux, file sizes
Home  |  Users  |  Search  |  FAQ
Username:
Register forum user name
Password:
Forgotten password?

Linux, file sizes

[Reply to this subject]  Reply to this subject   [New subject]  Start a new subject   [Refresh] Refresh page


Posted by Zeno   USA  (2,871 posts)  [Biography] bio   Moderator
Date Wed 15 Jun 2005 07:02 PM (UTC)  quote  ]
Message
I've been looking over this for a while. And I'm still confused. This is about my /quests dir.

-According to "du" its 900. (900kb)
-According to "du -h" its 900kb. (Same)
-According to "du -b" its 13908. (bytes)
-According to FTP, right click->Properties its... 5kb.

Can someone explain to me what is correct? I was afraid it was getting too large for what quests we have in, then thought I was wrong, then looked at it again and began panicking again, then double checked and was relaxed. Then looked again. And repeat. I'm really confused. I had asked before on a *nix forum, but did not understand what they had said. Or at least they were explaining the wrong thing I wanted to understand.

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by David Haley   USA  (3,881 posts)  [Biography] bio   Moderator
Date Reply #1 on Wed 15 Jun 2005 09:25 PM (UTC)  quote  ]
Message
The best way to get the size of a directory is to cd into it, and then type "du -sh ." If I were you, I would not trust the FTP client. The system knows better.

That will give you the size of your directory and its contents.

Recall that du -b prints out the apparent size meaning that it does not take into account such things as cluster fragmentation and filesystem overhead. If your files are highly fragmented, then the actual size taken on disk will be much larger.

How many files are we talking about here?

David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone

http://david.the-haleys.org
[Go to top] top

Posted by Zeno   USA  (2,871 posts)  [Biography] bio   Moderator
Date Reply #2 on Wed 15 Jun 2005 09:35 PM (UTC)  quote  ]

Amended on Wed 15 Jun 2005 09:37 PM (UTC) by Zeno

Message
"du -sh" gives me 916K. There are 227 files in the quest dir. According to "du -ha" each file is 4.0K which makes no sense. Which is why I'm still confused. I'm really hoping the directory is not almost 1mb. Each file should be a different size because not all players have done the same amount of quests.

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by David Haley   USA  (3,881 posts)  [Biography] bio   Moderator
Date Reply #3 on Wed 15 Jun 2005 10:54 PM (UTC)  quote  ]
Message
227 * 4k = 908k. Makes sense to me that with overhead, that'll be 916k.

As for every file taking up at least 4k, that could be the file system overhead that I was talking about. Recall that if a file needs even just 1 byte, it still has to fill up an entire disk cluster not to mention all the overhead needed to store it in the file system.

What do you get if you do du -ab?

David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone

http://david.the-haleys.org
[Go to top] top

Posted by Zeno   USA  (2,871 posts)  [Biography] bio   Moderator
Date Reply #4 on Thu 16 Jun 2005 01:07 AM (UTC)  quote  ]
Message
The math part made sense but the 4kb file sizes/total size did not.

Yeah I checked -ab before, some were 84, others were 17 etc. So lets say each file is 84b. 227*84=19068. 19068b/1024 =~ 18.6kb. This is where I'm confused. It says 908kb while its it can't be anymore than 18kb.

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by Meerclar   USA  (584 posts)  [Biography] bio
Date Reply #5 on Thu 16 Jun 2005 03:18 AM (UTC)  quote  ]
Message
Unless you are on a distro of Linux that uses Novell's file system, your minimum cluster size will be ~4k regardless of the actual size of the files. When you check du -ab it's reporting back the cluster size of the files, not their actual byte count (du -b). The actual byte count will *always* be signifigantly lower than the cluster usage indicates because the file system is highly ineffecient in its file storage.

Meerclar - Lord of Cats
Coder, Builder, and Tormenter of Mortals
Stormbringer: Rebirth
storm-bringer.org:4500
www.storm-bringer.org
[Go to top] top

Posted by Zeno   USA  (2,871 posts)  [Biography] bio   Moderator
Date Reply #6 on Thu 16 Jun 2005 03:24 AM (UTC)  quote  ]
Message
So which file size is correct and which total file directory size is correct?

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by David Haley   USA  (3,881 posts)  [Biography] bio   Moderator
Date Reply #7 on Thu 16 Jun 2005 07:10 AM (UTC)  quote  ]
Message
du -sh will give you the actual file size, including overhead and everything. In other words, if du -sh says it's 900k, then it's taking up 900k.

du -ba and du -b do the same thing as far as sizes are concerned, btw: report the actual space usage. It's du -B that will report the actual size of the file.

In other words, if you want to be storage efficient, you should group multiple players into one large file. That way, the cluster size penalty won't be nearly as bad. In your system, every 20 bytes is taking up a whopping 4k. You could fit 200 times 20 bytes into 4k...

If you don't feel like writing the code to manage grouped files, it might be worth your time to get a library that does table storage for you, such as BerkeleyDB. Your system might even have BerkeleyDB already installed - a lot of programs use it for its flat-file storage functionality.

David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone

http://david.the-haleys.org
[Go to top] top

Posted by Zeno   USA  (2,871 posts)  [Biography] bio   Moderator
Date Reply #8 on Thu 16 Jun 2005 03:48 PM (UTC)  quote  ]
Message
I see. So now it doesn't matter how many quests are in the game since it'll be at 4k unless we have so many quests it goes past 4k, right? It just matters how many quest files there are. And there can only be as much quest files as there are player files, which means it'll never go past the total size of the pfile dir. BerkeleyDB eh? What would it do? Store the files into one file? Or something else?

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by David Haley   USA  (3,881 posts)  [Biography] bio   Moderator
Date Reply #9 on Thu 16 Jun 2005 08:27 PM (UTC)  quote  ]
Message
That's right. No matter how big the file is, it'll be 4k. And if you ever have 4k+1 of actual size, the file will take up 8k. This doesn't matter a lot for large files (what is 4k when you already have 10mb?) but it really kills on the smaller files like the ones you have.

BerkeleyDB provides a simple table storage mechanism. You define key-value pairs. It's not a 'database' in the SQL sense of the term. That is, you can't have multiple columns in your tables, etc. You just have unique key to value.

In your case, the key could be the player name, and the value could be the quest data. You could even, conceivably, store the file you're currently writing to disk in string form in the database; you would load that string and parse it using the code you already have.

The advantage to using BerkeleyDB for you would be that you wouldn't get the massive overhead of storing lots of small files. In parallel, however, you'll have to deal with the BerkeleyDB stuff, making sure you open/close the file correctly. I remember from using BerkeleyDBXML (an actual XML database built around BerkeleyDB) that it wasn't very tolerant of crashes; if the database wasn't closed properly you couldn't open it without doing some voodoo on it first. BerkeleyDB is supposed to be fairly lightweight; I've seen it used in many places but have never used it myself. My suggestion would be to check it out and see if it seems reasonable.

If that doesn't work, my suggestion would be to either find another library, or write some code that stores quest data in groups. For example, all players whose names start with 'a' could be in one file. That will help a great deal with your overhead problem.

David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone

http://david.the-haleys.org
[Go to top] top

Posted by Zeno   USA  (2,871 posts)  [Biography] bio   Moderator
Date Reply #10 on Thu 16 Jun 2005 09:53 PM (UTC)  quote  ]
Message
Well this was the snippet (I think from MudMagic) that has the questcomplete mprogs etc. I don't feel ready to attempt any sort of table storage etc or anything. I think this is best to leave how it is now. Do you have any idea how many lines (one word/line) it would take to go past 4kb?

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by David Haley   USA  (3,881 posts)  [Biography] bio   Moderator
Date Reply #11 on Thu 16 Jun 2005 10:30 PM (UTC)  quote  ]
Message
4k is 4096 bytes. Assuming an average word length of 5 letters, plus one for the newline, you would need 818 lines to go past 4k.

David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone

http://david.the-haleys.org
[Go to top] top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


4,106 views.

[Reply to this subject]  Reply to this subject   [New subject]  Start a new subject   [Refresh] Refresh page

Go to topic:           Search the forum


[Go to top] top

Quick links: MUSHclient. MUSHclient help. Forum shortcuts. Posting templates. Lua modules. Lua documentation.

[Home]

Written by Nick Gammon - 5K

Comments to: Gammon Software support
[RH click to get RSS URL] Forum RSS feed ( http://www.gammon.com.au/rss/forum.xml )

[Best viewed with any browser - 2K]    [Web site powered by FutureQuest.Net]