Definitely looks like time for "old school" debugging. Lots of log messages sprinkled about. Start general. Log message at the beginning of game_loop, in between the descriptor input and the descriptor output, beginning and end of the update_handler, and the end of game_loop. Then when it just up and dies, if you haven't killed the thing with 3 days worth of log spam, you can begin to narrow down the search with log messages in more specific parts. Pain in the ass, but some bugs just demand this kind of attention.
SmaugMuds.org: http://www.smaugmuds.org - The Smaug MUDs Community Center
"The past was erased, the erasure was forgotten, the lie became truth." -- George Orwell, 1984
The memory quota? I have this installed on my local machine. Is there really a memory quota included, or is it normally set by the admin? Because I haven't set it.
Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
It's weird that it's happening so regularly after 3 days. Any chance of an overflow somewhere on a timer?
As for hostnames: it's possible that the feature doesn't quite work. Or that you have to type 'users' to see the hostnames, because it's in the process of resolving them. But the point of the feature is to prevent the entire MUD from hanging if the DNS lookup hangs. Generally that happens if the DNS resolver is having issues.
David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone
OK, it looks like you create a thread called "thread_lookup"; I imagine you are using that to do the DNS lookups so that the whole MUD doesn't hang if the DNS is slow? That's something I've been meaning to do for a while.
Anyhow, have you observed any correlation at all between someone connecting and the MUD crashing a short time later?
David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone
Nick's right about the local variables being smashed. Valgrind would help you with that.
As for pthread, it should be easy to find out where you're using it... just grep for 'pthread'. (And I actually meant what you're linking with at link time i.e. what -l flags you send to g++ at link phase.)
David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone
A crash that does not produce helpful information can often be caused by a badly corrupted stack. If the stack itself is wiped out, then doing a backtrace can be a problem.
Wiping the stack isn't that hard, overwriting a local variable can easily do it (that is, memset or strcpy into more space than was allocated).
The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).
To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.