[Home] [Downloads] [Search] [Help/forum]


Register forum user name Search FAQ

Gammon Forum

[Folder]  Entire forum
-> [Folder]  SMAUG
. -> [Folder]  SMAUG coding
. . -> [Subject]  talk_channel crash

talk_channel crash

It is now over 60 days since the last post. This thread is closed.     [Refresh] Refresh page


Posted by Zeno   USA  (2,871 posts)  [Biography] bio
Date Mon 07 Feb 2005 07:19 PM (UTC)
Message
I can't seem to figure out why this crashed, and if it's a Smaug bug or not.

#0  talk_channel (ch=0x9c65a58, argument=0xbff61c75 "I'm on a roll today", channel=2, verb=0x64 "") at act_comm.c:488
488             och = d->original ? d->original : d->character;
(gdb) bt
#0  talk_channel (ch=0x9c65a58, argument=0xbff61c75 "I'm on a roll today", channel=2, verb=0x64 "") at act_comm.c:488
#1  0x0805a7d0 in do_chat (ch=0x9a0838, argument=0xbff61c75 "I'm on a roll today") at act_comm.c:673
#2  0x080e6044 in interpret (ch=0x9c65a58, argument=0xbff61c75 "I'm on a roll today") at interp.c:577
#3  0x080bb9fa in game_loop () at comm.c:687
#4  0x080bb29c in main (argc=8, argv=0xbff620b0) at comm.c:316


Both d->original and d->character are null, but I don't know what it's expecting.

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by David Haley   USA  (3,881 posts)  [Biography] bio
Date Reply #1 on Mon 07 Feb 2005 08:29 PM (UTC)
Message
Where's the crash? What's the value of d?

David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone

http://david.the-haleys.org
[Go to top] top

Posted by Zeno   USA  (2,871 posts)  [Biography] bio
Date Reply #2 on Mon 07 Feb 2005 08:42 PM (UTC)

Amended on Mon 07 Feb 2005 08:43 PM (UTC) by Zeno

Message
Well this is the core debug from the crash. talk_channel is the crash.

(gdb) p *d
$1 = {next = 0x0, prev = 0x0, snoop_by = 0x0, character = 0x0, original = 0x0,
  host = 0x0, port = 0, descriptor = 0, connected = 0, idle = 0, lines = 0,
  scrlen = 0, fcommand = 0 '\0', inbuf = '\0' <repeats 1023 times>,
  incomm = '\0' <repeats 1023 times>, inlast = '\0' <repeats 1023 times>,
  repeat = 0, outbuf = 0x0, outsize = 0, outtop = 0, pagebuf = 0x0,
  pagesize = 0, pagetop = 0, pagepoint = 0x0, pagecmd = 0 '\0',
  pagecolor = 0 '\0', user = 0x0, newstate = 0, prevcolor = 0 '\0'}


Er, that's not good...

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by David Haley   USA  (3,881 posts)  [Biography] bio
Date Reply #3 on Tue 08 Feb 2005 01:08 AM (UTC)
Message
Err, no, it's not. :P But I meant, what is the value of d, not the value of what it points to. I suppose that if it let you dereference it, it couldn't be null, though.

But what's really weird is that if the pointer points to some bogus memory (but does point somewhere) dereferencing it should not cause anything to crash.

David Haley aka Ksilyan
Head Programmer,
Legends of the Darkstone

http://david.the-haleys.org
[Go to top] top

Posted by Samson   USA  (683 posts)  [Biography] bio
Date Reply #4 on Tue 08 Feb 2005 01:22 AM (UTC)
Message
Zeno,

Your gdb output indicates you have a cleaned out descriptor that hasn't been unlinked somehow. Though I can't see why this would be, clearly it happened.

Not only is it cleaned out, it looks like it was also zero-initialized in the process. Almost suggesting it was a fresh one that had just been calloc'd.

Maybe you have an improperly loaded linkdead character trying to use a channel?
[Go to top] top

Posted by Zeno   USA  (2,871 posts)  [Biography] bio
Date Reply #5 on Tue 08 Feb 2005 01:46 AM (UTC)
Message
Not sure I wasn't on at the time. Lets see, the core was dumped at "Feb 7 01:52". The logs are this:

Mon Feb  7 01:52:37 2005 :: [*****] BUG: Buffer overflow. Closing (Takashe).
Mon Feb  7 01:52:37 2005 :: Closing link to Takashe.


Mmmm, that like the problem. But I don't see why... Did it force the character to go close the link twice?

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by Samson   USA  (683 posts)  [Biography] bio
Date Reply #6 on Tue 08 Feb 2005 02:00 AM (UTC)
Message
No, but the log does make sense if you've missed a bug fix for pager overflows. The exact fix is on the mudplanet.org site but basically what happened is you closed the link before sending the bug message when it should be closed after. I know it seems silly but that's probably all it is.
[Go to top] top

Posted by Zeno   USA  (2,871 posts)  [Biography] bio
Date Reply #7 on Tue 08 Feb 2005 02:06 AM (UTC)
Message
While my SmaugFUSS version isn't exactly new, the fix is in the code.
    while ( d->outtop + length >= d->outsize )
    {
        if (d->outsize > 32000)
        {
            /* empty buffer */
            d->outtop = 0;
          /* Bugfix by Samson - moved bug() call up */
            bug("Buffer overflow. Closing (%s).", d->character ? d->character->name : "???" );
            close_socket(d, TRUE);
            return;
        }
        d->outsize *= 2;
        RECREATE( d->outbuf, char, d->outsize );


Anywhere to go from here?

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by Zeno   USA  (2,871 posts)  [Biography] bio
Date Reply #8 on Sat 12 Feb 2005 05:00 PM (UTC)

Amended on Sat 12 Feb 2005 05:02 PM (UTC) by Zeno

Message
Actually here's a reason it seems.
#0  to_channel (argument=0x0, channel=512, verb=0x81afb4d "Log", level=107) at act_comm.c:637
637             och = d->original ? d->original : d->character;
(gdb) bt
#0  to_channel (argument=0x0, channel=512, verb=0x81afb4d "Log", level=107) at act_comm.c:637
#1  0x080cbfbc in log_string_plus (str=0x8242340 "Closing link to Zeno.", log_type=0, level=107) at db.c:4213
#2  0x0816a5a5 in fread_pfile (fp=0x917c6f0, tdiff=265, fname=0xbff69ce0 "../player/k/Khalam", count=0 '\0') at pfiles.c:281
#3  0x0816a7de in read_pfile (dirname=0xbff69e20 "../player/k", filename=0x917eb8f "Khalam", count=0 '\0') at pfiles.c:360
#4  0x0816a90c in pfile_scan (count=0 '\0') at pfiles.c:413
#5  0x0816aaec in do_pfiles (ch=0x916caa8, argument=0xbff69ec0 "tar -cf ../player/pfiles.tar ../player/*") at pfiles.c:472
#6  0x080e6f90 in interpret (ch=0x916caa8, argument=0xbff6bb56 "") at interp.c:577
#7  0x080bc857 in game_loop () at comm.c:688
#8  0x080bc0f9 in main (argc=8, argv=0xbff6bf90) at comm.c:317


I installed the pfiles snippet, used the pfiles commmand, forced me linkdead (?) and there was a crash.

(gdb) p *d
$2 = {next = 0x0, prev = 0x0, snoop_by = 0x0, character = 0x0, original = 0x0, host = 0x0, port = 0, descriptor = 0,
  connected = 0, idle = 0, lines = 0, scrlen = 0, fcommand = 0 '\0', inbuf = '\0' <repeats 1023 times>,
  incomm = '\0' <repeats 1023 times>, inlast = '\0' <repeats 1023 times>, repeat = 0, outbuf = 0x0, outsize = 0,
  outtop = 0, pagebuf = 0x0, pagesize = 0, pagetop = 0, pagepoint = 0x0, pagecmd = 0 '\0', pagecolor = 0 '\0', user = 0x0,
  newstate = 0, prevcolor = 0 '\0'}
(gdb) p d
$3 = (DESCRIPTOR_DATA *) 0x18


This looks like a bug, if someone goes linkdead and a channel is used, it crashes.

    for ( d = first_descriptor; d; d = d->next )
    {
        CHAR_DATA *och;
        CHAR_DATA *vch;

        och = d->original ? d->original : d->character;
        vch = d->character;


So if both d->original and d->character are null, should I do this:
if ( !d->character && !d->original )
            continue;


To prevent this?

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by Samson   USA  (683 posts)  [Biography] bio
Date Reply #9 on Sat 12 Feb 2005 05:19 PM (UTC)
Message
The first thing I'd check on before you get too far along is have you set your pfile delete times to something other than zero? The only time I've ever seen the code throw someone offline when using it is if the times have not been set and it proceeded to delete *ALL* of the pfiles.

Then I would check your close_socket function for the secondary bug and see why it's not unlinking the NULL descriptor like it should. It apepars to be properly deallocating it, but it's not being taken out of the list.
[Go to top] top

Posted by Zeno   USA  (2,871 posts)  [Biography] bio
Date Reply #10 on Sat 12 Feb 2005 05:47 PM (UTC)

Amended on Sat 12 Feb 2005 06:42 PM (UTC) by Zeno

Message
In close_socket...
    if ( !DoNotUnlink )
    {
        /* make sure loop doesn't get messed up */
        if ( d_next == dclose )
          d_next = d_next->next;
        UNLINK( dclose, first_descriptor, last_descriptor, next, prev );
    }

But DoNotUnlink was false:
Sat Feb 12 09:56:45 2005 :: [*****] BUG: Buffer overflow. Closing (Zeno).
Sat Feb 12 09:56:45 2005 :: Closing link to Zeno.


My close_socket function is the exact same as the latest SmaugFUSS, I see no problems with it.

As for pfiles.
Log: Total pfiles scanned: 2550

There are a lot of unused, but that shouldn't matter right? I'm pretty sure I have pfile fixed.
[EDIT] Or not.
Sat Feb 12 11:35:04 2005 :: Player Kiege was deleted. Exceeded time limit of 50 days.
Sat Feb 12 11:35:04 2005 :: Player Kitty was deleted. Exceeded time limit of 30 days.
Sat Feb 12 11:35:04 2005 :: [*****] BUG: Buffer overflow. Closing (Zeno).
Sat Feb 12 11:35:04 2005 :: [*****] BUG: Buffer overflow. Closing (Hobson).
Sat Feb 12 11:35:04 2005 :: [*****] BUG: Buffer overflow. Closing (Torath).
Sat Feb 12 11:35:04 2005 :: Closing link to Torath.
Sat Feb 12 11:35:04 2005 :: Closing link to Hobson.
Sat Feb 12 11:35:04 2005 :: Closing link to Zeno.

And then a crash.
#0  0x080dc112 in get_trust (ch=0xa44ee08) at handler.c:140
140         if ( ch->desc && ch->desc->original )
(gdb) bt
#0  0x080dc112 in get_trust (ch=0xa44ee08) at handler.c:140
#1  0x0805a7ec in to_channel (argument=0xa0d2e <Address 0xa0d2e out of bounds>, channel=512, verb=0x81afb8d "Log",
    level=107) at act_comm.c:642
#2  0x080cbfbc in log_string_plus (str=0x82422e0 "Closing link to Zeno.", log_type=0, level=107) at db.c:4213
#3  0x0816a5b9 in fread_pfile (fp=0xa460cc8, tdiff=90, fname=0xbff4c1c0 "../player/k/Kitty", count=0 '\0') at pfiles.c:281
#4  0x0816a7f2 in read_pfile (dirname=0xbff4c300 "../player/k", filename=0xa459a6f "Kitty", count=0 '\0') at pfiles.c:360
#5  0x0816a920 in pfile_scan (count=0 '\0') at pfiles.c:413
#6  0x0816ab28 in do_pfiles (ch=0xa44e280, argument=0xbff4c3a0 "tar -cf ../player/pfiles.tar ../player/*") at pfiles.c:478
#7  0x080e6f90 in interpret (ch=0xa44e280, argument=0xbff4e036 "cleanup") at interp.c:577
#8  0x080bc857 in game_loop () at comm.c:688
#9  0x080bc0f9 in main (argc=8, argv=0xbff4e470) at comm.c:317

50 days for newbies, 30 days for normal players.

Also this happens:
tar: Member names contain `..'
tar: ../player/pfiles.tar: file is the archive; not dumped

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by Greven   Canada  (835 posts)  [Biography] bio
Date Reply #11 on Sun 13 Feb 2005 01:50 AM (UTC)
Message
I get the same error with tar on my system, its trying to put the .. and . "files" into the tar ball. I just made it skip if the first character was .

Nobody ever expects the spanish inquisition!

darkwarriors.net:4848
http://darkwarriors.net
[Go to top] top

Posted by Zeno   USA  (2,871 posts)  [Biography] bio
Date Reply #12 on Sun 13 Feb 2005 04:56 PM (UTC)
Message
Hmm, although why not do gzip to save more space?

Also, since I cannot find the cause of this crash, would my ifcheck be "safe" to do? I know it may be a bandaid fix, but it's better than nothing.

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by Samson   USA  (683 posts)  [Biography] bio
Date Reply #13 on Sun 13 Feb 2005 06:55 PM (UTC)
Message
Easiest way to find out is to apply the bandaid and see what happens. If it works, then leave it and mark it in some way you'll notice so that when you figure out the real cause you can remove it.

Smaug is full of bandaid code in many places. To this day I am *STILL* finding bits and pieces of them and taking them out, after fixing the reason for the bandaids of course. Most often due to linked list issues which seem to go away when switching to std::list
[Go to top] top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


22,938 views.

It is now over 60 days since the last post. This thread is closed.     [Refresh] Refresh page

Go to topic:           Search the forum


[Go to top] top

Quick links: MUSHclient. MUSHclient help. Forum shortcuts. Posting templates. Lua modules. Lua documentation.

Information and images on this site are licensed under the Creative Commons Attribution 3.0 Australia License unless stated otherwise.

[Home]


Written by Nick Gammon - 5K   profile for Nick Gammon on Stack Exchange, a network of free, community-driven Q&A sites   Marriage equality

Comments to: Gammon Software support
[RH click to get RSS URL] Forum RSS feed ( https://gammon.com.au/rss/forum.xml )

[Best viewed with any browser - 2K]    [Hosted at HostDash]