[Home] [Downloads] [Search] [Help/forum]

Gammon Software Solutions forum

See www.mushclient.com/spam for dealing with forum spam. Please read the MUSHclient FAQ!

[Folder]  Entire forum
-> [Folder]  Programming
. -> [Folder]  General
. . -> [Subject]  Baffled by seemingly-mysterious segfault error(s)

Home  |  Users  |  Search  |  FAQ
Username:
Register forum user name
Password:
Forgotten password?
(New message)
Subject: Baffled by seemingly-mysterious segfault error(s)
Name:
Your forum user name.
Register forum user name
Password:
Your forum password.
Forgotten password?
Message:
Message to be posted (in English, please)
Maximum of 6000 characters. Text only please, no HTML.
Forum codes:
Check this if your message uses 'forum codes' or templates (auto-detected for new posts).
Forum codes Templates

Save this message ...


Subject review (reverse sequence)

Posted by Zeno   USA  (2,867 posts)  [Biography] bio   Moderator
Date Wed 07 Mar 2012 04:13 AM (UTC)  quote  ]
Message
You get a SIGINT even without debugging it at all?

If so, check your MUD code for any SIGINT use. Sometimes copyover-crashes send specific signals, but I don't know if SIGINT is one of them.

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by Nick Gammon   Australia  (18,770 posts)  [Biography] bio   Forum Administrator
Date Mon 05 Mar 2012 08:23 PM (UTC)  quote  ]
Message
A search for "dl_sysinfo_int80" reveals about 20,000 hits.

However your problem seems to have changed. Now you are getting SIGINT not SIGSEGV.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Lmclarke   (7 posts)  [Biography] bio
Date Mon 05 Mar 2012 07:19 PM (UTC)  quote  ]
Message
Hmm. I am the only person with shell access. I didn't kill the process or use the kill command, and didn't Ctrl+C. I checked who though, and found this, which seemed odd:


myusername  pts/4    NOT myIP    09:14    3:57m  0.01s  0.01s -bash
myusername  pts/8    myIP    11:01    2:14m  0.01s  0.01s python ./flashpolicyd.py --file=flashpolicy.xml --por
myusername  pts/21   myIP    11:35   40:35   0.15s  0.00s login -- myusername
myusername  pts/26   myIP    13:16    0.00s  0.00s  0.00s w


[Go to top] top

Posted by Twisol   USA  (2,229 posts)  [Biography] bio
Date Mon 05 Mar 2012 06:46 PM (UTC)  quote  ]
Message
I don't claim to be an expert on Linux, but - SIGINT? That's not something that just happens like a SIGSEGV, someone had to actively send that signal to the process. Did you or someone else with shell access to the server interrupt the process (such as by hitting Ctrl+C from the shell, or via the 'kill' program)?

If that's not the case - and again, not a Linux expert - someone might have unauthorized access to the server. Run 'w' from the shell to see if there are any other users online. You may want to change the passwords on your shell accounts, too.

'Soludra' on Achaea

Blog: http://jonathan.com/
GitHub: http://github.com/Twisol
[Go to top] top

Posted by Lmclarke   (7 posts)  [Biography] bio
Date Mon 05 Mar 2012 06:33 PM (UTC)  quote  ]

Amended on Mon 05 Mar 2012 06:36 PM (UTC) by Lmclarke

Message
Bah. I thought that worked. Seemed stable for a little while, there.

Suddenly:


Program received signal SIGINT, Interrupt.
0x001cb7f2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2



(gdb) bt
#0  0x001cb7f2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x0038902d in ___newselect_nocancel () from /lib/libc.so.6
#2  0x08099991 in game_loop_unix (control=8) at comm.c:949
#3  0x08098fc3 in main (argc=2, argv=0xbfffd834) at comm.c:439


The backtrace is different this time...

EDIT:

Alright, I started us back up and logged in. I was the only person online this time and it crashed.


Program received signal SIGINT, Interrupt.
0x001cb7f2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2




#0  0x001cb7f2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
#1  0x002b502d in ___newselect_nocancel () from /lib/libc.so.6
#2  0x08099991 in game_loop_unix (control=8) at comm.c:949
#3  0x08098fc3 in main (argc=2, argv=0xbfffd834) at comm.c:439
[Go to top] top

Posted by Nick Gammon   Australia  (18,770 posts)  [Biography] bio   Forum Administrator
Date Mon 05 Mar 2012 06:22 AM (UTC)  quote  ]
Message
Get him to make a new world file from scratch. Just enter the MUD address/port. See if it still happens.

I suspect something along the lines of, he changed his terminal type to something longer than you are allowing for.

- Nick Gammon

www.gammon.com.au, www.mushclient.com
[Go to top] top

Posted by Lmclarke   (7 posts)  [Biography] bio
Date Mon 05 Mar 2012 12:14 AM (UTC)  quote  ]
Message
Line 1397:

strcat(telbuf, telopts[(unsigned char)d->inbuf]);
[Go to top] top

Posted by Zeno   USA  (2,867 posts)  [Biography] bio   Moderator
Date Sun 04 Mar 2012 11:18 PM (UTC)  quote  ]
Message
Which is line 1397? Can you print all variables on that line?

Zeno McDohl,
Owner of Bleached InuYasha Galaxy
http://www.biyg.org
[Go to top] top

Posted by Lmclarke   (7 posts)  [Biography] bio
Date Sun 04 Mar 2012 08:46 PM (UTC)  quote  ]

Amended on Sun 04 Mar 2012 08:50 PM (UTC) by Lmclarke

Message
Valgrind leak check (standard):

Write_to_descriptor: Bad file descriptor
Write_to_descriptor: Bad file descriptor
Write_to_descriptor: Bad file descriptor
Write_to_descriptor: Bad file descriptor
Write_to_descriptor: Bad file descriptor

==28441== HEAP SUMMARY:
==28441==     in use at exit: 16,661,251 bytes in 95 blocks
==28441==   total heap usage: 122 allocs, 27 frees, 16,669,134 bytes allocated
==28441==
==28441== LEAK SUMMARY:
==28441==    definitely lost: 0 bytes in 0 blocks
==28441==    indirectly lost: 0 bytes in 0 blocks
==28441==      possibly lost: 0 bytes in 0 blocks
==28441==    still reachable: 16,661,251 bytes in 95 blocks
==28441==         suppressed: 0 bytes in 0 blocks
==28441== Reachable blocks (those to which a pointer was found) are not shown.
==28441== To see them, rerun with: --leak-check=full --show-reachable=yes
==28441==
==28441== For counts of detected and suppressed errors, rerun with: -v
==28441== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 13 from 8)


Valgrind leak check (full):

New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket
New_descriptor: accept: Socket operation on non-socket /* There were about 10 dozen more of these */
==1702==
==1702== HEAP SUMMARY:
==1702==     in use at exit: 16,661,251 bytes in 95 blocks
==1702==   total heap usage: 123 allocs, 28 frees, 16,669,486 bytes allocated
==1702==
==1702== LEAK SUMMARY:
==1702==    definitely lost: 0 bytes in 0 blocks
==1702==    indirectly lost: 0 bytes in 0 blocks
==1702==      possibly lost: 0 bytes in 0 blocks
==1702==    still reachable: 16,661,251 bytes in 95 blocks
==1702==         suppressed: 0 bytes in 0 blocks
==1702== Reachable blocks (those to which a pointer was found) are not shown.
==1702== To see them, rerun with: --leak-check=full --show-reachable=yes
==1702==
==1702== For counts of detected and suppressed errors, rerun with: -v
==1702== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 13 from 8)


ANY help on this is so terribly appreciated. I feel like overturning my desk. The lines of code in comm.c that popped up in gdb haven't been changed in years, and looking at them reveals nothing to my relatively untrained eye. We're going 52 hours without stability and I'm no closer to figuring out what's wrong. :/
[Go to top] top

Posted by Lmclarke   (7 posts)  [Biography] bio
Date Sun 04 Mar 2012 08:44 PM (UTC)  quote  ]

Amended on Mon 05 Mar 2012 12:17 AM (UTC) by Lmclarke

Message
Alright, in and of itself, this shouldn't be too terribly strange, but it seems to be narrowed down to only one, single player who crashes the game every single time he shows up. It isn't his pfile - the same error occurs every time he connects, whether to his own pfile, the Guest pfile, random new pfiles, etc. Me and others can connect to the same pfiles with no difficulty.

Troubleshooting that has already been done:
- As mentioned above, I checked and doublechecked that it wasn't an error with his pfile. No matter which pfile he signs into, the game crashes with the same error. Other users, including myself, can use the same pfiles without issue.
- Just in case I deleted his pfile and remade it. Same things occurred - I can sign into it without issue. He signs into it, we crash.
- His client is MUSHclient, which is the most commonly used client on our game. It seems highly unlikely to be his client because we (almost) all use it.
- He is not inputting any unusual commands, and he can occasionally stay logged in long enough to have a brief conversation. The timing seems fairly random as to when the error decides to occur, but it always occurs eventually.
- He can sign into the game from The Mud Connector FMud in-browser client without any problems. The game does not crash. This is using the mudconnector.com proxy. However, having him use a free proxy through MUSHClient does not have the same effect. He connects through the proxy and the same error occurs. It is only the in-browser FMud client on TMC that seems immune.

I am a fairly new coder, so this is a little outside my knowledge. I did what I could to try and pin the problem down, but the results are difficult for me to process, let alone what I should do with them.

This is what I get in gdb when it crashes:

Program received signal SIGSEGV, Segmentation fault.
0x00259bb9 in strcat () from /lib/libc.so.6


Backtrace:

(gdb) bt
#0  0x00259bb9 in strcat () from /lib/libc.so.6
#1  0x0809a76a in read_from_buffer (d=0xb70460e4) at comm.c:1397
#2  0x080995f6 in game_loop_unix (control=8) at comm.c:835
#3  0x08098fc3 in main (argc=2, argv=0xbfffd834) at comm.c:439


The portions of comm.c referenced in the backtrace:

Line 1837 specified in context below

if ((signed char) 255 > d->inbuf > (signed char) 250)
		  {
			  i++;
			  strcat(telbuf, " ");
			  if (IS_TELOPT(d->inbuf))
				  strcat(telbuf, telopts[(unsigned char)d->inbuf]); /* Line 1397 */
			  else
			  {
				  sprintf(buf, "(%d)", d->inbuf);
				  strcat(telbuf, buf);
			  }
		  }


Line 835 specified in context below

if (d->character != NULL && d->character->wait > 0)
         {
            --d->character->wait;
            continue;
         }

         read_from_buffer (d);  /* Line 835 */
         if (d->incomm[0] != '\0')
         {
            d->fcommand = TRUE;
            stop_idling (d->character);

Line 439 specified in context below

#if defined(unix)
   if (!fCopyOver)
      control = init_socket (port);
   boot_db (fCopyOver);
   sprintf (log_buf, "The Requiem has docked at port %d.", port);
   log_string (log_buf, CHANNEL_LOG_NORMAL);
   game_loop_unix (control); /* Line 439 */
   close (control);
#endif


Continued...
[Go to top] top

The dates and times for posts above are shown in Universal Co-ordinated Time (UTC).

To show them in your local time you can join the forum, and then set the 'time correction' field in your profile to the number of hours difference between your location and UTC time.


1,425 views.

[Reply to this subject]  Reply to this subject   [New subject]  Start a new subject   [Refresh] Refresh page

Go to topic:           Search the forum


[Go to top] top

[Home]

Written by Nick Gammon - 5K

Comments to: Gammon Software support
[RH click to get RSS URL] Forum RSS feed ( http://www.gammon.com.au/rss/forum.xml )

[Best viewed with any browser - 2K]    [Internet Contents Rating Association (ICRA) - 2K]    [Web site powered by FutureQuest.Net]