Discussion:
[hercules-390] define NO_SIGABEND_HANDLER - for debugging ?
Mike Stramba mikestramba@gmail.com [hercules-390]
2018-08-09 09:34:52 UTC
Permalink
Is NO_SIGABEND_HANDLER designed for debugging ?

It looks like just what I need for single stepping through code.

Mike
Mike Stramba mikestramba@gmail.com [hercules-390]
2018-08-09 10:36:43 UTC
Permalink
If so, where should I #define it ?

Mike
Post by Mike Stramba ***@gmail.com [hercules-390]
Is NO_SIGABEND_HANDLER designed for debugging ?
It looks like just what I need for single stepping through code.
Mike
'\'Fish\' (David B. Trout)' david.b.trout@gmail.com [hercules-390]
2018-08-09 20:22:33 UTC
Permalink
Post by Mike Stramba ***@gmail.com [hercules-390]
If so, where should I #define it ?
You shouldn't. It gets defined automatically if required signal handling support exists for the platform being built.(*)


------
(*) Note however that it is purposely NOT defined on Windows. This is not because Windows doesn't support signals (it does!), but rather because I personally think "eating" such signals and allowing Hercules to continue running is AN EXTREMELY BAD IDEA. In my humble opinion Hercules SHOULD immediately crash and create a dump in such situations. How else is the problem -- whatever it is -- ever going to be discovered and fixed?

Yeah, sure, real hardware may throw a Machine Check Interrupt, etc, but real hardware also more than likely gathers all the information needed for the hardware engineers to debug the problem too. If Hercules "masks" (covers up) the problem by simply throwing a Machine Check but otherwise allowing the process to continue running, we will have ZERO CHANCE of ever finding or fixing the problem.

Maybe if we could somehow freeze all threads and generate a core dump file or something whenever it happened, then and ONLY then could I agree with the current sigabend-handler and watch-dog thread design. But until that happens, the WINDOWS BUILD of Hercules will always *crash*. And there's special code in bootstrap.c to automatically generate a "MiniDump" whenever that happens too, so that we CAN discover *exactly* WHERE the crash occurred and, hopefully, as a result, WHY it occurred as well.
--
"Fish" (David B. Trout)
Software Development Laboratories
http://www.softdevlabs.com
mail: ***@softdevlabs.com
'\'Fish\' (David B. Trout)' david.b.trout@gmail.com [hercules-390]
2018-08-09 20:07:44 UTC
Permalink
Post by Mike Stramba ***@gmail.com [hercules-390]
Is NO_SIGABEND_HANDLER designed for debugging ?
Not really, no.

For non-Windows builds, Hercules has and "abend" signal handler in place that is designed to intercept "abend" type signals (SIGBUS, SIGFPE, SIGILL, SIGSEGV, etc) which indicate a programming error. That is, if any of these signals occur, it means Hercules is sick, having done something "wrong" programmatically-speaking.

Such errors normally immediately terminate (crash) the process in question, but if you have a "signal handler" in place, it can intercept such error conditions instead and decide whether the error, whatever it is, is serious enough to terminate the process or whether the error can possibly be recovered from such that the process can continue running.

This "abend handler" function in Hercules is the "sigabend_handler" function in source file 'machchk.c'. It is designed to catch abend signals, and if it occurs in one of the CPU threads, it simply terminates that CPU thread and takes that CPU offline, and presents the guest (z/OS, z/VM, etc) a "Machine Check" interrupt. This allows Hercules (and thus the guest!) to continue running (rather than crash), letting you do a graceful shutdown of your guest thereby preventing damage to your dasd, etc.

There is also a "watchdog_thread" in source file 'impl.c' that monitors all CPU threads to make sure they are still executing instructions. If any one of them fails to increment their "regs->instcount" value within a certain length of time (currently 20 seconds?? minutes??), then it presumes there's a bug in Hercules that is causing that CPU thread to get stuck in a loop, and sends an abend signal for that thread (which the "sigabend_handler" function in 'machchk.c' then handles my generating a Machine Check interruption as previously explained).
Post by Mike Stramba ***@gmail.com [hercules-390]
It looks like just what I need for single stepping through code.
Nope. Has nothing to do with stepping through code. Sorry.
--
"Fish" (David B. Trout)
Software Development Laboratories
http://www.softdevlabs.com
mail: ***@softdevlabs.com
Mike Stramba mikestramba@gmail.com [hercules-390]
2018-08-09 20:32:40 UTC
Permalink
On 8/9/18, ''Fish' (David B. Trout)' ***@gmail.com
[hercules-390] <hercules->
Post by '\'Fish\' (David B. Trout)' ***@gmail.com [hercules-390]
Post by Mike Stramba ***@gmail.com [hercules-390]
It looks like just what I need for single stepping through code.
Nope. Has nothing to do with stepping through code. Sorry.
I beg to differ :)

Without NO_SIGABEND_HANDLER defined, if I single step, then pause to
"look at the scenery", the watchdog thread fires, and aborts my
hercules session :/

After defining NO_SIGABEND_HANDLER in CFLAGS : ..... -DNO_SIGABEND_HANDLER,
the watchdog thread doesn't exist, and I can single step "at my leisure" ;)

Mike
Mike Stramba mikestramba@gmail.com [hercules-390]
2018-08-09 20:33:54 UTC
Permalink
Btw, I'm running on an Linux box ... using GDB.

Mike
Post by Mike Stramba ***@gmail.com [hercules-390]
[hercules-390] <hercules->
Post by '\'Fish\' (David B. Trout)' ***@gmail.com [hercules-390]
Post by Mike Stramba ***@gmail.com [hercules-390]
It looks like just what I need for single stepping through code.
Nope. Has nothing to do with stepping through code. Sorry.
I beg to differ :)
Without NO_SIGABEND_HANDLER defined, if I single step, then pause to
"look at the scenery", the watchdog thread fires, and aborts my
hercules session :/
After defining NO_SIGABEND_HANDLER in CFLAGS : ..... -DNO_SIGABEND_HANDLER,
the watchdog thread doesn't exist, and I can single step "at my leisure" ;)
Mike
'\'Fish\' (David B. Trout)' david.b.trout@gmail.com [hercules-390]
2018-08-09 20:59:02 UTC
Permalink
[...]
Post by Mike Stramba ***@gmail.com [hercules-390]
Nope. Has nothing to do with stepping through code. Sorry.
I beg to differ :)
Without NO_SIGABEND_HANDLER defined, if I single step,
then pause to "look at the scenery", the watchdog thread
fires, and aborts my hercules session :/
After defining NO_SIGABEND_HANDLER in CFLAGS : ..... -
DNO_SIGABEND_HANDLER, the watchdog thread doesn't exist,
and I can single step "at my leisure" ;)
Okay, fine! Touché. :)

But you shouldn't be looking at the scenery when you have Hercules debugging to do!

;-)
--
"Fish" (David B. Trout)
Software Development Laboratories
http://www.softdevlabs.com
mail: ***@softdevlabs.com
Loading...