The problem I was working on for the last week is finally seeing some progress. Actually, manifestation of two different problems, which can make debugging embedded software awfully difficult.
First, there is a matter of adjusting the timeout for the watchdog. It turns out that the software is very sensitive to how much data it receives over the RS232 port and will result in process times taking longer periods between "petting" the watchdog. The right answer is somewhere between "not to short" and "not to long". The timeout must have enough wiggle room that we don't get a reset because a task takes a little bit longer when we've bombarded it with serial port messages. On the other hand, we have the potential for a real error where a task totally goes out in the weeds.
The side effect of the watchdog timeout that has caused me so much grief lately is that the software apparently doesn't cleanly reset everything on a "warm" reboot... particularly, it seems to leave interrupts enabled. The result is a race condition after a watchdog reset, where the interrupt fires before the software has a chance to do some other initialization, resulting in a really bizarre error in a part of the software that shouldn't even be executing, all because of wrongly or uninitialized data following the reset.
I have given the MAME debugger quite the workout lately: breakpoints, data watchpoints, execution traces - all these features are indispensable.
No comments:
Post a Comment