View Single Post
  #7  
Old January 29th 04, 07:35 PM
Mark Hittinger
external usenet poster
 
Posts: n/a
Default Photo received from recovering Spirit rover

JXStern writes:
I hope that this flash/files program is the whole problem, but if so,
it still sounds like they never did any serious system test, not to
mention missing some important design rules from the start. The "less
surgical" technique would be "delete all files" as on any digital
camera. Very sophisticated.


Given the success of the two rovers it was indeed a wet blanket to learn
about the file problem - a very typical problem on earth. Its evident
that testing at resource limit conditions wasn't done.

One explanation for this is that the rover reboots each morning. Many
earth based applications from Windows to Mainframes must be rebooted
often in order to "hide" gradual buildup of problems from memory leaks
or what not.

In testing they may have started from scratch. operated for a day, and
cleared it out when their shift ended.

Now they will have to be extra careful with the rovers so they don't push
them up to the resource limit they now know about. Of course there could
be other resource limits yet to be discovered. If it takes a week or two
to recover from each one that will be a significant drain.

Clearly JPL needs to have a team of loud jack-booted non-academics software
guys to take their software up to these frequently encountered resource
limits and make sure the software can handle it. Constrain one resource
and see what dies, then two resources and see what dies, etc. Much better
to do this on earth!

I really wish the interpretation of what this means was different, because
if we are right then it looks like we will have a roller coaster ride of
software crashes followed by two week recovery cycles. It will be hell.

Later

Mark Hittinger