View Single Post
  #6  
Old March 7th 12, 10:16 AM posted to sci.space.history,sci.electronics.design
B0b Mosley
external usenet poster
 
Posts: 32
Default Did non space rated chips made Phobos-Grunt fail?

On Feb 2, 2:34*am, miso wrote:

One of the sleazier semis I worked at would do the initial run in both
bulk and epi. If there was a latch up issue in bulk but the epi worked,
you sold the epi product until pricing demands made a redesign on bulk
profitable.


....These "sleazy semis" were far more the norm than the exception,
especially about 17 years ago when I was helping design the SummaJet
for SummaGraphics. This one particular common IC - and for the hate of
me I can't recall what sort of amp chip it was - when bought in bulk,
over 80% of the chips would lock up within 96 hours of continuous
operation. And by "continuous operation", this meant either continuous
torture-test printing - a real "Romulan Engineering" test that one -
or just sitting there waiting for a print job. There were a dozen
manufacturers we went through, and every single bulk batch we got had
either the same lockup rates, or exceeded those rates by as much as
15% even though they'd run a few days longer before locking up.

....But that wasn't the punch line. Not only could we not determine
just *what* was triggering the lockups - it wasn't any errant signals
and/or voltage issues coming down the line from the main proc on the
processor/RAM board that was the plotter's brains - but if you let the
plotter sit there locked up for more than 48 hours, it was pretty much
guaranteed that within the next 24 the IC in question would actually
*catch fire*, spewing flame out like a gas jet when it first ignited.
I actually witnessed this on four separate occasions, and we actually
caught it not only on videotape with a camera about a foot from the
IC, but on the lab's security cameras just before one set the whole
proc board alight and triggered the fire alarms. Luckily the sprinkler
systems didn't go off, because about 300 rolls of various plotter
paper and 25 years worth of stored yet-to-be-digitized blueprints
would have been ruined. And trust me, nothing stinks worse in a print
environment than mildewed vellum :O

....However, when we got samples that had been run through extended
pretest by the manufacturers instead of just the bulk testing, the
lockup rates dropped to less than 5% of the samples supplied, and
where a chip that locked up once was pretty much guaranteed to lock up
again and again and again without un-fail, those that did experience a
lockup usually did not experience a lockup again. And out of all the
samples received from the various chip houses, not one single IC ever
caught fire. We still stuck a heat sink on it just to play safe when
we shipped the final product - especially the custom jobs we did for
Mutoh - and AFAIK not a single SummaJet ever caught fire in the field
even after CalComp bought the place out and finished up the sales
contracts.

....And alas, we never *did* find out just what in the bulk test
process was causing those chips to become so damaged that they would
immolate. The only in-depth testing we had time to do other than to
verify that the signal and voltage issues were either not there and/or
not our fault was to strip back the outer housing and take a look at
the inner circuit before and after lockup - and in one case, after
immolation had started but was suffocated before too much damage was
done - and see if anything was out of the norm. The only damage we saw
before IC install was the "arclight" or "craters of the Moon" effect
that occurs when too much static discharge has been passed through an
IC through improper handling - read: touching the pins with bare
fingers, no grounding strap at the very least, etc - and the ICs that
had this microdamage were also ones prone to lockup. Subsequent
analysis showed no additional evidence of "arclighting", which ruled
out damage caused by our assembly team during installation and
soldering, and the one that briefly caught fire also showed no
additional "arclighting" - at least when we compared the unimmolated
areas to photos of the pre-installed IC.

....So, to Bottom Line it all, it was something in the bulk test
process that was damaging the ICs that wasn't duplicated in the
samples sent to us that were extensively tested at the foundries. What
that was, however, was never determined, and due to time constraints
and shipping schedules, it was determined that we'd pay the chip
houses extra to ship us the e-test samples as if they were the
requested bulk shipments, stretch out our delivery dates a bit, and
pass the cost per chip on to the consumer, which only added about
$20.00 USD per unit. We were later able to offset that by about $5.00
USD by switching to a different resin for the inkjet cartridge holder
wedges - that's a clusterfrack that deserves a whole book chapter to
fully describe the history of - and trimmed another $5 to $7.00 USD
per unit for those who bought a SummaJet with the full 32MB of RAM
installed, as we got a price break from a RAM supplier. That left ~
$4.00 USD to pass on to Joe Customer, but for an inkjet that performed
95% identical to an HP job, using the same carts, operating at the
same speed, and all for about 30% less cost, that four bucks was just
a drop in the bucket compared to the sales tax on such an item.

OM