[TriLUG] Dealing with bit flips from cosmic rays
David Burton via TriLUG
trilug at trilug.org
Mon May 31 22:51:41 EDT 2021
If you're building spaceships, I think you need to be willing to go
higher-end than a Pi Zero. It seems silly to spend thousands of dollars to
send a $70 computer to space.
You need to start with a radiation-hardened, or at least
radiation-tolerant, ARM processor, in place of the stock chip:
https://duckduckgo.com/?q=radiation-hardened+ARM+processor
One of the articles which that search finds is encouragingly entitled,
*"Radiation
hardened ARM that doesn’t cost an ARM and a LEG." *It is about this one,
which is $1350 at qty one, from Mouser:
https://www.mouser.com/ProductDetail/VORAGO/VA10820-PQ128F0PCA?qs=AQlKX63v8RsaZ0n5FC8cxg%3D%3D
It includes triple redundancy and internal 2-of-3 voting circuits on all
internal registers:
https://www.mouser.com/new/vorago-technologies/vorago-va108x0-mcus/
https://www.mouser.com/pdfdocs/VA10820_DS_12.pdf
It sounds pretty great except:
1. It doesn't appear to support external RAM, and it doesn't have anywhere
near enough internal RAM to run Linux.
2. It is slow as mud.
If you're using off-chip memory, you'd also want it to be ECC RAM. I don't
know whether the available rad-hardened ARM CPUs support ECC, but since
some ARM core designs do support ECC I'd be surprised if that feature
wasn't included in at least some of the rad-hardened CPUs. The reason that
I know that some ARM core designs support ECC is that this ARM-based
machine uses ECC RAM:
https://kobol.io/helios4/
For dealing with permanent memory errors, Linux supports (or at least used
to support?) mapping them out, via the memmap
<https://www.google.com/search?q=memmap+kernel+parameter> kernel parameter:
https://web.archive.org/web/20140806175048/https://bryanquigley.com/planet-ubuntu/bad-memory-howto
I think it was mainly intended for cheapskates who don't want to replace
their expensive defective DIMMs, but it has obvious applicability to use
cases in which replacing DIMMs is impossible.
SSDs have a lot of redundancy and error-correction built into them already,
so maybe stock SLC SSDs will be fine? (Maybe RAID1 mirrored.)
I can also envision having multiple identical computers aboard. Apollo did
that and used 2-of-3 voting logic on the control lines coming from the
CPUs, but that sounds hard. However, having two or more computers and using
only one at a time would be pretty simple. You could have external logic
which would periodically switch between them, perhaps daily. If one of them
went casters-up, then your satellite would be off-line only temporarily,
until the next scheduled computer swap, at which time your satellite would
come back on-line, and "ground control" could decide what to do about the
problem computer: perhaps disable it permanently, or perhaps identify bad
memory, map it out, and run that machine with a bit less RAM.
Dave
On Mon, May 31, 2021 at 11:29 AM John Franklin via TriLUG <trilug at trilug.org>
wrote:
> On May 30, 2021, at 13:37, Charles West via TriLUG <trilug at trilug.org>
> wrote:
> > TL:DR: Are there software ways to harden a Raspberry Pi Zero against bit
> > flips?
> >
> > I've been looking into space craft design and found some interesting
> things
> > related to computing for space missions. The common way to do
> computation
> > is to have special hardened hardware that can handle a lot more
> radiation.
> > These things can mass kilograms and run at ~200 mHz while costing
> $200,000+.
>
> Water is known to be a good radiation shield. Putting a Pi in a hollow
> surrounded by several centimeters of water would lengthen the serviceable
> lifespan of the Pi. The problem is water is heavy, and getting it to orbit
> is expensive.
>
> Software tricks, such as parity pages of memory, would protect against bit
> flips in main memory, but that gets harder to do in L2/L3 cache, and near
> impossible to do with CPU registers, at least on a stock Pi. Also, that
> protects against bit flips, but not permanent damage to the DRAM resulting
> in stuck bits.
>
> NASA has a really cool flame experiment where they point a camera towards
> a pair of wires with loops at the end to hold a bit of flammable material
> and light it. One of the images from the experiment is here [2]. The dots
> in the background of the image aren’t stars. They're damaged bits of the
> CCD suffered from hard radiation.
>
> If there were a cheaper way, they would already be using it.
>
> jf
> [1]
> https://space.stackexchange.com/questions/1336/what-thickness-depth-of-water-would-be-required-to-provide-radiation-shielding-i
> <
> https://space.stackexchange.com/questions/1336/what-thickness-depth-of-water-would-be-required-to-provide-radiation-shielding-i
> >
> [2] https://www.flickr.com/photos/nasamarshall/9935162654/ <
> https://www.flickr.com/photos/nasamarshall/9935162654/>
> --
> John Franklin
> franklin at elfie.org
>
> --
> This message was sent to: Dave Burton <ncdave4life at gmail.com>
> To unsubscribe, send a blank message to trilug-leave at trilug.org from that
> address.
> TriLUG mailing list : https://www.trilug.org/mailman/listinfo/trilug
> Unsubscribe or edit options on the web :
> https://www.trilug.org/mailman/options/trilug/ncdave4life%40gmail.com
> Welcome to TriLUG: https://trilug.org/welcome
More information about the TriLUG
mailing list