[TriLUG] Dealing with bit flips from cosmic rays
Matt Flyer via TriLUG
trilug at trilug.org
Mon May 31 07:49:02 EDT 2021
I think trying to physically stop radiation and cosmic rays will be
futile. It would be like trying to catch a sneeze with a pasta
strainer. The mass required from dense materials, which are still
mostly empty space, would be astronomical.
Instead I can think of three things:
1) go back to higher voltage, slower, logic. The old 5 volt stuff was a
lot more noise resilient than the current 1V or less stuff that
transitions in nanoseconds or even picoseconds.
2) redundancy. There is a reason the space shuttle used three computers
for everything and it was a 2 out of 3 vote.
3) software algorithms and processing power are such that mathematical
hashing (encrypting / decrypting) etc are a lot faster so I think some
sort of extra check values could add a lot of resilience. Kind of like
ECM memory but on steroids.
On Sun, 2021-05-30 at 13:37 -0400, Charles West via TriLUG wrote:
> Hello!
>
> TL:DR: Are there software ways to harden a Raspberry Pi Zero against
> bit
> flips?
>
> I've been looking into space craft design and found some interesting
> things
> related to computing for space missions. The common way to do
> computation
> is to have special hardened hardware that can handle a lot more
> radiation.
> These things can mass kilograms and run at ~200 mHz while costing
> $200,000+.
>
> This has been fine as long as launch costs are really high, but are
> likely
> to be a bigger part of the cost as launches get cheaper (looking at
> you,
> Starship). This has me wondering if you could shrink the Pi zero
> form
> factor as much as possible (though it's already pretty small) and
> spend the
> mass on thick/cheap rad shielding. This would eliminate most of the
> low
> energy radiation.
>
> However, cosmic radiation is really hard to stop. You will sometimes
> get
> bit flips and more serious failures. If I may ask, do you have any
> ideas
> on how you could harden the system against software errors? Could
> you
> store/retrieve from RAM with forward error correction? Maybe scan
> and
> correct flash storage with forward error correction and then reboot
> occasionally?
>
> What do you think?
>
> Thanks,
> Charlie
More information about the TriLUG
mailing list