[TriLUG] Intel bug in the news today

Michael Wright via TriLUG trilug at trilug.org
Thu Jan 4 19:52:45 EST 2018


> Variation 3 is the Intel specific variation which the kernel mitigates through KPTI.

Small addendum: Variation 3 (Meltdown) appears to also affect Apple's
mobile SoCs given that they patched iOS but not watchOS[1]. The key
here is that it's not ISA specific but chipset specific depending on
how they build their speculative execution pipeline. Given that, it's
certainly possible that other SoC manufacturers are vulnerable (e.g.
Samsung's Exynos line), but the initial paper didn't have any success
with an ARM chipset.

[1]: https://support.apple.com/en-us/HT208394

Michael


On Thu, Jan 4, 2018 at 11:25 PM, Michael Wright <mdwrigh2 at ncsu.edu> wrote:
>> Matt, it appears that this *problem* only affects Intel CPUs.
>
> This is not strictly true; yesterday's announcement was about a new
> vector of attack but there were three variations of it with working
> PoCs[1]. Variations 1 and 2 are referred to as Spectre in their
> original paper, and variation 3 is known as Meltdown in its original
> paper, so you may see those names for the vulnerabilities as well.
>
> Variation 1 is applicable to *all* modern processors (AMD, Intel, ARM,
> etc), but only affects OS code and applications which execute
> untrusted code (e.g. javascript in the browser, eBPF in the kernel,
> etc). Mitigation is basically recompilation with a patched compiler
> that doesn't emit vulnerable code, and is expected to have a small to
> negligible impact on performance in the long run but for the moment
> this has real performance implications (e.g. 10-20% memory increase
> for Chrome due to site isolation, default disabling of the
> SharedArrayBuffer feature, etc).
>
> Variation 2 is so far only demonstrated on Intel processors and AMD
> claims they have a "near zero risk of exploitation", but it's still an
> open question at the moment. This can be fixed with basically no
> performance impact via a microcode update.
>
> Variation 3 is the Intel specific variation which the kernel mitigates
> through KPTI. The patchset mitigating this has been merged upstream[2]
> and is disabled by default for AMD[3]. It was also specifically
> requested to be a config option by Linus[4] to avoid living with this
> performance burden forever. That being said, I think the performance
> concerns for most applications are overblown; you'll see a large
> performance hit for anything that with frequent syscalls or
> interrupts, but the vast majority of compute intensive operations
> specifically avoid syscalls anyways because they're kind of slow. From
> Google's blog post[5]:
>
> "There has been speculation that the deployment of KPTI causes
> significant performance slowdowns. Performance can vary, as the impact
> of the KPTI mitigations depends on the rate of system calls made by an
> application. On most of our workloads, including our cloud
> infrastructure, we see negligible impact on performance.
> In our own testing, we have found that microbenchmarks can show an
> exaggerated impact. Of course, Google recommends thorough testing in
> your environment before deployment; we cannot guarantee any particular
> performance or operational impact."
>
> Hope this helps clear up some of the confusion,
> Michael
>
> [1]: https://googleprojectzero.blogspot.com/2018/01/reading-privileged-memory-with-side.html
> [2]: https://lkml.org/lkml/2017/12/28/748
> [3]: https://lkml.org/lkml/2017/12/27/2
> [4]: https://lkml.org/lkml/2018/1/3/797
> [5]: https://security.googleblog.com/2018/01/more-details-about-mitigations-for-cpu_4.html
>
> On Thu, Jan 4, 2018 at 10:25 PM, David Burton via TriLUG
> <trilug at trilug.org> wrote:
>> Matt, it appears that this *problem* only affects Intel CPUs.
>>
>> I hope that is true of the *fix*, as well. It seems possible that a
>> workaround to solve this problem on Intel CPUs *could* clobber performance
>> on both Intel and AMD CPUs.
>>
>> Probably not. If the performance hit is as big as they're saying, I would
>> hope that fix implementers would write in processor-specific checks to
>> ensure that AMD CPUs (and the newest Intel CPUs) aren't impacted.
>>
>> Here's a statement from Lenovo:
>> https://support.lenovo.com/us/en/solutions/len-18282
>>
>> And please accept my apology for mixing two topics in the same email, and
>> thus making a hash of this thread.  😞
>>
>> Dave
>>
>>
>> On Thu, Jan 4, 2018 at 8:09 AM, Matt Flyer via TriLUG <trilug at trilug.org>
>> wrote:
>>
>>> Back in the late 90's I was working on a masters in computer
>>> engineering.  They put a lot of emphasis on a technique called "score
>>> boarding" that would look forward into the execution path and determine
>>> if there were dependencies - either code or data and rearrange
>>> execution of non dependent items to fill bubbles in the pipeline
>>> process.  For example, if it had to do a multiply operation that would
>>> take 3 micro-clock cycles it would pull non dependent opcodes into the
>>> processor registers and fill the gaps.
>>>
>>> It sounds like the technique has advanced to where it attempts to guess
>>> at the dependency value and unwind the operation when it gets it wrong
>>> as a means of getting more ergs out of this type of process.
>>> Unfortunately, it looks like there is a fundamental design flaw that
>>> all the manufacturers adopted.
>>>
>>>  On Thu, 4 Jan 2018 05:33:17 -0500
>>> Steve Holton <sph0lt0n at gmail.com> wrote:
>>>
>>> > This is probably the best one-paragraph summary we're likely to find
>>> > at this point.
>>> >
>>> > From: https://security.googleblog.com/2018/01/todays-cpu-
>>> > vulnerability-what-you-need.html
>>> >
>>> > In order to improve performance, many CPUs may choose to speculatively
>>> > execute instructions based on assumptions that are considered likely
>>> > to be true. During speculative execution, the processor is verifying
>>> > these assumptions; if they are valid, then the execution continues.
>>> > If they are invalid, then the execution is unwound, and the correct
>>> > execution path can be started based on the actual conditions. It is
>>> > possible for this speculative execution to have side effects which
>>> > are not restored when the CPU state is unwound, and can lead to
>>> > information disclosure.
>>>
>>
>>
>> On Wed, Jan 3, 2018 at 1:54 PM, David Burton <ncdave4life at gmail.com> wrote:
>>
>>> ...
>>> Now let's talk about a possible* real *problem. Does anyone know anything
>>> about the big Intel bug in the news today is? Breathless headlines say the
>>> fix could slow some workloads by up to 30%:
>>>
>>>    - https://www.pcmag.com/news/358249/intel-chips-have-a-
>>>    major-design-flaw-and-the-fix-means-slowe
>>>    <https://www.pcmag.com/news/358249/intel-chips-have-a-major-design-flaw-and-the-fix-means-slowe>
>>>    - https://hothardware.com/news/intel-cpu-bug-kernel-memory-
>>>    isolation-linux-windows-macos
>>>    - https://www.theregister.co.uk/2018/01/02/intel_cpu_design_flaw/
>>>    - http://pythonsweetness.tumblr.com/post/169166980422/the-
>>>    mysterious-case-of-the-linux-page-table
>>>    <http://pythonsweetness.tumblr.com/post/169166980422/the-mysterious-case-of-the-linux-page-table>
>>>
>>>
>>> Dave
>>>
>> --
>> This message was sent to: Michael Wright <terminallyalive at gmail.com>
>> To unsubscribe, send a blank message to trilug-leave at trilug.org from that address.
>> TriLUG mailing list : https://www.trilug.org/mailman/listinfo/trilug
>> Unsubscribe or edit options on the web  : https://www.trilug.org/mailman/options/trilug/terminallyalive%40gmail.com
>> Welcome to TriLUG: http://trilug.org/welcome


More information about the TriLUG mailing list