Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok

From: Alan Cox
Date: Sat Jan 06 2018 - 18:05:44 EST


> Even if it would be practical the speed probably going to be in bytes per second,
> so to read anything meaningful an attack detection techniques (that people
> are actively working on) will be able to catch it.
> At the end security cannot be absolute.
> The current level of paranoia shouldn't force us to make hastily decisions.

I think there are at least three overlapping problem spaces here

1. This is a new field. That could mean that it turns out to be
really hard and everyone discovers that eBPF was pretty much the only
interesting attack. It could also mean we are going to see several years
or refinement by evil geniuses all over the world and what we see now is
tip of iceberg in cleverness.

2. It is very very complicated to answer a question like "is
sequence x safe on all of vendor's microprocessors" even for the vendor

3. A lot of people outside of the professional security space are
very new to the concept of security economics and risk management as
opposed to seeing the fake binary nice green tick that says their
computers are secure that they can pass to their senior management or
show to audit.

> So how about we do array_access() macro similar to above by default
> with extra CONFIG_ to convert it to lfence ?

We don't have to decide today. Intel's current position is 'lfence'. Over
time we may see vendors provide more sequences. We will see vendors add
new instruction hints and the like (See the ARM whitepaper for example)
and the array access code will change and change again for the better.

The important thing is that there is something clean, all architecture
that can be used today that doesn't keep forcing everyone to change
drivers when new/better ways to do the speculation management appear.

> Why default to AND approach instead of lfence ?
> Because the kernel should still be usable. If security
> sacrifices performance so much such security will be turned off.
> Ex: kpti suppose to add 5-30%. If it means 10% on production workload
> and the datacenter capacity cannot grow 10% overnight, kpti will be off.

The differences involved on the "lfence" versus "and" versus before are
not likely to be anywhere in that order of magnitude. As I said I want to
take a hard look at the IPv4/6 ones but most of them are not in places
where you'd expect a lot of data to be in flight in a perf critical path.

Alan