Re: Bricked x86 CPU with software?

From: Tim Mouraveiko
Date: Thu Jan 04 2018 - 20:20:01 EST


> On Thu 2018-01-04 14:13:56, Tim Mouraveiko wrote:
> > > > As I mentioned before, I repeatedly and fully power-cycled the motherboard and reset BIOS
> > > > and etc. It made no difference. I can see that the processor was not drawing any power. The
> > > > software code behaved in a similar fashion on other processors, until I fixed it so that it would
> > > > not kill any more processors.
> > > >
> > >
> > > So you have code that killed more than one processor? Save it! We want
> > > a copy.
> > >
> > > Do you have model numbers of affected CPUs?
> >
> >
> > Why would you want a copy? Last time I checked bricked CPUs do not work well, even as
> > decorations.
> >
> > I believe the processors were Intel Xeon series. The code would likely run on others too.
>
> Well... Intel's shares are overpriced, and you have code to fix that
> :-).
>
> Actually... I don't think your code works. That's why I'm curious. But
> if it works, its rather a big news... and I'm sure Intel and cloud
> providers are going to be interested.
>

I first discovered this issue over a year ago, quite by accident. I changed the code I was
working on so as not to kill the CPU (as that is not what I was trying to). We made Intel aware
of it. They didn´t care much, one of their personnel suggesting that they already knew about it
(whether this is true or not I couldn´t say). It popped up again later, so I had to fix the code
again. It could be a buggy implementation of a certain x86 functionality, but I left it at that
because I had better things to do with my time.

Now this news came up about meltdown and spectre and I was curious if anyone else had
experienced a dead CPU by software, too. Meltdown and spectre are undeniably a problem,
but the magnitude and practicality of it is questionable.

I suspect that what I discovered is either a kill switch, an unintentional flaw that was
implemented at the time the original feature was built into x86 functionality and kept
propagating through successive generations of processors, or could well be that I have a
very destructive and targeted solar flare that is after my CPUs. So, I figured I would put the
question out there, to see if anyone else had a similar experience. Putting the solar flare idea
aside, I can´t conclusively say whether it is a flaw or a feature. Both options are supported at
this time by my observations of the CPU behavior.