Re: [PATCH 16/16 v6] PCI: document the new PCI boot parameters

From: Greg KH
Date: Sat Nov 08 2008 - 00:28:45 EST


On Sat, Nov 08, 2008 at 01:00:29PM +0800, Yu Zhao wrote:
> Greg KH wrote:
>> On Fri, Nov 07, 2008 at 04:35:47PM +0800, Zhao, Yu wrote:
>>> Greg KH wrote:
>>>> On Fri, Nov 07, 2008 at 04:17:02PM +0800, Zhao, Yu wrote:
>>>>>> Well, to do it "correctly" you are going to have to tell the driver to
>>>>>> shut itself down, and reinitialize itself.
>>>>>> Turns out, that doesn't really work for disk and network devices
>>>>>> without
>>>>>> dropping the connection (well, network devices should be fine
>>>>>> probably).
>>>>>> So you just can't do this, sorry. That's why the BIOS handles all of
>>>>>> these issues in a PCI hotplug system.
>>>>>> How does the hardware people think we are going to handle this in the
>>>>>> OS? It's not something that any operating system can do, is it part
>>>>>> of
>>>>>> the IOV PCI spec somewhere?
>>>>> No, it's not part of the PCI IOV spec.
>>>>>
>>>>> I just want the IOV (and whole PCI subsystem) have more flexibility on
>>>>> various BIOSes. So can we reconsider about resource rebalance as boot
>>>>> option, or should we forget about this idea?
>>>> As you have proposed it, the boot option will not work at all, so I
>>>> think we need to forget about it. Especially if it is not really
>>>> needed.
>>> I guess at least one thing would work if people don't want to boot twice:
>>> give the bus number 0 as rebalance starting point, then all system
>>> resources would be reshuffled :-)
>> Hm, but don't we do that today with our basic resource reservation logic
>> at boot time? What would be different about this kind of proposal?
>
> The generic PCI core can do this but this feature is kind of disabled by
> low level PCI code in x86. The low level code tries to reserve resource
> according to configuration from BIOS. If the BIOS is wrong, the allocation
> would fail and the generic PCI core couldn't repair it because the bridge
> resources may have been allocated by the PCI low level and the PCI core
> can't expand them to find enough resource for the subordinates.

Yes, we do this on purpose.

> The proposal is to disable x86 PCI low level to allocation resources
> according to BIOS so PCI core can fully control the resource allocation.
> The PCI core takes all resources from BARs it knows into account and
> configure the resource windows on the bridges according to its own
> calculation.

Ah, so you mean we should revert back to the way we use to do x86 PCI
resource allocation from about a year and a half ago to about 8 years
ago?

Hint, there was a reason why we switched over to using the BIOS instead
of doing it ourselves. Turns out we have to trust the BIOS here, as
that is exactly what other operating systems do. Trying to do it on our
own was too fragile and resulted in too many problems over time.

Go look at the archives for when this all was switched, you'll see the
reasons why.

So no, we will not be going back to the way we used to do things, we
changed for a reason :)

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/