Re: [BUG] Build error for 4.15-rc3 kernel caused by patch "kbuild: Add a cache for generated variables"

From: Masahiro Yamada
Date: Wed Dec 20 2017 - 23:02:28 EST


Hi Doug

2017-12-21 2:07 GMT+09:00 Doug Anderson <dianders@xxxxxxxxxxxx>:
> Hi,
>
> On Tue, Dec 19, 2017 at 6:29 PM, Masahiro Yamada
> <yamada.masahiro@xxxxxxxxxxxxx> wrote:
>> 2017-12-19 2:17 GMT+09:00 Doug Anderson <dianders@xxxxxxxxxxxx>:
>>> Hi,
>>>
>>> On Mon, Dec 18, 2017 at 7:50 AM, Masahiro Yamada
>>> <yamada.masahiro@xxxxxxxxxxxxx> wrote:
>>>> 2017-12-18 23:56 GMT+09:00 Masahiro Yamada <yamada.masahiro@xxxxxxxxxxxxx>:
>>>>> 2017-12-17 7:35 GMT+09:00 Yang Shi <yang.s@xxxxxxxxxxxxxxx>:
>>>>>> Hi folks,
>>>>>>
>>>>>> I just upgraded gcc to 6.4 on my centos 7 machine by Arnd's suggestion. But,
>>>>>> I ran into the below compile error with 4.15-rc3 kernel:
>>>>>>
>>>>>> In file included from ./include/uapi/linux/uuid.h:21:0,
>>>>>> from ./include/linux/uuid.h:19,
>>>>>> from ./include/linux/mod_devicetable.h:12,
>>>>>> from scripts/mod/devicetable-offsets.c:2:
>>>>>> ./include/linux/string.h:8:20: fatal error: stdarg.h: No such file or
>>>>>> directory
>>>>>> #include <stdarg.h>
>>>>>>
>>>>>> I bisected to commit 3298b690b21cdbe6b2ae8076d9147027f396f2b1 ("kbuild: Add
>>>>>> a cache for generated variables"). Once I revert this commit, kernel build
>>>>>> is fine.
>>>>>>
>>>>>> gcc 4.8.5 is fine to build kernel with this commit.
>>>>>>
>>>>>> I'm not quite sure if this is a bug or my gcc install is skewed although it
>>>>>> can build kernel without that commit since that commit might exacerbate the
>>>>>> case.
>>>>>>
>>>>>> Any hint is appreciated
>>>>>
>>>>>
>>>>> Today, I was also hit with the same error
>>>>> when I was compiling linux-next.
>>>>> I am not so sure why this error happens, but
>>>>> "make clean" will probably fix the problem.
>>>>>
>>>>> You need to do "make clean" to blow .cache.mk
>>>>> when you upgrade your compiler.
>>>>> This is nasty, though...
>>>>>
>>>>
>>>>
>>>> I got it.
>>>>
>>>> The following line in the top-level Makefile.
>>>>
>>>> NOSTDINC_FLAGS += -nostdinc -isystem $(call shell-cached,$(CC)
>>>> -print-file-name=include)
>>>>
>>>>
>>>> If the stale result of -print-file-name is stored in the cache file,
>>>> the compiler fails to find <stdarg.h>
>>>
>>> Nice catch! Do you have any idea how we can fix it? I suppose we
>>> could add a single (non-cached) call to CC somewhere in there to get
>>> CC's version and clobber the cache if the version changes. Is that
>>> the best approach here?
>>>
>>> In general I remember thinking about the gcc upgrade problem when I
>>> was first experimenting with the cache. At the time my assumption was
>>> that if someone updated their gcc then they really ought to be doing a
>>> clean anyway (I wasn't sure if the build system somehow enforced this,
>>> but I didn't think so). Doing an incremental build after a compiler
>>> upgrade just seems (to me) to be asking for asking for trouble, or in
>>> the very least seems like it's not what the user wanted (if you update
>>> your compiler you almost certainly want it to be used to build all of
>>> your code, don't you?)
>>
>> I agree.
>> When you upgrade your compiler,
>> you need to remove not only cache files, but also all object files.
>> So, "make clean" is the most reasonable way.
>>
>>
>>> Even if it's wise to do a clean after a compiler upgrade, it still
>>> seems pretty non-ideal that a user has to decipher an arcane error
>>> like this, so it seems like we should see what we can do to detect
>>> this case for the user and help them out. Perhaps rather than
>>> clobbering the cache we should actually suggest that the user run a
>>> "make clean"?
>>>
>>
>> Right. I think it's a good thing to do.
>
> Are you planning on doing this, or is this something you'd like me to
> attempt? I'm a bit busy in the last two days before I go on Christmas
> break, but I can try to squeeze something like this in since the root
> of the issue is a patch that I authored. Let me know.

I am busy too these days.
Your contribution is very appreciated.


> If this is something you'd like me to do, let me know if you think the
> right solution is to detect the problem and warn the user or if the
> right solution is to just blow away the cache. It would be up to you,
> but I'd tend to go the route of warning the user because:
>
> * The user should almost certainly do a "make clean" to really ensure
> no mismatch between object files.
>
> * I could imagine that trying to invoke "make clean" automatically
> might be complicated.

I agree with both.


When compiler upgrade is detected,
we can terminate building
with a hint message to prompt users to run "make clean"


>
>> BTW, "sudo make install" or "sudo make modules_install" could
>> add some cache entries by super user privilege?
>>
>> (For example, run build targets with CROSS_COMPILE,
>> but run install targets without CROSS_COMPILE,
>> install targets will produce different cache entries.)
>>
>>
>> If so, "make clean" in normal user privilege
>> can not remove cache files...
>
> Hrm. That doesn't sound nice. I guess this could be solved by
> something like your "no-compiler-targets" patch, but IIUC that didn't
> include "install" or "module_install". I guess the other option would
> be to somehow detect "UID=0" specifically and not generate the cache?
>
> -Doug

That would be a solution.
We can skip cache generation for some sort of targets.


--
Best Regards
Masahiro Yamada