Re: [PATCH v2 0/4] riscv: enable HAVE_LD_DEAD_CODE_DATA_ELIMINATION

From: Conor Dooley
Date: Wed Jun 21 2023 - 12:42:51 EST


On Wed, Jun 21, 2023 at 07:53:59AM -0700, Palmer Dabbelt wrote:
> On Tue, 20 Jun 2023 17:13:17 PDT (-0700), Palmer Dabbelt wrote:
> > On Tue, 20 Jun 2023 14:08:33 PDT (-0700), Palmer Dabbelt wrote:
> >> On Tue, 20 Jun 2023 13:47:07 PDT (-0700), ndesaulniers@xxxxxxxxxx wrote:
> >>> On Tue, Jun 20, 2023 at 4:41 PM Palmer Dabbelt <palmer@xxxxxxxxxxx> wrote:
> >>>>
> >>>> On Tue, 20 Jun 2023 13:32:32 PDT (-0700), ndesaulniers@xxxxxxxxxx wrote:
> >>>> > On Tue, Jun 20, 2023 at 4:13 PM Conor Dooley <conor@xxxxxxxxxx> wrote:
> >>>> >>
> >>>> >> On Tue, Jun 20, 2023 at 04:05:55PM -0400, Nick Desaulniers wrote:
> >>>> >> > On Mon, Jun 19, 2023 at 6:06 PM Palmer Dabbelt <palmer@xxxxxxxxxxx> wrote:
> >>>> >> > > On Thu, 15 Jun 2023 06:54:33 PDT (-0700), Palmer Dabbelt wrote:
> >>>> >> > > > On Wed, 14 Jun 2023 09:25:49 PDT (-0700), jszhang@xxxxxxxxxx wrote:
> >>>> >> > > >> On Wed, Jun 14, 2023 at 07:49:17AM -0700, Palmer Dabbelt wrote:
> >>>> >> > > >>> On Tue, 23 May 2023 09:54:58 PDT (-0700), jszhang@xxxxxxxxxx wrote:
> >>>> >>
> >>>> >> > > >> Commit 3b90b09af5be ("riscv: Fix orphan section warnings caused by
> >>>> >> > > >> kernel/pi") touches vmlinux.lds.S, so to make the merge easy, this
> >>>> >> > > >> series is based on 6.4-rc2.
> >>>> >> > > >
> >>>> >> > > > Thanks.
> >>>> >> > >
> >>>> >> > > Sorry to be so slow here, but I think this is causing LLD to hang on
> >>>> >> > > allmodconfig. I'm still getting to the bottom of it, there's a few
> >>>> >> > > other things I have in flight still.
> >>>> >> >
> >>>> >> > Confirmed with v3 on mainline (linux-next is pretty red at the moment).
> >>>> >> > https://lore.kernel.org/linux-riscv/20230517082936.37563-1-falcon@xxxxxxxxxxx/
> >>>> >>
> >>>> >> Just FYI Nick, there's been some concurrent work here from different
> >>>> >> people working on the same thing & the v3 you linked (from Zhangjin) was
> >>>> >> superseded by this v2 (from Jisheng).
> >>>> >
> >>>> > Ah! I've been testing the deprecated patch set, sorry I just looked on
> >>>> > lore for "dead code" on riscv-linux and grabbed the first thread,
> >>>> > without noticing the difference in authors or new version numbers for
> >>>> > distinct series. ok, nevermind my noise. I'll follow up with the
> >>>> > correct patch set, sorry!
> >>>>
> >>>> Ya, I hadn't even noticed the v3 because I pretty much only look at
> >>>> patchwork these days. Like we talked about in IRC, I'm going to go test
> >>>> the merge of this one and see what's up -- I've got it staged at
> >>>> <https://git.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git/commit/?h=for-next&id=1bd2963b21758a773206a1cb67c93e7a8ae8a195>,
> >>>> though that won't be a stable hash if it's actually broken...
> >>>
> >>> Ok, https://lore.kernel.org/linux-riscv/20230523165502.2592-1-jszhang@xxxxxxxxxx/
> >>> built for me. If you're seeing a hang, please let me know what
> >>> version of LLD you're using and I'll build that tag from source to see
> >>> if I can reproduce, then bisect if so.
> >>>
> >>> $ ARCH=riscv LLVM=1 /usr/bin/time -v make -j128 allmodconfig vmlinux
> >>> ...
> >>> Elapsed (wall clock) time (h:mm:ss or m:ss): 2:35.68
> >>> ...
> >>>
> >>> Tested-by: Nick Desaulniers <ndesaulniers@xxxxxxxxxx> # build
> >>
> >> OK, it triggered enough of a rebuild that it might take a bit for
> >> anything to filter out.
> >
> > I'm on LLVM 16.0.2
> >
> > $ git describe
> > llvmorg-16.0.2
> > $ git log | head -n1
> > commit 18ddebe1a1a9bde349441631365f0472e9693520
> >
> > that seems to hang for me -- or at least run for an hour without
> > completing, so I assume it's hung. I'm not wed to 16.0.2, it just
> > happens to be the last time I bumped the toolchain. I'm moving to
> > 16.0.5 to see if that changes anything.
>
> That also takes at least an hour to link. I tried running on LLVM trunk
> from last night
>
> $ git log | head -n1
> commit 5e9173c43a9b97c8614e36d6f754317f731e71e9
>
> and that completed. Just as a curiosity I tried to re-spin it to see
> how long it takes, and it's been running for 23 minutes so far.

After some misdirection through stupid user error, I have also
reproduced this for an LLVM=1 build w/ llvmorg-16.0.0

> So I'm no longer actually sure there's a hang, just something slow.
> That's even more of a grey area, but I think it's sane to call a 1-hour
> link time a regression -- unless it's expected that this is just very
> slow to link?

I dunno, if it was only a thing for allyesconfig, then whatever - but
it's gonna significantly increase build times for any large kernels if LLD
is this much slower than LD. Regression in my book.

I'm gonna go and experiment with mixed toolchain builds, I'll report
back..

Cheers,
Conor.

Attachment: signature.asc
Description: PGP signature