Re: [PATCH v2 0/4] dedupe smb unicode files

From: Dr. David Alan Gilbert
Date: Fri Jul 21 2023 - 09:26:32 EST


* Tom Talpey (tom@xxxxxxxxxx) wrote:
> On 7/20/2023 7:57 PM, Dr. David Alan Gilbert wrote:
> > * Tom Talpey (tom@xxxxxxxxxx) wrote:
> > > On 7/19/2023 6:06 PM, Dave Kleikamp wrote:
> > > > On 7/19/23 4:58PM, Dr. David Alan Gilbert wrote:
> > > > > * Steve French (smfrench@xxxxxxxxx) wrote:
> > > > > > The related question is which tree to send it from, if no problems
> > > > > > reported (presumably mine since it mostly affect cifs.ko and ksmbd.ko,
> > > > > > and because there hasn't been activity in fs/nls for years)
> > > > >
> > > > > That was my hope, given that ~half of the patches are directly on that
> > > > > code, and it's the only very active tree this touches as far as I can
> > > > > tell.
> > > > >
> > > > > > On Wed, Jul 19, 2023 at 12:56 PM Steve French
> > > > > > <smfrench@xxxxxxxxx> wrote:
> > > > > > >
> > > > > > > No objections to this on my part.  If Shaggy is ok with the JFS
> > > > > > > change, we could target it for 6.6-rc1 if it tests out ok
> > > >
> > > > For the series:
> > > > Reviewed-by: Dave Kleikamp <dave.kleikamp@xxxxxxxxxx>
> > > >
> > > > Steve,
> > > > Feel free to pull in even the 4th patch into your tree with my consent.
> > > > Or if you're more comfortable, I could submit it after yours hits
> > > > mainline.
> > > >
> > > > Shaggy
> > >
> > > The changes look good to me but there is one quirk with the
> > > copyrights and SPDX in patch 2.
> > >
> > > In the new fs/nls/nls_ucs2_utils.c, the SPDX line changes from
> > > a "/* ... */" form to "// ...", which may be a proper update, but
> > > then partway down, adds the same SPDX in "/* ... */ form. These
> > > should at least be consistent.
> > >
> > > > +++ b/fs/nls/nls_ucs2_utils.c
> > > > @@ -1,19 +1,25 @@
> > > > -/* SPDX-License-Identifier: GPL-2.0-or-later */
> > > > +// SPDX-License-Identifier: GPL-2.0-or-later
> > >
> > > vs
> > >
> > > > +++ b/fs/nls/nls_ucs2_utils.h
> > > > @@ -0,0 +1,297 @@
> > > > +/* SPDX-License-Identifier: GPL-2.0-or-later */
> >
> > Yeh that's an easy fix - so that's just the fact the .h has
> > the older /* where I'd fixed up the .c ?
>
> Yep, for consistency that sounds good.
>
> > > Second, the copyright in fs/nls/nls_ucs2_utils.c is a bit of
> > > a mash-up (adding 2009 especially).
> > >
> > > I think it's better to keep the exact text of both copyrights,
> > > perhaps with a note as to which files had them previously, and
> > > adding some new note/blank line to separate the recent contributions
> > > from Namjae and you from the ancient history.
> >
> > How about the following;
> >
> > * This file has taken chunks from a few other files
> > * smb/server/uniupr.h had the declaration:
>
> These two lines above aren't needed, because the lines below
> contain the copyright and where they originated. So just omit
> the two above.

OK, will do; although the reason I included them was because the
path included in the declaration isn't the file the declaration is
currently in.

> > *
> > * Some of the source code in this file came from fs/cifs/uniupr.h
> > * Copyright (c) International Business Machines Corp., 2000,2002
> > *
> > * fs/smb/server/unicode.c had the declaration:
>
> And this one above - not needed.
>
> > *
> > * Some of the source code in this file came from fs/cifs/cifs_unicode.c
> > *
> > * Copyright (c) International Business Machines Corp., 2000,2009
> > * Modified by Steve French (sfrench@xxxxxxxxxx)
> > * Modified by Namjae Jeon (linkinjeon@xxxxxxxxxx)
> > *
> >
> > I haven't added the extra line above Namjae's line, since it's now
> > a straight copy from the unicode.c entry.
>
> Straight copy is what's important. No deletion, no edit in a copyright.
> So, ok.
>
> > I'm not particularly fussed about adding my own line unless you think
> > it's needed; git keeps better history!
>
> In fact, since you technically didn't add any code, just deleted,
> moved or renamed, I think it might be best to leave yourself out.

Oh that question could keep copyright lawyers occupied for years :-)

> But, totally your choice.

Thanks

I'll post a modified version of this patch in a few hours.

Dave

> Tom.
>
> > > > +++ b/fs/nls/nls_ucs2_utils.c
> > > > ...
> > > > - * Some of the source code in this file came from fs/cifs/uniupr.h
> > > > - * Copyright (c) International Business Machines Corp., 2000,2002
> > > > - *
> > > > - * uniupr.h - Unicode compressed case ranges
> > > > + * Some of the source code in this file came from fs/cifs/cifs_unicode.c
> > > > + * via fs/smb/unicode.c and fs/smb/uniupr.h and fs/cifs/uniupr.h
> > > > + * Copyright (c) International Business Machines Corp., 2000,2002,2009
> > > > + * Modified by Steve French (sfrench@xxxxxxxxxx)
> > > > + * Modified by Namjae Jeon (linkinjeon@xxxxxxxxxx)
> > > > + * Modified by Dr. David Alan Gilbert <linux@xxxxxxxxxxx>
> > >
> > > Apart from considering these:
> > >
> > > Reviewed-by: Tom Talpey <tom@xxxxxxxxxx>
> >
> > Thanks!
> >
> > Dave
> >
> > > Nice work!
> > >
> > > > >
> > > > > Thanks.
> > > > >
> > > > > Dave
> > > > >
> > > > > > > On Wed, Jul 12, 2023 at 6:28 PM Dr. David Alan Gilbert
> > > > > > > <dave@xxxxxxxxxxx> wrote:
> > > > > > > >
> > > > > > > > * linux@xxxxxxxxxxx (linux@xxxxxxxxxxx) wrote:
> > > > > > > > > From: "Dr. David Alan Gilbert" <linux@xxxxxxxxxxx>
> > > > > > > > >
> > > > > > > > > The smb client and server code have (mostly) duplicated code
> > > > > > > > > for unicode manipulation, in particular upper case handling.
> > > > > > > > >
> > > > > > > > > Flatten this lot into shared code.
> > > > > > > >
> > > > > > > > Gentle two week ping on this please.
> > > > > > > >
> > > > > > > > Dave
> > > > > > > >
> > > > > > > > (Apologies to the 3 of you who already got a copy of this ping,
> > > > > > > > recent due to a missing header ',' )
> > > > > > > >
> > > > > > > > > There's some code that's slightly different between the two, and
> > > > > > > > > I've not attempted to share that - this should be strictly a no
> > > > > > > > > behaviour change set.
> > > > > > > > >
> > > > > > > > > In addition, the same tables and code are shared in jfs, however
> > > > > > > > > there's very little testing available for the unicode in there,
> > > > > > > > > so just share the raw data tables.
> > > > > > > > >
> > > > > > > > > I suspect there's more UCS-2 code that can be shared, in the NLS code
> > > > > > > > > and in the UCS-2 code used by the EFI interfaces.
> > > > > > > > >
> > > > > > > > > Lightly tested with a module and a monolithic build,
> > > > > > > > > and just mounting
> > > > > > > > > itself.
> > > > > > > > >
> > > > > > > > > This dupe was found using PMD:
> > > > > > > > >    https://pmd.github.io/pmd/pmd_userdocs_cpd.html
> > > > > > > > >
> > > > > > > > > Dave
> > > > > > > > >
> > > > > > > > > Version 2
> > > > > > > > >    Moved the shared code to fs/nls after v1 feedback.
> > > > > > > > >    Renamed shared tables from Smb to Nls prefix
> > > > > > > > >    Move UniStrcat as well
> > > > > > > > >    Share the JFS tables
> > > > > > > > >
> > > > > > > > > Dr. David Alan Gilbert (4):
> > > > > > > > >    fs/smb: Remove unicode 'lower' tables
> > > > > > > > >    fs/smb: Swing unicode common code from smb->NLS
> > > > > > > > >    fs/smb/client: Use common code in client
> > > > > > > > >    fs/jfs: Use common ucs2 upper case table
> > > > > > > > >
> > > > > > > > >   fs/jfs/Kconfig               |   1 +
> > > > > > > > >   fs/jfs/Makefile              |   2 +-
> > > > > > > > >   fs/jfs/jfs_unicode.h         |  17 +-
> > > > > > > > >   fs/jfs/jfs_uniupr.c          | 121 -------------
> > > > > > > > >   fs/nls/Kconfig               |   8 +
> > > > > > > > >   fs/nls/Makefile              |   1 +
> > > > > > > > >   fs/nls/nls_ucs2_data.h       |  15 ++
> > > > > > > > >   fs/nls/nls_ucs2_utils.c      | 144 +++++++++++++++
> > > > > > > > >   fs/nls/nls_ucs2_utils.h      | 285 ++++++++++++++++++++++++++++++
> > > > > > > > >   fs/smb/client/Kconfig        |   1 +
> > > > > > > > >   fs/smb/client/cifs_unicode.c |   1 -
> > > > > > > > >   fs/smb/client/cifs_unicode.h | 330
> > > > > > > > > +----------------------------------
> > > > > > > > >   fs/smb/client/cifs_uniupr.h  | 239 -------------------------
> > > > > > > > >   fs/smb/server/Kconfig        |   1 +
> > > > > > > > >   fs/smb/server/unicode.c      |   1 -
> > > > > > > > >   fs/smb/server/unicode.h      | 325
> > > > > > > > > +---------------------------------
> > > > > > > > >   fs/smb/server/uniupr.h       | 268 ----------------------------
> > > > > > > > >   17 files changed, 467 insertions(+), 1293 deletions(-)
> > > > > > > > >   delete mode 100644 fs/jfs/jfs_uniupr.c
> > > > > > > > >   create mode 100644 fs/nls/nls_ucs2_data.h
> > > > > > > > >   create mode 100644 fs/nls/nls_ucs2_utils.c
> > > > > > > > >   create mode 100644 fs/nls/nls_ucs2_utils.h
> > > > > > > > >   delete mode 100644 fs/smb/client/cifs_uniupr.h
> > > > > > > > >   delete mode 100644 fs/smb/server/uniupr.h
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > 2.41.0
> > > > > > > > >
> > > > > > > > --
> > > > > > > >   -----Open up your eyes, open up your mind, open up your code -------
> > > > > > > > / Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \
> > > > > > > > \        dave @ treblig.org |                               | In Hex /
> > > > > > > >   \ _________________________|_____ http://www.treblig.org  ; |_______/
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Steve
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Thanks,
> > > > > >
> > > > > > Steve
> > > >
--
-----Open up your eyes, open up your mind, open up your code -------
/ Dr. David Alan Gilbert | Running GNU/Linux | Happy \
\ dave @ treblig.org | | In Hex /
\ _________________________|_____ http://www.treblig.org |_______/