Re: [RFC PATCH v2 1/1] kernfs: replace deprecated strlcpy() with strscpy()

From: Mirsad Todorovac
Date: Sat Nov 25 2023 - 18:33:40 EST


Hello, Mr. Tejun Heo,

On 11/25/23 20:35, Tejun Heo wrote:
Hello,

On Thu, Nov 23, 2023 at 12:37:03AM +0100, Mirsad Todorovac wrote:
...
141 static int kernfs_path_from_node_locked(struct kernfs_node *kn_to,
142 struct kernfs_node *kn_from,
143 char *buf, size_t buflen)
...
172 /* Calculate how many bytes we need for the rest */
173 for (i = depth_to - 1; i >= 0; i--) {
174 for (kn = kn_to, j = 0; j < i; j++)
175 kn = kn->parent;
176 len += strscpy(buf + len, "/",
177 len < buflen ? buflen - len : 0);
178 len += strscpy(buf + len, kn->name,
179 len < buflen ? buflen - len : 0);
180 }
181
182 return len;
183 }
...
This is safe, for we see that in case of count == 0 strscpy() just like
strlcpy() turns to a virtual NOP.

The conversion itself isn't dangerous but it changes the return value of the
function. The comment is not updated and the callers are still assuming that
the function returns full length when the buffer is too short. e.g. Take a
look at cgroup_show_path(). All those paths seem safe but the code is more
more confusing because the conversions are half-way. I'm not necessarily
against the conversion but the benefit to risk / churn ratio doesn't seem
too attractive. If you wanna push this through, please make the conversion
complete including the comments and the callers and include a short summary
of the changes and why they're safe in the commit message.

Thanks.

Thank you kindly for your review of my lengthy analysis of the return value.

I apologise for another long email, but most of it is the code you already
understand or wrote, so it should be easy for you to read.

Apparently, what troubles you is:

1. Lack of comment that the return value of kernfs_name_locked() changed for
strlen(kn->node) >= buflen. This is noted and it is a legitimate concern.

54 static int kernfs_name_locked(struct kernfs_node *kn, char *buf, size_t buflen)
55 {
56 size_t len;
57
58 if (!kn)
59 return strscpy(buf, "(null)", buflen);
60
61 len = strscpy(buf, kn->parent ? kn->name : "/", buflen);
62
63 if (unlikely(len == -E2BIG)) {
64 return buflen - 1;
65 } else
66 return len;
67 }

Returning strlen(kn->name) that is greater than buflen really does not make any
sense IMHO, as by the first character after buflen we already know we are in
overrun.

Maybe the comment should be like this:

/*
* The function kernfs_name_locked() returns the name of kernfs_node *kn truncated
* to buflen - 1 or a "/" if it is the root node.
*
* The resulting buffer buf is in any case zero-terminated, and the value returned
* is the number of actual characters copied (excluding the termination NUL byte).
*/

Would this be clear enough in your opinion?

2. The example with cgroup_show_path() that is of your concern. I have carefully
examined it and there is nothing to worry about.

So, the stacktrace goes like this:

kernel/cgroup/cgroup.c:1879: cgroup_show_path(sf, kf_node, kf_root)
1893: len = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX);

fs/kernfs/dir.c:228: kernfs_path_from_node(to, from, buf, buflen)
235: ret = kernfs_path_from_node_locked(to, from, buf, buflen);

fs/kernfs/dir.c:141: kernfs_path_from_node_locked(kn_to, kn_from, buf, buflen)

Let us analyse the stacktrace:

kernel/cgroup/cgroup.c:
=======================
1879 int cgroup_show_path(struct seq_file *sf, struct kernfs_node *kf_node,
1880 struct kernfs_root *kf_root)
1881 {
1882 int len = 0;
1883 char *buf = NULL;
1884 struct cgroup_root *kf_cgroot = cgroup_root_from_kf(kf_root);
1885 struct cgroup *ns_cgroup;
1886
1887 buf = kmalloc(PATH_MAX, GFP_KERNEL);
1888 if (!buf)
1889 return -ENOMEM;
1890
1891 spin_lock_irq(&css_set_lock);
1892 ns_cgroup = current_cgns_cgroup_from_root(kf_cgroot);
→ 1893 len = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX);
1894 spin_unlock_irq(&css_set_lock);
1895
1896 if (len >= PATH_MAX)
1897 len = -ERANGE;
1898 else if (len > 0) {
1899 seq_escape(sf, buf, " \t\n\\");
1900 len = 0;
1901 }
1902 kfree(buf);
1903 return len;
1904 }

fs/kernfs/dir.c:
================
228 int kernfs_path_from_node(struct kernfs_node *to, struct kernfs_node *from,
229 char *buf, size_t buflen)
230 {
231 unsigned long flags;
232 int ret;
233
234 read_lock_irqsave(&kernfs_rename_lock, flags);
→ 235 ret = kernfs_path_from_node_locked(to, from, buf, buflen);
236 read_unlock_irqrestore(&kernfs_rename_lock, flags);
237 return ret;
238 }
239 EXPORT_SYMBOL_GPL(kernfs_path_from_node);

The problem seems to be this function:
======================================

141 static int kernfs_path_from_node_locked(struct kernfs_node *kn_to,
142 struct kernfs_node *kn_from,
143 char *buf, size_t buflen)
144 {
145 struct kernfs_node *kn, *common;
146 const char parent_str[] = "/..";
147 size_t depth_from, depth_to, len = 0;
148 int i, j;
149
150 if (!kn_to)
→ 151 return strscpy(buf, "(null)", buflen);
152
153 if (!kn_from)
154 kn_from = kernfs_root(kn_to)->kn;
155
156 if (kn_from == kn_to)
→ 157 return strscpy(buf, "/", buflen);
158
159 common = kernfs_common_ancestor(kn_from, kn_to);
160 if (WARN_ON(!common))
161 return -EINVAL;
162
163 depth_to = kernfs_depth(common, kn_to);
164 depth_from = kernfs_depth(common, kn_from);
165
166 buf[0] = '\0';
167
168 for (i = 0; i < depth_from; i++)
→ 169 len += strscpy(buf + len, parent_str,
170 len < buflen ? buflen - len : 0);
171
172 /* Calculate how many bytes we need for the rest */
173 for (i = depth_to - 1; i >= 0; i--) {
174 for (kn = kn_to, j = 0; j < i; j++)
175 kn = kn->parent;
→ 176 len += strscpy(buf + len, "/",
177 len < buflen ? buflen - len : 0);
→ 178 len += strscpy(buf + len, kn->name,
179 len < buflen ? buflen - len : 0);
180 }
181
182 return len;
183 }

We have five (5) uses of strlcpy() -> strscpy() conversion.

Line 151: strscpy(buf, "(null)", buflen);
cannot overrun and immediately returns - return value just as if strlcpy() was used.

Line 157: strscpy(buf, "/", buflen);
cannot overrun and immediately returns - return value just as if strlcpy() was used.

Line 169: len += strscpy(buf + len, parent_str, len < buflen ? buflen - len : 0);

Line 176: len += strscpy(buf + len, "/", len < buflen ? buflen - len : 0);

Line 178: len += strscpy(buf + len, kn->name, len < buflen ? buflen - len : 0);

As you will clearly see, all of those cases are already carefully and cleverly protected already against
overrun. This was apparently done by you and Mr. Aditya Kali. Nothing changes here with replacement
one-for-one strlcpy() -> strscpy(), as there is never a buffer overrun.

In case of overrun, the construct (len < buflen ? buflen - len : 0) will simply give a "count" of
zero, and this will cause an early exit from strscpy().

So, my conclusion is that for callers of kernfs_path_from_node_locked() and kernfs_path_from_node()
nothing changes in semantics or the return value.

For the kernfs_name_locked(), we get a truncated value of kn->name in buf as before, only return
value changes for this special case, but I have explained and I think proven that it is used
nowhere in code.

For the function:

fs/kernfs/dir.c
===============
856 static struct kernfs_node *kernfs_walk_ns(struct kernfs_node *parent,
857 const unsigned char *path,
858 const void *ns)
859 {
860 size_t len;
861 char *p, *name;
862
863 lockdep_assert_held_read(&kernfs_root(parent)->kernfs_rwsem);
864
865 spin_lock_irq(&kernfs_pr_cont_lock);
866
867 len = strscpy(kernfs_pr_cont_buf, path, sizeof(kernfs_pr_cont_buf));
868
869 if (unlikely(len == -E2BIG)) {
870 spin_unlock_irq(&kernfs_pr_cont_lock);
871 return NULL;
872 }
873
874 p = kernfs_pr_cont_buf;
875
876 while ((name = strsep(&p, "/")) && parent) {
877 if (*name == '\0')
878 continue;
879 parent = kernfs_find_ns(parent, name, ns);
880 }
881
882 spin_unlock_irq(&kernfs_pr_cont_lock);
883
884 return parent;
885 }

Nothing changes. Neither the truncation of buffer and neither the return value (NULL or parent).

The previous check was:

860 len = strlcpy(kernfs_pr_cont_buf, path, sizeof(kernfs_pr_cont_buf));
861
862 if (len >= sizeof(kernfs_pr_cont_buf)) {
863 spin_unlock_irq(&kernfs_pr_cont_lock);
864 return NULL;
865 }

The string is truncated on overrun just as with strlcpy(), and the test

if (len == -E2BIG)

is equivalent to

if (len >= sizeof(kernfs_pr_cont_buf))

NOTE:

The forementioned situation in cgroup_show_path():

→ 1893 len = kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX);
1894 spin_unlock_irq(&css_set_lock);
1895
1896 if (len >= PATH_MAX)
1897 len = -ERANGE;
1898 else if (len > 0) {
1899 seq_escape(sf, buf, " \t\n\\");
1900 len = 0;
1901 }

len > PATH_MAX can never happen because:

kernfs_path_from_node(kf_node, ns_cgroup->kn, buf, PATH_MAX);

calls

234 read_lock_irqsave(&kernfs_rename_lock, flags);
→ 235 ret = kernfs_path_from_node_locked(to, from, buf, buflen);
236 read_unlock_irqrestore(&kernfs_rename_lock, flags);
237 return ret;

which is evaluated to:

234 read_lock_irqsave(&kernfs_rename_lock, flags);
→ 235 ret = kernfs_path_from_node_locked(to, from, buf, PATH_MAX);
236 read_unlock_irqrestore(&kernfs_rename_lock, flags);
237 return ret;

and then to:

141 static int kernfs_path_from_node_locked(struct kernfs_node *kn_to,
142 struct kernfs_node *kn_from,
143 char *buf, size_t buflen)
144 {
145 struct kernfs_node *kn, *common;
146 const char parent_str[] = "/..";
147 size_t depth_from, depth_to, len = 0;
148 int i, j;
149
150 if (!kn_to)
→ 151 return strscpy(buf, "(null)", PATH_MAX);
152
153 if (!kn_from)
154 kn_from = kernfs_root(kn_to)->kn;
155
156 if (kn_from == kn_to)
→ 157 return strscpy(buf, "/", PATH_MAX);
158
159 common = kernfs_common_ancestor(kn_from, kn_to);
160 if (WARN_ON(!common))
161 return -EINVAL;
162
163 depth_to = kernfs_depth(common, kn_to);
164 depth_from = kernfs_depth(common, kn_from);
165
166 buf[0] = '\0';
167
168 for (i = 0; i < depth_from; i++)
→ 169 len += strscpy(buf + len, parent_str,
170 len < PATH_MAX ? PATH_MAX - len : 0);
171
172 /* Calculate how many bytes we need for the rest */
173 for (i = depth_to - 1; i >= 0; i--) {
174 for (kn = kn_to, j = 0; j < i; j++)
175 kn = kn->parent;
→ 176 len += strscpy(buf + len, "/",
177 len < PATH_MAX ? PATH_MAX - len : 0);
→ 178 len += strscpy(buf + len, kn->name,
179 len < PATH_MAX ? PATH_MAX - len : 0);
180 }
181
182 return len;
183 }

len will never exceed PATH_MAX, because it is increased by (PATH_MAX - len) or 0 (zero),
as strscpy will copy either PATH_MAX - len or 0 (zero) characters.

The second use of kernfs_path_from_node() appears safe:

2350 int cgroup_path_ns_locked(struct cgroup *cgrp, char *buf, size_t buflen,
2351 struct cgroup_namespace *ns)
2352 {
2353 struct cgroup *root = cset_cgroup_from_root(ns->root_cset, cgrp->root);
2354
→ 2355 return kernfs_path_from_node(cgrp->kn, root->kn, buf, buflen);
2356 }

According to Bootlin, this covers all uses of the function kernfs_path_from_node():

Defined in 1 files as a prototype:

include/linux/kernfs.h, line 396 (as a prototype)

Defined in 2 files as a function:

fs/kernfs/dir.c, line 221 (as a function)
include/linux/kernfs.h, line 476 (as a function)

Documented in 1 files:

fs/kernfs/dir.c, line 205

Referenced in 3 files:

fs/kernfs/dir.c
line 232
line 265
include/linux/kernfs.h, line 596
kernel/cgroup/cgroup.c
line 1893
line 2355

Hope this helps.

Best regards,
Mirsad Todorovac