Re: [PATCH net-next] net: ethtool: Fix out-of-bounds copy to user

From: Alexander Duyck
Date: Fri Jun 02 2023 - 11:31:37 EST


On Thu, Jun 1, 2023 at 6:46 PM Ding Hui <dinghui@xxxxxxxxxxxxxx> wrote:
>
> On 2023/6/1 23:04, Alexander H Duyck wrote:
> > On Thu, 2023-06-01 at 19:28 +0800, Ding Hui wrote:
> >> When we get statistics by ethtool during changing the number of NIC
> >> channels greater, the utility may crash due to memory corruption.
> >>
> >> The NIC drivers callback get_sset_count() could return a calculated
> >> length depends on current number of channels (e.g. i40e, igb).
> >>
> >
> > The drivers shouldn't be changing that value. If the drivers are doing
> > this they should be fixed to provide a fixed length in terms of their
> > strings.
> >
>
> Is there an explicit declaration for the rule?
> I searched the ethernet drivers, found that many drivers that support
> multiple queues return calculated length by number of queues rather than
> fixed value. So pushing all these drivers to follow the rule is hard
> to me.
>
> >> The ethtool allocates a user buffer with the first ioctl returned
> >> length and invokes the second ioctl to get data. The kernel copies
> >> data to the user buffer but without checking its length. If the length
> >> returned by the second get_sset_count() is greater than the length
> >> allocated by the user, it will lead to an out-of-bounds copy.
> >>
> >> Fix it by restricting the copy length not exceed the buffer length
> >> specified by userspace.
> >>
> >> Signed-off-by: Ding Hui <dinghui@xxxxxxxxxxxxxx>
> >
> > Changing the copy size would not fix this. The problem is the driver
> > will be overwriting with the size that it thinks it should be using.
> > Reducing the value that is provided for the memory allocations will
> > cause the driver to corrupt memory.
> >
>
> I noticed that, in fact I did use the returned length to allocate
> kernel memory, and only use adjusted length to copy to user.

Ah, okay I hadn't noticed that part. Although that leads me to the
question of if any of the drivers might be modifying the length values
stored in the structures. We may want to add a new stack variable to
track what the clamped value is for these rather than just leaving the
value stored in the structure.

> >> ---
> >> net/ethtool/ioctl.c | 16 +++++++++-------
> >> 1 file changed, 9 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
> >> index 6bb778e10461..82a975a9c895 100644
> >> --- a/net/ethtool/ioctl.c
> >> +++ b/net/ethtool/ioctl.c
> >> @@ -1902,7 +1902,7 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr)
> >> if (copy_from_user(&test, useraddr, sizeof(test)))
> >> return -EFAULT;
> >>
> >> - test.len = test_len;
> >> + test.len = min_t(u32, test.len, test_len);
> >> data = kcalloc(test_len, sizeof(u64), GFP_USER);
> >> if (!data)
> >> return -ENOMEM;
> >
> > This is the wrong spot to be doing this. You need to use test_len for
> > your allocation as that is what the driver will be writing to. You
> > should look at adjusting after the allocation call and before you do
> > the copy
>
> data = kcalloc(test_len, sizeof(u64), GFP_USER); // yes, **test_len** for kernel memory
> ...
> copy_to_user(useraddr, data, array_size(test.len, sizeof(u64)) // **test.len** only for copy to user

One other thought on this. Would we ever expect the length value to
change? For many of these I wonder if we shouldn't just return an
error in the case that there isn't enough space to store the test
results.

It might make sense to look at adding a return of ENOSPC or EFBIG when
we encounter a size difference where our output is too big for the
provided userspace buffer. At least with that we are not losing data.

> >
> >> @@ -1915,7 +1915,8 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr)
> >> if (copy_to_user(useraddr, &test, sizeof(test)))
> >> goto out;
> >> useraddr += sizeof(test);
> >> - if (copy_to_user(useraddr, data, array_size(test.len, sizeof(u64))))
> >> + if (test.len &&
> >> + copy_to_user(useraddr, data, array_size(test.len, sizeof(u64))))
> >> goto out;
> >> ret = 0;
> >>
> >
> > I don't believe this is adding any value. I wouldn't bother with
> > checking for lengths of 0.
> >
>
> Yes, we already checked the data ptr is not NULL, so we don't need checking test.len.
> I'll remove it in v2.
>
> >> @@ -1940,10 +1941,10 @@ static int ethtool_get_strings(struct net_device *dev, void __user *useraddr)
> >> return -ENOMEM;
> >> WARN_ON_ONCE(!ret);
> >>
> >> - gstrings.len = ret;
> >> + gstrings.len = min_t(u32, gstrings.len, ret);
> >>
> >> if (gstrings.len) {
> >> - data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN));
> >> + data = vzalloc(array_size(ret, ETH_GSTRING_LEN));
> >> if (!data)
> >> return -ENOMEM;
> >>
> >
> > Same here. We should be using the returned value for the allocations
> > and tests, and then doing the min adjustment after the allocationis
> > completed.
> >
>
> gstrings.len = min_t(u32, gstrings.len, ret); // adjusting
>
> if (gstrings.len) { // checking the adjusted gstrings.len can avoid unnecessary vzalloc and __ethtool_get_strings()
> vzalloc(array_size(ret, ETH_GSTRING_LEN)); // **ret** for kernel memory, rather than adjusted lenght
>
> At last, adjusted gstrings.len for copy to user

I see what you are talking about now.

> >> @@ -2055,9 +2056,9 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr)
> >> if (copy_from_user(&stats, useraddr, sizeof(stats)))
> >> return -EFAULT;
> >>
> >> - stats.n_stats = n_stats;
> >> + stats.n_stats = min_t(u32, stats.n_stats, n_stats);
> >>
> >> - if (n_stats) {
> >> + if (stats.n_stats) {
> >> data = vzalloc(array_size(n_stats, sizeof(u64)));
> >> if (!data)
> >> return -ENOMEM;
> >
> > Same here. We should be using n_stats, not stats.n_stats and adjust
> > before you do the final copy.
> >
> >> @@ -2070,7 +2071,8 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr)
> >> if (copy_to_user(useraddr, &stats, sizeof(stats)))
> >> goto out;
> >> useraddr += sizeof(stats);
> >> - if (n_stats && copy_to_user(useraddr, data, array_size(n_stats, sizeof(u64))))
> >> + if (stats.n_stats &&
> >> + copy_to_user(useraddr, data, array_size(stats.n_stats, sizeof(u64))))
> >> goto out;
> >> ret = 0;
> >>
> >
> > Again. I am not sure what value is being added. If n_stats is 0 then I
> > am pretty sure this will do nothing anyway.
> >
>
> Not really no, n_stats is returned value, and stats.n_stats is adjusted value,
> if the adjusted stats.n_stats is 0, we avoid memory allocation and setting data ptr
> to NULL, copy_to_user() with NULL ptr maybe cause warn log.

I see now. So data is NULL if stats.n_stats is 0.