Re: sscanf: implement basic character sets

From: Jessica Yu
Date: Tue Feb 23 2016 - 14:26:43 EST


+++ Andy Shevchenko [23/02/16 12:56 +0200]:
On Mon, 2016-02-22 at 16:24 -0500, Jessica Yu wrote:
Implement basic character sets for the '%[]' conversion specifier.

The '%[]' conversion specifier matches a nonempty sequence of
characters
from the specified set of accepted (or with '^', rejected) characters
between the brackets. The substring matched is to be made up of
characters
in (or not in) the set. This implementation differs from its glibc
counterpart in that it does not support character ranges (e.g., 'a-z'
or
'0-9'), the hyphen '-' is *not* a special character, and the brackets
themselves cannot be matched.

Signed-off-by: Jessica Yu <jeyu@xxxxxxxxxx>
---
Patch based on linux-next-20160222.

v2:
 - Use kstrndup() to copy the character set from fmt instead of using
a
   statically allocated array
 
 lib/vsprintf.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 525c8e1..93a6f52 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -2714,6 +2714,45 @@ int vsscanf(const char *buf, const char *fmt,
va_list args)
  num++;
  }
  continue;
+ case '[':
+ {
+ char *s = (char *)va_arg(args, char *);
+ char *set;
+ size_t (*op)(const char *str, const char
*set);
+ size_t len = 0;
+ bool negate = (*(fmt) == '^');
+
+ if (field_width == -1)
+ field_width = SHRT_MAX;

I'm not sure if it's needed here. It will count down till 0 in any
case.

I think it might be good to be consistent with the '%s' specifier code
and have some sort of upper bound set, even if it is much more likely
that len will get to 0 before field_width does.

+
+ op = negate ? &strcspn : &strspn;
+ if (negate)
+ fmt++;

+
+ len = strcspn(fmt, "]");
+ /* invalid format; stop here */
+ if (!len)
+ return num;
+
+ set = kstrndup(fmt, len, GFP_KERNEL);
+ if (!set)
+ return num;
+
+ /* advance fmt past ']' */
+ fmt += len + 1;
+
+ len = (*op)(str, set);

Can we use just normal form:
 op();
?

+ /* no matches */
+ if (!len)

Memory leak here.

+ return num;
+
+ while (*str && len-- && field_width--)
+ *s++ = *str++;

Looks like strcpy() variant. First of all, is it possible to have *str
== '\0' when len != 0?

Good point. The *str check is redundant, since after the call to
strspn/strcspn we know there are at least len bytes in str, so that
check can be removed.

+ *s = '\0';
+ kfree(set);
+ num++;
+ }
+ continue;
  case 'o':
  base = 8;
  break;

--
Andy Shevchenko <andriy.shevchenko@xxxxxxxxxxxxxxx>
Intel Finland Oy