[PATCH] lib: vsprintf: 32-bit put_dec() fixes

From: Michal Nazarewicz
Date: Sat Mar 05 2011 - 14:41:53 EST


This commit fixes the 32-bit put_dec() function.

I have submitted by mistake an older version of the put_dec()
patch with a bug in it (which had been spotted by Denys and
fixed in subsequent version), which resulted in Hugh having to
find the bug once again (after experiencing boot failure).

This commit fixes the bug once and for all and introduces some
additional optimisations and comments (which were present in
the fixed version of put_dec() patch).

Signed-off-by: Michal Nazarewicz <mina86@xxxxxxxxxx>
Cc: Denys Vlasenko <vda.linux@xxxxxxxxxxxxxx>
Cc: Hugh Dickins <hughd@xxxxxxxxxx>
---
lib/vsprintf.c | 49 ++++++++++++++++++++++++-------------------------
1 files changed, 24 insertions(+), 25 deletions(-)

Hugh Dickins <hughd@xxxxxxxxxx> writes:
> mmotm 2011-03-02-16-52 is utterly broken on 32-bit: panics
> at boot with "Couldn't register console driver", and
> preceding warnings don't even print their line
> numbers... which leads to the vsprintf changes.

As I suspected, I sent by mistake the v3 instead of v4 of the
patch. I haven't spotted any problems because I was building
kernel for my 32-bit machine from a branch with a fix but than
sent patch from a branch for 64-bit machine (which was not
affected by the bug).

Hugh proposed version with cascades of ifs. I think this is
better version because it's smaller and benchmark are not
clear which version is faster (on Intel version with ifs was
faster but on ARM version without ifs was).

I pushed an unified version of the whole put_dec patch,
rebased on v2.6.38-rc7, to github:

git://github.com/mina86/linux-2.6.git v2.6.38-rc7+put-dec

The attached patch is rebased on top of -mm (ie. just a delta).

For the sake of elegance, I would probably recommend taking
the unified patch instead of what's already in -mm plus this
or Hugh's fix.

Again, sorry about all the problems.

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 344c03f..daa9209 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -157,7 +157,7 @@ static noinline_for_stack char *put_dec_full5(char *buf, unsigned q)
* without any branches.
*/

- r = (q * (uint64_t)0xcccd) >> 19;
+ r = (q * (uint64_t)0xcccd) >> 19;
*buf++ = (q - 10 * r) + '0';

/*
@@ -226,7 +226,14 @@ static noinline_for_stack char *put_dec_trunc5(char *buf, unsigned q)
return buf;
}

-/* No inlining helps gcc to use registers better */
+/*
+ * This function formats all integers correctly, however on 32-bit
+ * processors function below is used (not this one) which handles only
+ * non-zero integers. So be advised never to call this function with
+ * num == 0.
+ *
+ * No inlining helps gcc to use registers better
+ */
static noinline_for_stack
char *put_dec(char *buf, unsigned long long num)
{
@@ -293,36 +300,36 @@ char *put_dec_8bit(char *buf, unsigned q)
* permission from the author). This performs no 64-bit division and
* hence should be faster on 32-bit machines then the version of the
* function above.
+ *
+ * This function formats correctly all NON-ZERO integers. Passing
+ * zero makes daemons come out of your closet. This is OK, since
+ * number(), which calls this function, has a special case for zero
+ * anyways.
*/
static noinline_for_stack
char *put_dec(char *buf, unsigned long long n)
{
uint32_t d3, d2, d1, q;

- if (n < 10) {
- *buf++ = '0' + (unsigned)n;
- return buf;
- }
-
d1 = (n >> 16) & 0xFFFF;
d2 = (n >> 32) & 0xFFFF;
d3 = (n >> 48) & 0xFFFF;

- q = 656 * d3 + 7296 * d2 + 5536 * d1 + (n & 0xFFFF);
+ q = 656 * d3 + 7296 * d2 + 5536 * d1 + (n & 0xFFFF);

- q = q / 10000;
buf = put_dec_full4(buf, q % 10000);
+ q = q / 10000;

d1 = q + 7671 * d3 + 9496 * d2 + 6 * d1;
- q = d1 / 10000;
+ q = d1 / 10000;
buf = put_dec_full4(buf, d1 % 10000);

d2 = q + 4749 * d3 + 42 * d2;
- q = d2 / 10000;
+ q = d2 / 10000;
buf = put_dec_full4(buf, d2 % 10000);

d3 = q + 281 * d3;
- q = d3 / 10000;
+ q = d3 / 10000;
buf = put_dec_full4(buf, d3 % 10000);

buf = put_dec_full4(buf, q);
@@ -407,22 +414,14 @@ char *number(char *buf, char *end, unsigned long long num,
spec.field_width--;
}
}
- if (need_pfx) {
- spec.field_width--;
- if (spec.base == 16)
- spec.field_width--;
- }
+ if (need_pfx)
+ spec.field_width -= spec.base / 8;

/* generate full string in tmp[], in reverse order */
i = 0;
- if (num == 0)
- tmp[i++] = '0';
- /* Generic code, for any base:
- else do {
- tmp[i++] = (digits[do_div(num,base)] | locase);
- } while (num != 0);
- */
- else if (spec.base != 10) { /* 8 or 16 */
+ if (num < 8) {
+ tmp[i++] = '0' + (char)num;
+ } else if (spec.base != 10) { /* 8 or 16 */
int mask = spec.base - 1;
int shift = 3;

--
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/