Re: hunt for 2.6.37 dm-crypt+ext4 corruption? (was: Re: dm-cryptbarrier support is effective)

From: Jon Nelson
Date: Wed Dec 08 2010 - 10:26:56 EST


On Tue, Dec 7, 2010 at 9:37 PM, Jon Nelson <jnelson@xxxxxxxxxxx> wrote:
> On Tue, Dec 7, 2010 at 1:35 PM, Ted Ts'o <tytso@xxxxxxx> wrote:
>> On Tue, Dec 07, 2010 at 01:22:43PM -0500, Mike Snitzer wrote:
>>> > 1. create a database (from bash):
>>> >
>>> > createdb test
>>> >
>>> > 2. place the following contents in a file (I used 't.sql'):
>>> >
>>> > begin;
>>> > create temporary table foo as select x as a, ARRAY[x] as b FROM
>>> > generate_series(1, 10000000 ) AS x;
>>> > create index foo_a_idx on foo (a);
>>> > create index foo_b_idx on foo USING GIN (b);
>>> > rollback;
>>> >
>>> > 3. execute that sql:
>>> >
>>> > psql -f t.sql --echo-all test
>>> >
>>> > With 2.6.34.7 I can re-run [3] all day long, as many times as I want,
>>> > without issue.
>>> >
>>> > With 2.6.37-rc4-13 (the currently-installed KOTD kernel) if tails
>>> > pretty frequently.
>>
>> So I just tried to reproduce this on an Ubuntu 10.04 system running
>> 2.6.37-rc5 (completely stock except for a few apparmor patches that I
>> needed to keep the apparmor userspace from complaining). ÂI'm using
>> Postgres 8.4.5-0ubuntu10.04.
>>
>> Using the above procedure, I wasn't able to reproduce. ÂThen I
>> realized this might have been because I was using an SSD root file
>> system (which is secured using LUKS/dm-crypt, with LVM on top of
>> dm-crypt). ÂSo I mounted a file system on a 5400 rpm SSD disk, which
>> is also protected using LUKS/dm-crypt with LVM on top. ÂI then
>> executed the PostgresQL commands:
>>
>> CREATE TABLESPACE test LOCATION '/kbuild/postgres';
>> SET default_tablespace = test;
>> COMMIT
>> \quit
>>
>> I then re-ran the above proceduing, and verified that all of the I/O
>> was going to the 5400rpm laptop disk.
>>
>> I then ran the above procedure a half-dozen times, and I still haven't
>> been able to reproduce any Postgresql errors or kernel errors.
>>
>> Jon, can you help me identify what might be different with your run
>> and mine? ÂWhat version of Postgres are you using?
>
> One difference is the location of the transaction logs (pg_xlog). In
> my case, /var/lib/pgsql/data *is* mountpoint for the test volume
> (actually, it's a symlink to the mount point). In your case, that is
> not so. Perhaps that makes a difference? Âpgsql_tmp might also be on
> two different volumes in your case (I can't be sure).


I grabbed a Kubuntu iso and installed Kubuntu 10.10, and then upgraded
to 'natty', and eventually to 2.6.37-8-generic.

With that install, and postgresql's "data" (/var/lib/postgresql/data)
being located on a LUKS+ext4 volume, I easily observe the behavior.

Does this help?

--
Jon
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/