Re: [PATCH] netfilter: per netns nf_conntrack_cachep

From: Patrick McHardy
Date: Wed Feb 03 2010 - 07:10:50 EST


Patrick McHardy wrote:
> Jon Masters wrote:
>> On Tue, 2010-02-02 at 19:58 +0200, Alexey Dobriyan wrote:
>>
>>> Yes, moving to init_net-only function is fine.
>> So moving the "setup up fake conntrack" bits to init_init_net from
>> init_net still results in the panic, which means that the use count
>> really is dropping to zero and we really are trying to free it when
>> using multiple namespaces. Per ns is probably an easier way to go.
>
> Agreed, that will also avoid problems in the future with the
> ct_net pointer pointing to &init_net. I'll take care of this
> tommorrow.

Unfortunately a per-namespace conntrack is not easily possible without
larger changes (most of which are already queued in nf-next-2.6.git
though). So for now I just moved the untrack handling to the init_net
setup and cleanup functions and we can try to fix the remainder in
2.6.34.

Jon, could you give this patch a try please?
commit 056ff3e3bd1563969a311697323ff929df94415c
Author: Patrick McHardy <kaber@xxxxxxxxx>
Date: Wed Feb 3 12:58:06 2010 +0100

netfilter: nf_conntrack: fix memory corruption with multiple namespaces

As discovered by Jon Masters <jonathan@xxxxxxxxxxxxxx>, the "untracked"
conntrack, which is located in the data section, might be accidentally
freed when a new namespace is instantiated while the untracked conntrack
is attached to a skb because the reference count it re-initialized.

The best fix would be to use a seperate untracked conntrack per
namespace since it includes a namespace pointer. Unfortunately this is
not possible without larger changes since the namespace is not easily
available everywhere we need it. For now move the untracked conntrack
initialization to the init_net setup function to make sure the reference
count is not re-initialized and handle cleanup in the init_net cleanup
function to make sure namespaces can exit properly while the untracked
conntrack is in use in other namespaces.

Signed-off-by: Patrick McHardy <kaber@xxxxxxxxx>

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 0e98c32..37e2b88 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1113,6 +1113,10 @@ static void nf_ct_release_dying_list(struct net *net)

static void nf_conntrack_cleanup_init_net(void)
{
+ /* wait until all references to nf_conntrack_untracked are dropped */
+ while (atomic_read(&nf_conntrack_untracked.ct_general.use) > 1)
+ schedule();
+
nf_conntrack_helper_fini();
nf_conntrack_proto_fini();
kmem_cache_destroy(nf_conntrack_cachep);
@@ -1127,9 +1131,6 @@ static void nf_conntrack_cleanup_net(struct net *net)
schedule();
goto i_see_dead_people;
}
- /* wait until all references to nf_conntrack_untracked are dropped */
- while (atomic_read(&nf_conntrack_untracked.ct_general.use) > 1)
- schedule();

nf_ct_free_hashtable(net->ct.hash, net->ct.hash_vmalloc,
nf_conntrack_htable_size);
@@ -1288,6 +1289,14 @@ static int nf_conntrack_init_init_net(void)
if (ret < 0)
goto err_helper;

+ /* Set up fake conntrack: to never be deleted, not in any hashes */
+#ifdef CONFIG_NET_NS
+ nf_conntrack_untracked.ct_net = &init_net;
+#endif
+ atomic_set(&nf_conntrack_untracked.ct_general.use, 1);
+ /* - and look it like as a confirmed connection */
+ set_bit(IPS_CONFIRMED_BIT, &nf_conntrack_untracked.status);
+
return 0;

err_helper:
@@ -1333,15 +1342,6 @@ static int nf_conntrack_init_net(struct net *net)
if (ret < 0)
goto err_ecache;

- /* Set up fake conntrack:
- - to never be deleted, not in any hashes */
-#ifdef CONFIG_NET_NS
- nf_conntrack_untracked.ct_net = &init_net;
-#endif
- atomic_set(&nf_conntrack_untracked.ct_general.use, 1);
- /* - and look it like as a confirmed connection */
- set_bit(IPS_CONFIRMED_BIT, &nf_conntrack_untracked.status);
-
return 0;

err_ecache: