This is an alternative version of ip/ip6/arp tables locking using
per-cpu locks. This avoids the overhead of synchronize_net() during
update but still removes the expensive rwlock in earlier versions.
The idea for this came from an earlier version done by Eric Duzamet.
Locking is done per-cpu, the fast path locks on the current cpu
and updates counters. The slow case involves acquiring the locks on
all cpu's.
The mutex that was added for 2.6.30 in xt_table is unnecessary since
there already is a mutex for xt[af].mutex that is held.
Tested basic functionality (add/remove/list), but don't have test cases
for stress, ip6tables or arptables.