Re: inode_lock "decorative"?

kuznet@ms2.inr.ac.ru
Fri, 26 Nov 1999 20:04:20 +0300 (MSK)


Hello!

> Actually for dummy inodes like for the sockets we could avoid queueing the
> inode in the inuse_list but then we should add a path in iput() to check
> that the inode is queued in a list before calling list_del. This thing is
> not very interesting IMHO.

If you make this check before grabbing inode_lock, it becomes to be a bit
more interesting.

It is also cleaner from viewpoint of sockets. 8) At least, the fact that
it is private and referenced only from "struct file" becomes explicit.
For sockets atomic f_count on struct file is enough to keep it safe.

But you are right in main. Another aspect is that VFS is not
BH safe and we cannot allocate new sockets inside net core.
It is not a secret that we lose lots of space in struct inode per socket,
which is reserved by FS instances. Do you see? If "dummy" inodes are private,
we can easily invent a way to allocate/release them from net and save
lots of space and dereferences sock <-> socket.

> What I ask you is _why_ you increment and the iput by hand. Just remove
> the i_count++ and the iput() and all should remains fine. Basically you
> run always with an i_count of 2 while you could run all the time with an
> i_count of 1 without differences (unless I am missing something... hmm).

No, you did not miss anything. It is pure paranoia. It accounts only for
dummy helper reference sock->inode. Explained only by absense of credit
to VFS 8) I would prefer to convert it to a macro, though.

> >Also, is it difficult to make similar thing with d_alloc_root?
>
> It's just possible even if the dcache is still signle threaded. The reason
> is that d_alloc remains local to you if the parent is NULL (and the parent
> is null when called by d_alloc_root()). Then there is d_instantitate()
> that will do a list_add on a list local to you as it's in your dummy inode
> (and you won't have more than one dentry per inode so you have not to
> protect either inside your subsystem). So basically you can run such piece
> of code without the big kernel lock.

Hmm, I believed it is referenced from a root dentry? No? OK.

Superb, Ingo blamed on some remnants of kernel lock contention.
Two of them are dead. 8) The only things to do is close()/poll()/ioctl().

> I also noticed an SMP race in a accounting var.

I know about this. No, atomic_* is bad choice. We have lots
of such useless accounting-only crap not-serialized properly
(snmp counters etc.). I will convert them to cpu local vars in one hit,
only have to invent a way to macroize bunching them to areas
without (or with) linker help.

BTW why not to create NR_CPUS bss sections?

Alexey

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/