RE: af_packet: use after free in prb_retire_rx_blk_timer_expired

From: liujian (CE)
Date: Sat Jul 22 2017 - 06:05:35 EST


I also hit this issue with trinity test:

The call trace:
[exception RIP: prb_retire_rx_blk_timer_expired+70]
RIP: ffffffff81633be6 RSP: ffff8801bec03dc0 RFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8801b49d0948 RCX: 0000000000000000
RDX: ffff8801b31057a0 RSI: a56b6b6b6b6b6b6b RDI: ffff8801b49d09ec
RBP: ffff8801bec03dd8 R8: 0000000000000001 R9: ffffffff83e1bf80
R10: 0000000000000002 R11: 0000000000000005 R12: ffff8801b49d09ec
R13: 0000000000000100 R14: ffffffff81633ba0 R15: ffff8801b49d0948
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#7 [ffff8801bec03de0] call_timer_fn at ffffffff8108cb76
#8 [ffff8801bec03e18] run_timer_softirq at ffffffff8108f87c
#9 [ffff8801bec03e90] __do_softirq at ffffffff8108629f
#10 [ffff8801bec03f00] call_softirq at ffffffff8166a01c
#11 [ffff8801bec03f18] do_softirq at ffffffff810172ad
#12 [ffff8801bec03f30] irq_exit at ffffffff81086655
#13 [ffff8801bec03f48] msa_irq_exit at ffffffff810b1ab3
#14 [ffff8801bec03f88] smp_apic_timer_interrupt at ffffffff8166aeae
#15 [ffff8801bec03fb0] apic_timer_interrupt at ffffffff816692dd
--- <IRQ stack> ---

And from vmcore, I can see the pointer GET_CURR_PBLOCK_DESC_FROM_CORE(pkc); is a56b6b6b6b6b6b6b


struct packet_ring_buffer rx_ring = {
pg_vec = 0x0,
head = 0x0,
frames_per_block = 0x400,
frame_size = 0x0,
frame_max = 0xffffffff,
pg_vec_order = 0x0,
pg_vec_pages = 0x0,
pg_vec_len = 0x0,
pending_refcnt = 0x0,
prb_bdqc = {
pkbdq = 0xffff8801b31057a0,
feature_req_word = 0x1,
hdrlen = 0x44,
reset_pending_on_curr_blk = 0x1,
delete_blk_timer = 0x0,
kactive_blk_num = 0x0,
blk_sizeof_priv = 0x0,
last_kactive_blk_num = 0x0,
pkblk_start = 0xffff8800a7000000 struct: page excluded: kernel virtual address: ffff8800a7000000 type: "gdb_readmem_callback"
struct: page excluded: kernel virtual address: ffff8800a7000000 type: "gdb_readmem_callback"
<Address 0xffff8800a7000000 out of bounds>,
pkblk_end = 0xffff8800a7200000 "\002",
kblk_size = 0x200000,
max_frame_len = 0x1fffd0,
knum_blocks = 0x1,
knxt_seq_num = 0x2,
prev = 0xffff8800a7000030 struct: page excluded: kernel virtual address: ffff8800a7000030 type: "gdb_readmem_callback"
struct: page excluded: kernel virtual address: ffff8800a7000030 type: "gdb_readmem_callback"
<Address 0xffff8800a7000030 out of bounds>,
nxt_offset = 0xffff8800a7000030 struct: page excluded: kernel virtual address: ffff8800a7000030 type: "gdb_readmem_callback"
struct: page excluded: kernel virtual address: ffff8800a7000030 type: "gdb_readmem_callback"
<Address 0xffff8800a7000030 out of bounds>,
skb = 0x0,
blk_fill_in_prog = {
counter = 0x0

crash> struct pgv 0xffff8801b31057a0
struct pgv {
buffer = 0xa56b6b6b6b6b6b6b <Address 0xa56b6b6b6b6b6b6b out of bounds>
}


Best Regards,
liujian


> -----Original Message-----
> From: netdev-owner@xxxxxxxxxxxxxxx [mailto:netdev-owner@xxxxxxxxxxxxxxx]
> On Behalf Of Willem de Bruijn
> Sent: Wednesday, April 12, 2017 7:23 AM
> To: Dave Jones; alexander.levin@xxxxxxxxxxx; davem@xxxxxxxxxxxxx;
> edumazet@xxxxxxxxxx; willemb@xxxxxxxxxx; daniel@xxxxxxxxxxxxx;
> netdev@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: af_packet: use after free in prb_retire_rx_blk_timer_expired
>
> On Mon, Apr 10, 2017 at 3:23 PM, Dave Jones <davej@xxxxxxxxxxxxxxxxx>
> wrote:
> > On Mon, Apr 10, 2017 at 07:03:30PM +0000, alexander.levin@xxxxxxxxxxx
> wrote:
> > > Hi all,
> > >
> > > I seem to be hitting this use-after-free on a -next kernel using trinity:
> > >
> > > [ 531.036054] BUG: KASAN: use-after-free in
> > prb_retire_rx_blk_timer_expired (net/packet/af_packet.c:688)
>
> The retire_blk_timer is called after the pg_vec struct for this ring was freed.
> This should not happen. packet_set_ring stops the timer with del_timer_sync
> when tearing down the ring before freeing that
> struct:
>
> if (closing && (po->tp_version > TPACKET_V2)) {
> /* Because we don't support block-based V3 on tx-ring */
> if (!tx_ring)
> prb_shutdown_retire_blk_timer(po, rb_queue);
> }
>
> if (pg_vec)
> free_pg_vec(pg_vec, order, req->tp_block_nr);
>
> This is a similar race to the use-after-free fixed by 84ac7260236a
> ("packet: fix race condition in packet_set_ring"). The previous race was
> triggered by a call to setsockopt PACKET_VERSION changing tp_version while
> the ring is active. It is not immediately obvious what is the cause now. I
> suppose trinity does not give a trace of such system calls on this file descriptor?
> That would be helpful.
>
> The bug report shows both a timer firing after the packet_set_ring call that
> freed the pg_vec, and later a CONFIG_DEBUG_OBJECTS_FREE warning that
> the timer is still active when the socket is closed on release of the last file
> descriptor.