Re: Request for net guru help: waitqueue oops

From: Hans Grobler (grobh@sun.ac.za)
Date: Tue Oct 03 2000 - 08:40:12 EST


Hi Petkan,

Thanks for your comment.

On Tue, 3 Oct 2000, Petko Manolov wrote:
> > A driver I'm working on seems to be doing/triggering something related
> > to waitqueues. This causes a perfectly reproducable oops (small mercies!).
> > Since the oops is not happening in my driver, I'm having a hard time
> > figuring out whats going wrong. I suspect a networking guru will take
> > one look and know what I'm doing wrong. Any suggestions please?
>
>
> It seems you're trying to sleep without process context (most likely in
> net_tx_action). It would be more clear if you send that part of the
> code.

Since I don't explictly sleep anywhere, I'm not sure which code fragment
would be useful... (net_tx_action is part of the networking layers). Which
network functions can sleep (netif_rx, netif_stop_queue, netif_wake_queue,
...) ?

After reading the softnet HOWTO, and some of the network drivers, I
was unsure about the netif_stop_queue and netif_wake_queue functions. The
howto indicated that these two should be protected from concurrent
execution by a private lock. Not all the drivers seem to do this. In my
case (although I'm running UP at the moment), I've used a driver global
spinlock, for example:

  spinlock_t driver_lock = SPIN_LOCK_UNLOCKED;

  int scc72_hard_xmit (struct sk_buff *skb, struct net_device *dev)
  {
    unsigned long flags;

    /* ... */
  
    spin_lock_irqsave (&driver_lock, flags);
    netif_stop_queue (dev);
    spin_unlock_irqrestore (&driver_lock, flags);

    /* ... */
  }

  /* Example timer callback, to wake the queue */
  void scc72_interframewait (unsigned long channel)
  {
    unsigned long flags;
    struct scc72_channel *scc = (struct scc72_channel *) channel;

    /* ... */

    spin_lock_irqsave (&driver_lock, flags);

    /* ... */
 
    if (netif_queue_stopped (scc->dev))
      netif_wake_queue (scc->dev);

    spin_unlock_irqrestore (&driver_lock, flags);
  }

I've just checked my driver, and below is the list of all the external
functions called. Any idea which of these could be trying to sleep?

  dev_kfree_skb_any (called from both hard IRQ and non IRQ context)
  dev_alloc_skb (called from both hard IRQ and non IRQ context)
  del_timer (called from both hard IRQ and non IRQ context)
  add_timer (called from both hard IRQ and non IRQ context)
  netif_rx (called from IRQ context)
  netif_start_queue (called from non hard IRQ context, ex: dev_open)
  netif_stop_queue (called from non hard IRQ context, ex: hard_start_xmit)
  netif_wake_queue (called from non hard IRQ context, ex: timer callbacks)
  netif_queue_stopped (called from non hard IRQ context, ex: timer callbacks)
  skb_queue_tail (called from non hard IRQ context, ex: hard_start_xmit)
  skb_dequeue (called from both hard IRQ and non IRQ context)
  skb_queue_head_init (called from non hard IRQ context, ex: dev_open)

and the standard functions dev_init_buffers, register_netdevice,
   copy_from_user, unregister_netdev, etc. called in the standard places.

skb_queue_tail, skb_dequeue and skb_queue_head_init are used to manage
an internal queue of outgoing skb's.

Thanks.
-- Hans
  

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Oct 07 2000 - 21:00:12 EST