Re: [PATCH net v2] atl1c: move tx cleanup processing out of interrupt

From: Eric Dumazet
Date: Fri Apr 02 2021 - 14:05:30 EST




On 4/2/21 7:20 PM, Gatis Peisenieks wrote:
> Tx queue cleanup happens in interrupt handler on same core as rx queue processing.
> Both can take considerable amount of processing in high packet-per-second scenarios.
>
> Sending big amounts of packets can stall the rx processing which is unfair
> and also can lead to to out-of-memory condition since __dev_kfree_skb_irq
> queues the skbs for later kfree in softirq which is not allowed to happen
> with heavy load in interrupt handler.
>
> This puts tx cleanup in its own napi and enables threaded napi to allow the rx/tx
> queue processing to happen on different cores.
>
> The ability to sustain equal amounts of tx/rx traffic increased:
> from 280Kpps to 1130Kpps on Threadripper 3960X with upcoming Mikrotik 10/25G NIC,
> from 520Kpps to 850Kpps on Intel i3-3320 with Mikrotik RB44Ge adapter.
>
> Signed-off-by: Gatis Peisenieks <gatis@xxxxxxxxxxxx>
> ---
>  drivers/net/ethernet/atheros/atl1c/atl1c.h    |  2 +
>  .../net/ethernet/atheros/atl1c/atl1c_main.c   | 43 +++++++++++++++++--
>  2 files changed, 41 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c.h b/drivers/net/ethernet/atheros/atl1c/atl1c.h
> index a0562a90fb6d..4404fa44d719 100644
> --- a/drivers/net/ethernet/atheros/atl1c/atl1c.h
> +++ b/drivers/net/ethernet/atheros/atl1c/atl1c.h
> @@ -506,6 +506,7 @@ struct atl1c_adapter {
>      struct net_device   *netdev;
>      struct pci_dev      *pdev;
>      struct napi_struct  napi;
> +    struct napi_struct  tx_napi;
>      struct page         *rx_page;
>      unsigned int        rx_page_offset;
>      unsigned int        rx_frag_size;
> @@ -529,6 +530,7 @@ struct atl1c_adapter {
>      u16 link_duplex;
>
>      spinlock_t mdio_lock;
> +    spinlock_t irq_mask_lock;
>      atomic_t irq_sem;
>
>      struct work_struct common_task;
> diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> index 3f65f2b370c5..f51b28e8b6dc 100644
> --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c
> @@ -813,6 +813,7 @@ static int atl1c_sw_init(struct atl1c_adapter *adapter)
>      atl1c_set_rxbufsize(adapter, adapter->netdev);
>      atomic_set(&adapter->irq_sem, 1);
>      spin_lock_init(&adapter->mdio_lock);
> +    spin_lock_init(&adapter->irq_mask_lock);
>      set_bit(__AT_DOWN, &adapter->flags);
>
>      return 0;
> @@ -1530,7 +1531,7 @@ static inline void atl1c_clear_phy_int(struct atl1c_adapter *adapter)
>      spin_unlock(&adapter->mdio_lock);
>  }
>
> -static bool atl1c_clean_tx_irq(struct atl1c_adapter *adapter,
> +static unsigned atl1c_clean_tx_irq(struct atl1c_adapter *adapter,
>                  enum atl1c_trans_queue type)


This v2 is much better, thanks.

You might rename this atl1c_clean_tx_irq(), because it is now
not run under hard irqs ?

Maybe merge atl1c_clean_tx_irq() and atl1c_clean_tx() into a single function ?