Re: [PATCH net-next v2] xfrm: Add ESN support for IPSec HW offload

From: Shannon Nelson
Date: Thu Jan 11 2018 - 14:16:18 EST

Next message: Dave Hansen: "Re: [RFC PATCH v2 6/6] x86/entry/pti: don't switch PGD on when pti_disable is set"
Previous message: Eric Dumazet: "Re: [RFC 1/2] softirq: Defer net rx/tx processing to ksoftirqd context"
In reply to: Aviad Yehezkel: "Re: [PATCH net-next v2] xfrm: Add ESN support for IPSec HW offload"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 1/11/2018 5:51 AM, Aviad Yehezkel wrote:

On 1/11/2018 10:28 AM, Yossi Kuperman wrote:

From: Shannon Nelson [mailto:shannon.nelson@xxxxxxxxxx]
Sent: Thursday, January 11, 2018 5:21 AM

On 1/10/2018 3:09 PM, Yossi Kuperman wrote:

On 10 Jan 2018, at 19:36, Shannon Nelson <shannon.nelson@xxxxxxxxxx> wrote:

On 1/10/2018 2:34 AM, yossefe@xxxxxxxxxxxx wrote:
From: Yossef Efraim <yossefe@xxxxxxxxxxxx>
This patch adds ESN support to IPsec device offload.
Adding new xfrm device operation to synchronize device ESN.
Signed-off-by: Yossef Efraim <yossefe@xxxxxxxxxxxx>
---
Changes from v1:
ÂÂ - Added documentation
---
ÂÂ Documentation/networking/xfrm_device.txt |Â 3 +++
ÂÂ include/linux/netdevice.hÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |Â 1 +
ÂÂ include/net/xfrm.hÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ | 12 ++++++++++++
ÂÂ net/xfrm/xfrm_device.cÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |Â 4 ++--
ÂÂ net/xfrm/xfrm_replay.cÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ |Â 2 ++
ÂÂ 5 files changed, 20 insertions(+), 2 deletions(-)

[...]

diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index 7598250..704a055 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -147,8 +147,8 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x,
ÂÂÂÂÂÂ if (!x->type_offload)
ÂÂÂÂÂÂÂÂÂÂ return -EINVAL;
ÂÂ -ÂÂÂ /* We don't yet support UDP encapsulation, TFC padding and ESN. */
-ÂÂÂ if (x->encap || x->tfcpad || (x->props.flags & XFRM_STATE_ESN))
+ÂÂÂ /* We don't yet support UDP encapsulation and TFC padding. */
+ÂÂÂ if (x->encap || x->tfcpad)

As I mentioned before, this will cause issues when working with hardware that has no ESN support, such as Intel's x540: the stack will
expect the driver to do ESN, and nothing actually happens but a rollover of the numbers.Â Sure, the driver could look for the ESN attribute
and fail the add, but that's a mode where we have to update every driver to fend off problems every time we add a new feature.Â Much
better is to only update drivers that actively support the new feature.

You are right.

Iâm not sure why this check is here in the first place. IMO it should take place in xdo_dev_state_addâa driver-specific callback.

If you say I'm right, then why do you say it should take place in the
driver callback?Â I just wrote that it should *not*.

Sorry, I wasn't clear; you are right with respect that this change will break Intel's x540 driver.

However, I do think that this is the purpose of xdo_dev_state_add(). Again, As far as I can understand, and please correct me if I'm wrong, this shouldnât be here in the first place.

Please have a look at mlx5e_xfrm_validate_state(). Currently, it return an error if the user requests ESN, regardless of the underlying device's capabilities. Subsequent patch to mlx5 driver, will allow such a request if the device does support it; maintaining backward compatibility.

Here is a code snippet:

-ÂÂÂÂÂÂ if (x->props.flags & XFRM_STATE_ESN) {
+ÂÂÂÂÂÂ if (x->props.flags & XFRM_STATE_ESN &&
+ÂÂÂÂÂÂÂÂÂÂ !(mlx5_accel_ipsec_device_caps(priv->mdev) & MLX5_ACCEL_IPSEC_ESN)) {
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ netdev_info(netdev, "Cannot offload ESN xfrm states\n");
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ return -EINVAL;
ÂÂÂÂÂÂÂÂ }

This code seems to be assuming that all drivers/NICs with the offload
will be able to do ESN, and this is not the case.Â If this code is put
into place, suddenly the ixgbe driver's offload will have a failure
case: the driver doesn't support ESN, and doesn't know to NAK the
state_add if the ESN bit is on.Â This is a generic capabilities issue
for which we already have a solution "pattern".

I guess you are right but ixgbe driver is already checking many other caps during add_sa callback (below code from v3 patches for ixgbe ipsec):

+ÂÂÂ if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) {
+ÂÂÂÂÂÂÂ netdev_err(dev, "Unsupported protocol 0x%04x for ipsec offload\n",
+ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ xs->id.proto);
+ÂÂÂÂÂÂÂ return -EINVAL;
+ÂÂÂ }
+
+ÂÂÂ if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) {
+ÂÂÂÂÂÂÂ struct rx_sa rsa;
+
+ÂÂÂÂÂÂÂ if (xs->calg) {
+ÂÂÂÂÂÂÂÂÂÂÂ netdev_err(dev, "Compression offload not supported\n");
+ÂÂÂÂÂÂÂÂÂÂÂ return -EINVAL;
+ÂÂÂÂÂÂÂ }

What is the difference for checking xs->calg exists in state to ESN?

Yes, the currently existing 2 drivers are doing this: I imagine that mlx5e did it this way because it happened to be the first driver and it got things working; ixgbe followed because there wasn't any other way to do this. But this doesn't mean it is the right thing to do, and this is good that we're having this discussion before too many other drivers end up following this same example.

If I've read the patch correctly, the SA with ESN enabled will be added for offload, but nothing will happen when the ESN needs to be advanced if the driver hasn't implemented xdo_dev_state_advance_esn(). At this point the ipsec conversation will fail, correct? How do we protect the XFRM stack's new feature from drivers that don't support it?

The quick and dirty answer is for this patch to include code for ixgbe and any other ipsec-offload drivers. However, this becomes a burden for the author of the any feature, where every driver will need to be updated for it to work correctly, and every driver will need to have the same code to do it. This is opening the door to mistakes.

When we look at code like mlx5e_xfrm_validate_state(), and similar things in ixgbe, we can see there are many capabilities that every ipsec offload driver needs to check for. If drivers have to copy code to do the same checks, let's push these common requirements up the stack so we only need the code in one place, rather than code it in every driver.

I think in long term we can refactor to cap mask declaration by the driver and call add_sa only if mask exists but
this can be a totally different patch.

Let's do this now before more drivers are enabled for ipsec, while the problem is still small.

In the meantime, while we're still hashing this out, please at least add something in xfrm_dev_state_add() to return -EINVAL if the driver hasn't implemented xdo_dev_state_advance_esn(). Perhaps something like this:

@@ -172,10 +172,12 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x,
dst_release(dst);
}

- if (!dev->xfrmdev_ops || !dev->xfrmdev_ops->xdo_dev_state_add) {
+ if (!dev->xfrmdev_ops || !dev->xfrmdev_ops->xdo_dev_state_add ||
+ ((x->props.flags & XFRM_STATE_ESN) &&
+ !dev->xfrmdev_ops->xdo_dev_state_advance_esn)) {
xso->dev = NULL;
dev_put(dev);
- return 0;
+ return -EINVAL;
}

xso->dev = dev;

sln

We weren't assuming that, please see above.

Â > What do you suggest?
Â >

There should be a capabilities/feature flag for the driver to set and
the XFRM code shouldn't try the state_add with ESN if the driver hasn't
set an ESN bit in its capabilities.Â Other capabilities that might make
sense here are IPv6, TSO, and CSUM; there may be others.

Look at how feature bits are added to netdev->features to signify what the driver can do.Â I think that's a much better approach.

It looks like an overkill?

Alternatively, just solve this by failing to add the SA that has ESN set
if the driver hasn't defined your new xdo_dev_state_advance_esn().

sln

sln

ÂÂÂÂÂÂÂÂÂÂ return -EINVAL;
ÂÂÂÂÂÂÂÂ dev = dev_get_by_index(net, xuo->ifindex);
diff --git a/net/xfrm/xfrm_replay.c b/net/xfrm/xfrm_replay.c
index 0250181..1d38c6a 100644
--- a/net/xfrm/xfrm_replay.c
+++ b/net/xfrm/xfrm_replay.c
@@ -551,6 +551,8 @@ static void xfrm_replay_advance_esn(struct xfrm_state *x, __be32 net_seq)
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂ bitnr = replay_esn->replay_window - (diff - pos);
ÂÂÂÂÂÂ }
ÂÂ +ÂÂÂ xfrm_dev_state_advance_esn(x);
+
ÂÂÂÂÂÂ nr = bitnr >> 5;
ÂÂÂÂÂÂ bitnr = bitnr & 0x1F;
ÂÂÂÂÂÂ replay_esn->bmp[nr] |= (1U << bitnr);

Next message: Dave Hansen: "Re: [RFC PATCH v2 6/6] x86/entry/pti: don't switch PGD on when pti_disable is set"
Previous message: Eric Dumazet: "Re: [RFC 1/2] softirq: Defer net rx/tx processing to ksoftirqd context"
In reply to: Aviad Yehezkel: "Re: [PATCH net-next v2] xfrm: Add ESN support for IPSec HW offload"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]