Re: [PATCH] IBcore/CM: Issue DREQ when receiving REQ/REP for stale QP

From: Doug Ledford
Date: Wed Dec 14 2016 - 12:56:04 EST


On 10/28/2016 7:14 AM, Hans Westgaard Ry wrote:
> from "InfiBand Architecture Specifications Volume 1":
>
> A QP is said to have a stale connection when only one side has
> connection information. A stale connection may result if the remote CM
> had dropped the connection and sent a DREQ but the DREQ was never
> received by the local CM. Alternatively the remote CM may have lost
> all record of past connections because its node crashed and rebooted,
> while the local CM did not become aware of the remote node's reboot
> and therefore did not clean up stale connections.
>
> and:
>
> A local CM may receive a REQ/REP for a stale connection. It shall
> abort the connection issuing REJ to the REQ/REP. It shall then issue
> DREQ with "DREQ:remote QPNâ set to the remote QPN from the REQ/REP.
>
> This patch solves a problem with reuse of QPN. Current codebase, that
> is IPoIB, relies on a REAP-mechanism to do cleanup of the structures
> in CM. A problem with this is the timeconstants governing this
> mechanism; they are up to 768 seconds and the interface may look
> inresponsive in that period. Issuing a DREQ (and receiving a DREP)
> does the necessary cleanup and the interface comes up.
>
> Signed-off-by: Hans Westgaard Ry <hans.westgaard.ry@xxxxxxxxxx>
> Reviewed-by: HÃkon Bugge <haakon.bugge@xxxxxxxxxx>

Thanks, applied.


--
Doug Ledford <dledford@xxxxxxxxxx>
GPG Key ID: 0E572FDD

Attachment: signature.asc
Description: OpenPGP digital signature