Re: [PATCH v5 09/27] IB/Verbs: Reform IB-core verbs/uverbs_cmd/sysfs

From: ira.weiny
Date: Wed Apr 22 2015 - 12:29:11 EST


On Wed, Apr 22, 2015 at 09:38:57AM +0200, Michael Wang wrote:
> Hi, Ira
>
> Thanks for the review :-)
>
> On 04/22/2015 01:19 AM, ira.weiny wrote:
> [snip]
> >> diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
> >> index cbd0383..8570180 100644
> >> --- a/drivers/infiniband/core/sysfs.c
> >> +++ b/drivers/infiniband/core/sysfs.c
> >> @@ -248,14 +248,10 @@ static ssize_t phys_state_show(struct ib_port *p, struct port_attribute *unused,
> >> static ssize_t link_layer_show(struct ib_port *p, struct port_attribute *unused,
> >> char *buf)
> >> {
> >> - switch (rdma_port_get_link_layer(p->ibdev, p->port_num)) {
> >> - case IB_LINK_LAYER_INFINIBAND:
> >> + if (rdma_tech_ib(p->ibdev, p->port_num))
> >
> > Is the final intention to remove Link Layer from the rdma stack entirely?
> >
> > I know that the use of link layer in userspace is just as convoluted as what we
> > are trying to fix here in the kernel. So it would be nice if we can eventually
> > get user space cleaned up to not use link layer as it currently does.
> >
> > However, standard networking tools can report the link layer. So while the
> > current use of "link layer" via userspace software is wrong I don't think it is
> > wrong to report this information _to_ userspace.
> >
> > So unless we intend to completely hide the link layer from userspace I don't
> > think we should be removing the rdma_port_get_link_layer call. It is still
> > valid information even though we don't want to use it in most places.
>
> This series won't erase the rdma_port_get_link_layer(), although
> currently only mlx4 still using it in kernel...

But this patch does not return that information to the user. So we have the
drivers reporting a link layer which is no longer exposed userspace.

I think we need to separate the rdma_tech_ib (or cap_tech_ib or whatever) from
reporting the link layer.

>
> link_layer_show() was supposed to report the same info to user
> space as usual, so user tool don't have to change anything :-)

We need to expose the "cap_*" functionality to userspace which can then convert
to this interface and stop relying on inferring support based on the link
layer. But that is a separate issue from correctly reporting the link layer.

The link layer should be reported correctly from the drivers "get_link_layer"
call.

Ira

>
> Regards,
> Michael Wang
>
> >
> > Ira
> >
> >> return sprintf(buf, "%s\n", "InfiniBand");
> >> - case IB_LINK_LAYER_ETHERNET:
> >> + else
> >> return sprintf(buf, "%s\n", "Ethernet");
> >> - default:
> >> - return sprintf(buf, "%s\n", "Unknown");
> >> - }
> >> }
> >>
> >> static PORT_ATTR_RO(state);
> >> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> >> index a9f0489..5dc90aa 100644
> >> --- a/drivers/infiniband/core/uverbs_cmd.c
> >> +++ b/drivers/infiniband/core/uverbs_cmd.c
> >> @@ -515,8 +515,10 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
> >> resp.active_width = attr.active_width;
> >> resp.active_speed = attr.active_speed;
> >> resp.phys_state = attr.phys_state;
> >> - resp.link_layer = rdma_port_get_link_layer(file->device->ib_dev,
> >> - cmd.port_num);
> >> + resp.link_layer = rdma_tech_ib(file->device->ib_dev,
> >> + cmd.port_num) ?
> >> + IB_LINK_LAYER_INFINIBAND :
> >> + IB_LINK_LAYER_ETHERNET;
> >>
> >> if (copy_to_user((void __user *) (unsigned long) cmd.response,
> >> &resp, sizeof resp))
> >> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> >> index 626c9cf..7264860 100644
> >> --- a/drivers/infiniband/core/verbs.c
> >> +++ b/drivers/infiniband/core/verbs.c
> >> @@ -200,11 +200,9 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num, struct ib_wc *wc,
> >> u32 flow_class;
> >> u16 gid_index;
> >> int ret;
> >> - int is_eth = (rdma_port_get_link_layer(device, port_num) ==
> >> - IB_LINK_LAYER_ETHERNET);
> >>
> >> memset(ah_attr, 0, sizeof *ah_attr);
> >> - if (is_eth) {
> >> + if (rdma_tech_iboe(device, port_num)) {
> >> if (!(wc->wc_flags & IB_WC_GRH))
> >> return -EPROTOTYPE;
> >>
> >> @@ -873,7 +871,7 @@ int ib_resolve_eth_l2_attrs(struct ib_qp *qp,
> >> union ib_gid sgid;
> >>
> >> if ((*qp_attr_mask & IB_QP_AV) &&
> >> - (rdma_port_get_link_layer(qp->device, qp_attr->ah_attr.port_num) == IB_LINK_LAYER_ETHERNET)) {
> >> + (rdma_tech_iboe(qp->device, qp_attr->ah_attr.port_num))) {
> >> ret = ib_query_gid(qp->device, qp_attr->ah_attr.port_num,
> >> qp_attr->ah_attr.grh.sgid_index, &sgid);
> >> if (ret)
> >> --
> >> 2.1.0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/