Re: [PATCH net-next v2 1/1] net: bridge: switchdev: Improve error message clarity for switchdev_port_obj_add/del_deffered operations

From: Joe Perches
Date: Sat Mar 09 2024 - 09:36:08 EST


On Fri, 2024-03-08 at 11:47 +0100, Oleksij Rempel wrote:
> Enhance the error reporting mechanism in the switchdev framework to
> provide more informative and user-friendly error messages.
>
> Following feedback from users struggling to understand the implications
> of error messages like "failed (err=-28) to add object (id=2)", this
> update aims to clarify what operation failed and how this might impact
> the system or network.
>
> With this change, error messages now include a description of the failed
> operation, the specific object involved, and a brief explanation of the
> potential impact on the system. This approach helps administrators and
> developers better understand the context and severity of errors,
> facilitating quicker and more effective troubleshooting.
>
> Example of the improved logging:
>
> [ 70.516446] ksz-switch spi0.0 uplink: Failed to add Port Multicast
> Database entry (object id=2) with error: -ENOSPC (-28).
> [ 70.516446] Failure in updating the port's Multicast Database could
> lead to multicast forwarding issues.
> [ 70.516446] Current HW/SW setup lacks sufficient resources.

In general, I think the "problem" is being written with 80
columns in mind in the source and is not well thought out
how someone might read the log.

Generally, I think it's better to have single line statements
in the log.

[]

> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
[]
> @@ -244,6 +244,106 @@ static int switchdev_port_obj_notify(enum switchdev_notifier_type nt,
> return 0;
> }

>
> +static void switchdev_obj_id_to_helpful_msg(struct net_device *dev,
> + enum switchdev_obj_id obj_id,
> + int err, bool add)
> +{
> + const char *action = add ? "add" : "del";
> + const char *reason = "";
> + const char *problem;
> + const char *obj_str;
> +
> + switch (obj_id) {
> + case SWITCHDEV_OBJ_ID_UNDEFINED:
> + obj_str = "Undefined object";
> + problem = "Attempted operation is undefined, indicating a "
> + "possible programming error.\n";

My preference would be to write
problem = "Attempted operation is undefined indicating a possible programming error\n";

> + break;
> + case SWITCHDEV_OBJ_ID_PORT_VLAN:
> + obj_str = "VLAN entry";
> + problem = "Failure in VLAN settings on this port might disrupt "
> + "network segmentation or traffic isolation, affecting\n"
> + "network partitioning.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_PORT_MDB:
> + obj_str = "Port Multicast Database entry";
> + problem = "Failure in updating the port's Multicast Database "
> + "could lead to multicast forwarding issues.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_HOST_MDB:
> + obj_str = "Host Multicast Database entry";
> + problem = "Failure in updating the host's Multicast Database"
> + "may impact multicast group memberships or\n"

No space after Database makes the output "Databasemay"

> + "traffic delivery, affecting multicast communication.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_MRP:
> + obj_str = "Media Redundancy Protocol configuration for port";
> + problem = "Failure to set MRP ring ID on this port prevents"
> + "communication with the specified redundancy ring,\n"

portcommunication

> + "resulting in an inability to engage in MRP-based "
> + "network operations.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_RING_TEST_MRP:
> + obj_str = "MRP Test Frame Operations for port";
> + problem = "Failure to generate/monitor MRP test frames may lead"
> + "to inability to assess the ring's operational\n"

leadto

> + "integrity and fault response, hindering proactive "
> + "network management.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_RING_ROLE_MRP:
> + obj_str = "MRP Ring Role Configuration";
> + problem = "Improper MRP ring role configuration may create "
> + "conflicts in the ring, disrupting communication\n"
> + "for all participants, or isolate the local system "
> + "from the ring, hindering its ability to communicate "
> + "with other participants.\n";

A bunch of unnecessary commas.

> + break;
> + case SWITCHDEV_OBJ_ID_RING_STATE_MRP:
> + obj_str = "MRP Ring State Configuration";
> + problem = "Failure to correctly set the MRP ring state can "
> + "result in network loops or leave segments without\n"
> + "communication. In a Closed state, it maintains loop "
> + "prevention by blocking one MRM port, while an Open\n"
> + "state activates in response to failures, changing "
> + "port states to preserve network connectivity.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_IN_TEST_MRP:
> + obj_str = "MRP_InTest Frame Generation Configuration";
> + problem = "Failure in managing MRP_InTest frame generation can "
> + "misjudge the interconnection ring's state, leading\n"
> + "to incorrect blocking or unblocking of the I/C port."
> + "This misconfiguration might result in unintended\n"
> + "network loops or isolate critical network segments, "
> + "compromising network integrity and reliability.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_IN_ROLE_MRP:
> + obj_str = "Interconnection Ring Role Configuration";
> + problem = "Failure in incorrect assignment of interconnection "
> + "ring roles (MIM/MIC) can impair the formation of the\n"
> + "interconnection rings.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_IN_STATE_MRP:
> + obj_str = "Interconnection Ring State Configuration";
> + problem = "Failure in updating the interconnection ring state "
> + "can lead in case of Open state to incorrect blocking\n"
> + "or unblocking of the I/C port, resulting in unintended"
> + "network loops or isolation of critical network\n";
> + break;
> + default:
> + obj_str = "Unknown object";
> + problem = "Indicating a possible programming error.\n";
> + }
> +
> + switch (err) {
> + case -ENOSPC:
> + reason = "Current HW/SW setup lacks sufficient resources.\n";

And adding a newline here puts an unnecessary newline between
logging output as the format also has a trailing newline.


> + break;
> + }
> +
> + netdev_err(dev, "Failed to %s %s (object id=%d) with error: %pe (%d).\n%s%s\n",
> + action, obj_str, obj_id, ERR_PTR(err), err, problem, reason);
> +}
> +