[PATCH] optoe: driver to read/write SFP/QSFP EEPROMs

From: Don Bollinger
Date: Mon Jun 11 2018 - 00:48:23 EST


optoe is an i2c based driver that supports read/write access to all
the pages (tables) of MSA standard SFP and similar devices (conforming
to the SFF-8472 spec) and MSA standard QSFP and similar devices
(conforming to the SFF-8436 spec).

These devices provide identification, operational status and control
registers via an EEPROM model. These devices support one or 3 fixed
pages (128 bytes) of data, and one page that is selected via a page
register on the first fixed page. Thus the driver's main task is
to map these pages onto a simple linear address space for user space
management applications. See the driver code for a detailed layout.

EEPROM data is accessible via a bin_attribute file called 'eeprom',
e.g. /sys/bus/i2c/devices/24-0050/eeprom.

Signed-off-by: Don Bollinger <don@xxxxxxxxxxxxxxxxx>
---

Why should this driver be in the Linux kernel? SFP and QSFP devices plug
into switches to convert electrical to optical signals and drive the
optical signal over fiber optic cables. They provide status and control
registers through an i2c interface similar to to other EEPROMS. However,
they have a paging mechanism that is unique, which requires a different
driver from (for example) at24. Various drivers have been developed for
this purpose, none of them support both SFP and QSFP, provide both read
and write access, and access all 256 architected pages. optoe does all
of these.

optoe has been adopted and is shipping to customers as a base module,
available to all platforms (switches) and used by multiple vendors and
platforms on both ONL (Open Network Linux) and SONiC (Microsoft's
'Software for Open Networking in the Cloud').

This patch has been built on the latest staging-testing kernel. It has
built and tested with SFP and QSFP devices on an ARM platform with a 4.9
kernel, and an x86 switch with a 3.16 kernel. This patch should install
and build clean on any kernel from 3.16 up to the latest (as of 6/10/2018).


Documentation/misc-devices/optoe.txt | 56 ++
drivers/misc/eeprom/Kconfig | 18 +
drivers/misc/eeprom/Makefile | 1 +
drivers/misc/eeprom/optoe.c | 1141 ++++++++++++++++++++++++++++++++++
4 files changed, 1216 insertions(+)
create mode 100644 Documentation/misc-devices/optoe.txt
create mode 100644 drivers/misc/eeprom/optoe.c

diff --git a/Documentation/misc-devices/optoe.txt b/Documentation/misc-devices/optoe.txt
new file mode 100644
index 000000000000..496134940147
--- /dev/null
+++ b/Documentation/misc-devices/optoe.txt
@@ -0,0 +1,56 @@
+optoe driver
+
+Author Don Bollinger (don@xxxxxxxxxxxxxxxxx)
+
+Optoe is an i2c based driver that supports read/write access to all
+the pages (tables) of MSA standard SFP and similar devices (conforming
+to the SFF-8472 spec) and MSA standard QSFP and similar devices
+(conforming to the SFF-8436 spec).
+
+i2c based optoelectronic transceivers (SPF, QSFP, etc) provide identification,
+operational status, and control registers via an EEPROM model. Unlike the
+EEPROMs that at24 supports, these devices access data beyond byte 256 via
+a page select register, which must be managed by the driver. See the driver
+code for a detailed explanation of how the linear address space provided
+by the driver maps to the paged address space provided by the devices.
+
+The EEPROM data is accessible via a bin_attribute file called 'eeprom',
+e.g. /sys/bus/i2c/devices/24-0050/eeprom
+
+This driver also reports the port number for each device, via a sysfs
+attribute: 'port_name'. This is a read/write attribute. It should be
+explicitly set as part of system initialization, ideally at the same time
+the device is instantiated. Write an appropriate port name (any string, up
+to 19 characters) to initialize. If not initialized explicitly, all ports
+will have the port_name of 'unitialized'. Alternatively, if the driver is
+called with platform_data, the port_name will be read from eeprom_data->label
+(if the EEPROM CLASS driver is configured) or from platform_data.port_name.
+
+This driver can be instantiated with 'new_device', per the convention
+described in Documentation/i2c/instantiating-devices. It wants one of
+two possible device identifiers. Use 'optoe1' to indicate this is a device
+with just one i2c address (all QSFP type devices). Use 'optoe2' to indicate
+this is a device with two i2c addresses (all SFP type devices).
+
+Example:
+# echo optoe1 0x50 > /sys/bus/i2c/devices/i2c-64/new_device
+# echo port54 > /sys/bus/i2c/devices/i2c-64/port_name
+
+This will add a QSFP type device to i2c bus i2c-64, and name it 'port54'
+
+Example:
+# echo optoe2 0x50 > /sys/bus/i2c/devices/i2c-11/new_device
+# echo port1 > /sys/bus/i2c/devices/i2c-11/port_name
+
+This will add an SFP type device to i2c bus i2c-11, and name it 'port1'
+
+The second parameter to new_device is an i2c address, and MUST be 0x50 for
+this driver to work properly. This is part of the spec for these devices.
+(It is not necessary to create a device at 0x51 for SFP type devices, the
+driver does that automatically.)
+
+Note that SFP type and QSFP type devices are not plug-compatible. The
+driver expects the correct ID for each port (each i2c device). It does
+not check because the port will often be empty, and the only way to check
+is to interrogate the device. Incorrect choice of ID will lead to correct
+data being reported for the first 256 bytes, incorrect data after that.
diff --git a/drivers/misc/eeprom/Kconfig b/drivers/misc/eeprom/Kconfig
index 68a1ac929917..9a08e12756ee 100644
--- a/drivers/misc/eeprom/Kconfig
+++ b/drivers/misc/eeprom/Kconfig
@@ -111,4 +111,22 @@ config EEPROM_IDT_89HPESX
This driver can also be built as a module. If so, the module
will be called idt_89hpesx.

+config EEPROM_OPTOE
+ tristate "read/write access to SFP* & QSFP* EEPROMs"
+ depends on I2C && SYSFS
+ help
+ If you say yes here you get support for read and write access to
+ the EEPROM of SFP and QSFP type optical and copper transceivers.
+ Includes all devices which conform to the sff-8436 and sff-8472
+ spec including SFP, SFP+, SFP28, SFP-DWDM, QSFP, QSFP+, QSFP28
+ or later. These devices are usually found in network switches.
+
+ This driver only manages read/write access to the EEPROM, all
+ other features should be accessed via i2c-dev.
+
+ This driver can also be built as a module. If so, the module
+ will be called optoe.
+
+ If unsure, say N.
+
endmenu
diff --git a/drivers/misc/eeprom/Makefile b/drivers/misc/eeprom/Makefile
index 2aab60ef3e3e..00288d669017 100644
--- a/drivers/misc/eeprom/Makefile
+++ b/drivers/misc/eeprom/Makefile
@@ -7,3 +7,4 @@ obj-$(CONFIG_EEPROM_93CX6) += eeprom_93cx6.o
obj-$(CONFIG_EEPROM_93XX46) += eeprom_93xx46.o
obj-$(CONFIG_EEPROM_DIGSY_MTC_CFG) += digsy_mtc_eeprom.o
obj-$(CONFIG_EEPROM_IDT_89HPESX) += idt_89hpesx.o
+obj-$(CONFIG_EEPROM_OPTOE) += optoe.o
diff --git a/drivers/misc/eeprom/optoe.c b/drivers/misc/eeprom/optoe.c
new file mode 100644
index 000000000000..7cdf1a0a5299
--- /dev/null
+++ b/drivers/misc/eeprom/optoe.c
@@ -0,0 +1,1141 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * optoe.c - A driver to read and write the EEPROM on optical transceivers
+ * (SFP, QSFP and similar I2C based devices)
+ *
+ * Copyright (C) 2014 Cumulus networks Inc.
+ * Copyright (C) 2017 Finisar Corp.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Freeoftware Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+/*
+ * Description:
+ * a) Optical transceiver EEPROM read/write transactions are just like
+ * the at24 eeproms managed by the at24.c i2c driver
+ * b) The register/memory layout is up to 256 128 byte pages defined by
+ * a "pages valid" register and switched via a "page select"
+ * register as explained in below diagram.
+ * c) 256 bytes are mapped at a time. 'Lower page 00h' is the first 128
+ * bytes of address space, and always references the same
+ * location, independent of the page select register.
+ * All mapped pages are mapped into the upper 128 bytes
+ * (offset 128-255) of the i2c address.
+ * d) Devices with one I2C address (eg QSFP) use I2C address 0x50
+ * (A0h in the spec), and map all pages in the upper 128 bytes
+ * of that address.
+ * e) Devices with two I2C addresses (eg SFP) have 256 bytes of data
+ * at I2C address 0x50, and 256 bytes of data at I2C address
+ * 0x51 (A2h in the spec). Page selection and paged access
+ * only apply to this second I2C address (0x51).
+ * e) The address space is presented, by the driver, as a linear
+ * address space. For devices with one I2C client at address
+ * 0x50 (eg QSFP), offset 0-127 are in the lower
+ * half of address 50/A0h/client[0]. Offset 128-255 are in
+ * page 0, 256-383 are page 1, etc. More generally, offset
+ * 'n' resides in page (n/128)-1. ('page -1' is the lower
+ * half, offset 0-127).
+ * f) For devices with two I2C clients at address 0x50 and 0x51 (eg SFP),
+ * the address space places offset 0-127 in the lower
+ * half of 50/A0/client[0], offset 128-255 in the upper
+ * half. Offset 256-383 is in the lower half of 51/A2/client[1].
+ * Offset 384-511 is in page 0, in the upper half of 51/A2/...
+ * Offset 512-639 is in page 1, in the upper half of 51/A2/...
+ * Offset 'n' is in page (n/128)-3 (for n > 383)
+ *
+ * One I2c addressed (eg QSFP) Memory Map
+ *
+ * 2-Wire Serial Address: 1010000x
+ *
+ * Lower Page 00h (128 bytes)
+ * =====================
+ * | |
+ * | |
+ * | |
+ * | |
+ * | |
+ * | |
+ * | |
+ * | |
+ * | |
+ * | |
+ * |Page Select Byte(127)|
+ * =====================
+ * |
+ * |
+ * |
+ * |
+ * V
+ * ------------------------------------------------------------
+ * | | | |
+ * | | | |
+ * | | | |
+ * | | | |
+ * | | | |
+ * | | | |
+ * | | | |
+ * | | | |
+ * | | | |
+ * V V V V
+ * ------------ -------------- --------------- --------------
+ * | | | | | | | |
+ * | Upper | | Upper | | Upper | | Upper |
+ * | Page 00h | | Page 01h | | Page 02h | | Page 03h |
+ * | | | (Optional) | | (Optional) | | (Optional |
+ * | | | | | | | for Cable |
+ * | | | | | | | Assemblies) |
+ * | ID | | AST | | User | | |
+ * | Fields | | Table | | EEPROM Data | | |
+ * | | | | | | | |
+ * | | | | | | | |
+ * | | | | | | | |
+ * ------------ -------------- --------------- --------------
+ *
+ * The SFF 8436 (QSFP) spec only defines the 4 pages described above.
+ * In anticipation of future applications and devices, this driver
+ * supports access to the full architected range, 256 pages.
+ *
+ **/
+
+/* #define DEBUG 1 */
+
+#undef EEPROM_CLASS
+#ifdef CONFIG_EEPROM_CLASS
+#define EEPROM_CLASS
+#endif
+#ifdef CONFIG_EEPROM_CLASS_MODULE
+#define EEPROM_CLASS
+#endif
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/delay.h>
+#include <linux/mutex.h>
+#include <linux/sysfs.h>
+#include <linux/jiffies.h>
+#include <linux/i2c.h>
+
+#ifdef EEPROM_CLASS
+#include <linux/eeprom_class.h>
+#endif
+
+#include <linux/types.h>
+
+/* The maximum length of a port name */
+#define MAX_PORT_NAME_LEN 20
+
+struct optoe_platform_data {
+ u32 byte_len; /* size (sum of all addr) */
+ u16 page_size; /* for writes */
+ u8 flags;
+ void *dummy1; /* backward compatibility */
+ void *dummy2; /* backward compatibility */
+
+#ifdef EEPROM_CLASS
+ struct eeprom_platform_data *eeprom_data;
+#endif
+ char port_name[MAX_PORT_NAME_LEN];
+};
+
+/* fundamental unit of addressing for EEPROM */
+#define OPTOE_PAGE_SIZE 128
+/*
+ * Single address devices (eg QSFP) have 256 pages, plus the unpaged
+ * low 128 bytes. If the device does not support paging, it is
+ * only 2 'pages' long.
+ */
+#define OPTOE_ARCH_PAGES 256
+#define ONE_ADDR_EEPROM_SIZE ((1 + OPTOE_ARCH_PAGES) * OPTOE_PAGE_SIZE)
+#define ONE_ADDR_EEPROM_UNPAGED_SIZE (2 * OPTOE_PAGE_SIZE)
+/*
+ * Dual address devices (eg SFP) have 256 pages, plus the unpaged
+ * low 128 bytes, plus 256 bytes at 0x50. If the device does not
+ * support paging, it is 4 'pages' long.
+ */
+#define TWO_ADDR_EEPROM_SIZE ((3 + OPTOE_ARCH_PAGES) * OPTOE_PAGE_SIZE)
+#define TWO_ADDR_EEPROM_UNPAGED_SIZE (4 * OPTOE_PAGE_SIZE)
+#define TWO_ADDR_NO_0X51_SIZE (2 * OPTOE_PAGE_SIZE)
+
+/* a few constants to find our way around the EEPROM */
+#define OPTOE_PAGE_SELECT_REG 0x7F
+#define ONE_ADDR_PAGEABLE_REG 0x02
+#define ONE_ADDR_NOT_PAGEABLE BIT(2)
+#define TWO_ADDR_PAGEABLE_REG 0x40
+#define TWO_ADDR_PAGEABLE BIT(4)
+#define TWO_ADDR_0X51_REG 92
+#define TWO_ADDR_0X51_SUPP BIT(6)
+#define OPTOE_ID_REG 0
+#define OPTOE_READ_OP 0
+#define OPTOE_WRITE_OP 1
+#define OPTOE_EOF 0 /* used for access beyond end of device */
+
+struct optoe_data {
+ struct optoe_platform_data chip;
+ int use_smbus;
+ char port_name[MAX_PORT_NAME_LEN];
+
+ /*
+ * Lock protects against activities from other Linux tasks,
+ * but not from changes by other I2C masters.
+ */
+ struct mutex lock;
+ struct bin_attribute bin;
+ struct attribute_group attr_group;
+
+ u8 *writebuf;
+ unsigned int write_max;
+
+ unsigned int num_addresses;
+
+#ifdef EEPROM_CLASS
+ struct eeprom_device *eeprom_dev;
+#endif
+
+ /* dev_class: ONE_ADDR (QSFP) or TWO_ADDR (SFP) */
+ int dev_class;
+
+ struct i2c_client *client[2];
+};
+
+/*
+ * This parameter is to help this driver avoid blocking other drivers out
+ * of I2C for potentially troublesome amounts of time. With a 100 kHz I2C
+ * clock, one 256 byte read takes about 1/43 second which is excessive;
+ * but the 1/170 second it takes at 400 kHz may be quite reasonable; and
+ * at 1 MHz (Fm+) a 1/430 second delay could easily be invisible.
+ *
+ * This value is forced to be a power of two so that writes align on pages.
+ */
+static unsigned int io_limit = OPTOE_PAGE_SIZE;
+
+/*
+ * specs often allow 5 msec for a page write, sometimes 20 msec;
+ * it's important to recover from write timeouts.
+ */
+static unsigned int write_timeout = 25;
+
+/*
+ * flags to distinguish one-address (QSFP family) from two-address (SFP family)
+ * If the family is not known, figure it out when the device is accessed
+ */
+#define ONE_ADDR 1
+#define TWO_ADDR 2
+
+static const struct i2c_device_id optoe_ids[] = {
+ { "optoe1", ONE_ADDR },
+ { "optoe2", TWO_ADDR },
+ { "sff8436", ONE_ADDR },
+ { "24c04", TWO_ADDR },
+ { /* END OF LIST */ }
+};
+MODULE_DEVICE_TABLE(i2c, optoe_ids);
+
+/*-------------------------------------------------------------------------*/
+/*
+ * This routine computes the addressing information to be used for
+ * a given r/w request.
+ *
+ * Task is to calculate the client (0 = i2c addr 50, 1 = i2c addr 51),
+ * the page, and the offset.
+ *
+ * Handles both single address (eg QSFP) and two address (eg SFP).
+ * For SFP, offset 0-255 are on client[0], >255 is on client[1]
+ * Offset 256-383 are on the lower half of client[1]
+ * Pages are accessible on the upper half of client[1].
+ * Offset >383 are in 128 byte pages mapped into the upper half
+ *
+ * For QSFP, all offsets are on client[0]
+ * offset 0-127 are on the lower half of client[0] (no paging)
+ * Pages are accessible on the upper half of client[1].
+ * Offset >127 are in 128 byte pages mapped into the upper half
+ *
+ * Callers must not read/write beyond the end of a client or a page
+ * without recomputing the client/page. Hence offset (within page)
+ * plus length must be less than or equal to 128. (Note that this
+ * routine does not have access to the length of the call, hence
+ * cannot do the validity check.)
+ *
+ * Offset within Lower Page 00h and Upper Page 00h are not recomputed
+ */
+
+static uint8_t optoe_translate_offset(struct optoe_data *optoe,
+ loff_t *offset,
+ struct i2c_client **client)
+{
+ unsigned int page = 0;
+
+ *client = optoe->client[0];
+
+ /* if SFP style, offset > 255, shift to i2c addr 0x51 */
+ if (optoe->dev_class == TWO_ADDR) {
+ if (*offset > 255) {
+ /* like QSFP, but shifted to client[1] */
+ *client = optoe->client[1];
+ *offset -= 256;
+ }
+ }
+
+ /*
+ * if offset is in the range 0-128...
+ * page doesn't matter (using lower half), return 0.
+ * offset is already correct (don't add 128 to get to paged area)
+ */
+ if (*offset < OPTOE_PAGE_SIZE)
+ return page;
+
+ /* note, page will always be positive since *offset >= 128 */
+ page = (*offset >> 7) - 1;
+ /* 0x80 places the offset in the top half, offset is last 7 bits */
+ *offset = OPTOE_PAGE_SIZE + (*offset & 0x7f);
+
+ return page; /* note also returning client and offset */
+}
+
+static ssize_t optoe_eeprom_read(struct optoe_data *optoe,
+ struct i2c_client *client,
+ char *buf, unsigned int offset, size_t count)
+{
+ struct i2c_msg msg[2];
+ u8 msgbuf[2];
+ unsigned long timeout, read_time;
+ int status, i;
+
+ memset(msg, 0, sizeof(msg));
+
+ switch (optoe->use_smbus) {
+ case I2C_SMBUS_I2C_BLOCK_DATA:
+ /*smaller eeproms can work given some SMBus extension calls */
+ if (count > I2C_SMBUS_BLOCK_MAX)
+ count = I2C_SMBUS_BLOCK_MAX;
+ break;
+ case I2C_SMBUS_WORD_DATA:
+ /* Check for odd length transaction */
+ count = (count == 1) ? 1 : 2;
+ break;
+ case I2C_SMBUS_BYTE_DATA:
+ count = 1;
+ break;
+ default:
+ /*
+ * When we have a better choice than SMBus calls, use a
+ * combined I2C message. Write address; then read up to
+ * io_limit data bytes. msgbuf is u8 and will cast to our
+ * needs.
+ */
+ i = 0;
+ msgbuf[i++] = offset;
+
+ msg[0].addr = client->addr;
+ msg[0].buf = msgbuf;
+ msg[0].len = i;
+
+ msg[1].addr = client->addr;
+ msg[1].flags = I2C_M_RD;
+ msg[1].buf = buf;
+ msg[1].len = count;
+ }
+
+ /*
+ * Reads fail if the previous write didn't complete yet. We may
+ * loop a few times until this one succeeds, waiting at least
+ * long enough for one entire page write to work.
+ */
+ timeout = jiffies + msecs_to_jiffies(write_timeout);
+ do {
+ read_time = jiffies;
+
+ switch (optoe->use_smbus) {
+ case I2C_SMBUS_I2C_BLOCK_DATA:
+ status = i2c_smbus_read_i2c_block_data(client, offset,
+ count, buf);
+ break;
+ case I2C_SMBUS_WORD_DATA:
+ status = i2c_smbus_read_word_data(client, offset);
+ if (status >= 0) {
+ buf[0] = status & 0xff;
+ if (count == 2)
+ buf[1] = status >> 8;
+ status = count;
+ }
+ break;
+ case I2C_SMBUS_BYTE_DATA:
+ status = i2c_smbus_read_byte_data(client, offset);
+ if (status >= 0) {
+ buf[0] = status;
+ status = count;
+ }
+ break;
+ default:
+ status = i2c_transfer(client->adapter, msg, 2);
+ if (status == 2)
+ status = count;
+ }
+
+ dev_dbg(&client->dev, "eeprom read %zu@%d --> %d (%ld)\n",
+ count, offset, status, jiffies);
+
+ if (status == count) /* happy path */
+ return count;
+
+ if (status == -ENXIO) /* no module present */
+ return status;
+
+ /* REVISIT: at HZ=100, this is sloooow */
+ usleep_range(1000, 2000);
+ } while (time_before(read_time, timeout));
+
+ return -ETIMEDOUT;
+}
+
+static ssize_t optoe_eeprom_write(struct optoe_data *optoe,
+ struct i2c_client *client,
+ const char *buf,
+ unsigned int offset, size_t count)
+{
+ struct i2c_msg msg;
+ ssize_t status;
+ unsigned long timeout, write_time;
+ unsigned int next_page_start;
+ int i = 0;
+ u16 writeword;
+
+ /* write max is at most a page
+ * (In this driver, write_max is actually one byte!)
+ */
+ if (count > optoe->write_max)
+ count = optoe->write_max;
+
+ /* shorten count if necessary to avoid crossing page boundary */
+ next_page_start = roundup(offset + 1, OPTOE_PAGE_SIZE);
+ if (offset + count > next_page_start)
+ count = next_page_start - offset;
+
+ switch (optoe->use_smbus) {
+ case I2C_SMBUS_I2C_BLOCK_DATA:
+ /*smaller eeproms can work given some SMBus extension calls */
+ if (count > I2C_SMBUS_BLOCK_MAX)
+ count = I2C_SMBUS_BLOCK_MAX;
+ break;
+ case I2C_SMBUS_WORD_DATA:
+ /* Check for odd length transaction */
+ count = (count == 1) ? 1 : 2;
+ break;
+ case I2C_SMBUS_BYTE_DATA:
+ count = 1;
+ break;
+ default:
+ /* If we'll use I2C calls for I/O, set up the message */
+ msg.addr = client->addr;
+ msg.flags = 0;
+
+ /* msg.buf is u8 and casts will mask the values */
+ msg.buf = optoe->writebuf;
+
+ msg.buf[i++] = offset;
+ memcpy(&msg.buf[i], buf, count);
+ msg.len = i + count;
+ break;
+ }
+
+ /*
+ * Reads fail if the previous write didn't complete yet. We may
+ * loop a few times until this one succeeds, waiting at least
+ * long enough for one entire page write to work.
+ */
+ timeout = jiffies + msecs_to_jiffies(write_timeout);
+ do {
+ write_time = jiffies;
+
+ switch (optoe->use_smbus) {
+ case I2C_SMBUS_I2C_BLOCK_DATA:
+ status = i2c_smbus_write_i2c_block_data(client,
+ offset,
+ count,
+ buf);
+ if (status == 0)
+ status = count;
+ break;
+ case I2C_SMBUS_WORD_DATA:
+ if (count == 2) {
+ writeword = (buf[1] << 8) | buf[0];
+ status = i2c_smbus_write_word_data(client,
+ offset,
+ writeword);
+ } else {
+ /* count = 1 */
+ status = i2c_smbus_write_byte_data(client,
+ offset,
+ buf[0]);
+ }
+ if (status == 0)
+ status = count;
+ break;
+ case I2C_SMBUS_BYTE_DATA:
+ status = i2c_smbus_write_byte_data(client, offset,
+ buf[0]);
+ if (status == 0)
+ status = count;
+ break;
+ default:
+ status = i2c_transfer(client->adapter, &msg, 1);
+ if (status == 1)
+ status = count;
+ break;
+ }
+
+ dev_dbg(&client->dev, "eeprom write %zu@%d --> %ld (%lu)\n",
+ count, offset, (long int)status, jiffies);
+
+ if (status == count)
+ return count;
+
+ /* REVISIT: at HZ=100, this is sloooow */
+ usleep_range(1000, 2000);
+ } while (time_before(write_time, timeout));
+
+ return -ETIMEDOUT;
+}
+
+static ssize_t optoe_eeprom_update_client(struct optoe_data *optoe,
+ char *buf, loff_t off,
+ size_t count, int opcode)
+{
+ struct i2c_client *client;
+ ssize_t retval = 0;
+ u8 page = 0;
+ loff_t phy_offset = off;
+ int ret = 0;
+
+ page = optoe_translate_offset(optoe, &phy_offset, &client);
+ dev_dbg(&client->dev,
+ "%s off %lld page:%d phy_offset:%lld, count:%ld, opcode:%d\n",
+ __func__, off, page, phy_offset, (long int)count, opcode);
+ if (page > 0) {
+ ret = optoe_eeprom_write(optoe, client, &page,
+ OPTOE_PAGE_SELECT_REG, 1);
+ if (ret < 0) {
+ dev_dbg(&client->dev,
+ "Write page register for page %d failed ret:%d!\n",
+ page, ret);
+ return ret;
+ }
+ }
+
+ while (count) {
+ ssize_t status;
+
+ if (opcode == OPTOE_READ_OP) {
+ status = optoe_eeprom_read(optoe, client, buf,
+ phy_offset, count);
+ } else {
+ status = optoe_eeprom_write(optoe, client, buf,
+ phy_offset, count);
+ }
+ if (status <= 0) {
+ if (retval == 0)
+ retval = status;
+ break;
+ }
+ buf += status;
+ phy_offset += status;
+ count -= status;
+ retval += status;
+ }
+
+ if (page > 0) {
+ /* return the page register to page 0 (why?) */
+ page = 0;
+ ret = optoe_eeprom_write(optoe, client, &page,
+ OPTOE_PAGE_SELECT_REG, 1);
+ if (ret < 0) {
+ dev_err(&client->dev,
+ "Restore page register to 0 failed:%d!\n", ret);
+ /* error only if nothing has been transferred */
+ if (retval == 0)
+ retval = ret;
+ }
+ }
+ return retval;
+}
+
+/*
+ * Figure out if this access is within the range of supported pages.
+ * Note this is called on every access because we don't know if the
+ * module has been replaced since the last call.
+ * If/when modules support more pages, this is the routine to update
+ * to validate and allow access to additional pages.
+ *
+ * Returns updated len for this access:
+ * - entire access is legal, original len is returned.
+ * - access begins legal but is too long, len is truncated to fit.
+ * - initial offset exceeds supported pages, return OPTOE_EOF (zero)
+ */
+static ssize_t optoe_page_legal(struct optoe_data *optoe,
+ loff_t off, size_t len)
+{
+ struct i2c_client *client = optoe->client[0];
+ u8 regval;
+ int status;
+ size_t maxlen;
+
+ if (off < 0)
+ return -EINVAL;
+ if (optoe->dev_class == TWO_ADDR) {
+ /* SFP case */
+ /* if only using addr 0x50 (first 256 bytes) we're good */
+ if ((off + len) <= TWO_ADDR_NO_0X51_SIZE)
+ return len;
+ /* if offset exceeds possible pages, we're not good */
+ if (off >= TWO_ADDR_EEPROM_SIZE)
+ return OPTOE_EOF;
+ /* in between, are pages supported? */
+ status = optoe_eeprom_read(optoe, client, &regval,
+ TWO_ADDR_PAGEABLE_REG, 1);
+ if (status < 0)
+ return status; /* error out (no module?) */
+ if (regval & TWO_ADDR_PAGEABLE) {
+ /* Pages supported, trim len to the end of pages */
+ maxlen = TWO_ADDR_EEPROM_SIZE - off;
+ } else {
+ /* pages not supported, trim len to unpaged size */
+ if (off >= TWO_ADDR_EEPROM_UNPAGED_SIZE)
+ return OPTOE_EOF;
+
+ /* will be accessing addr 0x51, is that supported? */
+ /* byte 92, bit 6 implies DDM support, 0x51 support */
+ status = optoe_eeprom_read(optoe, client, &regval,
+ TWO_ADDR_0X51_REG, 1);
+ if (status < 0)
+ return status;
+ if (regval & TWO_ADDR_0X51_SUPP) {
+ /* addr 0x51 is OK */
+ maxlen = TWO_ADDR_EEPROM_UNPAGED_SIZE - off;
+ } else {
+ /* addr 0x51 NOT supported, trim to 256 max */
+ if (off >= TWO_ADDR_NO_0X51_SIZE)
+ return OPTOE_EOF;
+ maxlen = TWO_ADDR_NO_0X51_SIZE - off;
+ }
+ }
+ len = (len > maxlen) ? maxlen : len;
+ dev_dbg(&client->dev,
+ "page_legal, SFP, off %lld len %ld\n",
+ off, (long int)len);
+ } else {
+ /* QSFP case */
+ /* if no pages needed, we're good */
+ if ((off + len) <= ONE_ADDR_EEPROM_UNPAGED_SIZE)
+ return len;
+ /* if offset exceeds possible pages, we're not good */
+ if (off >= ONE_ADDR_EEPROM_SIZE)
+ return OPTOE_EOF;
+ /* in between, are pages supported? */
+ status = optoe_eeprom_read(optoe, client, &regval,
+ ONE_ADDR_PAGEABLE_REG, 1);
+ if (status < 0)
+ return status; /* error out (no module?) */
+ if (regval & ONE_ADDR_NOT_PAGEABLE) {
+ /* pages not supported, trim len to unpaged size */
+ if (off >= ONE_ADDR_EEPROM_UNPAGED_SIZE)
+ return OPTOE_EOF;
+ maxlen = ONE_ADDR_EEPROM_UNPAGED_SIZE - off;
+ } else {
+ /* Pages supported, trim len to the end of pages */
+ maxlen = ONE_ADDR_EEPROM_SIZE - off;
+ }
+ len = (len > maxlen) ? maxlen : len;
+ dev_dbg(&client->dev,
+ "page_legal, QSFP, off %lld len %ld\n",
+ off, (long int)len);
+ }
+ return len;
+}
+
+static ssize_t optoe_read_write(struct optoe_data *optoe,
+ char *buf, loff_t off,
+ size_t len, int opcode)
+{
+ struct i2c_client *client = optoe->client[0];
+ int chunk;
+ int status = 0;
+ ssize_t retval;
+ size_t pending_len = 0, chunk_len = 0;
+ loff_t chunk_offset = 0, chunk_start_offset = 0;
+
+ dev_dbg(&client->dev,
+ "%s: off %lld len:%ld, opcode:%s\n",
+ __func__, off, (long int)len,
+ (opcode == OPTOE_READ_OP) ? "r" : "w");
+ if (unlikely(!len))
+ return len;
+
+ /*
+ * Read data from chip, protecting against concurrent updates
+ * from this host, but not from other I2C masters.
+ */
+ mutex_lock(&optoe->lock);
+
+ /*
+ * Confirm this access fits within the device supported addr range
+ */
+ status = optoe_page_legal(optoe, off, len);
+ if (status == OPTOE_EOF || status < 0) {
+ mutex_unlock(&optoe->lock);
+ return status;
+ }
+ len = status;
+
+ /*
+ * For each (128 byte) chunk involved in this request, issue a
+ * separate call to sff_eeprom_update_client(), to
+ * ensure that each access recalculates the client/page
+ * and writes the page register as needed.
+ * Note that chunk to page mapping is confusing, is different for
+ * QSFP and SFP, and never needs to be done. Don't try!
+ */
+ pending_len = len; /* amount remaining to transfer */
+ retval = 0; /* amount transferred */
+ for (chunk = off >> 7; chunk <= (off + len - 1) >> 7; chunk++) {
+ /*
+ * Compute the offset and number of bytes to be read/write
+ *
+ * 1. start at offset 0 (within the chunk), and read/write
+ * the entire chunk
+ * 2. start at offset 0 (within the chunk) and read/write less
+ * than entire chunk
+ * 3. start at an offset not equal to 0 and read/write the rest
+ * of the chunk
+ * 4. start at an offset not equal to 0 and read/write less than
+ * (end of chunk - offset)
+ */
+ chunk_start_offset = chunk * OPTOE_PAGE_SIZE;
+
+ if (chunk_start_offset < off) {
+ chunk_offset = off;
+ if ((off + pending_len) < (chunk_start_offset +
+ OPTOE_PAGE_SIZE))
+ chunk_len = pending_len;
+ else
+ chunk_len = OPTOE_PAGE_SIZE - off;
+ } else {
+ chunk_offset = chunk_start_offset;
+ if (pending_len > OPTOE_PAGE_SIZE)
+ chunk_len = OPTOE_PAGE_SIZE;
+ else
+ chunk_len = pending_len;
+ }
+
+ dev_dbg(&client->dev,
+ "sff_r/w: off %lld, len %ld, chunk_start_offset %lld, chunk_offset %lld, chunk_len %ld, pending_len %ld\n",
+ off, (long int)len, chunk_start_offset, chunk_offset,
+ (long int)chunk_len, (long int)pending_len);
+
+ /*
+ * note: chunk_offset is from the start of the EEPROM,
+ * not the start of the chunk
+ */
+ status = optoe_eeprom_update_client(optoe, buf, chunk_offset,
+ chunk_len, opcode);
+ if (status != chunk_len) {
+ /* This is another 'no device present' path */
+ dev_dbg(&client->dev,
+ "o_u_c: chunk %d c_offset %lld c_len %ld failed %d!\n",
+ chunk, chunk_offset, (long int)chunk_len, status);
+ if (status > 0)
+ retval += status;
+ if (retval == 0)
+ retval = status;
+ break;
+ }
+ buf += status;
+ pending_len -= status;
+ retval += status;
+ }
+ mutex_unlock(&optoe->lock);
+
+ return retval;
+}
+
+static ssize_t optoe_bin_read(struct file *filp, struct kobject *kobj,
+ struct bin_attribute *attr,
+ char *buf, loff_t off, size_t count)
+{
+ struct i2c_client *client = to_i2c_client(container_of(kobj,
+ struct device, kobj));
+ struct optoe_data *optoe = i2c_get_clientdata(client);
+
+ return optoe_read_write(optoe, buf, off, count, OPTOE_READ_OP);
+}
+
+static ssize_t optoe_bin_write(struct file *filp, struct kobject *kobj,
+ struct bin_attribute *attr,
+ char *buf, loff_t off, size_t count)
+{
+ struct i2c_client *client = to_i2c_client(container_of(kobj,
+ struct device, kobj));
+ struct optoe_data *optoe = i2c_get_clientdata(client);
+
+ return optoe_read_write(optoe, buf, off, count, OPTOE_WRITE_OP);
+}
+
+static int optoe_remove(struct i2c_client *client)
+{
+ struct optoe_data *optoe;
+
+ optoe = i2c_get_clientdata(client);
+ sysfs_remove_group(&client->dev.kobj, &optoe->attr_group);
+ sysfs_remove_bin_file(&client->dev.kobj, &optoe->bin);
+
+ if (optoe->num_addresses == 2)
+ i2c_unregister_device(optoe->client[1]);
+
+#ifdef EEPROM_CLASS
+ eeprom_device_unregister(optoe->eeprom_dev);
+#endif
+
+ kfree(optoe->writebuf);
+ kfree(optoe);
+ return 0;
+}
+
+static ssize_t dev_class_show(struct device *dev,
+ struct device_attribute *dattr, char *buf)
+{
+ struct i2c_client *client = to_i2c_client(dev);
+ struct optoe_data *optoe = i2c_get_clientdata(client);
+ ssize_t count;
+
+ mutex_lock(&optoe->lock);
+ count = sprintf(buf, "%d\n", optoe->dev_class);
+ mutex_unlock(&optoe->lock);
+
+ return count;
+}
+
+static ssize_t dev_class_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct i2c_client *client = to_i2c_client(dev);
+ struct optoe_data *optoe = i2c_get_clientdata(client);
+ int dev_class;
+
+ /*
+ * dev_class is actually the number of i2c addresses used, thus
+ * legal values are "1" (QSFP class) and "2" (SFP class)
+ */
+
+ if (kstrtoint(buf, 0, &dev_class) != 0 ||
+ dev_class < 1 || dev_class > 2)
+ return -EINVAL;
+
+ mutex_lock(&optoe->lock);
+ optoe->dev_class = dev_class;
+ mutex_unlock(&optoe->lock);
+
+ return count;
+}
+
+/*
+ * if using the EEPROM CLASS driver, we don't report a port_name,
+ * the EEPROM CLASS drive handles that. Hence all this code is
+ * only compiled if we are NOT using the EEPROM CLASS driver.
+ */
+#ifndef EEPROM_CLASS
+
+static ssize_t port_name_show(struct device *dev,
+ struct device_attribute *dattr, char *buf)
+{
+ struct i2c_client *client = to_i2c_client(dev);
+ struct optoe_data *optoe = i2c_get_clientdata(client);
+ ssize_t count;
+
+ mutex_lock(&optoe->lock);
+ count = sprintf(buf, "%s\n", optoe->port_name);
+ mutex_unlock(&optoe->lock);
+
+ return count;
+}
+
+static ssize_t port_name_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct i2c_client *client = to_i2c_client(dev);
+ struct optoe_data *optoe = i2c_get_clientdata(client);
+ char port_name[MAX_PORT_NAME_LEN];
+
+ /* no checking, this value is not used except by port_name_show */
+
+ if (sscanf(buf, "%19s", port_name) != 1)
+ return -EINVAL;
+
+ mutex_lock(&optoe->lock);
+ strcpy(optoe->port_name, port_name);
+ mutex_unlock(&optoe->lock);
+
+ return count;
+}
+
+static DEVICE_ATTR_RW(port_name);
+#endif /* if NOT defined EEPROM_CLASS, the common case */
+
+static DEVICE_ATTR_RW(dev_class);
+
+static struct attribute *optoe_attrs[] = {
+#ifndef EEPROM_CLASS
+ &dev_attr_port_name.attr,
+#endif
+ &dev_attr_dev_class.attr,
+ NULL,
+};
+
+static struct attribute_group optoe_attr_group = {
+ .attrs = optoe_attrs,
+};
+
+static int optoe_probe(struct i2c_client *client,
+ const struct i2c_device_id *id)
+{
+ int err;
+ int use_smbus = 0;
+ struct optoe_platform_data chip;
+ struct optoe_data *optoe;
+ int num_addresses;
+ char port_name[MAX_PORT_NAME_LEN];
+
+ if (client->addr != 0x50) {
+ dev_dbg(&client->dev, "probe, bad i2c addr: 0x%x\n",
+ client->addr);
+ err = -EINVAL;
+ goto exit;
+ }
+
+ if (client->dev.platform_data) {
+ chip = *(struct optoe_platform_data *)client->dev.platform_data;
+ /* take the port name from the supplied platform data */
+#ifdef EEPROM_CLASS
+ strncpy(port_name, chip.eeprom_data->label, MAX_PORT_NAME_LEN);
+#else
+ memcpy(port_name, chip.port_name, MAX_PORT_NAME_LEN);
+#endif
+ dev_dbg(&client->dev,
+ "probe, chip provided, flags:0x%x; name: %s\n",
+ chip.flags, client->name);
+ } else {
+ if (!id->driver_data) {
+ err = -ENODEV;
+ goto exit;
+ }
+ dev_dbg(&client->dev, "probe, building chip\n");
+ strcpy(port_name, "unitialized");
+ chip.flags = 0;
+#ifdef EEPROM_CLASS
+ chip.eeprom_data = NULL;
+#endif
+ }
+
+ /* Use I2C operations unless we're stuck with SMBus extensions. */
+ if (!i2c_check_functionality(client->adapter, I2C_FUNC_I2C)) {
+ if (i2c_check_functionality(client->adapter,
+ I2C_FUNC_SMBUS_READ_I2C_BLOCK)) {
+ use_smbus = I2C_SMBUS_I2C_BLOCK_DATA;
+ } else if (i2c_check_functionality(client->adapter,
+ I2C_FUNC_SMBUS_READ_WORD_DATA)) {
+ use_smbus = I2C_SMBUS_WORD_DATA;
+ } else if (i2c_check_functionality(client->adapter,
+ I2C_FUNC_SMBUS_READ_BYTE_DATA)) {
+ use_smbus = I2C_SMBUS_BYTE_DATA;
+ } else {
+ err = -EPFNOSUPPORT;
+ goto exit;
+ }
+ }
+
+ optoe = kzalloc(sizeof(*optoe), GFP_KERNEL);
+ if (!optoe) {
+ err = -ENOMEM;
+ goto exit;
+ }
+
+ mutex_init(&optoe->lock);
+
+ /* determine whether this is a one-address or two-address module */
+ if ((strcmp(client->name, "optoe1") == 0) ||
+ (strcmp(client->name, "sff8436") == 0)) {
+ /* one-address (eg QSFP) family */
+ optoe->dev_class = ONE_ADDR;
+ chip.byte_len = ONE_ADDR_EEPROM_SIZE;
+ num_addresses = 1;
+ } else if ((strcmp(client->name, "optoe2") == 0) ||
+ (strcmp(client->name, "24c04") == 0)) {
+ /* SFP family */
+ optoe->dev_class = TWO_ADDR;
+ chip.byte_len = TWO_ADDR_EEPROM_SIZE;
+ num_addresses = 2;
+ } else { /* those were the only two choices */
+ err = -EINVAL;
+ goto exit;
+ }
+
+ dev_dbg(&client->dev, "dev_class: %d\n", optoe->dev_class);
+ optoe->use_smbus = use_smbus;
+ optoe->chip = chip;
+ optoe->num_addresses = num_addresses;
+ memcpy(optoe->port_name, port_name, MAX_PORT_NAME_LEN);
+
+ /*
+ * Export the EEPROM bytes through sysfs, since that's convenient.
+ * By default, only root should see the data (maybe passwords etc)
+ */
+ sysfs_bin_attr_init(&optoe->bin);
+ optoe->bin.attr.name = "eeprom";
+ optoe->bin.attr.mode = 0444;
+ optoe->bin.read = optoe_bin_read;
+ optoe->bin.size = chip.byte_len;
+
+ if (!use_smbus ||
+ i2c_check_functionality(client->adapter,
+ I2C_FUNC_SMBUS_WRITE_I2C_BLOCK) ||
+ i2c_check_functionality(client->adapter,
+ I2C_FUNC_SMBUS_WRITE_WORD_DATA) ||
+ i2c_check_functionality(client->adapter,
+ I2C_FUNC_SMBUS_WRITE_BYTE_DATA)) {
+ /*
+ * NOTE: AN-2079
+ * Finisar recommends that the host implement 1 byte writes
+ * only since this module only supports 32 byte page boundaries.
+ * 2 byte writes are acceptable for PE and Vout changes per
+ * Application Note AN-2071.
+ */
+ unsigned int write_max = 1;
+
+ optoe->bin.write = optoe_bin_write;
+ optoe->bin.attr.mode |= 0200;
+
+ if (write_max > io_limit)
+ write_max = io_limit;
+ if (use_smbus && write_max > I2C_SMBUS_BLOCK_MAX)
+ write_max = I2C_SMBUS_BLOCK_MAX;
+ optoe->write_max = write_max;
+
+ /* buffer (data + address at the beginning) */
+ optoe->writebuf = kmalloc(write_max + 2, GFP_KERNEL);
+ if (!optoe->writebuf) {
+ err = -ENOMEM;
+ goto exit_kfree;
+ }
+ } else {
+ dev_warn(&client->dev,
+ "cannot write due to controller restrictions.");
+ }
+
+ optoe->client[0] = client;
+
+ /* SFF-8472 spec requires that the second I2C address be 0x51 */
+ if (num_addresses == 2) {
+ optoe->client[1] = i2c_new_dummy(client->adapter, 0x51);
+ if (!optoe->client[1]) {
+ dev_err(&client->dev, "address 0x51 unavailable\n");
+ err = -EADDRINUSE;
+ goto err_struct;
+ }
+ }
+
+ /* create the sysfs eeprom file */
+ err = sysfs_create_bin_file(&client->dev.kobj, &optoe->bin);
+ if (err)
+ goto err_struct;
+
+ optoe->attr_group = optoe_attr_group;
+
+ err = sysfs_create_group(&client->dev.kobj, &optoe->attr_group);
+ if (err) {
+ dev_err(&client->dev, "failed to create sysfs attribute group.\n");
+ goto err_struct;
+ }
+
+#ifdef EEPROM_CLASS
+ optoe->eeprom_dev = eeprom_device_register(&client->dev,
+ chip.eeprom_data);
+ if (IS_ERR(optoe->eeprom_dev)) {
+ dev_err(&client->dev, "error registering eeprom device.\n");
+ err = PTR_ERR(optoe->eeprom_dev);
+ goto err_sysfs_cleanup;
+ }
+#endif
+
+ i2c_set_clientdata(client, optoe);
+
+ dev_info(&client->dev, "%zu byte %s EEPROM, %s\n",
+ optoe->bin.size, client->name,
+ optoe->bin.write ? "read/write" : "read-only");
+
+ if (use_smbus == I2C_SMBUS_WORD_DATA ||
+ use_smbus == I2C_SMBUS_BYTE_DATA) {
+ dev_notice(&client->dev,
+ "Falling back to %s reads, performance will suffer\n",
+ use_smbus == I2C_SMBUS_WORD_DATA ? "word" : "byte");
+ }
+
+ return 0;
+
+#ifdef EEPROM_CLASS
+err_sysfs_cleanup:
+ sysfs_remove_group(&client->dev.kobj, &optoe->attr_group);
+ sysfs_remove_bin_file(&client->dev.kobj, &optoe->bin);
+#endif
+
+err_struct:
+ if (num_addresses == 2) {
+ if (optoe->client[1])
+ i2c_unregister_device(optoe->client[1]);
+ }
+
+ kfree(optoe->writebuf);
+exit_kfree:
+ kfree(optoe);
+exit:
+ dev_dbg(&client->dev, "probe error %d\n", err);
+
+ return err;
+}
+
+/*-------------------------------------------------------------------------*/
+
+static struct i2c_driver optoe_driver = {
+ .driver = {
+ .name = "optoe",
+ .owner = THIS_MODULE,
+ },
+ .probe = optoe_probe,
+ .remove = optoe_remove,
+ .id_table = optoe_ids,
+};
+
+static int __init optoe_init(void)
+{
+ if (!io_limit) {
+ pr_err("optoe: io_limit must not be 0!\n");
+ return -EINVAL;
+ }
+
+ io_limit = rounddown_pow_of_two(io_limit);
+ return i2c_add_driver(&optoe_driver);
+}
+module_init(optoe_init);
+
+static void __exit optoe_exit(void)
+{
+ i2c_del_driver(&optoe_driver);
+}
+module_exit(optoe_exit);
+
+MODULE_DESCRIPTION("Driver for optical transceiver (SFP, QSFP, ...) EEPROMs");
+MODULE_AUTHOR("DON BOLLINGER <don@xxxxxxxxxxxxxxxxx>");
+MODULE_LICENSE("GPL");
--
2.11.0