Re: [PATCHv5] staging: vme_user: provide DMA functionality

From: Martyn Welch
Date: Sun Oct 18 2015 - 18:04:41 EST




On 18/10/15 18:53, Dmitry Kalinkin wrote:
On Sun, Oct 18, 2015 at 10:31 AM, Martyn Welch <martyn@xxxxxxxxxxxx> wrote:

On 11/10/15 01:13, Dmitry Kalinkin wrote:
This introduces a new dma device that provides a single ioctl call that
provides DMA read and write functionality to the user space.

Signed-off-by: Dmitry Kalinkin <dmitry.kalinkin@xxxxxxxxx>
Cc: Igor Alekseev <igor.alekseev@xxxxxxx>
---

In the last reply Martyn suggested a rework of this to make it use existing
bus/vme/ctl instead of creating a new bus/vme/dma%i device and also
dynamically
allocate a dma resource in each request.

I think this doesn't need those adjustments. I think that dynamic allocation
doesn't solve any practical problem that isn't caused by current kernel
api.

That would depend on what resources had already been allocated to other
drivers using the kernel api and what resources the underlying bridge had to
make available. This driver will currently only load if all the resources it
wishes to expose to user space are available. That said, such a modification
is clearly separate from adding DMA support to user space, so the argument
is rather academic.
Other drives meaning vme_pio, I don't see any others. All this time
we are discussing how many GE PIO boards one can plug into a crate
with or without vme_user. Most of people have zero of them.
Also, VME DMA API has no users in kernel, we are just adding one now.

Unfortunately not all users of Linux upstream or even publicise their drivers. This is especially true of some industries where VME gets used. The number of VME windows is limited, so having a user space shim either hog or limit the number of resources available either in kernel space or user space is not an optimal solution.

I also think that separate device is a good feature because it allows
for
discovery of dma capatibility from userspace.

Given the current user space api, that's true.

The interface with separate
chardev also allows to provide DMA read() and write() syscalls that can
come handy in pair with /bin/dd.

But this patch doesn't implement such a feature does it?
Actually, initial (never published) version of this patch exposed
read(),write(),
and an ioctl to set the access cycle. It was working, but with subtlety for
A64 addressing. I come across some problems when using large offsets
that would not fit in signed long long. I was using FMODE_UNSIGNED_OFFSET
to fix the kernel side of things, but it seemed like userspace didn't like
the "negative" offsets. I've looked a bit at glibc sources and decided
to give up.
Now that I remember this, my original argument is kind of busted.
(Generally happy with this for now, however couple of comments below.)


v5:
Added a validation for dma_op argument in vme_user_sg_to_dma_list(). It is
already checked in caller but we would like to silence a warning:

drivers/staging/vme/devices/vme_user.c: In function
'vme_user_ioctl.isra.4':
drivers/staging/vme/devices/vme_user.c:324:7: warning: 'dest' may be
used uninitialized in this function [-Wmaybe-uninitialized]
ret = vme_dma_list_add(dma_list, src, dest, hw_len);
^
drivers/staging/vme/devices/vme_user.c:296:52: note: 'dest' was
declared here
struct vme_dma_attr *pci_attr, *vme_attr, *src, *dest;
^
drivers/staging/vme/devices/vme_user.c:324:7: warning: 'src' may be used
uninitialized in this function [-Wmaybe-uninitialized]
ret = vme_dma_list_add(dma_list, src, dest, hw_len);
^
drivers/staging/vme/devices/vme_user.c:296:46: note: 'src' was
declared here
struct vme_dma_attr *pci_attr, *vme_attr, *src, *dest;

---
drivers/staging/vme/devices/vme_user.c | 205
++++++++++++++++++++++++++++++++-
drivers/staging/vme/devices/vme_user.h | 11 ++
2 files changed, 213 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/vme/devices/vme_user.c
b/drivers/staging/vme/devices/vme_user.c
index 8e61a3b..2434e5f 100644
--- a/drivers/staging/vme/devices/vme_user.c
+++ b/drivers/staging/vme/devices/vme_user.c
@@ -79,15 +79,18 @@ static unsigned int bus_num;
* We shall support 4 masters and 4 slaves with this driver.
*/

The comment just above here (cropped in the patch) describes the interface
that this driver exposes and what is documented in
Documentation/devices.txt.
I've come across a long time ago and at the time I realized that this
document is generally outdated and is not required to be updated.
First, "Last revised: 6th April 2009"
Second, the device path information is long obsolete in the light of udev.
Third, they want submissions on a separate list <device@xxxxxxxxxx>
Fourth, "20 block Hitachi CD-ROM (under development) 0 = /dev/hitcd"
-- this is not for real.

Yup, I agree, devices.txt is probably well out of date and is a product of long since deprecated practices.

That said, the comment in vme_user.c is at odds with what the code implements after this patch is applied and that is something we very much have control over.

I think this comment either needs updating to reflect the changes introduced
in this patch, or deleted.

(As an aside:

The interface in Docmentation/devices.txt is an interesting oddity - it
existed before any VME drivers were present in the kernel. Given that the
driver at vmelinux.org hasn't been updated since some time in the 2.4 kernel
series and the lack of mainlined drivers other than this one using that
interface, should we update that file to reflect the additions?

If we were to modify this driver sufficiently, so that chrdevs were
dynamically allocated for example, should we delete that entry?
)
Had same thoughts. I've concluded that it's not worth (see above).

#define VME_MAJOR 221 /* VME Major Device Number */
-#define VME_DEVS 9 /* Number of dev entries */
+#define VME_DEVS 10 /* Number of dev entries */
#define MASTER_MINOR 0
#define MASTER_MAX 3
#define SLAVE_MINOR 4
#define SLAVE_MAX 7
#define CONTROL_MINOR 8
+#define DMA_MINOR 9
-#define PCI_BUF_SIZE 0x20000 /* Size of one slave image buffer
*/
+#define PCI_BUF_SIZE 0x20000 /* Size of one slave image buffer
*/
+
+#define VME_MAX_DMA_LEN 0x4000000 /* Maximal DMA transfer
length */
/*
* Structure to handle image related parameters.
@@ -112,7 +115,7 @@ static const int type[VME_DEVS] = { MASTER_MINOR,
MASTER_MINOR,
MASTER_MINOR, MASTER_MINOR,
SLAVE_MINOR, SLAVE_MINOR,
SLAVE_MINOR, SLAVE_MINOR,
- CONTROL_MINOR
+ CONTROL_MINOR, DMA_MINOR
};
struct vme_user_vma_priv {
@@ -281,6 +284,172 @@ static loff_t vme_user_llseek(struct file *file,
loff_t off, int whence)
return -EINVAL;
}
+static int vme_user_sg_to_dma_list(const struct vme_dma_op *dma_op,
+ struct sg_table *sgt,
+ int sg_count, struct vme_dma_list
*dma_list)
+{
+ ssize_t pos = 0;
+ struct scatterlist *sg;
+ int i, ret;
+
+ if ((dma_op->dir != VME_DMA_MEM_TO_VME) &&
+ (dma_op->dir != VME_DMA_VME_TO_MEM))
+ return -EINVAL;
+

Would this not be better implemented as a "default" case in the switch
below?
I preferred to exit early to not any cleanup.

True, adding the default case would require extra code to clear up the attributes.


+ for_each_sg(sgt->sgl, sg, sg_count, i) {
+ struct vme_dma_attr *pci_attr, *vme_attr, *src, *dest;
+ dma_addr_t hw_address = sg_dma_address(sg);
+ unsigned int hw_len = sg_dma_len(sg);
+
+ vme_attr = vme_dma_vme_attribute(dma_op->vme_addr + pos,
+ dma_op->aspace,
+ dma_op->cycle,
+ dma_op->dwidth);
+ if (!vme_attr)
+ return -ENOMEM;
+
+ pci_attr = vme_dma_pci_attribute(hw_address);
+ if (!pci_attr) {
+ vme_dma_free_attribute(vme_attr);
+ return -ENOMEM;
+ }
+
+ switch (dma_op->dir) {
+ case VME_DMA_MEM_TO_VME:
+ src = pci_attr;
+ dest = vme_attr;
+ break;
+ case VME_DMA_VME_TO_MEM:
+ src = vme_attr;
+ dest = pci_attr;
+ break;
+ }
+
+ ret = vme_dma_list_add(dma_list, src, dest, hw_len);
+
+ /*
+ * XXX VME API doesn't mention whether we should keep
+ * attributes around
+ */
+ vme_dma_free_attribute(vme_attr);
+ vme_dma_free_attribute(pci_attr);
+
+ if (ret)
+ return ret;
+
+ pos += hw_len;
+ }
+
+ return 0;
+}
+
+static enum dma_data_direction vme_dir_to_dma_dir(unsigned vme_dir)
+{
+ switch (vme_dir) {
+ case VME_DMA_VME_TO_MEM:
+ return DMA_FROM_DEVICE;
+ case VME_DMA_MEM_TO_VME:
+ return DMA_TO_DEVICE;
+ }
+
+ return DMA_NONE;
+}
+
+static ssize_t vme_user_dma_ioctl(unsigned int minor,
+ const struct vme_dma_op *dma_op)
+{
+ unsigned int offset = offset_in_page(dma_op->buf_vaddr);
+ unsigned long nr_pages;
+ enum dma_data_direction dir;
+ struct vme_dma_list *dma_list;
+ struct sg_table *sgt = NULL;
+ struct page **pages = NULL;
+ long got_pages;
+ ssize_t count;
+ int retval, sg_count;
+
+ /* Prevent WARN from dma_map_sg */
+ if (dma_op->count == 0)
+ return 0;
+
+ /*
+ * This is a voluntary limit to prevent huge allocation for pages
+ * array. VME_MAX_DMA_LEN is not a fundamental VME constraint.
+ */
+ count = min_t(size_t, dma_op->count, VME_MAX_DMA_LEN);
+ nr_pages = (offset + count + PAGE_SIZE - 1) >> PAGE_SHIFT;
+
+ dir = vme_dir_to_dma_dir(dma_op->dir);
+ if (dir == DMA_NONE)
+ return -EINVAL;
+
+ pages = kmalloc_array(nr_pages, sizeof(pages[0]), GFP_KERNEL);
+ if (!pages) {
+ retval = -ENOMEM;
+ goto free;
+ }
+
+ sgt = kzalloc(sizeof(*sgt), GFP_KERNEL);
+ if (!sgt) {
+ retval = -ENOMEM;
+ goto free;
+ }
+
+ dma_list = vme_new_dma_list(image[minor].resource);
+ if (!dma_list) {
+ retval = -ENOMEM;
+ goto free;
+ }
+
+ got_pages = get_user_pages_fast(dma_op->buf_vaddr, nr_pages,
+ dir == DMA_FROM_DEVICE, pages);
+ if (got_pages != nr_pages) {
+ pr_debug("Not all pages were pinned\n");
+ retval = (got_pages < 0) ? got_pages : -EFAULT;
+ goto release_pages;
+ }
+
+ retval = sg_alloc_table_from_pages(sgt, pages, nr_pages,
+ offset, count, GFP_KERNEL);
+ if (retval)
+ goto release_pages;
+
+ sg_count = dma_map_sg(vme_user_bridge->dev.parent,
+ sgt->sgl, sgt->nents, dir);
+ if (!sg_count) {
+ pr_debug("DMA mapping error\n");
+ retval = -EFAULT;
+ goto free_sgt;
+ }
+
+ retval = vme_user_sg_to_dma_list(dma_op, sgt, sg_count, dma_list);
+ if (retval)
+ goto dma_unmap;
+
+ retval = vme_dma_list_exec(dma_list);
+
+dma_unmap:
+ dma_unmap_sg(vme_user_bridge->dev.parent, sgt->sgl, sgt->nents,
dir);
+
+free_sgt:
+ sg_free_table(sgt);
+
+release_pages:
+ if (got_pages > 0)
+ release_pages(pages, got_pages, 0);
+
+ vme_dma_list_free(dma_list);
+
+free:
+ kfree(sgt);
+ kfree(pages);
+
+ if (retval)
+ return retval;
+
+ return count;
+}
+
/*
* The ioctls provided by the old VME access method (the one at
vmelinux.org)
* are most certainly wrong as the effectively push the registers layout
@@ -297,6 +466,7 @@ static int vme_user_ioctl(struct inode *inode, struct
file *file,
struct vme_master master;
struct vme_slave slave;
struct vme_irq_id irq_req;
+ struct vme_dma_op dma_op;
unsigned long copied;
unsigned int minor = MINOR(inode->i_rdev);
int retval;
@@ -406,6 +576,19 @@ static int vme_user_ioctl(struct inode *inode, struct
file *file,
break;
}
break;
+ case DMA_MINOR:
+ switch (cmd) {
+ case VME_DO_DMA:
+ copied = copy_from_user(&dma_op, argp,
+ sizeof(dma_op));
+ if (copied != 0) {
+ pr_warn("Partial copy from userspace\n");
+ return -EFAULT;
+ }
+
+ return vme_user_dma_ioctl(minor, &dma_op);
+ }
+ break;
}
return -EINVAL;
@@ -615,6 +798,15 @@ static int vme_user_probe(struct vme_dev *vdev)
}
}
+ image[DMA_MINOR].resource = vme_dma_request(vme_user_bridge,
+ VME_DMA_VME_TO_MEM | VME_DMA_MEM_TO_VME);
+ if (!image[DMA_MINOR].resource) {
+ dev_warn(&vdev->dev,
+ "Unable to allocate dma resource\n");
+ err = -ENOMEM;
+ goto err_master;
+ }
+
/* Create sysfs entries - on udev systems this creates the dev
files */
vme_user_sysfs_class = class_create(THIS_MODULE, driver_name);
if (IS_ERR(vme_user_sysfs_class)) {
@@ -637,6 +829,9 @@ static int vme_user_probe(struct vme_dev *vdev)
case SLAVE_MINOR:
name = "bus/vme/s%d";
break;
+ case DMA_MINOR:
+ name = "bus/vme/dma0";
+ break;
default:
err = -EINVAL;
goto err_sysfs;
@@ -661,6 +856,8 @@ err_sysfs:
}
class_destroy(vme_user_sysfs_class);
+ vme_dma_free(image[DMA_MINOR].resource);
+
/* Ensure counter set correcty to unalloc all master windows */
i = MASTER_MAX + 1;
err_master:
@@ -701,6 +898,8 @@ static int vme_user_remove(struct vme_dev *dev)
}
class_destroy(vme_user_sysfs_class);
+ vme_dma_free(image[DMA_MINOR].resource);
+
for (i = MASTER_MINOR; i < (MASTER_MAX + 1); i++) {
kfree(image[i].kern_buf);
vme_master_free(image[i].resource);
diff --git a/drivers/staging/vme/devices/vme_user.h
b/drivers/staging/vme/devices/vme_user.h
index b8cc7bc..252b1c9 100644
--- a/drivers/staging/vme/devices/vme_user.h
+++ b/drivers/staging/vme/devices/vme_user.h
@@ -48,11 +48,22 @@ struct vme_irq_id {
__u8 statid;
};
+struct vme_dma_op {
+ __u64 vme_addr; /* Starting Address on the VMEbus */
+ __u64 buf_vaddr; /* Pointer to userspace memory */
+ __u32 count; /* Count of bytes to copy */
+ __u32 aspace; /* Address Space */
+ __u32 cycle; /* Cycle properties */
+ __u32 dwidth; /* Data transfer width */
+ __u32 dir; /* Transfer Direction */
+};
+
#define VME_GET_SLAVE _IOR(VME_IOC_MAGIC, 1, struct vme_slave)
#define VME_SET_SLAVE _IOW(VME_IOC_MAGIC, 2, struct vme_slave)
#define VME_GET_MASTER _IOR(VME_IOC_MAGIC, 3, struct vme_master)
#define VME_SET_MASTER _IOW(VME_IOC_MAGIC, 4, struct vme_master)
#define VME_IRQ_GEN _IOW(VME_IOC_MAGIC, 5, struct vme_irq_id)
+#define VME_DO_DMA _IOW(VME_IOC_MAGIC, 7, struct vme_dma_op)

Why not 6?
I wanted to reserve 6 for "VME_IRQ_WAIT" kind of operation.
But, I guess, that is never coming around anyway.
#endif /* _VME_USER_H_ */



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/