Re: [PATCHv2 10/10] zram: add dynamic device add/remove functionality

From: Minchan Kim
Date: Wed Apr 22 2015 - 23:06:40 EST


On Thu, Apr 16, 2015 at 08:55:56PM +0900, Sergey Senozhatsky wrote:
> We currently don't support on-demand device creation. The one and only way
> to have N zram devices is to specify num_devices module parameter (default
> value 1). That means that if, for some reason, at some point, user wants
> to have N + 1 devies he/she must umount all the existing devices, unload
> the module, load the module passing num_devices equals to N + 1. And do
> this again, if needed.
>
> This patch introduces zram control sysfs class, which has two sysfs
> attrs:
> - zram_add -- add a new zram device
> - zram_remove -- remove a specific (device_id) zram device
>
> zram_add sysfs attr is read-only and has only automatic device id assignment
> mode (as requested by Minchan Kim). read operation performed on this attr
> creates a new zram device and returns back its device_id or error status.
>
> Usage example:
> # add a new specific zram device
> cat /sys/class/zram-control/zram_add
> 2
>
> # remove a specific zram device
> echo 4 > /sys/class/zram-control/zram_remove
>
> Returning zram_add() error code back to user (-ENOMEM in this case)
>
> cat /sys/class/zram-control/zram_add
> cat: /sys/class/zram-control/zram_add: Cannot allocate memory
>
> NOTE, there might be users who already depend on the fact that at
> least zram0 device gets always created by zram_init(). Preserve this
> behavior.
>
> [Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>: fix comment layout]
> Reported-by: Minchan Kim <minchan@xxxxxxxxxx>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx>
> ---
> Documentation/ABI/testing/sysfs-class-zram | 24 ++++++
> Documentation/blockdev/zram.txt | 23 +++++-
> drivers/block/zram/zram_drv.c | 124 ++++++++++++++++++++++++++++-
> 3 files changed, 166 insertions(+), 5 deletions(-)
> create mode 100644 Documentation/ABI/testing/sysfs-class-zram
>
> diff --git a/Documentation/ABI/testing/sysfs-class-zram b/Documentation/ABI/testing/sysfs-class-zram
> new file mode 100644
> index 0000000..6f62ef5
> --- /dev/null
> +++ b/Documentation/ABI/testing/sysfs-class-zram
> @@ -0,0 +1,24 @@
> +What: /sys/class/zram-control/
> +Date: August 2015
> +KernelVersion: 4.1
> +Contact: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx>
> +Description:
> + The zram-control/ class sub-directory belongs to zram
> + device class
> +
> +What: /sys/class/zram-control/zram_add
> +Date: August 2015
> +KernelVersion: 4.1
> +Contact: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx>
> +Description:
> + RO attribute. Read operation will cause zram to add a new
> + device and return its device id back to user (so one can
> + use /dev/zram<id>), or error code.
> +
> +What: /sys/class/zram-control/zram_add
> +Date: August 2015
> +KernelVersion: 4.1
> +Contact: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx>
> +Description:
> + Remove a specific /dev/zramX device, where X is a device_id
> + provided by user
> diff --git a/Documentation/blockdev/zram.txt b/Documentation/blockdev/zram.txt
> index 2ccc741..44b1a77 100644
> --- a/Documentation/blockdev/zram.txt
> +++ b/Documentation/blockdev/zram.txt
> @@ -99,7 +99,24 @@ size of the disk when not in use so a huge zram is wasteful.
> mkfs.ext4 /dev/zram1
> mount /dev/zram1 /tmp
>
> -7) Stats:
> +7) Add/remove zram devices
> +
> +zram provides a control interface, which enables dynamic (on-demand) device
> +addition and removal.
> +
> +In order to add a new /dev/zramX device, perform read operation on zram_add
> +attribute. This will return either new device's device id (meaning that you
> +can use /dev/zram<id>) or error code.
> +
> +Example:
> + cat /sys/class/zram-control/zram_add

Why do we put zram-contol there rather than /sys/block/zram
> + 1
> +
> +To remove the existing /dev/zramX device (where X is a device id)
> +execute
> + echo X > /sys/class/zram-control/zram_remove
> +
> +8) Stats:
> Per-device statistics are exported as various nodes under /sys/block/zram<id>/
>
> A brief description of exported device attritbutes. For more details please
> @@ -178,11 +195,11 @@ line of text and contains the following stats separated by whitespace:
> num_migrated
>
>
> -8) Deactivate:
> +9) Deactivate:
> swapoff /dev/zram0
> umount /dev/zram1
>
> -9) Reset:
> +10) Reset:
> Write any positive value to 'reset' sysfs node
> echo 1 > /sys/block/zram0/reset
> echo 1 > /sys/block/zram1/reset
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index 2c2e7cc..848222a 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -33,10 +33,14 @@
> #include <linux/vmalloc.h>
> #include <linux/err.h>
> #include <linux/idr.h>
> +#include <linux/sysfs.h>
>
> #include "zram_drv.h"
>
> static DEFINE_IDR(zram_index_idr);
> +/* idr index must be protected */
> +static DEFINE_MUTEX(zram_index_mutex);
> +
> static int zram_major;
> static const char *default_compressor = "lzo";
>
> @@ -1168,8 +1172,15 @@ static int zram_add(int device_id)

Why do zram_add need device_id?
We decided to remove option user pass device_id.

> if (!zram)
> return -ENOMEM;
>
> - ret = idr_alloc(&zram_index_idr, zram, device_id,
> - device_id + 1, GFP_KERNEL);
> + if (device_id < 0) {
> + /* generate new device_id */
> + ret = idr_alloc(&zram_index_idr, zram, 0, 0, GFP_KERNEL);
> + device_id = ret;
> + } else {
> + /* use provided device_id */
> + ret = idr_alloc(&zram_index_idr, zram, device_id,
> + device_id + 1, GFP_KERNEL);
> + }
> if (ret < 0)
> goto out_free_dev;
>
> @@ -1278,6 +1289,105 @@ static void zram_remove(struct zram *zram)
> kfree(zram);
> }
>
> +/*
> + * Lookup if there is any device pointer that match the given device_id.
> + * return device pointer if so, or ERR_PTR() otherwise.
> + */
> +static struct zram *zram_lookup(int dev_id)
> +{
> + struct zram *zram;
> +
> + zram = idr_find(&zram_index_idr, dev_id);
> + if (zram)
> + return zram;
> + return ERR_PTR(-ENODEV);

Just return NULL which is more simple and readable.

> +}
> +
> +/* zram module control sysfs attributes */
> +static ssize_t zram_add_show(struct class *class,
> + struct class_attribute *attr,
> + char *buf)
> +{
> + int ret;
> +
> + mutex_lock(&zram_index_mutex);
> + /* pick up available device_id */
> + ret = zram_add(-1);
> + mutex_unlock(&zram_index_mutex);
> +
> + if (ret < 0)
> + return ret;
> + return scnprintf(buf, PAGE_SIZE, "%d\n", ret);
> +}
> +
> +static ssize_t zram_remove_store(struct class *class,
> + struct class_attribute *attr,
> + const char *buf,
> + size_t count)
> +{
> + struct zram *zram;
> + int ret, err, dev_id;
> +
> + mutex_lock(&zram_index_mutex);
> +
> + /* dev_id is gendisk->first_minor, which is `int' */
> + ret = kstrtoint(buf, 10, &dev_id);
> + if (ret || dev_id < 0) {
> + ret = -EINVAL;
> + goto out;
> + }
> +
> + zram = zram_lookup(dev_id);
> + if (IS_ERR(zram)) {
> + ret = PTR_ERR(zram);
> + goto out;
> + }
> +
> + /*
> + * First, make ->disksize device attr RO, closing
> + * zram_remove() vs disksize_store() race window

Why don't you use zram->init_lock to protect the race?

> + */
> + ret = sysfs_chmod_file(&disk_to_dev(zram->disk)->kobj,
> + &dev_attr_disksize.attr, S_IRUGO);
> + if (ret)
> + goto out;
> +
> + ret = zram_reset_device(zram);
> + if (ret == 0) {
> + /* ->disksize is RO and there are no ->bd_openers */
> + zram_remove(zram);
> + goto out;
> + }
> +
> + /*
> + * If there are still device bd_openers, try to make ->disksize
> + * RW again and return. even if we fail to make ->disksize RW,
> + * user still has RW ->reset attr. so it's possible to destroy
> + * that device.
> + */
> + err = sysfs_chmod_file(&disk_to_dev(zram->disk)->kobj,
> + &dev_attr_disksize.attr,
> + S_IWUSR | S_IRUGO);
> + if (err)
> + ret = err;
> +
> +out:
> + mutex_unlock(&zram_index_mutex);
> + return ret ? ret : count;
> +}
> +
> +static struct class_attribute zram_control_class_attrs[] = {
> + __ATTR_RO(zram_add),
> + __ATTR_WO(zram_remove),
> + __ATTR_NULL,
> +};
> +
> +static struct class zram_control_class = {
> + .name = "zram-control",
> + .owner = THIS_MODULE,
> + .class_attrs = zram_control_class_attrs,
> +};
> +
> static int zram_exit_cb(int id, void *ptr, void *data)
> {
> zram_remove(ptr);
> @@ -1286,6 +1396,7 @@ static int zram_exit_cb(int id, void *ptr, void *data)
>
> static void destroy_devices(void)
> {
> + class_unregister(&zram_control_class);
> idr_for_each(&zram_index_idr, &zram_exit_cb, NULL);
> idr_destroy(&zram_index_idr);
> unregister_blkdev(zram_major, "zram");
> @@ -1295,14 +1406,23 @@ static int __init zram_init(void)
> {
> int ret, dev_id;
>
> + ret = class_register(&zram_control_class);
> + if (ret) {
> + pr_warn("Unable to register zram-control class\n");
> + return ret;
> + }
> +
> zram_major = register_blkdev(0, "zram");
> if (zram_major <= 0) {
> pr_warn("Unable to get major number\n");
> + class_unregister(&zram_control_class);
> return -EBUSY;
> }
>
> for (dev_id = 0; dev_id < num_devices; dev_id++) {
> + mutex_lock(&zram_index_mutex);
> ret = zram_add(dev_id);
> + mutex_unlock(&zram_index_mutex);
> if (ret < 0)
> goto out_error;
> }
> --
> 2.4.0.rc2
>

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/