Re: [RFC PATCH V1] zram:calculate available memory when zram is used

From: yong w
Date: Wed Jun 02 2021 - 20:40:34 EST


It's a good suggestion!
I do think SwapAvailable is more reasonable too.
SwapAvailable is corresponded to Available!

I will remake the patch and submit it later.
Thanks very much!

David Hildenbrand <david@xxxxxxxxxx> 于2021年6月2日周三 下午3:44写道:
>
> On 02.06.21 02:06, yong w wrote:
> > Thanks for your reply!
> >
> > Indeed, when zram is not triggered, MemAvalible will be greater than Memtotal.
> > As you said, this modification will have program compatibility issues,
> > when there
> > are some user space scripts use MemAvailable for calculation.
> >
> > How about providing another proc interface?
> > Such as /proc/xxx, When execute “cat /proc/xxx”, it returns "available
> > + swapfree - swapfree * compress ratio" value.
>
> Good question. What would be an appropriate name and description for
> this entry?
>
> We do have in /proc/meminfo already:
>
> MemTotal:
> MemAvailable:
> SwapTotal:
> SwapFree:
>
> I wonder if we could add an entry either for "swapfree - swapfree *
> compress" or "swapfree * compress". But even then, I struggle to find a
> fitting name. We could either define something like
>
> "SwapAvailable:" An estimate of how much swap space is available for
> starting new applications. Note that SwapAvailable is different to
> SwapFree when swap compression, as implemented by zram, is being performed.
>
> So a user can compute
>
> "TotalAvailable = MemAvailable + SwapAvailable".
>
> or if that doesn't make any sense:
>
> "TotalAvailable:" An estimate of how much memory and swap space is
> available for starting new applications. Usually, the value corresponds
> to "MemAvailable + SwapFree", however, swap compression, as implemented
> by zram, might allow for making more efficient use of free swap space.
>
>
> I'd prefer the first -- if it makes any sense.
>
> You should CC linux-api on follow up patches.
>
> > Thanks in advance!
> >
> >
> > David Hildenbrand <david@xxxxxxxxxx> 于2021年5月31日周一 下午4:17写道:
> >>
> >> On 30.05.21 19:18, yong w wrote:
> >>> When zram is used, available+Swap free memory is obviously bigger than
> >>> I actually can use, because zram can compress memory by compression
> >>> algorithm and zram compressed data will occupy memory too.
> >>>
> >>> So, I count the compression rate of zram in the kernel. The available
> >>> memory is calculated as follows:
> >>> available + swapfree - swapfree * compress ratio
> >>> MemAvailable in /proc/meminfo returns available + zram will save space
> >>>
> >>
> >> This will mean that we can easily have MemAvailable > MemTotal, right?
> >> I'm not sure if there are some user space scripts that will be a little
> >> disrupted by that. Like calculating "MemUnavailable = MemTotal -
> >> MemAvailable".
> >>
> >> MemAvailable: "An estimate of how much memory is available for starting
> >> new applications, without swapping"
> >>
> >> Although zram isn't "traditional swapping", there is still a performance
> >> impact when having to go to zram because it adds an indirection and
> >> requires (de)compression. Similar to having very fast swap space (like
> >> PMEM). Let's not call something "memory" that doesn't have the same
> >> semantics as real memory as in "MemTotal".
> >>
> >> This doesn't feel right.
> >>
> >>> Signed-off-by: wangyong <yongw.pur@xxxxxxxxx <mailto:yongw.pur@xxxxxxxxx>>
> >>> ---
> >>> drivers/block/zram/zcomp.h | 1 +
> >>> drivers/block/zram/zram_drv.c | 4 ++
> >>> drivers/block/zram/zram_drv.h | 1 +
> >>> fs/proc/meminfo.c | 2 +-
> >>> include/linux/swap.h | 10 +++++
> >>> mm/swapfile.c | 95
> >>> +++++++++++++++++++++++++++++++++++++++++++
> >>> mm/vmscan.c | 1 +
> >>> 7 files changed, 113 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/drivers/block/zram/zcomp.h b/drivers/block/zram/zcomp.h
> >>> index 40f6420..deb2dbf 100644
> >>> --- a/drivers/block/zram/zcomp.h
> >>> +++ b/drivers/block/zram/zcomp.h
> >>> @@ -40,4 +40,5 @@ int zcomp_decompress(struct zcomp_strm *zstrm,
> >>> const void *src, unsigned int src_len, void *dst);
> >>>
> >>> bool zcomp_set_max_streams(struct zcomp *comp, int num_strm);
> >>> +int get_zram_major(void);
> >>> #endif /* _ZCOMP_H_ */
> >>> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> >>> index cf8deec..1c6cbd4 100644
> >>> --- a/drivers/block/zram/zram_drv.c
> >>> +++ b/drivers/block/zram/zram_drv.c
> >>> @@ -59,6 +59,10 @@ static void zram_free_page(struct zram *zram, size_t
> >>> index);
> >>> static int zram_bvec_read(struct zram *zram, struct bio_vec *bvec,
> >>> u32 index, int offset, struct bio *bio);
> >>>
> >>> +int get_zram_major(void)
> >>> +{
> >>> + return zram_major;
> >>> +}
> >>>
> >>> static int zram_slot_trylock(struct zram *zram, u32 index)
> >>> {
> >>> diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
> >>> index 6e73dc3..5d8701a 100644
> >>> --- a/drivers/block/zram/zram_drv.h
> >>> +++ b/drivers/block/zram/zram_drv.h
> >>> @@ -88,6 +88,7 @@ struct zram_stats {
> >>> atomic64_t bd_reads; /* no. of reads from backing device */
> >>> atomic64_t bd_writes; /* no. of writes from backing device */
> >>> #endif
> >>> + atomic_t min_compr_ratio;
> >>> };
> >>>
> >>> struct zram {
> >>> diff --git a/fs/proc/meminfo.c b/fs/proc/meminfo.c
> >>> index 6fa761c..f7bf350 100644
> >>> --- a/fs/proc/meminfo.c
> >>> +++ b/fs/proc/meminfo.c
> >>> @@ -57,7 +57,7 @@ static int meminfo_proc_show(struct seq_file *m, void *v)
> >>>
> >>> show_val_kb(m, "MemTotal: ", i.totalram);
> >>> show_val_kb(m, "MemFree: ", i.freeram);
> >>> - show_val_kb(m, "MemAvailable: ", available);
> >>> + show_val_kb(m, "MemAvailable: ", available + count_avail_swaps());
> >>> show_val_kb(m, "Buffers: ", i.bufferram);
> >>> show_val_kb(m, "Cached: ", cached);
> >>> show_val_kb(m, "SwapCached: ", total_swapcache_pages());
> >>> diff --git a/include/linux/swap.h b/include/linux/swap.h
> >>> index 032485e..3225a2f 100644
> >>> --- a/include/linux/swap.h
> >>> +++ b/include/linux/swap.h
> >>> @@ -514,6 +514,8 @@ extern int init_swap_address_space(unsigned int
> >>> type, unsigned long nr_pages);
> >>> extern void exit_swap_address_space(unsigned int type);
> >>> extern struct swap_info_struct *get_swap_device(swp_entry_t entry);
> >>> sector_t swap_page_sector(struct page *page);
> >>> +extern void update_zram_zstats(void);
> >>> +extern u64 count_avail_swaps(void);
> >>>
> >>> static inline void put_swap_device(struct swap_info_struct *si)
> >>> {
> >>> @@ -684,6 +686,14 @@ static inline swp_entry_t get_swap_page(struct page
> >>> *page)
> >>> return entry;
> >>> }
> >>>
> >>> +void update_zram_zstats(void)
> >>> +{
> >>> +}
> >>> +
> >>> +u64 count_avail_swaps(void)
> >>> +{
> >>> +}
> >>> +
> >>> #endif /* CONFIG_SWAP */
> >>>
> >>> #ifdef CONFIG_THP_SWAP
> >>> diff --git a/mm/swapfile.c b/mm/swapfile.c
> >>> index cbb4c07..93a9dcb 100644
> >>> --- a/mm/swapfile.c
> >>> +++ b/mm/swapfile.c
> >>> @@ -44,6 +44,7 @@
> >>> #include <asm/tlbflush.h>
> >>> #include <linux/swapops.h>
> >>> #include <linux/swap_cgroup.h>
> >>> +#include "../drivers/block/zram/zram_drv.h"
> >>>
> >>> static bool swap_count_continued(struct swap_info_struct *, pgoff_t,
> >>> unsigned char);
> >>> @@ -3408,6 +3409,100 @@ SYSCALL_DEFINE2(swapon, const char __user *,
> >>> specialfile, int, swap_flags)
> >>> return error;
> >>> }
> >>>
> >>> +u64 count_avail_swap(struct swap_info_struct *si)
> >>> +{
> >>> + u64 result;
> >>> + struct zram *z;
> >>> + unsigned int free;
> >>> + unsigned int ratio;
> >>> +
> >>> + result = 0;
> >>> + if (!si)
> >>> + return 0;
> >>> +
> >>> + //zram calculate available mem
> >>> + if (si->flags & SWP_USED && si->swap_map) {
> >>> + if (si->bdev->bd_disk->major == get_zram_major()) {
> >>> + z = (struct zram *)si->bdev->bd_disk->private_data;
> >>> + down_read(&z->init_lock);
> >>> + ratio = atomic_read(&z->stats.min_compr_ratio);
> >>> + free = (si->pages << (PAGE_SHIFT - 10))
> >>> + - (si->inuse_pages << (PAGE_SHIFT - 10));
> >>> + if (!ratio)
> >>> + result += free / 2;
> >>> + else
> >>> + result = free * (100 - 10000 / ratio) / 100;
> >>> + up_read(&z->init_lock);
> >>> + }
> >>> + } else
> >>> + result += (si->pages << (PAGE_SHIFT - 10))
> >>> + - (si->inuse_pages << (PAGE_SHIFT - 10));
> >>> +
> >>> + return result;
> >>> +}
> >>> +
> >>> +u64 count_avail_swaps(void)
> >>> +{
> >>> + int type;
> >>> + u64 result;
> >>> + struct swap_info_struct *si;
> >>> +
> >>> + result = 0;
> >>> + spin_lock(&swap_lock);
> >>> + for (type = 0; type < nr_swapfiles; type++) {
> >>> + si = swap_info[type];
> >>> + spin_lock(&si->lock);
> >>> + result += count_avail_swap(si);
> >>> + spin_unlock(&si->lock);
> >>> + }
> >>> + spin_unlock(&swap_lock);
> >>> +
> >>> + return result;
> >>> +}
> >>> +
> >>> +void update_zram_zstat(struct swap_info_struct *si)
> >>> +{
> >>> + struct zram *z;
> >>> + struct zram_stats *stat;
> >>> + int ratio;
> >>> + u64 orig_size, compr_data_size;
> >>> +
> >>> + if (!si)
> >>> + return;
> >>> +
> >>> + //update zram min compress ratio
> >>> + if (si->flags & SWP_USED && si->swap_map) {
> >>> + if (si->bdev->bd_disk->major == get_zram_major()) {
> >>> + z = (struct zram *)si->bdev->bd_disk->private_data;
> >>> + down_read(&z->init_lock);
> >>> + stat = &z->stats;
> >>> + ratio = atomic_read(&stat->min_compr_ratio);
> >>> + orig_size = atomic64_read(&stat->pages_stored) << PAGE_SHIFT;
> >>> + compr_data_size = atomic64_read(&stat->compr_data_size);
> >>> + if (compr_data_size && (!ratio
> >>> + || ((orig_size * 100) / compr_data_size < ratio)))
> >>> + atomic_set(&stat->min_compr_ratio,
> >>> + (orig_size * 100) / compr_data_size);
> >>> + up_read(&z->init_lock);
> >>> + }
> >>> + }
> >>> +}
> >>> +
> >>> +void update_zram_zstats(void)
> >>> +{
> >>> + int type;
> >>> + struct swap_info_struct *si;
> >>> +
> >>> + spin_lock(&swap_lock);
> >>> + for (type = 0; type < nr_swapfiles; type++) {
> >>> + si = swap_info[type];
> >>> + spin_lock(&si->lock);
> >>> + update_zram_zstat(si);
> >>> + spin_unlock(&si->lock);
> >>> + }
> >>> + spin_unlock(&swap_lock);
> >>> +}
> >>> +
> >>> void si_swapinfo(struct sysinfo *val)
> >>> {
> >>> unsigned int type;
> >>> diff --git a/mm/vmscan.c b/mm/vmscan.c
> >>> index eb31452..ffaf59b 100644
> >>> --- a/mm/vmscan.c
> >>> +++ b/mm/vmscan.c
> >>> @@ -4159,6 +4159,7 @@ static int kswapd(void *p)
> >>> alloc_order);
> >>> reclaim_order = balance_pgdat(pgdat, alloc_order,
> >>> highest_zoneidx);
> >>> + update_zram_zstats();
> >>> if (reclaim_order < alloc_order)
> >>> goto kswapd_try_sleep;
> >>> }
> >>> --
> >>> 2.7.4
> >>
> >>
> >> --
> >> Thanks,
> >>
> >> David / dhildenb
> >>
> >
>
>
> --
> Thanks,
>
> David / dhildenb
>