Sunday, 28 March 2021
Enable root login over SSH:
Tuesday, 9 March 2021
mdb calculates ZFS related values and how those differ from ZFS ARC size
Applies to:
Solaris Operating System - Version 10 6/06 U2 and laterInformation in this document applies to any platform.
Purpose
This document describes how mdb calculates ZFS related values and how those differ from ZFS ARC size so that users understand correctly the relationship between these two.
Details
ARC size reported by arcstats
arcstats kernel statistics reports the current ZFS ARC usage.
module: zfs instance: 0
name: arcstats class: misc
buf_size 37861488
data_size 7838309824
l2_hdr_size 0
meta_used 170464568
other_size 115650152
prefetch_meta_size 16952928
rawdata_size 0
size 8008774392
(The output is cut for brevity.)
'size' is the amount of active data in the ARC and it can be broken down as follows.
Solaris 11.x prior to Solaris 11.3 SRU 13.4 and Solaris 10 without 150400-46/150401-46
size = meta_used + data_size;
Solaris 11.3 SRU 13.4 or later and Solaris 10 with 150400-46/150401-46 or later
size = data_size;
meta_used = buf_size + other_size + l2_hdr_size + rawdata_size + prefetch_meta_size;
buf_size: size of in-core data to manage ARC buffers.
other_size: size of in-core data to mange ZFS objects.
l2_hdr_size: size of in-core data to manage L2ARC.
rawdata_size: size of raw data used for persistent L2ARC. (Solaris 11.2.8 or later)
prefetch_meta_size: size of in-core data to manage prefetch. (Solaris 11.3 or later)
data_size: size of cached on-disk file data and on-disk meta data.
How ZFS ARC is allocated from kernel memory
The way ZFS ARC is allocated from kernel memory depends on Solaris versions.
Solaris 10, Solaris 11.0, Solaris 11.1
To cache on-disk file data, ARC is allocated from 'zio_data_buf_XXX' (XXX indicates cache unit size, such as '4096', '8192' etc.) kmem caches allocated from 'zfs_file_data_buf' virtual memory (vmem) arena.
To cache on-disk meta data, ARC is allocated from 'zio_buf_XXX' kmem caches allocated from 'kmem_default' vmem arena.
In-core data is allocated from other kmem caches, 'arc_buf_t', 'dmu_buf_impl_t', 'l2arc_buf_t', etc. allocated from 'kmem_default' vmem arena.
Also 'zio_data_buf_XXX' and 'zio_buf_XXX' are not used only to cache on-disk file and meta data but also used by ZFS IO routines not for ZFS ARC purpose.
Pages for 'zio_data_buf_XXX' are associated with the 'zvp' vnode and in the 'kzioseg' kernel segment.
Pages for 'zio_buf_XXX' and other caches are associated with the 'kvp', usual kernel vnode.
On Solaris 11.1 with SRU 3.4 or later, in addition to the above, 'zfs_file_data_lp_buf' vmem arena is used to allocate large pages.
Solaris 11.2
To cache on-disk file data, ARC is allocated from 'zio_data_buf_XXX' kmem caches allocated from 'zfs_file_data_buf' vmem arena.
To cache on-disk meta data, ARC is allocated from 'zio_buf_XXX' kmem caches allocated from 'zfs_metadta_buf' vmem arena.
In-core data is allocated from other kmem caches, 'arc_buf_t', 'dmu_buf_impl_t', 'l2arc_buf_t', 'zfetch_triggert_t', etc. allocated from 'kmem_default' vmem arena.
Also 'zio_data_buf_XXX' and 'zio_buf_XXX' are not used only to cache on-disk file and meta data but also used by ZFS IO routines not for ZFS ARC purpose.
Pages for both 'zio_data_buf_XXX' and 'zio_buf_XXX' are associated with the 'zvp' vnode and in the 'kzioseg' kernel segment.
Pages for other caches are associated with the 'kvp', usual kernel vnode.
Solaris 11.3 prior to SRU 21.5
The new kernel memory allocation mechanism, Kernel Object Manager (KOM) is introduced.
To cache on-disk file data, ARC is allocated from 'arc_data' kom class.
To cache on-disk meta data, ARC is allocated from 'arc_meta' kom class.
In-core data is allocated from other kmem caches, 'arc_buf_t', 'dmu_buf_impl_t', 'l2arc_buf_t', 'zfetch_triggert_t', etc. allocated from 'kmem_default' vmem arena.
Memory used by ZFS IO routines not for ZFS ARC purpose are allocated as 'kmem_alloc_XXX' from 'kmem_default' vmem arena.
'kzioseg' segment and 'zvp' vnode no longer exist.
Solaris 11.3 SRU 21.5 or later
To cache on-disk file data, ARC is allocated from 'arc_data' kom class.
To cache on-disk meta data, ARC is allocated from 'arc_meta' kom_class.
'kmem_default_zfs' vmem arena is introduced to account for kernel memory used by zfs not to cache on-disk data.
In-core data, 'arc_buf_t', 'dmu_buf_impl_t', 'l2arc_buf_t', 'zfetch_triggert_t', etc., are now allocated from 'kmem_default_zfs' vmem arena.
Memory used by ZFS IO routines not for ZFS ARC purpose are allocated as 'zio_buf_XXX' from 'kmem_default_zfs' vmem arena too.
ZFS information reported by ::memstat in mdb
::memstat reports ZFS related memory usage also, but it's not exactly the same as arcstats and its implementation depends on OS versions.
Solaris 10, Solaris 11.0, Solaris 11.1
Page Summary Pages MB %Tot
------------ ---------------- ---------------- ----
Kernel 540356 2110 13%
ZFS File Data 609140 2379 15%
Anon 41590 162 1%
Exec and libs 5231 20 0%
Page cache 2883 11 0%
Free (cachelist) 800042 3125 19%
Free (freelist) 2192512 8564 52%
Total 4191754 16374
Physical 4102251 16024
'ZFS File Data' shows the size of pages associated with the 'zvp', which is the size allocated from 'zio_data_buf_XXX' kmem caches.
It does not include on-disk meta data and in-core data. Also it contains some amount of data used by ZFS IO routines.
Solaris 11.2
Page Summary Pages Bytes %Tot
----------------- ---------------- ---------------- ----
Kernel 237329 1.8G 23%
Guest 0 0 0%
ZFS Metadata 28989 226.4M 3%
ZFS File Data 699858 5.3G 67%
Anon 41418 323.5M 4%
Exec and libs 1366 10.6M 0%
Page cache 4782 37.3M 0%
Free (cachelist) 1017 7.9M 0%
Free (freelist) 33817 264.1M 3%
Total 1048576 8G
'ZFS File Data' shows the size allocated from 'zfs_file_data_buf' vmem arena. 'ZFS Metadata' shows the size of "pages associated with zvp" - 'ZFS File Data'.
Solaris 11.3 prior to SRU 17.5.0
Page Summary Pages Bytes %Tot
----------------- ---------------- ---------------- ----
Kernel 558607 4.2G 7%
ZFS Metadata 27076 211.5M 0%
ZFS File Data 2743214 20.9G 33%
Anon 68656 536.3M 1%
Exec and libs 2067 16.1M 0%
Page cache 7285 56.9M 0%
Free (cachelist) 21596 168.7M 0%
Free (freelist) 4927709 37.5G 59%
Total 8372224 63.8G
> ::kom_class
ADDR FLAGS NAME RSS MEM_TOTAL
4c066e91d80 -L- arc_meta 211.5m 280m
4c066e91c80 --- arc_data 20.9g 20.9g
'ZFS File Data' shows the size of KOM statistics of 'arc_data'. 'ZFS Metadata' shows the size of KOM statistics of 'arc_meta'.
Solaris 11.3 with SRU 17.5 and without SRU 21.5
Page Summary Pages Bytes %Tot
---------------------------- ---------------- ---------------- ----
Kernel 636916 4.8G 4%
Kernel (ZFS ARC excess) 16053 125.4M 0%
Defdump prealloc 291049 2.2G 2%
ZFS Metadata 137434 1.0G 1%
ZFS File Data 4244593 32.3G 25%
Anon 114975 898.2M 1%
Exec and libs 2000 15.6M 0%
Page cache 15548 121.4M 0%
Free (cachelist) 253689 1.9G 2%
Free (freelist) 11064959 84.4G 66%
Total 16777216 128G
::memstat on Solaris 11.3 SRU 17.5 or later has '-v' option to show the details.
'ZFS File Data' and 'ZFS Metadata' shows the KOM stat same as before.
In addition, 'Kernel (ZFS ARC excess)' shows the wasted memory of the sum of 'ZFS File Data' and 'ZFS Metadata'.
KOM can keep allocated memory which is not actually used at the moment, which is considered wasted.
Solaris 11.3 SRU 21.5 or later
Page Summary Pages Bytes %Tot
---------------------------- ---------------- ---------------- ----
Kernel 671736 2.5G 6%
Kernel (ZFS ARC excess) 21159 82.6M 0%
Defdump prealloc 361273 1.3G 3%
ZFS Kernel Data 131699 514.4M 1%
ZFS Metadata 42962 167.8M 0%
ZFS File Data 8857479 33.7G 84%
Anon 99066 386.9M 1%
Exec and libs 2050 8.0M 0%
Page cache 9265 36.1M 0%
Free (cachelist) 14663 57.2M 0%
Free (freelist) 273905 1.0G 3%
Total 10485257 39.9G
In addition to the information prior to Solarsi 11.3 SRU 21.5, 'ZFS Kernel Data' shows the size allocated from 'kmem_default_zfs' arena (and its overhead).
Solaris 11.4 or later
Usage Type/Subtype Pages Bytes %Tot %Tot/%Subt
---------------------------- ---------------- -------- ----- -----------
Kernel 3669091 13.9g 7.2%
Regular Kernel 2602037 9.9g 5.1%/70.9%
ZFS ARC Fragmentation 14515 56.6m 0.0%/ 0.3%
Defdump prealloc 1052539 4.0g 2.0%/28.6%
ZFS 28359638 108.1g 56.3%
ZFS Metadata 116083 453.4m 0.2%/ 0.4%
ZFS Data 27959629 106.6g 55.5%/98.5%
ZFS Kernel Data 283926 1.0g 0.5%/ 1.0%
User/Anon 201462 786.9m 0.4%
Exec and libs 3062 11.9m 0.0%
Page Cache 29372 114.7m 0.0%
Free (cachelist) 944 3.6m 0.0%
Free 18033911 68.7g 35.8%
Total 50297480 191.8g 100%
'ZFS ARC Fragmentation' under 'Kernel' shows the wasted memory.
Why values reported by ::memstat is different from size reported by arcstats?
There are a few factors.
ARC size includes cached on-disk file data, cached on-disk meta data, and various in-core data. But ::memstat does not report each of them. Prior to Solaris 11.2, only 'ZFS File Data' is reported.
Even on Solaris 11.2 and 11.3, in-core data is not reported. Also the accounting by arcstats and ::memstat does not completely match.
::memstat on Solaris 11.3 SRU 21.5 or later reports in-core data as 'ZFS Kernel Data', though in-core data counted by arcstats and by ::memstat are not exactly the same.
Another factor is wasted memory in kmem caches.
Consider
a possible scenario here: customer ran a workload that was largely 128K
blocksize based. This resulted in filling up the ARC cache with say X
GB of 128K blocks. The customer then switched to a workload that was 8K
based. The ARC cache now filled up Y GB of 8K blocks (the 128K blocks
are evicted). When the 128K blocks are evicted from the ARC cache, they
are returned to the 'zio_data_buf_131072' cache, where they will stay (unused by the ARC) until either re-allocated or "reaped" by the VM system.
Under such a condition, 'ZFS File Data' shown by ::memstat can be much higher than the ARC size.
Especially,
from Solaris 11.1 with SRU 3.4 through Solaris 11.1 with SRU 21.4,
large pages are used by default and the situation can be worse.
::memstat reports such waste as 'Kernel (ZFS ARC excess)' on Solaris 11.3 SRU 17.5 or later, or 'ZFS ARC Fragmentation' on Solaris 11.4 or later.
Also it
could happen 'ZFS File Data' is higher than the ARC size even though
'ZFS ARC excess / ZFS ARC Fragmentation' is not high.
In this case, the ARC memory is freed but still have KOM objects associated.
As discussed above, it is clear that reported values by ::memstat do not have to match with the value of ZFS ARC size. It is not an issue if ::memstat values are more or less than ZFS ARC size.
-------------
Applies to:
Solaris Operating System - Version 8.0 to 11.4 [Release 8.0 to 11.0]All Platforms
*** Checked for currency and updated for Solaris 11.2 11-March-2015 ***
Goal
This document is intended to give hints, where to look for in checking and troubleshooting memory usage.
In principle, investigation of memory usage is split in checking usage of kernel memory and user memory.
Please
be aware that in case of a memory-usage problem on a system, corrective
actions usually requires deep knowledge and must be performed with
great care.
Solution
General System Practices is to keep system up-to-date with latest Solaris releases and patches
First, you need to check how much Memory is used in Kernel and how much is used in User Memory. This is important to decide, which further troubleshooting steps are required.
A very useful mdb dcmd is '::memstat' ( this command can take several minutes to complete )
For more information on using the modular debugger, see the Oracle Solaris Modular Debugger Guide.
Solaris[TM] 9 Operating System or greater only ! Format varies with OS release. This example is from Solaris 11.2
Page Summary Pages Bytes %Tot
----------------- ---------------- ---------------- ----
Kernel 585584 4.4G 14%
Defdump prealloc 204802 1.5G 5%
Guest 0 0 0%
ZFS Metadata 21436 167.4M 0%
ZFS File Data 342833 2.6G 8%
Anon 56636 442.4M 1%
Exec and libs 1131 8.8M 0%
Page cache 4339 33.8M 0%
Free (cachelist) 8011 62.5M 0%
Free (freelist) 2969532 22.6G 71%
Total 4194304 32G
User memory usage : print out processes using most USER - memory
% prstat -s size # sorted by userland virtual memory consumption
% prstat -s rss # sorted by userland physical memory consumption
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
4051 user1 297M 258M sleep 59 0 1:35:05 0.0% mysqld/10
26286 user2 229M 180M sleep 59 0 0:05:07 0.0% java/53
27101 user2 237M 150M sleep 59 0 0:02:21 0.0% soffice.bin/5
23335 user2 193M 135M sleep 59 0 0:12:33 0.0% firefox-bin/10
3727 noaccess 192M 131M sleep 59 0 0:36:22 0.0% java/18
22751 root 165M 131M sleep 59 0 1:13:12 0.0% java/46
1448 noaccess 192M 108M sleep 59 0 0:34:47 0.0% java/18
10115 root 129M 82M sleep 59 0 0:31:29 0.0% java/41
20274 root 136M 77M stop 59 0 0:04:08 0.0% java/25
3397 root 138M 76M sleep 59 0 0:12:42 0.0% java/37
12949 pgsql 81M 70M sleep 59 0 0:09:36 0.0% postgres/1
12945 pgsql 80M 70M sleep 59 0 0:00:05 0.0% postgres/1
User Memory Usage : shows Shared Memory and Semaphores:
IPC status from
T ID KEY MODE OWNER GROUP CREATOR CGROUP CBYTES QNUM QBYTES LSPID LRPID STIME RTIME CTIME
Message Queues:
q 0 0x55460272 -Rrw-rw---- root root root root 0 0 4194304 1390 18941 14:12:20 14:12:21 10:23:32
q 1 0x41460272 --rw-rw---- root root root root 0 0 4194304 5914 1390 8:03:34 8:03:34 10:23:39
q 2 0x4b460272 --rw-rw---- root root root root 0 0 4194304 0 0 no-entry no-entry 10:23:39
T ID KEY MODE OWNER GROUP CREATOR CGROUP NATTCH SEGSZ CPID LPID ATIME DTIME CTIME
Shared Memory:
m 0 0x50000b3f --rw-r--r-- root root root root 1 4 738 738 18:50:36 18:50:36 18:50:36
m 1 0x52574801 --rw-rw---- root oracle root oracle 35 1693450240 2049 26495 10:30:00 10:30:00 18:51:13
m 2 0x52574802 --rw-rw---- root oracle root oracle 35 1258291200 2049 26495 10:30:00 10:30:00 18:51:16
m 3 0x52594801 --rw-rw---- root oracle root oracle 12 241172480 2098 14328 7:58:33 7:58:33 18:51:27
m 4 0x52594802 --rw-rw---- root oracle root oracle 12 78643200 2098 14329 7:58:32 7:58:33 18:51:27
m 5 0x52584801 --rw-rw---- root oracle root oracle 13 125829120 2125 27492 1:36:12 1:36:12 18:51:34
m 6 0x52584802 --rw-rw---- root oracle root oracle 13 268435456 2125 27487 1:36:10 1:36:11 18:51:34
m 7 0x525a4801 --rw-rw---- root oracle root oracle 15 912261120 2160 27472 1:36:09 1:36:09 18:51:40
m 8 0x525a4802 --rw-rw---- root oracle root oracle 15 268435456 2160 27467 1:36:08 1:36:09 18:51:42
m 8201 0x4d2 --rw-rw-rw- root root root root 0 32008 1528 1543 10:26:03 10:26:04 10:25:53
T ID KEY MODE OWNER GROUP CREATOR CGROUP NSEMS OTIME CTIME
Semaphores:
s 0 0x1 --ra-ra-ra- root root root root 1 16:17:35 18:50:33
s 1 0 --ra-ra---- root oracle root oracle 36 10:33:28 18:51:17
s 2 0 --ra-ra---- root oracle root oracle 13 10:33:28 18:51:27
s 3 0 --ra-ra---- root oracle root oracle 14 10:33:28 18:51:34
s 4 0 --ra-ra---- root oracle root oracle 16 10:33:27 18:51:42
s 5 0x4d2 --ra-ra-ra- root root root root 1 no-entry 10:25:53
s 6 0x4d3 --ra-ra-ra- root root root root 1 no-entry 10:25:53
User Memory Usage : lists User Memory usage of all processes ( except PID 0,2,3 )
# pmap -x /proc/* > /var/tmp/pmap-x
short list of total usage of these processes
1: /sbin/init
total Kb 2336 2080 128 -
1006: rlogin cores4
total Kb 2216 1696 80 -
1007: rlogin cores4
total Kb 2216 1696 104 -
115: /usr/sbin/nscd
total Kb 4208 3784 1704 -
-- snip --
User Memory Usage : check the usage of /tmp
Filesystem kbytes used avail capacity Mounted on
swap 1355552 2072 1353480 1% /tmp
print the biggest 10 files and dirs in /tmp
% du -akd /tmp/ | sort -n | tail -10
288 /tmp/SUNWut
328 /tmp/log
576 /tmp/ips2
584 /tmp/explo
608 /tmp/ipso
3408 /tmp/sshd-truss.out
17992 /tmp/truss.p
22624 /tmp/js
49208 /tmp
User Memory Usage : Overall Memory usage on system
memory page executable anonymous filesystem
swap free re mf fr de sr epi epo epf api apo apf fpi fpo fpf
19680912 27487976 21 94 0 0 0 0 0 0 0 0 0 14 0 0
3577608 11959480 0 20 0 0 0 0 0 0 0 0 0 0 0 0
3577328 11959240 0 5 0 0 0 0 0 0 0 0 0 0 0 0
3577328 11959112 38 207 0 0 0 0 0 0 0 0 0 0 0 0
3577280 11958944 0 1 0 0 0 0 0 0 0 0 0 0 0 0
User Memory Usage : Swap usage
swapfile dev swaplo blocks free
/dev/dsk/c0t0d0s1 32,25 16 1946032 1946032
% swap -s
total: 399400k bytes allocated + 18152k reserved = 417552k used, 1355480k available
common kernel statistics
print out all kernel statistics in a parse'able format
kernel memory statistics:
% kstat -p -m vmem
% kstat -p -c vmem
% kstat -p | egrep zfs_file_data_buf | egrep mem_total
alternatively to kstat you can get kernel memory usage with kmastat
prints kmastat buffers
% more /var/tmp/kmastat
cache buf buf buf memory alloc alloc
name size in use total in use succeed fail
------------------------- ------ ------ ------ --------- --------- -----
kmem_magazine_1 16 470 508 8192 470 0
kmem_magazine_3 32 970 1016 32768 1164 0
kmem_magazine_7 64 1690 1778 114688 1715 0
Look for the highest numbers in column "memory in use" and for any numbers higher than '0' in column "alloc fail"
ZFS File Data:
Keep system up-to-date with latest Solaris releases and patches
Size memory requirements to actual system workload
With a known application memory footprint, such as for a database application, you might cap the ARC size so that the
application will not need to reclaim its necessary memory from the ZFS cache.
Consider de-duplication memory requirements
Identify ZFS memory usage with the following command:
Loading modules: [ unix genunix specfs dtrace zfs scsi_vhci sd mpt mac px ldc ip
hook neti ds arp usba kssl sockfs random mdesc idm nfs cpc crypto fcip fctl ufs
logindmux ptm sppp ipc ]
> ::memstat
Page Summary Pages Bytes %Tot
----------------- ---------------- ---------------- ----
Kernel 261969 1.9G 6%
Guest 0 0 0%
ZFS Metadata 13915 108.7M 0%
ZFS File Data 111955 874.6M 3%
Anon 52339 408.8M 1%
Exec and libs 1308 10.2M 0%
Page cache 5932 46.3M 0%
Free (cachelist) 16460 128.5M 0%
Free (freelist) 3701754 28.2G 89%
Total 4165632 31.7G
> $q
In case the amount of ZFS File Data is too high on the system, you might consider how to limit how much memory ZFS can consume.
For Solaris revisions prior to Solaris 11, the only way accomplish this is to limit the ARC cache
by setting zfs:zfs_arc_max in /etc/system
set zfs:zfs_arc_max = [size]
i.e. limit the cache to 1 GB in size
set zfs:zfs_arc_max = 1073741824
Please check the following documents to check/limit the ARC
How to Understand "ZFS File Data" Value by mdb and ZFS ARC Size. (Doc ID 1430323.1)
Oracle Solaris Tunable Parameters Reference Manual
Starting at Solaris 11, a second method, reserving memory for applications, may be used to prevent ZFS from using too much memory.
The entry in /etc/system looks like this:
set user_reserve_hint_pct=60
configure /dev/shm size of Linux
How to configure /dev/shm size of Linux?
To change the configuration for /dev/shm, add one line to /etc/fstab
as follows.
tmpfs /dev/shm tmpfs defaults,size=8g 0 0
Here, the /dev/shm size is configured to be 8GB (make sure you have enough physical memory installed).
It will take effect next time Linux reboot. If you would like to make it take effect immediately, run
========
For many facilities there are system calls, others are hidden behind
netlink interfaces, and even others are exposed via virtual file systems
such as /proc
or /sys
. These file systems are
programming interfaces, they are not actually backed by real,
persistent storage. They simply use the file system interface of the
kernel as interface to various unrelated mechanisms.
Now by default systemd assigns a certain part of your physical memory to these partitions as a threshold. But what if your requirement requires you to change tmpfs partition size?
For some of the tmpfs partitions, you can change the threshold size by using fstab
. While for other partitions like (/run/user/
) which are created runtime, you cannot use fstab
to change tmpfs partition size for such runtime directories.
Below are the list of tmpfs
partitions available in RHEL 7
Filesystem Size Used Avail Use% Mounted on
tmpfs 187G 0 187G 0% /dev/shm
tmpfs 187G 41M 187G 1% /run
tmpfs 187G 0 187G 0% /sys/fs/cgroup
tmpfs 38G 0 38G 0% /run/user/1710
tmpfs 38G 0 38G 0% /run/user/0
Change tmpfs partition size for /dev/shm
If an application is POSIX compliant or it uses GLIBC (2.2 and above)
on a Red Hat Enterprise Linux system, it will usually use the /dev/shm
for shared memory (shm_open, shm_unlink). /dev/shm
is a temporary filesystem (tmpfs) which is mounted from /etc/fstab
. Hence the standard options like "size" supported for tmpfs can be used to increase or decrease the size of tmpfs on /dev/shm
(by default it is half of available system RAM).
For example, to set the size of /dev/shm
to 2GiB, change the following line in /etc/fstab
:
Default:
none /dev/shm tmpfs defaults 0 0
To:
none /dev/shm tmpfs defaults,size=2G 0 0
For the changes to take effect immediately remount /dev/shm
:
# mount -o remount /dev/shm
Lastly validate the new size
# df -h /dev/shm
Filesystem Size Used Avail Use% Mounted on
tmpfs 2.0G 0 2.0G 0% /dev/shm
Change tmpfs partition size for /run
/run
is a filesystem which is used by applications the same way /var/run was used in previous versions of RHEL. Now /var/run
is a symlink to /run
filesystem. Previously early boot programs used to place runtime data in /dev
under numerous hidden dot directories. The reason they used directories in /dev
was because it was known to be available from very early time during machine boot process. Because /var/run
was available very late during boot, as /var might reside on a separate file system, directory /run was implemented.
By default you may not find any /etc/fstab
entry for /run, so you can add below line
none /run tmpfs defaults,size=600M 0 0
For the changes to take effect immediately remount /run
:
# mount -o remount /run
lastly validate the new size
# df -h /run
Filesystem Size Used Avail Use% Mounted on
tmpfs 600M 9.6M 591M 2% /run
Change tmpfs partition size for /run/user/$UID
/run/user/$UID
is a filesystem used by pam_systemd
to store files used by running processes for that user. In previous releases these files were typically stored in /tmp
as it was the only location specified by the FHS which is local, and writeable by all users. However using /tmp
can causes issues because it is writeable by anyone and thus access control was challenging. Using /run/user/$UID
fixes the issue because it is only accessible by the target user.
/run/user/$UID
using /etc/fstab
.tmps partition size for /run/user/$UID
is taken based on RuntimeDirectorySize
value from /etc/systemd/logind.conf
# grep -i runtime /etc/systemd/logind.conf
RuntimeDirectorySize=10%
By default the default threshold for these runtime directory is 10%
of the total physical memory.
From the man
page of logind.conf
RuntimeDirectorySize=
Sets the size limit on the $XDG_RUNTIME_DIR runtime directory for each user who logs in. Takes a size in bytes, optionally suffixed with the usual K, G, M, and T suffixes, to the base 1024 (IEC). Alternatively, a numerical percentage suffixed by "%" may be specified, which sets the size limit relative to the amount of physical RAM. Defaults to 10%. Note that this size is a safety limit only. As each runtime directory is a tmpfs file system, it will only consume as much memory as is needed.
Modify this variable to your required value, for example I have provided threshold of 100M
# grep -i runtime /etc/systemd/logind.conf
RuntimeDirectorySize=100M
Next restart the systemd-logind
service
Change tmpfs partition size for /sys/fs/cgroup
/sys/fs/cgroup
is an interface through which Control Groups can be accessed. By default there may or may not be /etc/fstab
content for /sys/fs/cgroup
so add a new entry
Current value for /sys/fs/cgroup
# df -h /sys/fs/cgroup
Filesystem Size Used Avail Use% Mounted on
tmpfs 63G 0 63G 0% /sys/fs/cgroup
Add below line in your /etc/fstab
to change the threshold to 2GB
none /sys/fs/cgroup tmpfs defaults,size=2G 0 0
Remount the partition /sys/fs/cgroup
# mount -o remount /sys/fs/cgroup
Lastly validate the updated changes
# df -h /sys/fs/cgroup
Filesystem Size Used Avail Use% Mounted on
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup