ZFS: Difference between revisions

From DWIKI
mNo edit summary
 
(62 intermediate revisions by the same user not shown)
Line 1: Line 1:


= Links =
= Links =
*[http://open-zfs.org http://open-zfs.org]
*[http://www.edplese.com/samba-with-zfs.html http://www.edplese.com/samba-with-zfs.html]
*[http://wintelguy.com/zfs-calc.pl ZFS calculator]
*[https://www.raidz-calculator.com/default.aspx another zfs calculator]
*[https://bm-stor.com/index.php/blog/Linux-cluster-with-ZFS-on-Cluster-in-a-Box/ ZFS clustering]
*[https://jrs-s.net/2015/02/03/will-zfs-and-non-ecc-ram-kill-your-data/ https://jrs-s.net/2015/02/03/will-zfs-and-non-ecc-ram-kill-your-data/] ZFS and ECC]
*[https://docs.joyent.com/private-cloud/troubleshooting/disk-replacement ZFS troubleshooting/disk replacement]
*[https://www.high-availability.com/docs/Quickstart-ZFS-Cluster/ Creating a ZFS HA Cluster using shared or shared-nothing storage]
*[https://arstechnica.com/information-technology/2020/05/zfs-101-understanding-zfs-storage-and-performance/ ZFS 101]
*[https://arstechnica.com/gadgets/2021/06/raidz-expansion-code-lands-in-openzfs-master/ Raidz expansion]
*[https://somedudesays.com/2021/08/the-basic-guide-to-working-with-zfs/ Basic guide to working with zfs]


=Documentation=
*[https://openzfs.github.io/openzfs-docs/man/4/zfs.4.html zfs manpage]
*[http://zfsonlinux.org/ ZFS on Linux]  
*[http://zfsonlinux.org/ ZFS on Linux]  
*[https://openzfs.org/wiki/ openzfs wiki]
*[https://wiki.gentoo.org/wiki/ZFS https://wiki.gentoo.org/wiki/ZFS]  
*[https://wiki.gentoo.org/wiki/ZFS https://wiki.gentoo.org/wiki/ZFS]  
*[https://blog.programster.org/zfs-cheatsheet ZFS cheatsheet]  
*[https://blog.programster.org/zfs-cheatsheet ZFS cheatsheet]  
*[http://open-zfs.org http://open-zfs.org]
*[http://wiki.freebsd.org/ZFSQuickStartGuide http://wiki.freebsd.org/ZFSQuickStartGuide]  
*[http://wiki.freebsd.org/ZFSQuickStartGuide http://wiki.freebsd.org/ZFSQuickStartGuide]  
*[http://www.opensolaris.org/os/community/zfs/intro/ http://www.opensolaris.org/os/community/zfs/intro/]  
*[http://www.opensolaris.org/os/community/zfs/intro/ Opensolaris ZFS intro]
*[http://en.wikipedia.org/wiki/ZFS http://en.wikipedia.org/wiki/ZFS]  
*[http://www.raidz-calculator.com/raidz-types-reference.aspx raidz types reference]
*[http://www.edplese.com/samba-with-zfs.html http://www.edplese.com/samba-with-zfs.html]  
==ARC/Caching==
*[http://wintelguy.com/zfs-calc.pl ZFS calculator]  
*[https://linuxhint.com/configure-zfs-cache-high-speed-io/ Configuring ZFS Cache for High-Speed IO]
*[https://bm-stor.com/index.php/blog/Linux-cluster-with-ZFS-on-Cluster-in-a-Box/ ZFS clustering]  
*[https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSARCItsVariousSizes ZFS Arc various sizes]
*{[https://jrs-s.net/2015/02/03/will-zfs-and-non-ecc-ram-kill-your-data/ https://jrs-s.net/2015/02/03/will-zfs-and-non-ecc-ram-kill-your-data/] ZFS and ECC]  
*[http://dtrace.org/blogs/brendan/2012/01/09/activity-of-the-zfs-arc/ Activity of the ZFS ARC]
*[https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSUnderstandingARCHits Understanding ARC hits]
*[https://www.45drives.com/community/articles/zfs-caching/ ZFS Caching]
*[https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSZpoolFragmentationMeaning ZFS fragmentation]
*[https://klarasystems.com/articles/openzfs-all-about-l2arc/ OpenZFS: All about the cache vdev or L2ARC]
 
==ARC statistics==
*[https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html Tuning module parameters]
 
cat /proc/spl/kstat/zfs/arcstats
===data_size===
size of cached user data
 
===dnode_size===
 
===hdr_size===
size of L2ARC headers stored in main ARC
 
===metadata_size===
size of cached metadata
 
==Tuning ZFS==
*[https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/index.html ZFS Performance and Tuning]
*[https://linuxhint.com/configure-zfs-cache-high-speed-io/ Configuring ZFS Cache for High-Speed IO]
 
=Tools=
*[https://github.com/asomers/ztop ztop]
*[https://github.com/jimsalterjrs/ioztat iozstat]
*[https://cuddletech.com/2008/10/explore-your-zfs-adaptive-replacement-cache-arc/ arc_summary]
*[https://github.com/richardelling/zfs-linux-tools zfs-linux-tools] kstat-analyzer is rather helpful
 
=Processes=
==arc_evict==
Evict buffers from list until we've removed the specified number of
bytes.  Move the removed buffers to the appropriate evict state.
If the recycle flag is set, then attempt to "recycle" a buffer:
- look for a buffer to evict that is `bytes' long.
- return the data block from this buffer rather than freeing it.
This flag is used by callers that are trying to make space for a
new buffer in a full arc cache.
 
 
This function makes a "best effort".  It skips over any buffers
it can't get a hash_lock on, and so may not catch all candidates.
It may also return without evicting as much space as requested.
 
==arc_prune==
 
=Commands=
 
==Getting arc statistics==
arc_summary
Tip, for details use
arc_summary -d
There is also
cat /proc/spl/kstat/zfs/arcstats
 
 
==Getting IO statistics==
zpool iostat -v 300
 
=Terms and acronyms=
==vdev==
'''V'''irtual '''Dev'''ice.
 
*[https://wiki.archlinux.org/title/ZFS/Virtual_disks ZFS Virtual disks]
==ARC==
'''A'''daptive '''R'''eplacement '''C'''ache
 
Portion of RAM used to cache data to speed up read performance
 
==L2ARC==
'''L'''evel '''2''' '''A'''daptive Replacement '''C'''ache'''
 
"L2ARC is usually considered if hit rate for the ARC is below 90% while having 64+ GB of RAM"
 
SSD cache
 
==DMU==
Data Management Unit
 
 
==MFU==
Most Frequently Used
 
==MRU==
Most Recently Used
 
 
==Scrubbing==
Checking disks/data integrity
zpool status <poolname | grep scrub


*[https://docs.joyent.com/private-cloud/troubleshooting/disk-replacement ZFS troubleshooting/disk replacement]
and
zpool scrub <poolname>
probably taken care of by cron.


&nbsp;
==ZIL==


&nbsp;
the space synchronous writes are logged before the confirmation is sent back to the client


==prefetch==
*[https://svennd.be/tuning-of-zfs-module/ Tuning of the ZFS module]
*[https://cuddletech.com/2009/05/understanding-zfs-prefetch/ Understanding ZFS prefetch]
*[https://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSARCStatsAndPrefetch Some basic ZFS ARC statistics and prefetching]


= HOWTO =
= HOWTO =
==Create zfs filesystem==
zfs create poolname/fsname
this also creates mountpoint
==Add vdev to pool==
zpool add mypool raidz1 sdg sdh sdi


== Replace disk in zfs ==
== Replace disk in zfs ==
Line 60: Line 178:
Run replace command. The id is the guid of the old disk, name is of the new disk
Run replace command. The id is the guid of the old disk, name is of the new disk


  zpool replace tank 13450850036953119346 /dev/disk/by-id/ata-ST4000VN000-1H4168_Z302FQVZ
  zpool replace tank /dev/disk/by-id/13450850036953119346 /dev/disk/by-id/ata-ST4000VN000-1H4168_Z302FQVZ
 
==Showing information about ZFS pools and datasets==
===Show pools with sizes===
zpool list
or
zpool list -H -o name,size
 
 
===Show reservations on datasets===
zfs list -o name,reservations
 
==Swap on zfs==
https://askubuntu.com/questions/228149/zfs-partition-as-swap
 
==vdevs==
===multiple vdevs===
Multiple vdevs in a zpool get striped.
What about balance?
 
===invalid vdev specification===
Probably means you need -f


     
===show balance between vdevs===
zpool iostat -v 'pool' [interval in seconds]
orjust
zpool iostat -vc 'pool'


== Tuning arc settings ==
== Tuning arc settings ==
 
See [https://openzfs.github.io/openzfs-docs/Performance%20and%20Tuning/Module%20Parameters.html Tuning ZFS modules parameters]
  arcstat
===zfs_arc_max===
  arc_summary
Linux defaults to giving 50% of RAM to arc, this is when:
  cat /sys/module/zfs/parameters/zfs_arc_max
  0
  grep c_max /proc/spl/kstat/zfs/arcstats
  grep c_max /proc/spl/kstat/zfs/arcstats
To change this:
  echo 5368709120 > /sys/module/zfs/parameters/zfs_arc_max
  echo 5368709120 > /sys/module/zfs/parameters/zfs_arc_max
and add to /etc/modprobe.d/zfs.conf
zfs zfs_arc_max=5368709120


maybe you need
maybe you need


  echo 3 > /proc/sys/vm/drop_caches
  echo 3 > /proc/sys/vm/drop_caches
===Tune zfs_arc_dnode_limit_percent===
Assuming zfs_arc_dnode_limit = 0:
echo 20 > /sys/module/zfs/parameters/zfs_arc_dnode_limit_percent
In /etc/modprobe.d/zfs.conf:
options zfs zfs_arc_dnode_limit_percent=20


= FAQ =
= FAQ =
==Arc metadata size exceeds maximum===
So '''arc_meta_used''' > '''arc_meta_limit'''


== show status ==
== show status and disks ==


  zspool status
  zpool status


== show drives/pools ==
== show drives/pools ==
Line 89: Line 251:


  zfs list -a
  zfs list -a
==Estimate raidz speeds==
raidz1: N/(N-1) * IOPS
raidz2: N/(N-2) * IOPS
raidz3: N/(N-3) * IOPS
==VDEV cache disabled, skipping section==
Looks like you just don't have l2arc cache

Latest revision as of 14:59, 21 September 2023

Links

Documentation

ARC/Caching

ARC statistics

cat /proc/spl/kstat/zfs/arcstats

data_size

size of cached user data

dnode_size

hdr_size

size of L2ARC headers stored in main ARC

metadata_size

size of cached metadata

Tuning ZFS

Tools

Processes

arc_evict

Evict buffers from list until we've removed the specified number of bytes. Move the removed buffers to the appropriate evict state. If the recycle flag is set, then attempt to "recycle" a buffer: - look for a buffer to evict that is `bytes' long. - return the data block from this buffer rather than freeing it. This flag is used by callers that are trying to make space for a new buffer in a full arc cache.


This function makes a "best effort". It skips over any buffers it can't get a hash_lock on, and so may not catch all candidates. It may also return without evicting as much space as requested.

arc_prune

Commands

Getting arc statistics

arc_summary

Tip, for details use

arc_summary -d

There is also

cat /proc/spl/kstat/zfs/arcstats


Getting IO statistics

zpool iostat -v 300

Terms and acronyms

vdev

Virtual Device.

ARC

Adaptive Replacement Cache

Portion of RAM used to cache data to speed up read performance

L2ARC

Level 2 Adaptive Replacement Cache

"L2ARC is usually considered if hit rate for the ARC is below 90% while having 64+ GB of RAM"

SSD cache

DMU

Data Management Unit


MFU

Most Frequently Used

MRU

Most Recently Used


Scrubbing

Checking disks/data integrity

zpool status <poolname | grep scrub

and

zpool scrub <poolname>

probably taken care of by cron.

ZIL

the space synchronous writes are logged before the confirmation is sent back to the client

prefetch

HOWTO

Create zfs filesystem

zfs create poolname/fsname

this also creates mountpoint


Add vdev to pool

zpool add mypool raidz1 sdg sdh sdi

Replace disk in zfs

Some links

Get information first:

Name of disk

zpool status

Find uid of disk to replace

take it offline

zpool offline poolname ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M5RLZC6V

Get the disk guid:

zdb

guid: 15233236897831806877

Get list of disk by id:

ls -al /dev/disk/by-id

Save the id, shutdown, replace disk, boot:

Find the new disk:

ls -al /dev/disk/by-id

Run replace command. The id is the guid of the old disk, name is of the new disk

zpool replace tank /dev/disk/by-id/13450850036953119346 /dev/disk/by-id/ata-ST4000VN000-1H4168_Z302FQVZ

Showing information about ZFS pools and datasets

Show pools with sizes

zpool list 

or

zpool list -H -o name,size


Show reservations on datasets

zfs list -o name,reservations

Swap on zfs

https://askubuntu.com/questions/228149/zfs-partition-as-swap

vdevs

multiple vdevs

Multiple vdevs in a zpool get striped. What about balance?

invalid vdev specification

Probably means you need -f

show balance between vdevs

zpool iostat -v 'pool' [interval in seconds]

orjust

zpool iostat -vc 'pool'

Tuning arc settings

See Tuning ZFS modules parameters

zfs_arc_max

Linux defaults to giving 50% of RAM to arc, this is when:

cat /sys/module/zfs/parameters/zfs_arc_max
0
grep c_max /proc/spl/kstat/zfs/arcstats

To change this:

echo 5368709120 > /sys/module/zfs/parameters/zfs_arc_max

and add to /etc/modprobe.d/zfs.conf

zfs zfs_arc_max=5368709120


maybe you need

echo 3 > /proc/sys/vm/drop_caches


Tune zfs_arc_dnode_limit_percent

Assuming zfs_arc_dnode_limit = 0:

echo 20 > /sys/module/zfs/parameters/zfs_arc_dnode_limit_percent

In /etc/modprobe.d/zfs.conf:

options zfs zfs_arc_dnode_limit_percent=20

FAQ

Arc metadata size exceeds maximum=

So arc_meta_used > arc_meta_limit


show status and disks

zpool status

show drives/pools

zfs list
      

check raid level

zfs list -a


Estimate raidz speeds

raidz1: N/(N-1) * IOPS
raidz2: N/(N-2) * IOPS
raidz3: N/(N-3) * IOPS


VDEV cache disabled, skipping section

Looks like you just don't have l2arc cache