Brian Lovin
/
Hacker News
Daily Digest email

Get the top HN stories in your inbox every day.

istjohn

I'm getting started with ZFS just now. The learning curve is steeper than I expected. I would love to have a dumbed down wrapper that made the common case dead-simple. For example:

- Use sane defaults for pool creation. ashift=12, lz4 compression, xattr=sa, acltype=posixacl, and atime=off. Don't even ask me.

- Make encryption just on or off instead of offering five or six options

- Generate the encryption key for me, set up the systemd service to decrypt the pool at start up, and prompt me to back up the key somewhere

- `zfs list` should show if a dataset is mounted or not, if it is encrypted or not, and if the encryption key is loaded or not

- No recursive datasets and use {pool}:{dataset} instead of {pool}/{dataset} to maintain a clear distinction between pools and datasets.

- Don't make me name pools or snapshots. Assign pools the name {hostname}-[A-Z]. Name snapshots {pool name}_{datetime created} and give them numerical shortcuts so I never have to type that all out

- Don't make me type disk IDs when creating pools. Store metadata on the disk so ZFS doesn't get confused if I set up a pool with `/dev/sda` and `/dev/sdb` references and then shuffle around the drives

- Always use `pv` to show progress

- Automatically set up weekly scrubs

- Automatically set up hourly/daily/weekly/monthly snapshots and snapshot pruning

- If I send to a disk without a pool, ask for confirmation and then create a new single disk pool for me with the same settings as on the sending pool

- collapse `zpool` and `zfs` into a single command

- Automatically use `--raw` when sending encrypted datasets, default to `--replicate` when sending, and use `-I` whenever possible when sending

- Provide an obvious way to mount and navigate a snapshot dataset instead of hiding the snapshot filesystem in a hidden directory

gruturo

Uh you are mixing sensible suggestions (ashift=12 as a default, {pool}:{dataset} syntax, though it will be hard to change so late) with very, very opinionated ones, which break use cases you may not be aware of.

Naming pools after hostnames: I have pools on a SAN which can be imported by more than one host.

Weekly scrubs, periodic snapshots, periodic pruning: This is really the job of the OS' scheduler (an equally opinionated view, I admit)

collapsing zpool and zfs commands - sure but why? so you can have zfs -pool XXXX and zfs -volume XXXX?

No recursive datasets? I have use cases where it's very useful.

`zfs list` should show if a dataset is mounted or not, if it is encrypted or not, and if the encryption key is loaded or not: Fully agree!

Don't make me type disk IDs when creating pools: You can address them in 3-4 different ways (by id, by WWN, by label, by sdX etc), and you have to specify in _some_ way which disks you want to go there, so not sure what's the point here.

Store metadata on the disk so ZFS doesn't get confused if I set up a pool with `/dev/sda` and `/dev/sdb` references and then shuffle around the drives: Already happening. Swap a few drives around and import the pool, it will find them.

Some of your suggestions are genuinely OK, at least as defaults, but some indicate you aren't really considering much outside your own usage pattern and needs. ZFS caters to a lot more people than you.

istjohn

I'm not suggesting zfs itself change. I'd like a porcelain for people with very simple needs where zfs is almost overkill, for people who just want better data integrity and quicker backups on their main machine, for example.

I think zpool is unnecessary as an additional command. For example, `zfs scrub`, `zfs destroy [pool | dataset]`, `zfs add`, `zfs remove` would all have clear meanings. There may be a couple commands that would need explicit disambiguation with a flag like `zfs create`.

gruturo

I missed completely that you were discussing just a wrapper and not touching the original ZFS. Completely my bad! Obviously some of my objections become quite moot in that light.

simondotau

> ZFS caters to a lot more people than you.

And under the OP's proposal, those people would continue to ZFS entirely unaffected. The OP wasn't proposing changing the behaviour of ZFS, but rather "wrapping" this defined set into a well-defined recipe which could be used by people who aren't so opinionated.

This "dumbed down wrapper" wouldn't even need to be called ZFS, to avoid confusion. Personally I'd like to propose the name ZzzFS: which is ZFS made so simple you can do it in your sleep...

istjohn

I like that. I was thinking Zoofus, easy enough for a doofus.

mceachen

Would also accept EZFS.

traceroute66

> so ZFS doesn't get confused if I set up a pool with `/dev/sda` and `/dev/sdb`

To be fair, that's not ZFS's problem, that is your problem for not keeping up with the times. PBCAK.

For quite some time now, Linux has had fully-qualified references, e.g. : `/dev/disk/by-id/ata-$manufactuer-$serial-$whatever`

That is what you should be using when building your pools.

istjohn

My problem is that that's a pain to type out. I've read that it's necessary (and other's have said it's not), but it'd be more convenient to just do `mirror /dev/sda /dev/sdb` than `mirror /dev/disk/by-id/ata-WDC_WDBNCE5000PNC_21365M802768 /dev/disk/by-id/ata-Samsung_SSD_850_EVO_500GB_S2RANX0HA52854X`

fluidcruft

Working with drive serial numbers is great. I would ban using /dev/sda etc. Nothing like being able to read the sticker on the disk to make sure you pulled the right one. Or that "sda" is actually what you think it is when you're booted from a recovery disk.

phpisthebest

Tab complete would not add that much over head in typing... I have never understood the complaint that verbosity is bad. We have systems that enable you to be both verbose and quick, win win...

I believe defining a storage array based on an immutable ID to be far better than the dynamic OS Assigned ID like /dev/sda which may have a total of what 5 key strokes more when using tab complete?

deadbunny

Create the pool using `/dev/sda` etc then:

    sudo zpool export tank
    sudo zpool import -d /dev/disk/by-id -aN

tomxor

> I've read that it's necessary (and other's have said it's not)

It is not, it's outdated advice that stubbornly persists. Try it yourself if you are not convinced, create a pool using /dev/sdx and change the device order, zfs will still find it, because why wouldn't they just dereference sdx and store UUIDs?

quags

One and done. One you type it would you can use zfs list|status from there. There is nothing wrong using sda and sdb but if you need to replace a drive In the future it’s just easier with the by-id label.

jlouis

True, but in ZFS, you don't usually use these names unless you are shoving disks around. In which case it's nice to have an identity on the disk.

Most of your work is a logical naming scheme on top.

eitland

the tab key is my friend here.

tuetuopay

OTOH you have BTRFS where you can use whatever you want, and just find all disks using the filesystem ID to join the array. Works like a charm and you never have to think about it.

craftkiller

ZFS does that too, the top-level comment was just uninformed. You can shuffle your disks around, move them to other machines, and even move the contents to different physical drives and ZFS will still automatically find them all using metadata written to the disks called the "zpool label".

undefined

[deleted]

rollcat

A lot of these suggestions are heavily opinionated. Which is not necessarily bad, but they seem to mess with existing conventions just for the sake of it (why {pool}:{dataset}?).

> Don't make me name [...] snapshots.

You might like this little tool I wrote: https://github.com/rollcat/zfs-autosnap

You put "zfs-autosnap snap" in cron hourly (or however often you want a snapshot), and "zfs-autosnap gc" in cron daily, and it takes care of maintaining a rolling history of snapshots, per the retention policy.

It's not hard writing simple ZFS command wrappers, feel free to take my code and make your own tools.

istjohn

Fair point on the {pool}:{dataset} thing. I just don't like that the same {pool} name refers to both a pool of vdevs and the top-level dataset on that pool. It makes it that little bit harder to grok the distinction. Perhaps there's a better way to emphasize that difference.

E39M5S62

The distinction is easy. If you're using the 'zpool' binary, you're operating on the pool. If you use the 'zfs' binary, you're operating on the dataset.

Freaky

Nice, you reminded me of my own incomplete Rust rewrite of the Ruby ZFS snapshot script I wrote about a decade ago, and this bit of yak shaving that ended up derailing me: https://github.com/Freaky/command-limits

I ended up finishing neither, and should pick them back up!

(I snapshot in big chunks with xargs to try to minimise temporal smear - snapshots created in the same `zfs snapshot` command are atomic)

cduzz

I've read that one of the last task of a blacksmith apprentice is to make all the tools a workaday blacksmith would need to use to blacksmith. IE your last lesson is to make your own anvil, your own hammers, tongs, etc.

At $DAYJOB I wrote a bunch of scripts to mechanize building ZFS arrays for whatever expected deployment I'd imagined on that day. Among the tasks was to make luks encrypted volumes on which to put the zvols, standardize the naming schemes, sane defaults like ashift=12, lz4 compression, etc. (this was well before encryption was part of ZFS; I haven't updated the scripts since to support encryption in zfs since it's not really been a problem this way)

I don't remember many of these flags now, but have a script as reference for documentation, and others on the team don't need to know much about ZFS besides run make-zfs-big-mirror or make-big-zfs-undundant-raid0 and magic happens.

Eventually maybe even that stuff will be automated away by our provisioning, if we ever are in a position to provision systems more than 20 times per year.

godelski

Honestly this is one of the reasons I love ansible. I create scripts while doing it the first time. Then the rest is hitting go and forgetting. The scripts are the documentation, like you said. The hell if I'm remembering all those magic incantations. You only ever will remember what you frequently use, the rest is for off-brain storage.

feitingen

Last time I tested performance (~2 years ago), zfs on luks performed better than zfs encrypted datasets, on sequential reads, almost twice as good. This was on particularly slow hard drives.

Not sure why, and I should probably make the test reproducable.

mustache_kimono

As others have noted, these are really opinionated suggestions. And while it's perfectly fine to have an opinion, many of these vary between "this isn't the way I'm used to Linux doing it" to the actually objectionable.

The ones I find most personally objectionable:

> - Don't make me name pools or snapshots. Assign pools the name {hostname}-[A-Z]. Name snapshots {pool name}_{datetime created} and give them numerical shortcuts so I never have to type that all out

Not naming pools is just bonkers. You don't create pools often enough to not simply name them.

Re: not naming snapshots, you could use `httm` and `zfs allow` for that[0]:

    $ httm -S .
    httm took a snapshot named: rpool/ROOT/ubuntu_tiebek@snap_2022-12-14-12:31:41_httmSnapFileMount
> - collapse `zpool` and `zfs` into a single command

`zfs` and `zpool` are just immaculate Unix commands, each of which has half a dozen sub commands. One the smartest decisions the ZFS designers made was not giving you a more complicated single administrative command.

> - Provide an obvious way to mount and navigate a snapshot dataset instead of hiding the snapshot filesystem in a hidden directory

Again -- you can do this very easily via `zfs mount`, but you'll have to trust me that a stable virtual interface also makes it very easy to search for all file versions, something which is much more difficult to achieve with btrfs, et. al. See again `httm` [1].

[0]: https://kimono-koans.github.io/opinionated-guide/#dynamic-sn... [1]: https://github.com/kimono-koans/httm

formerly_proven

> I would love to have a dumbed down wrapper that made the common case dead-simple.

TrueNAS

highpost

This. ZFS for Dummies == TrueNAS.

barrkel

It's interesting; I'm the kind of person who feels uncomfortable using something without understanding the shape of the stack underneath it. A magic black box is anathema; I want to know how to use the thing with mechanical sympathy, aligning my work with the grain of the implementation details. I want to understand error messages when they happen. I want to know how to diagnose something when it breaks, especially when it's something as important as my data.

I like how ZFS is put together. I've been running it for about 13 years. I started with Nexenta, a Solaris fork with Debian userland. I've ported my pool twice, had a bunch of HDD failures, and haven't lost a single byte.

I agree with you on most of the encryption stuff. That is very recent and not fully integrated and the user experience isn't fully baked. I don't agree on unifying zpool and zfs; for a good long time, I served zvols from my zpool, and dividing up storage management and its redundancy configuration from file system management makes sense to me. Similarly, recursive datasets make sense; you want inheritance or something very like it when managing more than a handful of filesystems. I don't agree on pool names (why anyone would want ordinal pool naming and just replicate the problem you just stated re sda, sdb etc. is a bit mysterious), and I don't agree on snapshots (to me this is like preferring commit IDs in git to branch and tag names - manually created snapshots outside periodic pruning should be named).

ZoL on Ubuntu does periodic scrubs by default now. Sometimes I have to stop them because they noticeably impact I/O too much. Periodic snapshots is one of the first cronjobs I created on Nexenta, and while there's plenty of tooling, it also needs configuration - if you are not aware of it, it's an easy way to retain references to huge volumes of data, depending on use case. Not all of my ZFS filesystems are periodically snapshotted the same way.

istjohn

I see the utility in recursive datasets and I wouldn't want them to go away, but if I were creating a zfs-for-dummies I wouldn't include the functionality. You'd have to drop down to the raw zpool/zfs commands to get that.

Likewise, I appreciate being able to name snapshots, but it's annoying to have to manually name the snapshot I create in order to zfs send. The solution there is probably to not make me take a manual snapshot in the first place. `zfs send` should automatically make the snapshot for me. But in general, I don't see why zfs can't default to a generic name and let me override it with a `--name` flag.

Giving it more thought, I think I would keep pool naming. What I don't like is the possibility of having pool name collisions which isn't something you have to think about with, say, ext4 filesystems. But the upshot, as you point out, is with zfs you aren't stuck using sda, sdb, etc.

fluidcruft

What's the lifecycle of an automatically-created snapshot? i.e. is the snapshot garbage? How does it get collected? Things like autosnapshot implement policy to take care of itself but... zfs send one-offs?

zfs send is a strange beast. It's more like differential tar than rsync (i.e. a stream intended for linear backup). zfs is cool because it unifies backup/restore. Have you tried restoring differential tar from unlabeled tapes?

gigatexal

I’m on the other end of the spectrum. I like knowing the flags and settings I use to create the pools.

For snapshots and replication take a look at sanoid (https://github.com/jimsalterjrs/sanoid).

vermaden

Other useful things about ZFS:

- get to know the difference between zpool-attach(8) and zpool-replace(8).

- this one will tell you where your space is used:

    # zfs list -t all -o space
    NAME                      AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV  USEDCHILD
    (...)
- ZFS Boot Environments is the best feature to protect your OS before major changes/upgrades

--- this may be useful for a start: https://is.gd/BECTL

- this command will tell you all history about ZFS pool config and its changes:

    # zpool history poolname
    History for 'poolname':
    2023-06-20.14:03:08 zpool create poolname ada0p1
    2023-06-20.14:03:08 zpool set autotrim=on poolname
    2023-06-20.14:03:08 zfs set atime=off poolname
    2023-06-20.14:03:08 zfs set compression=zstd poolname
    2023-06-20.14:03:08 zfs set recordsize=1m poolname
    (...)
- the guide misses one important info:

  --- you can create 3-way mirror - requires 3 disks and 2 may fail - still no data lost

  --- you can create 4-way mirror - requires 4 disks and 3 may fail - still no data lost

  --- you can create N-way mirror - requires N disks and N-1 may fail - still no data lost

  (useful when data is most important and you do not have that many slots/disks)

65a

N-way mirrors also have the property that ZFS can shard reads across them, which mattered a lot on spinning rust, since iops can be limited.

customizable

We have been running a large multi-TB PostgreSQL database on ZFS for years now. ZFS makes it super easy to do backups, create test environments from past snapshots, and saves a lot of disk space thanks to built-in compression. In case anyone is interested, you can read our experience at https://lackofimagination.org/2022/04/our-experience-with-po...

photon_lines

Nice - thanks for the info! I had no idea about the Toy Story 2 fiasco as well so this was a great read :)

customizable

Thanks, glad you liked it.

qwertox

FreeBSD's Handbook on ZFS [0] and Aaron Toponce's articles [1] were what helped me the most when getting started with ZFS

[0] https://docs.freebsd.org/en/books/handbook/zfs/

[1] https://pthree.org/2012/04/17/install-zfs-on-debian-gnulinux...

CTDOCodebases

I love FreeBSD's docs.

I had an old HP Microserver with 1GB of ECC RAM lying around so I installed FreeBSD on it. I had 5 old 500GB hard drives lying around too so I set them up in a 5x mirror with help from the FreeBSD Handbook. First time using FreeBSD and it was a breeze.

philsnow

One of the diagrams under the bit about snapshotting has a typo reading "snapthot" and I immediately thought it was talking about instagram.

(I realize now after writing it that maybe snapchat should have occurred to me first, but I have never used it)

tomxor

I recently rebuilt a load of infrastructure (mainly LAMP servers) and decided to back them all with ZFS on Linux for the benefit of efficient backup replication and encryption.

I've been using ZFS in combination with rsync for backups for a long time, so I was fairly comfortable with it... and it all worked out, but it was a way bigger time sink than I expected - because I wanted to do it right - and there is a lot of misleading advice on the web, particularly when it comes to running databases and replication.

For databases (you really should at minimum do basic tuning like block size alignment), by far the best resource I found for mariadb/innoDB is from the lets encrypt people [0]. They give reasons for everything and cite multiple sources, which is gold. If you search around the web elsewhere you will find endless contradicting advice, anecdotes and myths that are accompanied with incomplete and baseless theories. Ultimately you should also test this stuff and understand everything you tune (it's ok to decide to not tune something).

For replication, I can only recommend the man pages... yeah, really! ZFS gives you solid replication tools, but they are too agnostic, they are like git pluming, they don't assume you're going to be doing it over SSH (even though that's almost always how it's being used)... so you have to plug it together yourself, and this feels scary at first, especially because you probably want it to be automated, which means considering edge cases... which is why everyone runs to something like syncoid, but there's something horrible I discovered with replication scripts like syncoid, which is that they don't use ZFS's send --replication mode! They try to reimplement it in perl, for "greater flexibility", but incompletely. This is maddening when you are trying to test this stuff for the first time and find that all of the encryption roots break when you do a fresh restore, and not all dataset properties are automatically synced. ZFS takes care of all of this if you simply use the build in recursive "replicate" option. It's not that hard to script manually once you commit to it, just keep it simple, don't add a bunch of unnecessary crap into the pipeline like syncoid does, (they actually slow it down if you test), just use pv to monitor progress and it will fly.

I might publish my replication scripts at some point because I feel like there are no good functional reference scripts for this stuff that deal with the basics without going nuts and reinventing replication badly like so many others.

[0] https://github.com/letsencrypt/openzfs-nvme-databases

sgarland

They mention tuning io_capacity and io_capacity_max, which unfortunately the MySQL docs indicate is useful until you click through to see what the parameters do [0]. They control background IO actions like change buffer merges, and in fact will take IO from the main process that needs it for work.

IME with a decently busy (120K QPS) MySQL DB is that you do not need to touch either of these. If you think you do, monitor the time to fill the redo log, and the dirty page percent in the buffer pool. There are probably other parameters you should tune instead.

[0] https://dev.mysql.com/doc/refman/8.0/en/innodb-parameters.ht...

tomxor

To be fair they aren't going nuts with them, I've seen worse examples. But I agree with you in principle, it's not necessary, and potentially harmful to overall performance. It also doesn't really belong in a ZFS tuning guide.

yjftsjthsd-h

> For databases (you really should at minimum do basic tuning like block size alignment),

One unexpected thing to check (and do check, because your mileage will vary) - the suggestion is usually to align record sizes, which in practice tends to mean reducing the record size on the ZFS filesystem holding the data. I don't doubt that this is at some level more efficient, but I can empirically tell you that it kills compression ratios. Now the funny knock-on effect is that it can - and again, I say can because it will vary by your workload - but it can actually result in worse throughput if you're bottlenecked on disk bandwidth, because compression lets you read/write data faster than the disk is physically capable of, so killing that compression can do bad things to your read/write bandwidth.

tomxor

I know what you're getting at, I wondered the same thing, but my results were the opposite of what I expected for compression.

I enabled lz4 compression and set recordsize for database datasets to 16k to match innoDB... turns out even at 16k my databases are extremely compressible 3-4x AFAIR (I didn't write the DB schema for the really big DBs, they are not great, and I suspect that there is a lot of redundant data even within 16k of contiguous data)... maybe I could get even more throughput with larger record sizes, but seems unlikely.

As you say, mileage will vary, it's subjective, but then I wasn't using compression before ZFS, so I don't have a comparison. I have only done basic performance testing, overall it's an improvement over ext4, but I've not been trying to fine tune it, I'm just happy to not have made it worse so far while gaining ZFS.

yjftsjthsd-h

Oh, nice. In my case I sort of forgot to set the record sizes, got great compression, and then realized I'd missed the record sizes so tried lowering/matching them and watched compression ratio drop to... I think all the way to 1.0:( So that was a natural experiment, as it were. But of course it totally depends on the exact size and most of all your data.

guerby

I started to use ZFS (on Linux) a few years ago and it went smoothly.

My only surprise was volblocksize default which is pretty bad for most RAIDZ configuration: you need to increase it to avoid loosing 50% of raw disk space...

Articles touching this topic :

https://jro.io/nas/#overhead

https://openzfs.github.io/openzfs-docs/Basic%20Concepts/RAID...

https://www.delphix.com/blog/zfs-raidz-stripe-width-or-how-i...

And you end up on one of the ZFS "spreadsheet" out there:

ZFS overhead calc.xlsx https://docs.google.com/spreadsheets/d/1tf4qx1aMJp8Lo_R6gpT6...

RAID-Z parity cost https://docs.google.com/spreadsheets/d/1pdu_X2tR4ztF6_HLtJ-D...

rhinoceraptor

In my opinion, the 50% efficiency of mirror vdevs is a fair price to pay for the simplicity and greatly improved performance. You can grow RAIDZ pools now, but it's still a lot more complicated and doesn't perform as well.

tweetle_beetle

Might not remember the details correctly but when I was younger and stupider I read a lot about how great one of the open source NAS OSs (FreeNAS?) and ZFS were from fervent fans. I bought a very low spec second hand HP micro server on eBay and jumped straight in without really knowing what I was doing. I asked a few questions on the community forum but the vast majority of answers were "Have you read the documentation?!" "Do you have enough RAM?!".

The documentation in question was a PowerPoint presentation with difficult to read styling, somewhat evangelical language, lots of assumptions about knowledge and it was not regularly updated. It was vague on how much RAM was required, mainly just focused on having as much as possible. Needless to say I ignored all the red flags about the technology, the hype and my own knowledge and lost a load of data. Lots of lessons learnt.

andruby

Can you roughly remember how long ago that was? ZFS has been around since the earlier 2000's, with FreeNAS starting in 2005 iirc.

The filesystem has gotten a lot more stable, and imo the documentation clearer.

That said, it's "more powerful and more advanced" than traditional journaling filesystems like ext3, and thus comes with more ways to shoot yourself in the foot.

unethical_ban

Some additional points for posterity, in case it isn't driven home here:

- All redundancy in ZFS is built in the vdev layer. Zpools are created with one or more vdevs, and no matter what, if you lose any single vdev in a zpool, the zpool is permanently destroyed.

- Historically RAIDZs (parity RAIDs) cannot be expanded by adding disks. The only way to grow a RAIDZ is to replace each disk in the array one at a time with a larger disk (and hope no disks fail during the rebuild). So in my very amateur opinion, I would only consider doing a RAIDZ if it is something like a RAIDZ2 or 3 with a large number of disks. For n<=6 and if the budget can stand it, I would do several mirrored vdevs. (Again as an amateur I am less familiar with RW performance metrics of various RAIDs so do more research for prod).

sgarland

Pool of mirrors is usually the safer way, yes.

If and only if you a. Have full, on-site backups b. Are fairly sure of your abilities and monitoring then I can suggest RAIDZ1. I have a pool of 3x3 drives, which ships its snapshots a few U down in my rack to the backup target that wakes up daily, and has a pool of 3x4 drives, also in RAIDZ1.

In the event that I suffer a drive failure in my NAS, my plan of action would be to immediately start up the backup, ingest snapshots, and then replace the drive. That should minimize the chance of a 2nd drive failure during resilvering destroying my data.

Truly important data, of course, has off-site as well.

mastax

I've run into a ZFS problem I don't understand. I have a zpool where zpool status prints out a list of detected errors, never in files or `<metadata>` but in snapshots (and hex numbers that I assume are deleted snapshots). If I delete the listed errored snapshots and run zpool scrub twice the errors disappear and the scrub finds no errors. Zpool status never listed any errors for any of the devices.

So there aren't any errors in files. There aren't any errors in devices. There aren't any errors detected in scrub(?). And yet at runtime I get a dozen new "errors" showing up in zpool status per day. How?

Modified3019

Damn good question. I don’t have time to search for duplicates myself right now, but you can look through/ask the mailing list: https://zfsonlinux.topicbox.com/groups/zfs-discuss (looks weird, but this is a legit web front end for the mailing list) and the github issues: https://github.com/openzfs/zfs/issues

unethical_ban

I've been running into the same issue, where occasionally files seem to get corrupted on the snapshot but also in the live version of the file. I cannot move it or modify it. I can only delete it. There's no indication as to why these files are getting corrupted. Thankfully there they are all large Linux ISOs, so it hasn't been critical to my life.

totetsu

Nice. My gotchas form using zfs on my personal laptop with Ubuntu.

- if you want to copy files for example and connect your drive to another system and mount your zpool there, it sets some pool membership value on the file system and when you put it back in your system it won’t boot unless you set it back. Which involved chroot

- the default settings I had made snapshot every time I apt installed something, because that snap shot included my home drive when I deleted big files thereafter I didn’t get any free space back until i figued out what was going on and arbitrarily deleted some old snapshots

- you can’t just make a swap file and use it,

Helmut10001

> - if you want to copy files for example and connect your drive to another system and mount your zpool there, it sets some pool membership value on the file system and when you put it back in your system it won’t boot unless you set it back. Which involved chroot

Isn't this what `zpool export` is for?

dizhn

Opensuse Tumbleweed comes with snapper which works with btrfs in a similar fashion and /home is not included in the snapshot by default. For your use case too you should exclude /home from your apt triggered snapshots and set a separate one for it. I had scheduled snapshots for my /home at one point but since it's a very actively used directory (downloading isos, games then deleting them) I had similar problems to yours. I guess we could both also have a dedicated separate directory for those short lived huge files, which don't need a snapshot anyway.

totetsu

I found

  cat /etc/apt/apt.conf.d/90_zsys_system_autosnapshot
  // Takes a snapshot of the system before package changes.
  DPkg::Pre-Invoke {"[ -x /usr/libexec/zsys-system-autosnapshot ] && /usr/libexec/zsys-system-autosnapshot snapshot || true";};

  // Update our bootloader to list the new snapshot after the update is done to not block the critical path
  DPkg::Post-Invoke {"[ -x /usr/libexec/zsys-system-autosnapshot ] && /usr/libexec/zsys-system-autosnapshot update-menu || true";};

but how would I get this to not snapshot , say /home/Downloads .. make that its own zpool?

dizhn

Dataset should be enough I think. Not zpool.

Hard to be sure without knowing what that script does.

Dylan16807

> I had scheduled snapshots for my /home at one point but since it's a very actively used directory (downloading isos, games then deleting them) I had similar problems to yours.

What kind of schedule was it? I feel like the low-impact alternative to no snapshots at all is daily snapshots for half a week to a week, and maybe some n-hourly snapshots that last a day or two. Which I would not expect to use up very much space.

dizhn

It was very frequent (around every 5 minutes) because what I was trying to get was a failsafe in case I delete or otherwise mess up a file I was currently working on. It happened exactly once since I disabled snapshots of /home and I was able to recover it from the terminal's scrollback buffer of all places. :)

As far as I know btrfs does get slower as the number of snapshots increase. Not sure about zfs in that regard. Your plan does sound sensible. I have a feeling you could make the snapshots even more frequent without any ill effects.

idatum

I need 3 stores to feel I'm keeping safe years of digital family photos. 1) I have a live (local) FreeBSD ZFS server running for backups and snapshots; 2 pairs of mirrored physical drives 2) I have a USB device that takes 2 mirrored drives to recv ZFS snapshots from #1; I store that vdev backup in a safe place 3) I backup entire datasets to cloud storage from off-prem using rclone.

It's #3 where I need to do some more research/work. I need to spend some time sending snapshots/diffs to cloud blob storage and make sure I can restore. Yes, I know there is rsync.net.

Any experiences to share?

zyberzero

I bought a cheap HP Microserver with four 4 TB spinning disks that I placed at a relatives house ~1000 km from where I live. I do nightly replication to the off-site location, with an account on the receiving end that only has enough permissions to create snapshots and receive data, so even if that ssh key somehow got out in the wild things could not be deleted from the remote store. I hope :)

Clarification: Remote end also uses ZFS, so I can use cheap replication with encryption

havnagiggle

My setup is similar. +1 that Restic is great. My cloud backup was sending blobs to Google workspace, but they have clamped on storage. I will be replacing that with another box at my parents that will be tucked out of the way. I will just have that wireguard tunnel to my home network and send snapshots to it. At some point I'll turn down the workspace solution and probably also unsubscribe.

oniony

I'm using Borg to back up to rysnc.net.

Borg spilts your files up into chunks, encrypts them and dedupes them client-side and then syncs them with the server. Because of the deduping, versioning is cheap and you can configure how many daily, weekly, monthly, &c. copies to keep. For example you could keep 7 day's worth of copies, 6 monthly copies and 10 yearly copies.

Rysnc.net have special pricing for customers using Borg/Restic:

https://www.rsync.net/products/borg.html

https://www.rsync.net/products/restic.html

CTDOCodebases

I use Restic and backup to rsync.net for remote backups. Works great.

I'm not working with much data though so even if I wanted to I couldn't get a ZFS send/receive account with rsync.net. I like the way rsync.net give you separate credentials for managing the snapshots. This way even if my NAS gets compromised i will still have all the periodic snapshots.

For me privacy is my main concern and Restic's security model is good for me. The backup testing features are good too and rsync.net doesn't charge for traffic so these two work good together. I don't use the snapshots though because rsync.net already supports this via ZFS.

Maakuth

I have a similar setup, though with Linux. From my experience I can recommend taking a look at restic (https://restic.readthedocs.io/). It does encrypted and deduplicated snapshots to local and remote repositories. There's a good selection of remote target options available, but you can also use it with rclone to use any weird remote. Just remember to keep a backup of your encryption key somewhere besides the machine you back up ;-)

undefined

[deleted]

neverartful

Are you running regular scrubs with ZFS and checking the results?

idatum

In your experience, what is a good schedule for scrubs?

I do one about every month or so. I should probably add a crontab for that.

neverartful

Weekly is good as a generic answer. The specifics of what you store and the value you place on them could warrant tweaks up or down. The time it takes to perform a full scrub could also be a factor.

Daily Digest email

Get the top HN stories in your inbox every day.

ZFS for Dummies - Hacker News