How to write idempotent Bash scripts (2019)

Daily Digest email

Get the top HN stories in your inbox every day.

xonix

Idempotence is achievable elegantly with makesure tool [1] using @reached_if directive [2]. Full disclosure - I’m the author of the tool.

[1] https://github.com/xonixx/makesure

[2] https://github.com/xonixx/makesure#reached_if

julian_sark

This is, in my opinion, REALLY bad practice.

Never use potentially dangerous commands (rm -f) to avoid the prospect of an error!

Instead, one should practice good error handling. For some scripts and depending on the audience "set -e" may be sufficient. But usually always better:

rm file.txt || echo "Warning: Deleting file.txt failed, continuing anyway."

Or, for longer shell scripts, I always include a little function that deals with error handling. You can call it in the same way:

rm file.txt || errorHandler("ignore")

As shown above, the function can be built to take various parameters, and can then for instance abort or ignore the error. It can also take care of updating a log file.

Somewhat related, I do logging in shell scripts with a function, too:

rm file.txt && log("info", "file successfully deleted")

That way, I only need to write the code that writes log lines with nice formating and timestamps and such once.

Somewhat related, one can trap various signals in a shell script, and call a bash function when they happen. Thus, one can trap SIGEXIT and call a cleanup function that triggers on Ctrl-C, and I'm almost certain (sorry, can't test right now) one can also trap SIGERR and catch errors in a shell script nicely (though I'm not sure if that also goes for external commands inside said shell script).

xorcist

That's not it.

Between

  [ -f file.txt ] && rm file.txt

and

  rm -f file.txt

the latter form is preferrable because it is easier and not as prone to concurrency problems.

Although it should be said that concurrency and idempotency is seldom acquired just by using force-flags in isolated spots, the whole operation needs to take it into account. There are probably more steps involved than just removing one file. The linked article leaves out the bigger picture so it's probably not very helpful. Often it is sufficient to construct the new state under a temporary name and then move it in place in one operation.

bashonly

agreed w/ your general points, but passing arguments to bash functions does not work like this:

    errorHandler("ignore")

it should just be:

    errorHandler "ignore"

julian_sark

You are certainly correct, thanks. I hastily typed this out, my bad.

reacweb

you will probably love perl. Perl encourages this programming style :

    unlink ('file.txt') or die "Can't remove file.txt: $!";

julian_sark

Somewhat paradoxically, my opinion on perl can't be (cleanly) reproduced here ;)

undefined

[deleted]

josho

I tried doing this once. It just wasn't worth the additional testing effort and increased complexity. Switching to Ansible was far more productive.

Interestingly though starting in Bash was a good MVP approach. Fast to get started. But as things started to slow down the project switched to a better tool. I expect for many projects the number of bash scripts may never grow large enough to merit a switch. So, sure write scripts that are safe to run multiple times, but if that effort grows know that there are better tools.

pas

Ansible - in my experience (mostly with openstack-ansible years ago) - was too slow, hard to configure, debug and hard to extend with new modules. :(

Moving toward "declarative state with enforcing/monitoring (idempontent!?) control loops" is much, much better. (What k8s does well. Now all we need is one master bash script to setup k8s :D )

nathancahill

I might be misremembering, but I believe Ansible started this way too? And then switched from Bash to Python as the project grew?

zufallsheld

Ansible started as a Python peoject according to the first commit:

    commit f31421576b00f0b167cdbe61217c31c21a41ac02
    Author: Michael DeHaan <michael.dehaan@gmail.com>
    Date:   Thu Feb 23 14:17:24 2012 -0500

        Genesis.

    diff --git a/README.md b/README.md
    new file mode 100644
    index 00000000000..60bbc9f8137
    --- /dev/null
    +++ b/README.md
    @@ -0,0 +1,88 @@
    +Ansible
    +=======
    +
    +Ansible is a extra-simple Python API for doing 'remote things' over SSH.
    +

CraigJPerry

I only remember Python even from the very beginning. Also Cobbler (Michael Dehaan’s prior project) was in Python too.

I could be wrong though, the only reason this stuff sticks in my mind is because i was working on a CFEngine deployment (scarred for life) at the time Ansible popped up.

tryauuum

Even better than Ansible is Saltstack or Puppet or Chef. In other words, something that sits on a controlled machine and already knows facts about said machine

giobox

Not all idempotent automation operations need heavy weight centralized state management like a deployed Chef server... Ansible has a lot less boiler plate in many scenarios. Indeed this is exactly why Chef even have 'chef-solo'/local mode to let you avoid having a Chef Server at runtime:

https://docs.chef.io/chef_solo/

While its a matter of opinion and will depend on problem you are trying to solve, I've found teams get productive on Ansible a lot faster and Chef Cookbooks often end up requiring significant maintenance. I have similar complaints with Puppet too.

My biggest complaint with Chef is that its DSL so closely resembles Ruby that developers often assume the code in the cookbook is executed as pure Ruby rather than a DSL conversion; the transcoding phase in Chef is pretty complex and can lead to lots of debug head scratching.

tryauuum

Yeah, my favorite solution of them all is SaltStack. You can literally put jinja2 in any part of state, not into just specific parts like in Ansible.

And it's malleable like clay, you can turn SaltStack into a perfect fit for your case

paulirish

> Use the -f flag which ignores non-existent files.

> rm -f example.txt

Instead of using --force to clobber all sorts of permission scenarios, wouldn't you want to specifically avoid non-existing files and handle that? I'm thinking something like....

    stat -q example.txt && rm example.txt

With a literal filename, rimraf is safe enough, but I wouldn't mind some extra care around `rm -rf $somepath`.

SkittyDog

Your idea isn't wrong/bad... But you are aware it creates a potential race condition, right? The `stat` check and the `rm` are two separate operations, with a nonzero time interval between them. So it's entirely possible for another process to delete or rename that file, after the `stat` but before the `rm` can run.

Resolving the race robustly is actually kind of hard, in Bash... You can check whether `rm` returns a nonzero exit status, but that might fail for all sorts of other reasons (e.g., bad permissions). I guess you could case/select the exit status numerically, but that could turn into a real case of bedbugs, really quick.

I guess you could repeat the `stat` check afterwards, if the ~`rm` fails? In that case, if the path doesn't exist afterwards, you can just swallow the `rm` error and let it roll.

Practically speaking, I try to avoid using Bash for anything where automatic, reliable error handling is that important... I love cool Bash tricks, but it's just not designed for sophisticated control flow.

IgorPartola

As far as I am aware on Linux there is no way to stat and unlink in one syscall. And a second stat will create a second race condition. If you don’t want the file to exist why are you questioning whether you have permission or not? Just unlink the file via rm, if possible.

tzs

> As far as I am aware on Linux there is no way to stat and unlink in one syscall.

That just means you have to do a little more work.

1. Become root.

2. Fork until the process table is full.

3. Kill all other processes. After each kill, fork until the process table is again full.

4. When all processes other than the init process are yours, you can do the stat and unlink without worrying about race conditions since there is no one else to have a race with. (Assuming the file you want to stat and unlink isn't some file that your init process is interested in, and assuming it isn't on a network drive).

The side effects of this are annoying to deal with though and are probably worse than whatever problems an unhandled race condition would cause.

Another approach would be to create a new user, chown the directory containing the file to that user, become that user, chmod the directory to 0300, kill any process that has that file open, do your stat and unlink, chown and chmod the directory back to what they were, and delete the earlier created user.

SkittyDog

I believe you are correct about the lack of an atomic stat/unlink operation. Depending on how you implement the two steps, your potential errors are different.

`rm -f` handles some errors in a different way than the stat/rm approach. One can fail where the other would not.

Any given unhandled error may or may not be a problem, depending on the nature of the error, how the rest of the script is structured, and what your requirements are. There's nothing wrong with the stat/rm approach--it may be the better way to go.

kazinator

> But you are aware it creates a potential race condition, right?

If your script has to be idempotent in the face of concurrent executions of itself, then that's an extra requirement which you have to handle with locking or whatever.

It is not implied by idempotency; there is meaningful idempotency which excludes the concurrency requirement.

If something can rename example.txt in parallel with your script, at any time during its execution, there is nothing you can do to ensure that it's gone.

Whatever step you take to ensure that its is gone can be preceded by the parallel rename.

SkittyDog

I'm not addressing idempotency, that's a separate issue. The comparison we're discussing here is about the relative merits of `rm -f` versus the stat/rm (no '-f' option) approach. The parent post addressed a potential weakness of using the '-f' option--and the proposed alternative brings tradeoffs.

I believe your point about idempotency is correct, though.

MichaelMoser123

I doubt that a bash script will be concurrent. If you are running your scripts in paralell, then chances are, that they will be screwed because of some other dependency/side effect.

gpderetta

Hum parallelism is pervasive in shell scripting.

undefined

[deleted]

kbenson

I think what you really want is test, which allows you to test various things about a string or variables, including existence, non-zero size, etc.

maest

The issue there is that testing then executing can introduce racing issues in a multi threaded scenario (unless I misunderstand how `test` works)

bell-cot

I would not trust things like `ls -sf` to be thread-safe either.

For shell scripts, a reasonable approach is to use a (correctly implemented) lock file at the start, then a bunch of `test` (aka `[`) blocks to check things. That does a decent job of documenting, too. Vs. everyone who looks at the script needing to know the less-common command line options.

Then, in the bigger picture, make sure that nobody else's script - maybe with its own lock file - starts overlapping with your script.

Macha

This is true for stat too, which was the grandparent example.

Though you're unlikely to be writing multi-threaded bash, but entirely possible for another process to delete it too.

kbenson

Yes. I'm not making a case that test gets around that problem, just that in the cases where you want to test something about a file, such as existence, emptiness/non-emptiness, etc, test is the more versatile (and more standard) choice.

kazinator

I don't see a -q option documented for the GNU Coreutils stat in currently hosted online documentation.

https://www.gnu.org/software/coreutils/manual/html_node/stat...

(At time of writing, GNU Coreutils 9.0)

thayne

> Touch is by default idempotent.

maybe, depending on your definition of idempotent. It does update the modification time of the file, which does result in a different state if it is run multiple times. Although in practice, that probably doesn't matter most of the time.

medstrom

I've primarily used touch as a timestamp updater, not a file creator. But to be charitable, it is under the headline "Creating an empty file".

gorgoiler

Two other tips: a lot of the ip commands have idempotent versions too. Look for the “replace” versions of each command.

Also, if you have a complex firewall then putting your rules in your own chain and inserting that chain into a top level chain is the idempotent way:

  iptables -D FORWARD -j LOL || true
  iptables -F LOL || true
  iptables -X LOL || true
  iptables -N LOL
  iptables -I FORWARD -j LOL

…and then add your task specific rules to your own LOL chain and leave everyone else’s alone.

Much nicer than polluting the top level chains with the iptables equivalent of global variables. You can work on your own but if the firewall without clobbering all the other stuff.

arminiusreturns

thoughts or tips on nftables? (especially with bpf) vs iptables?

gorgoiler

As I understand it, nft has the same system/user chains way of organising rules.

Havoc

First encountered this concept while writing ansible scripts. Makes quick intuitive sense but actually doing this is a lot harder.

e.g. Approx zero percent of the stackoverflow codesnippets you google are idempotent.

So as a new learner that's a show stopper right there. Barely hanging on to the stuff you're being taught let alone being able to see a big picture concept that is 20x miles above your current level.

Its just not happening

gregmac

I'm helping introduce some DevOps concepts to an ops team that has been doing things in the "traditional" way for a very long time. They've been building scripts in the past couple years to replace what had always been manual work, but it's been interesting to see their perspective. Nothing is idempotent on purpose: they know whether a database exists, whether schema changes are pending, whether a hostname needs to be updated in DNS, so their mindset is simply to not run those scripts if not needed.

I thought the big task would be introducing source control and CI/CD, but I quickly realized idempotentcy is actually the more fundamental and key concept that I need them to embrace.

mekster

Unless you can carefully code, it's not worth pretending a script to be idempotent.

The example checks for /mnt/dev in a file but it doesn't check whether the string is in a comment or if it's part of /mnt/develop, and pretending you've achieved idempotency is dangerous unless you've gone through the effort of checking every corner case.

It's easier and safer to pretend it's not idempotent in the first place.

AceJohnny2

> Unless you can carefully code, it's not worth pretending a script to be idempotent.

Just because you can't reach perfection doesn't mean it's useless to strive towards it.

Idempotency is a very useful property that more shell script writers should be made aware of, even if they only learn basic things like "mkdir -p" or "rf -f SOMEFILE"

VWWHFSfQ

This whole thing is also loaded with TOCTOU problems. It's almost always best just to try to do what you want to do and check for failure.

zymhan

I had to look up TOCTOU https://en.wikipedia.org/wiki/Time-of-check_to_time-of-use

Very interesting!

User23

That link also describes the Catholic schoolboy algorithm: ask forgiveness not permission.

pas

Orchestration/install/setup scripts are usually run while nothing else is running. Bash is not ideal for this. But it's not ideal for almost everything. Still making things idempotent-ish is better than bailing on every recoverable error and making the admin do the cleanup manually so the script can finally do what it supposed to do.

(Of course a few years ago immutable infrastructure was the rage, because it means if the script runs once, pack it up as a VM and you're done. And that's how the docker sausage images are made.)

dzhiurgis

Is there a tool that treats txt files as a Set?

Been looking for one a while a go and it feels like it’s something sound be built in.

sa46

You can use a combination of `comm` and `uniq` to implement set intersection and union. https://ss64.com/bash/comm.html

> Return the unique lines in the file words.txt that don't exist in countries.txt

    comm -23 <(sort words.txt | uniq) <(sort countries.txt | uniq)

>Return the lines that are in both words.txt and countries.txt:

    comm -12 <(sort words.txt | uniq) <(sort countries.txt | uniq)

kmstout

I wrote these years ago. They're damn handy. It's true that they're not implemented in Bash (that would be nuts), but having them on hand lets me do much more on the command line than would otherwise be possible.

  ~/bin/union
  ===========
  #! /usr/bin/awk -f

  !acc[$0]++


  ~/bin/intersection
  ==================
  #! /usr/bin/awk -f

  !buf[$0]++ {acc[$0] += 1}

  ENDFILE {
    delete buf;
    files++
  }

  END {
    for (k in acc) if (acc[k] == files) print k
  }

  ~/bin/set-diff
  ==============
  #! /usr/bin/awk -f

  ! filenum { acc[$0] = 1    }
  filenum   { delete acc[$0] }

  ENDFILE { filenum++ }

  END {
    for (k in acc) print k
  }

SavantIdiot

In bash? How would you even implement a set in bash without just doing linear greps? Or did someone add sets to bash 20 years ago and I never got the memo?

adrianmonk

You could use the 'look' command, which does a binary search.

It's basically meant to look up spellings in /usr/share/dict/words, but it can work on any file. It will match any line that your pattern is a prefix of, so you'd have to add logic to eliminate longer matches.

But if you had some huge file to search and you wanted to do it from a shell script, that would be one way. Caveat: although it's fairly standard, 'look' might not be installed on every system.

Also, you have to be sure to maintain your file in sorted order. So no adding things by appending to the end, and checking if something is in the set is much quicker at the expense of adding things being much slower.

User23

When you’re paying six figures careful coding isn’t too much to ask.

SkittyDog

I agree with you. Bash is a wonderful tool, but it's not very well designed for sophisticated control flow. Gotta know when to say when, and switch to something else.

gigatexal

Do-one-thing-and-only-one-thing-well bash scripts coupled with a Makefile gives one “sophisticated control flow”.

dataflow

For a lot of these I think idempotency is the wrong angle to look at it. For example, the problem with mkdir foo is not that it's non-idempotent, it's that (in most cases) the directory already existing is not an error to begin with: it still satisfies the same post-condition you desired. To put it another way, you're usually trying to say "ensure this directory exists", not "ensure you create this directory", and as such, unless you're really trying to test directory creation, mkdir without -p is a semantic bug regardless of whether you need to run it multiple times.

joebob42

I found "Good software is always written in an idempotent way" sort of hilarious.

Obviously idempotency is a powerful tool, but the confidence with which tfa demands it of all software was silly.

kody

Only Sith deal in absolutes -- I feel that with TDD. Yes, our tests should catch most if not all regressions - no that does not mean we're inherently writing bug-free code as long as it passes all our tests. If someone claims that a methodology is totally necessary, take it with a grain of salt and test your code.

emmelaich

There are a few dark corners, e.g.

   mkdir dir
   chmod a-x dir
   mkdir -p dir

Will not give you the permissions on dir that you want, typically.

I've come to the idea that the best thing to guarantee what you want is

  mkdir the-new-dir.temp-jnsjw23
  <do things to the-new-dir.temp-jnsjw23>
  rsync -a --delete the-new-dir.temp-jnsjw23 the-new-dir

undefined

[deleted]

teddyh

Instead of “mkdir -p dir”, use “install --directory -- dir”.

emmelaich

Yep, or use the '-m' (mode) option with mkdir.

mmgutz

I have used every trick in this article over the years. Nice to have them in one place. Will save many hours of googling.

pkrumins

100% agree, this is the best article on writing great bash scripts.

Daily Digest email

Get the top HN stories in your inbox every day.