Brian Lovin
/
Hacker News
Daily Digest email

Get the top HN stories in your inbox every day.

TeMPOraL

> The excellent book xUnit Test Patterns describes a test smell named Assertion Roulette. It describes situations where it may be difficult to determine exactly which assertion caused a test failure.

How is that even possible in the first place?

The entire job of an assertion is to wave a flag saying "here! condition failed!". In programming languages and test frameworks I worked with, this typically includes providing at minimum the expression put in the assertion, verbatim, and precise coordinates of the assertion - i.e. name of the source file + line number.

I've never seen a case where it would be hard to tell which assertion failed. On the contrary, the most common problem I see is knowing which assertion failed, but not how the code got there, because someone helpfully stuffed it into a helper function that gets called by other helper functions in the test suite, and the testing framework doesn't report the call stack. But it's not that big of a deal anyway; the main problem I have with it is that I can't gleam the exact source of failure from CI logs, and have to run the thing myself.

hedora

> I've never seen a case where it would be hard to tell which assertion failed.

There are a set of unit testing frameworks that do everything they can to hide test output (junit), or vomit multiple screens of binary control code emoji soup to stdout (ginkgo), or just hide the actual stdout behind an authwall in a uuid named s3 object (code build).

Sadly, the people with the strongest opinions about using a "proper" unit test framework with lots of third party tooling integrations flock to such systems, then stack them.

I once saw a dozen-person team's productivity drop to zero for a quarter because junit broke backwards compatibility.

Instead of porting ~ 100,000 legacy (spaghetti) tests, I suggested forking + recompiling the old version for the new jdk. This was apparently heresey.

sitkack

You should write an episode of Seinfeld.

I was a TL on a project and I had two "eng" on the project that would make test with a single method and then 120 lines of Tasmanian Devil test cases. One of those people liked to write 600 line cron jobs to do critical business functions.

This scarred me.

ckastner

> One of those people liked to write 600 line cron jobs to do critical business functions.

I was a long-time maintainer of Debian's cron, a fork of Vixie cron (all cron implementations I'm aware of are forks of Vixie cron, or its successor, ISC cron).

There are a ton of reasons why I wouldn't do this, the primary one being is that cron really just executes jobs, period. It doesn't serialize them, it doesn't check for load, logging is really rudimentary, etc.

A few years ago somebody noticed that the cron daemon could be DoS'ed by a user submitting a huge crontab. I implemented a 1000-line limit to crontabs thinking "nobody would ever have 1000-line crontabs". I was wrong, quickly received bug reports.

I then increased it to 10K lines, but as far as I recall, users were hitting even that limit. Crazy.

ch4s3

Junit is especially bad about this. I often wonder how many of these maxims are from people using substandard Java tools and confusing their workarounds with deeper insights.

turblety

Yup, it's why I built `just-tap` [1] which trys to minimise as much magic that a lot of these frameworks try to "help" you with.

1. https://github.com/markwylde/just-tap

hedora

Here are a few mistakes I've seen in other frameworks:

- Make it possible to disable timeouts. Otherwise, people will need a different runner for integration, long running (e.g., find slow leaks), and benchmark tests. At that point, your runner is automatically just tech debt.

- It is probably possible to nest before and afters, and to have more than one nesting per process, either from multiple suites, or due to class inheritance, etc. Now, you have a tree of hooks. Document whether it is walked in breadth first or depth first order, then never change the decision (or disallow having trees of hooks, either by detecting them at runtime, or by picking a hook registration mechanism that makes them inexpressible).

autarch

TAP is better than some things, but it has some serious issues that I wrote about on my blog a while back - https://blog.urth.org/2017/01/21/tap-is-great-except-when-it...

Basically it's nearly impossible to fully parse it correctly.

jlarocco

> There are a set of unit testing frameworks that do everything they can to hide test output (junit), or vomit multiple screens of binary control code emoji soup to stdout (ginkgo), or just hide the actual stdout behind an authwall in a uuid named s3 object (code build).

The test runner in VS2019 does this, too and it's incredibly frustrating. I get to see debug output about DLLs loading and unloading (almost never useful), but not the test's stdout and stderr (always useful). Brilliant. At least their command line tool does it right.

torginus

I remember writing a small .NET test library for that exact problem - You could pass in a lambda with a complex condition, and it evaluated every piece of the expression separately and pretty printed what part of the condition failed.

So essentially you could write

     Assert(()=>width>0 && x + width < screenWidth)
And you would get:

      Assertion failed: 
      x is 1500
      width is 600
      screenWidth is 1920

It used Expression<T> to do the magic. Amazing debug messages. No moralizing required.

This was a huge boon for us as it was a legacy codebase and we ran tens of thousands of automated tests and it was really difficult to figure out why they failed.

paphillips

Related to this, for anyone not fully up to date on recent C# features there is also the CallerArgumentExpression [1], [2] feature introduced in C# 10. While it is not a pretty printer for an expression, it does allow the full expression passed from the call site as an argument value to be captured and used within the method. This can be useful for custom assert extensions.

For example:

    public void CheckIsTrue(bool value, [CallerArgumentExpression("value")] string? expression = null)
    {
        if (!value) 
        { 
            Debug.WriteLine($"Failed: '{expression}'"); 
        }
    }
So if you call like this: CheckIsTrue(foo != bar && baz == true), when the value is false it prints "Failed: 'foo != bar && baz == true'".

[1] https://learn.microsoft.com/en-us/dotnet/csharp/language-ref... [2] https://learn.microsoft.com/en-us/dotnet/csharp/language-ref...

sasmithjr

I love using Unquote[0] in F# for similar reasons; it uses F#'s code quotations. Assuming the variables have been defined with the values you state, the assertion is written as:

  test <@ width > 0 && x + width < screenWidth @>
And part of the output is:

  width > 0 && x + width < screenWidth
  500 > 0 && x + width < screenWidth
  true && x + width < screenWidth
  true && 1500 + 500 < 1920
  true && 2000 < 1920
  true && false
  false
[0]: https://github.com/SwensenSoftware/unquote

nerdponx

This is what the Python test framework Pytest does, among many other similar useful and magical things. I believe that the Python developer ecosystem as a whole would be substantially less productive without it.

jmount

Nice framework. Also once you allow more than one assertion, there is no need for top-level && in assertions (making them simpler tests).

torginus

While there is no strict need, sometimes assertions logically belong together.

superjan

Is this shared somewhere?

torginus

Nah, it was some corporate project. But if you are interested, I could rewrite it. It would be a fun weekend project.

opminion

https://stackoverflow.com/a/700325 to write out the internal structure of a LambdaExpression.

If you just want the assignments then it's simpler:

Add to the Evaluate method a test for MemberExpression and then:

The variable name is:

((MemberExpression)expr).Member.Name

The value is:

Expression.Lambda(expr).Compile().DynamicInvoke()

mananaysiempre

FWIW, in Python, this is pytest’s entire thing (although it ends up grovelling into bytecode to achieve it).

throwaway6977

Not quite the same, but available on nuget- 'fluentAssertions' gives you something akin to this. I've had decent success with having our juniors use it vs less verbose assertion libraries. I don't know about evaluating individual expressions in a line separately, but it does give you clean syntax and similar error messages that are very readable-

"GetProductPage().GetProductPrice().Should().Be().LessThan(...)"

beezlewax

Is there something like this available for javascript?

xeromal

I'm not aware of any related to testing, but perhaps you could use something like this in tandem with some tests to pull it off?

https://github.com/gvergnaud/ts-pattern

tobyhinloopen

Not really, no.

I like to use Jest’s toMatchObject to combine multiple assertions in a single assertion. If the assertion fails, the full object on both sides is shown in logs. You can easily debug tests that way.

The only way to make it even possible is to do some eval magic or to use a pre-processor like babel or a typescript compiler plugin.

But if you find something, Lemme know.

dalmo3

It might not be the general solution you're looking for, but idiomatic Jest does the job for me.

   expect(width).toBeGreaterThan(0); expect(x+width).toBeLessThan(screenWidth);
When an assertion fails it will tell you: "Expected: X; Received: Y"

eurasiantiger

You could parse the failing expression using babel and get an AST back.

fomine3

power-assert

magic_hamster

> How is that even possible in the first place? The entire job of an assertion is to wave a flag saying "here! condition failed!".

I envy you for never having seen tests atrocious enough where this is not only possible, but the common case.

Depending on language, framework and obviously usage, assertions might not be as informative as providing the basic functionality of failing the test - and that's it.

Now imagine this barebones use of assertions in tests which are entirely too long, not isolating the test cases properly, or even completely irrelevant to what's (supposedly) being tested!

If that's not enough, imagine this nightmare failing not after it has been written, but, let's say 18 months later, while being part of a massive test suite running for a while. All you have is a the name of the test that failed, you look into it to find a 630 lines long test "case" with 22 nondescript assertions along the way. You might know which line failed the test, but not always. And of course debugging the test function line by line doesn't work because the test depends on intricate timing for some reason. The person who wrote this might not be around and now this is your dragon to slay.

I think I should stop here before triggering myself any further. Therapy is expensive.

FartyMcFarter

> You might know which line failed the test, but not always.

If that's the case, the test framework itself is severely flawed and needs fixing even more than the tests do.

There's no excuse to have an assert function that doesn't print out the location of the failure.

magic_hamster

Even if the framework is fine, you can see something like an elaborate if-else tree, or even a try-catch block, and after it's all done, there's a condition check with `fail()`. So the point of failure could be manually detached from the actual point of failure.

Granted, this is not the way to do things. But it happens anyway.

ramraj07

So because some idiot somewhere wrote a 100 assertion unit test we should ban anyone from writing even 2 assertions in one test?

magic_hamster

Not at all. It makes sense in some tests. I addressed the part asking how it's even possible to not know what happened.

As for multiple asserts, that is really meaningless. The test case should test one thing. If it requires several asserts that's okay. But having a very long test function with a lot of assertions, is strongly indicating that you're testing more than one thing, and when the test fails it will be harder to know what actually happened.

mjul

Just use you best judgement and you will be fine.

Patterns are context specific advice about a solution and its trade-offs, not hard rules for every situation.

Notice how pattern books will often state the relevant Context, the Problem itself, the Forces influencing it, and a Solution.

(This formulaic approach is also why pattern books tend to be a dry read).

robertlagrant

Devious commenter was describing a (normal) scenario where a unit test is not precise. No need to follow up with an aggressive "so what you're saying is".

undefined

[deleted]

simplotek

> So because some idiot somewhere wrote a 100 assertion unit test we should ban anyone from writing even 2 assertions in one test?

You're falling prey to slippery slope fallacy, which at best is specious reasoning.

The rationale is easy to understand. Running 100 assertions in a single test renders tests unusable. Running 10 assertions suffers from the same problem. Test sets are user-friendly if they dump a single specific error message for a single specific failed assertion, thus allowing developers to quickly pinpoint root causes by simply glancing through the test logs.

Arguing whether two or three or five assertions should be banned misses the whole point and completely ignores the root cause that led to this guideline.

marcosdumay

> Depending on language, framework and obviously usage, assertions might not be as informative as providing the basic functionality of failing the test - and that's it.

Well, avoiding ecosystems where people act dumb is a sure way to improve one's life. For a start, you won't need to do stupid things in reaction of your tools.

Yes, it's not always possible. But the practices you create for surviving it are part of the dumb ecosystem survival kit, not part of any best practices BOK.

ethbr0

> ... where this is not only possible, but the common case.

Couldn't not read that in Peter Sellers' voice https://m.youtube.com/watch?v=2yfXgu37iyI&t=2m36s

> ... to find a 630 lines long test "case" with 22 nondescript assertions along the way.

This is where tech team managers are abrogating their responsibility and job.

It's the job of the organization to set policy standards to outlaw things like this.

It's the job of the developer to cut as many corners of those policies as possible to ship code ASAP.

And it's the job of a tech team manager to set up a detailed but efficient process (code review sign offs!) that paper over the gap between the two in a sane way.

... none of which helps immediately with a legacy codebase that's @$&@'d, though.

RHSeeger

> It's the job of the developer to cut as many corners of those policies as possible to ship code ASAP.

I can't tell if this is supposed to be humor, or if you actually believe it. It's certainly not my job as a developer to ship worse code so that I can release it ASAP. Rather, it's my job to push back against ASAP where it conflicts with writing better code.

mrits

22 assertions in a test is a lot better than 22 separate tests that fail for the same reason.

Jabbles

With 22 separate tests you have the possibility of knowing that only a subset of them fail. Knowing which fail and which pass may help you debug.

In Go, in general, tests fail and continue, rather than causing the test to stop early, so you can tell which of those 22 checks failed. Other languages may have the option to do something similar.

https://pkg.go.dev/testing#T.Fail

bluGill

They are the same. I don't care, as I'll fix them one at a time, and if the fix happens to fix more than one great.

Kuinox

Tech debt can often be fixed by more tech debt instead of fixing the root problem.

torginus

xUnit is terrible. It has a horrible culture, the author seems to have a god complex. It is overly complex and opinionated in the worst way possible.

Many times I searched for 'how do I do something with xUnit' and found a github issue with people struggling with the same thing, and the author flat out refusing to incorporate the feature as it was against his principles.

Other times I found that I needed to do was override some core xUnit class so it would do the thing I wanted it to do - sounds complex, all right lets see the docs. Oh there are none, 'just read the source' according to the author.

Another thing that bit us in the ass is they refused to support .NET Standard, a common subset of .NET Framework and Core, making migration hell.

HereBeBeasties

NUnit isn't much better. You'd think that they would have good test coverage and therefore high confidence to make changes, especially to fix actual bugs, but I gave up trying to get patches in because the core devs seem so afraid of breaking anything, even when it's an obviously-isolated private, hidden bug fix or performance improvement.

"We can't land this tiny fix because we have a release planned within three months" sort of thing.

torginus

Tbh, one of the differences between xUnit and nUnit, is the way generated test cases work, like specifying test cases in an xml file. nUnit has the TestCaseSource attribute for this, while xUnit has the Fact attribute.

One of the key differences, is there are no test cases generated, nUnit tests just wont run, while xUnit will throw.

Since it's completely legal and sensible for a certain kind of test to have no entries in an xml, we needed to hack around this quirk. When I (and countless others) have mentioned this on the xunit github, the author berated us that how dare we request this.

So nUnit might be buggy, but xUnit is fundamentally unfixable.

zdragnar

How ironic, a unit test framework is so fragile and poorly covered by tests they are not able to determine if a change would cause any regressions.

codenesium

I'm curious what things you're trying to do that requires you to overload xunit classes? We use xunit for everything and haven't found any gaps so far.

danuker

I use the following pattern for testing regexes:

    expected_positive = [
        'abc',
        'def', 
        ...]
    for text in expected_positive:
        self.assertTrue(matcher(text), f"Failed: {text}")
Before I added the assertion error message, `f"Failed: {text}"`, it was quite difficult to tell WHICH example failed.

profunctor

If you’re using pytest you just paramaterize the tests and it tells you the exact failing case. Seems to be a basic feature I would be surprised to know doesn’t exist across almost all commonly used frameworks.

danuker

We are using unittest.

But thanks for bringing it up, it seems it also has "subtest" support, which might be easier than interpolating the error message in some cases:

https://docs.python.org/3/library/unittest.html#distinguishi...

olex

This seems like a test design issue to me. Best practice is to avoid for-each loops with assertions within tests - using parametrized tests and feeding the looped values as input is almost always a better option. Figuring out which one failed and why is one advantage it gives you in comparison. Another one is that all inputs will always be tested - your example stops on the first one that fails, and does not evaluate the others after that.

dd82

not really. one thing this is useful for is extracting out various attributes in an object when you really don't want to compare the entire thing. Or comparing dict attributes, and figuring which one is the incorrect one.

for example,

    expected_results = {...}
    actual_obj = some_intance.method_call(...)

    for key, val in expected_results.items():
        assert getattr(actual_obj, key) == val, f"Mismatch for {key} attribute"
You could shift this off to a parametrized test, but that means you're making N more calls to the method being tested, which can have its own issues with cost of test setup and teardown. With this method, you see which key breaks, and re-run after fixing.

danuker

> your example stops on the first one that fails, and does not evaluate the others after that.

I don't think this is a big problem; trying to focus on multiple examples at once is difficult.

It might be a problem if tests are slow and you are forced to work on all of them at once.

But in that case I'd try to make the tests faster (getting rid of network requests, disk/DB access by faking them away or hoisting to the caller).

lancebeet

That looks like a case where I would use a parameterized test rather than a for loop inside the test.

dd82

this has downsides if you're comparing attributes with a method result and checking whether said attrs match what you expect. Either you run each test N times for N attr comparisons, accepting the cost of setup/teardown, or do a loop and fire off an assert error with text on which comparison failed.

Since you already have the object right there, why not do the latter approach?

undefined

[deleted]

kazinator

Think about the personality of someone who is so dissatisfied with the lack of verbosity in his test suites, that he needs a side project of writing a book about unit testing. Of course they will advocate testing one assertion per function, and make up nonsense to justify their recommendation.

Secretly, they would have the reader write 32 functions to separately test every bit of a uint32 calculation, only refraining from that advice due to the nagging suspicion that it might be loudly ridiculed.

Phrodo_00

I think the advantage of having an assertion per test is that it makes sure that all of your assertions are executed. In a lot of test frameworks (that use exceptions for assertions for example) the first assertion fail will stop the test.

That doesn't mean you have to duplicate code, you can deal with it in other ways. In Junit I like to use @TestFctory [1] where I'll write most of the test in the Factory and then each assertion will be a Test the factory creates, and since they're lambdas they have access to the TestFactoy closure.

[1] https://junit.org/junit5/docs/current/user-guide/#writing-te...

edgyquant

This is a feature. I want it to fail fast so I can see the first issue not crawl the stack for error logs.

un_montagnard

I personally prefer seeing everything that I broke, not just the first one

Izkata

I vaguely remember a test framework I saw a decade+ ago that had both "assert*" to fail immediately and something else ("expect*" maybe?) to check and continue.

411111111111111

    it('fails', async () => {
      expect(await somePromise()).toBe(undefined)
      expect(await someOtherPromise()).toBeDefined()
    })
No chance figuring out where it failed, it's likely just gonna run into a test suite timeout with no line reference or anything.

latch

I haven't heard of the single-assertion thing in at least 10 years, probably 15. In the early 2000s, when I was starting out and doing .NET, it used to be something you'd hear in the community as a very general guideline, more like "there's something to be said about very focused tests, and too many assertions might be a smell." At the time, I got the impression that the practice had come over from Java and converted from a rule to a guideline (hardly the only bad practice that the .NET community adopted from Java, but thankfully they largely did move the needle forward in most cases).

(I wrote Foundations of Programming for any 2000s .NET developer out there!)

To hear this is still a fight people are having...It really makes me appreciate the value of having deep experience in multiple languages/communities/frameworks. Some people are really stuck in the same year of their 10 (or 20, or 30) years of experience.

ed25519FUUU

Like you I was surprised to hear this is a thing or is even controversial. Admittedly I've only been programming for about 10 years, but I haven't heard (or seen) this come up even one time. Every test I've ever seen has usually had multiple mutations and assertions, all of them testing the same premise.

asabla

> I wrote Foundations of Programming for any 2000s .NET developer out there!

Holy moly! Think I still have your book somewhere. So thank you for that.

In my last +10 years of .net development I haven't heard anything about single-assertion.

> Some people are really stuck in the same year of their 10 (or 20, or 30) years of experience.

I think this has manifested even more with the transition into .net core and now .net 5 and beyond. There are so many things changing all the time (not that I complain), which can make it difficult to pick up what's the current mantra for the language and framework.

choeger

What? People really would criticize that code because it has two assertions? How are they ever testing any state changes?

And to the author: Your bubble is significantly different from mine. Pretty much every competent developer I've worked with would laugh at you for the idea that the second test case would not be perfectly fine. (But that first iteration would never pass code review either because it does nothing and thus is a waste of effort.)

pydry

There's a lot of not very competent people in the industry who cling tightly to dogma.

Testing (especially unit) is an area of tech weirdly with a lot of dogmatism. I think Uncle Bob is the source of some of it.

musingsole

I'm convinced if you read Uncle Bob carefully and follow all his suggestions... you'll have completely incapacitated whatever organization you infiltrated.

klysm

Then you need to hire consultants to come fix it!

drewcoo

> But that first iteration would never pass code review either because it does nothing and thus is a waste of effort.)

That first iteration would not be subject to code review. The author is using TDD.

https://en.wikipedia.org/wiki/Test-driven_development

mollerhoj

To answer your question: We zealots test for the fact that something changes to some degree. E.g with rubys rspec library:

expect { foo.call() }.to change { bar.value }.by(2)

That is, regardless of the absolute value of bar.value, I expect foo.call() to increment it by 2.

The point of the 1 assertion per test guideline is to end up with tests that are more focused. Giving that you did not seem to think of the above technique, I'd say that this guideline might just have helped you discover a way to write better specs ;-)

Guidelines (that is, not rules) are of course allowed to be broken if you have a good reason to do so. But not knowing about common idioms is not a good reason.

You might argue that the above code is just sugar for 2 assertions, but thats beside the point: The test is more focused, there -appears- to be only one assertion, and thats what matters.

philliphaydon

That’s because the example test only requires 1 assertion.

Any rule that says there should be only 1 assertion ever is stupid.

mollerhoj

OP asked how any state change would be tested with a single 'assertion' and I provided an answer. Absolute rules are stupid, but our codebase has just short of 10k tests, and very few have more than one assertion.

The only reason I can really see to have more than one assertion would be to avoid having to run the setup/teardown multiple times. However, its usually a desirable goal to write code that require little setup/teardown to test anyways because that comes with other benefits. Again, it might not be practical or even possible, but that goes of almost all programming "rules"..

krona

You tested a postcondition. What about preconditions and invariants, do you have separate tests for those assertions too, or just not bother?

mollerhoj

Please correct me if I'm wrong, but would a precondtion not just be the postcondition of the setup?

Invariants would either have to be publically available and thus easily testable with similar methods, or, one would have to use assertions in the implemention.

I try to avoid the latter, as it mixes implemations and 'test/invariants'. Granted, there are situations (usually in code that implements something very 'algorithm'-ish) where inline assertions are so useful that it would be silly to avoid them. (But implementing algos from scratch is rare in commercial code)

AlphaSite

I think a much better rule of thumb is: “A lot of small unit tests are better than a few big ones”. Same thing, but clearer intent and less rigid.

hinkley

Unit tests should be cheap. Cheap to write, cheap to run, cheap to read, cheap to replace.

Near as I can tell, many people are made uncomfortable by this in practice because these tests feel childish and dare I say demeaning. So they try to do something “sophisticated” instead which is a slow and lingering death where tests are co corned.

Lacking self consciousness, you can whack out hundreds of unit tests in a couple of days, and rewrite ten of someone else’s for a feature or a bug fix. That’s fine and good.

But when your test looks like an integration test, rewriting it misses boundary conditions because the test is t clear about what it’s doing. And then you have silent regressions in code with high coverage. What a mess.

choeger

I think you forgot at least one valid assertion and implied another one:

foo.call() might have a return value.

Also, the whole story invocation shouldn't throw an exception, if your language has them. This assertion is often implied (and that's fine), but it's still there.

Finally the test case is a little bit stupid, because very seldom code doesn't have any input that changes the behavior/result. So your assertion would usually involve that input.

If you follow that though consequently, you end up with property-based tests very soon. But property-based tests should have as many assertions as possible for a single point of data. Say you test addition. When writing property-based tests you would end up with three specifications: one for one number, testing the identity element and the relationship to increments. Another one for two numbers, testing commutativity and inversion via subtraction, and one for three numbers, testing associativity. In every case it would be very weird to not have all n-ary assertions for the addition operation in the same spot.

mollerhoj

When you say I 'forgot' an assertion, are you implying that test should include all possible assertions on the code? That would perhaps cover more surface, but my goal (read zealot ideology) here is to have the tests help document the code:

test "pressing d key makes mario move 2 pixels right" {

expect { keyboard.d() }.to change { mario.x }.by(2)

}

I could test the value of the d() function, but I dont because I don't care what it returns.

Didnt understand the "whole story invocation" and exception part, am I missing some context?

Sure property-based testing can be invaluable in many situations. Only downside is if the tests become so complex to reason about that bugs become more likely in the tests than the implemenation.

I've sometimes made tests with a manual list of inputs and a list of expected outputs for each. I'd still call that 1 assertion tests (just run multiple times), so my definition of 1 assertion might too broad..

hinkley

When you get the suites nested and configured right, and the code decomposed properly to support it, each of these assertions is two lines of code, plus the description of each constraint. So you just write four or five tests covering each one, in descending likelihood of breakage.

alkonaut

An assert message says what went wrong, and on which code line. How on earth does it help to make just one? The arrange part might take seconds for a nontrivial test and that would need to be duplicated both in code and execution time to make two asserts.

If you painstakingly craft a scenario where you create a rectangle of a specific expected size why wouldn’t it be acceptable to assert both the width and height of the the rectangle after you have created it?

assert_equal(20, w, …

assert_equal(10, h, …

A dogmatic rule would just lead to an objectively worse test where you assert an expression containing both width and height in a single assert?

assert_true(w == 20 && h == 10,…)

So I can only assume the rule also prohibits any compound/Boolean expressions in the asserts then? Otherwise you can just combine any number of asserts into one (including mutating state within the expression itself to emulate multiple asserts with mutation between)!

ljm

I’ve seen people take a dogmatic approach to this in Ruby without really applying any critical thought, because one assertion per test means your test is ‘clean’.

The part that is glossed over is that the test suite takes several hours to run on your machine, so you delegate it to a CI pipeline and then fork out for parallel execution (pun intended) and complex layers of caching so your suite takes 15 minutes rather than 2 and a half hours. It’s monumentally wasteful and the tests aren’t any easier to follow because of it.

The suite doesn’t have to be that slow, but it’s inevitable when every single assertion requires application state to be rebuilt from scratch, even when no state is expected to change between assertions, especially when you’re just doing assertions like ‘assert http status is 201’ and ‘assert response body is someJson’.

hamandcheese

> I’ve seen people take a dogmatic approach to this in Ruby

I came up in ruby, heard this, and quickly decided it was stupid.

mollerhoj

Yes, you got us rubyists there. :-( Its the unfortunate result of trying to avoid premature optimization and strive for clarity instead. Something thats usually sound advice.

Enginnering decisions have tradeoffs. When the testsuite becomes too slow, it might be time to reconsider those tradeoffs.

Usually though, I find that to road to fast tests is to reduce/remove slow things (almost always some form of IO) not to combine 10 small tests into one big.

ljm

I think it’s a sound strategy more often than not, it’s just that RSpec’s DSL can make those trade-offs unclear, especially if you use Rubocop and follow its default RSpec rules.

It just so happens that your tests become IO bound because those small tests in aggregate hammer your DB and the network purely to set up state. So if you only do it once by being more deliberate with your tests, you’re in a better place.

int_19h

I'd argue that it's the unfortunate result of Ruby being at the center of the Agile and XP scene back when it first became prominent (the manifesto etc) - because that scene is also where the more cultish varieties of TDD originated.

OJFord

> I’ve seen people take a dogmatic approach to this in Ruby without really applying any critical thought, because one assertion per test means your test is ‘clean’.

I can't speak for Ruby, but what I would call 'clean' and happily dogmatise is that assertions should come at the end, after setup and exercise.

I don't care how many there are, but they come last. I really hate tests that look like:

    setup()
    
    exercise(but_not_for_the_last_time)

    assert state.currently = blah

    state.do_other()
    something(state)

    assert state.now = blegh
And so on. It stinks of an integration test forced into a unit testing framework.

I like them to look like:

    foo = FooFactory(prop="whatever")

    result = do(foo)

    assert result == "bar"
I.e. some setup, something clearly under test, and then the assertion(s) checking the result.

ljm

I think even with integration tests they should still be treated similarly - at the end of the day you are setting expectations on an output given a certain input, there’s just a lot more going on in between.

There’s no avoiding it though when you want something end-to-end, or a synthetic test. You’re piling up a whole succession of stateful actions and if you tested them in isolation you would fail to capture bugs that depend on state. In that sense, better to run a ‘signup, authenticate and onboard’ flow in one test instead of breaking it down.

int_19h

Here's a trivial rewrite that satisfies your dogmatic requirement without any meaningful difference:

    setup()
    
    exercise(but_not_for_the_last_time)

    was_blah = (state.currently == blah)

    state.do_other()
    something(state)

    assert was_blah && state.now == blegh
In fact, this last version is worse, because if do_other() can fail if state wasn't blah, then what you'll get is the exception from that failure interrupting the test before the assert would have been reported.

TeMPOraL

> So I can only assume the rule also prohibits any compound/Boolean expressions in the asserts then? Otherwise you can just combine any number of asserts into one

That's what's bound to happen under that rule. People just start writing their complex tests in helper functions, and then write

  assert_true(myTotallySimpleAndFocusedTestOn(result))

ImPleadThe5th

The way it's been explained to me is that because one assert failing blocks the other assertions from running you don't get a "full" picture of what went wrong.

So instead of:

- error W doesn't equal 20

Fix that

Run test again

- error H doesn't equal 10

Fix that

Run test again

It's - Error Width doesn't equal 20

- Error Height doesn't equal 10

Fix both

Run test

I think the time savings are negligible though. And it makes testing even more tedious, as if people needed any additional reasons to avoid writing tests.

hinkley

Only a few programming languages have a facility to render that second assertion in a human readable way (python surprised me with this). Most C influenced languages will just present you with “assertion failed” or “expected true to be false” which means nothing. Test failure messages should be actionable, and that action is not, “read the test to see what went wrong”.

jeremyjh

Apparently this is yet another “best practice” preached by people using crappy tools.

flavius29663

That can be considered one logical assertion though. You're asserting the size of the rectangle. You can even extract an assert helper function AssertRectangleSize(20,10)

alkonaut

Exactly. But if I assert N properties of an object is that then one assert logically for any N? At what point does it stop?

Applying any rule dogmatically is often bad, and this is no exception. The is that we don’t like lacking rules. Especially it goes to hell when people start adding code analysis to enforce it, and then developers start writing poor code that passes the analysis.

One assert imo isn’t even a good starting point that might need occasional exceptions.

flavius29663

It shouldn't be dogmatic, but I think it should be something to think about when writing the test or reviewing one. A test should be single responsibility too, in order for it to not be brittle.

I disagree, it can be a good starting point for most cases. You should be able to condense your test in 3 steps (arrange, act, assert) each one a single line, and even if you do not do it because setup is too complicated, it's not worth it for that single test, you assert more things etc., I think the mental exercise to try to think: "can this be made in a 3 line test" is invaluable in writing maintainable tests.

> is that then one assert logically for any N? At what point does it stop?

This is one of the hard things about good tests, they are a little bit of art too. Maybe you can apply the single responsibility like I said before: the test should change for one reason only. By one reason I mean one "person/role": it should change if the CFO of our clients want something differet, or if Mark from IT wants some change.

I am not stressing or enforcing single asserts too much, I feel like tests allow a little bit of leeway in many ways, as long as the decision enhances expressiveness. If the extra lines are not making the test clearer, if a single assert would be clearer for the story that the test is telling, then it should go into a single assert. If I can break the story into multiple stories that still make sense, then I do that, such that each story has it's own strong storyline.

msaltz

I think two important requirements for good unit tests are that 1) If you look at a particular test, you can easily tell exactly what it's testing and what the expected behavior is, and 2) You are able to easily look at the tests in aggregate and determine whether they're covering all the behavior that you want to test.

To that end, I think a better guideline than "only have one assertion per test" would be "only test one behavior per test". So if you're writing a test for appending an element to a vector, it's probably fine to assert that the size increased by one AND assert that the last element in the vector is now the element that you inserted.

The thing I see people do that's more problematic is to basically pile up assertions in a single test, so that the inputs and outputs for the behavior become unclear, and you have to keep track of the intermediate state of the object being tested in your head (assuming they're testing an object). For instance, they might use the same vector, which starts out empty, test that it's empty; then add an element, then test that its size is one; then remove the element, test that the size is 0 again; then resize it, etc. I think that's the kind of testing that the "one assertion per test" rule was designed to target.

With a vector it's easy enough to track what's going on, but it's much harder to see what the discrete behaviors being tested are. With a more complex object, tracking the internal state as the tests go along can be way more difficult. It's a lot better IMO to have a bunch of different tests with clear names for what they're testing that properly set up the state in a way that's explicit. It's then easier to satisfy the above two requirements I listed.

I want to be able to look at a test and know exactly what I don't mind if a little bit of code is repeated - you can make functions if you need to help with the test set up and tear down.

jmillikin

Is it weird that not only have I never heard of the "rule" this post argues against, but I can't even conceive of a code structure where it would make sense?

How would a test suite with one assertion per test work? Do you have all the test logic in a shared fixture and then dozens of single-assertion tests? And does that rule completely rule out the common testing pattern of a "golden checkpoint"?

I tried googling for that rule and just came up with page after page of people arguing against it. Who is for it?

mollerhoj

I believe theres a rubocop that checks for it:

https://docs.rubocop.org/rubocop-minitest/cops_minitest.html...

raverbashing

Nothing like machine-enforced nitpicking of the worse kind

But even this rule/plugin has a default value of 3 which is a more sane value.

littlecranky67

> Is it weird that not only have I never heard of the "rule" this post argues against

This "rule" is known mostly because it is featured in the "Clean Code" book by Robert C. Martin (Uncle Bob). You should have heard of it ;)

jmillikin

Heard of it, but never read it.

Looking at the Amazon listing and a third-party summary[0] it seems to be the sort of wool-brained code astrology that was popular twenty years ago when people were trying to push "extreme programming" and TDD.

[0] https://gist.github.com/wojteklu/73c6914cc446146b8b533c0988c...

willsmith72

Wait what's wrong with XP and TDD? Where I'm from those are absolutely considered best practice for most situations

TeMPOraL

Perhaps the author is better off not having heard of it then, and by implication, not having read "Clean Code" in the first place. The book is full of anti-patterns.

matthews2

Are there any "rules" in Clean Code that don't need to be disregarded and burned? https://qntm.org/clean

pydry

There's plenty of sensible advice in there it's just that he argues for the sensible stuff and the idiotic stuff with equal levels of conviction and if you are junior you aren't going to be able to distinguish them.

It would be easier if it were all terrible advice.

undefined

[deleted]

teddyh

Where in that book is the rule stated? I ask because I have heard the author explicitly state that multiple assertions are fine (using essentially the same explanation as TrianguloY did in this comment: https://news.ycombinator.com/item?id=33480120).

MikeDelta

Chapter 9 talks about unit tests and there is a paragraph called 'Single Assert per Test', where he says it is a good concept but that he is not afraid to put more asserts in his tests.

That paragraph is followed by 'Single Concept per Test' where he starts with: 'Perhaps a better rule is that we want to test a single concept per test.'

So, technically he doesn't say it.

daviddever23box

It feels as if folks are splitting hairs where a haircut (or at least a half-hearted attempt at grooming) is required. Use good judgment, and do not blindly follow "rules" without understanding their effects.

Kwpolska

The book is so full of bad advice I'm not surprised this "rule" comes from there as well.

teddyh

As MikeDelta reports¹, the book doesn’t actually say that.

I’ve come to learn to completely disregard any non-specific criticism of that book (and its author). There is apparently a large group of people who hate everything he does and also, seemingly, him personally. Everywhere he (or any of his books) is mentioned, the haters come out, with their vague “it’s all bad” and the old standard “I don’t know where to begin”. Serious criticism can be found (if you look for it), and the author himself welcomes it, but the enormous hate parade is scary to see.

1. https://news.ycombinator.com/item?id=33480517

undefined

[deleted]

lodovic

Wait - people are doing real http calls in unit tests, over a network, and complaining about multiple asserts in the test code?

flqn

It's not unusual to spin up a local dev server in the same environment (or on the same machine, ie at localhost) as the tests. There's an argument to say these aren't "unit" tests but your definition of "unit" may vary.

chopin

Where I work milestone and release builds cannot make http calls other than to our maven repo. It's meant to further reproducible builds but your tests can't do this as well. I fire up an in-memory database to make my tests self containing.

mollerhoj

Hopefully the responses are stored locally and replayed on subsequent runs.

tinyspacewizard

Still a bit flaky. In an OOP language mocking is appropriate or in an FP language defunctionalization.

mollerhoj

Heh yeah but it can be used to write tests that check your assumptions about a 3 party api. Granted, It'll only fail once the test are rerun without the cache, but it can still be a valuable technique. It can be valuable to have a testsuite that a) helps check assumptions when implementing the connection and b) helps locate what part of it later starts to behave unexpectedly.

batiste

I have met people that thinks HTTP calls are great!

- "But it tests more things!"

Well ok, but those are integration tests, not unit tests... It is unacceptable that a unit tests can fail because of external system...

int_19h

It's far better to have integration tests without unit tests than the other way around.

undefined

[deleted]

warrenmiller

They can be mocked surely?

drewcoo

It's not nice to mock people "doing real http calls in unit tests," even if they deserve it.

nsoonhui

For mocking, the parent comment means "swapping out the external calls with dummy calls", not "laughing at the developer"

tester756

define "real"

yes, we perform "somewhat real tests" - tests start the app, fake db, and call HTTP APIs

it's really decent

wccrawford

First off, I do put more than 1 assertion in a test. But it definitely leads to situations where you have to investigate why a test failed, instead of it just being obvious. Like the article, I test 1 thing per test, but sometimes that means multiple assertions about the outcome of a test.

IMO there's no point in checking that you got a response in 1 test, and then checking the content/result of that response in another test. The useful portion of that test is the response bit.

tetha

IMO, the opposite also has to be considered. I've briefly worked with some code bases that absolutely did 1 assert per test. Essentially you'd have a helper method like "doCreateFooWithoutBarAttribute", and 3-4 tests around that - "check that response code is 400", "check that error message exists", and so on. Changes easily caused 4-5 tests to fail all at once, for example because the POST now returned a 404, but the 404 response also doesn't contain the error message and so on.

This also wasted time, because you always had to look at the tests, and eventually realized that they all failed from the same root cause. And sure, you can use test dependencies if your framework has that and do all manner of things... or you just put the asserts in the same test with a good message.

_lambdalambda

Even with multiple assertions the test failure reason should be quite clear as most testing frameworks allow to specify a message which is then output in the testing summary.

E.g. `assertEqual(actual_return_code, 200, "bad status code")` should lead to output like `FAILED: test_when_delete_user_then_ok (bad status code, expected 200 got 404)`

TeMPOraL

Even if you don't specify the message, at the very minimum, the output should look like:

  FAILED: test_when_delete_user_then_ok
    Assertion failed: `actual_return_code' expected `200', got `400'.
Note it mentions the actual expression put in the assert. Which makes it almost always uniquely identifiable within the test.

That's the bare minimum I'd expect of a testing framework - if it can't do that, then what's the point of having it? It's probably better to just write your own executable and throw exceptions in conditionals.

What I expect from a testing framework is at least this:

  FAILED: test_when_delete_user_then_ok
    Assertion failed: `actual_return_code' expected `200', got `400'.
    In file: '/src/blah/bleh/blop/RestApiTests.cs:212'.
I.e. to also identify the file and the line containing the failing assertion.

If your testing framework doesn't do that, then again, what's even the point of using it? Throwing an exception or calling language's built-in assert() on a conditional will likely provide at least the file+line.

philliphaydon

Maybe it’s different in other languages but in JS and .NET the failed assertion fails and you investigate the failed assertion. You wouldn’t ever have a situation that isn’t obvious.

If an assertion says “expected count to be 5 but got 4” you wouldn’t be looking at the not null check assertion getting confused why it’s not null…

iLoveOncall

> IMO there's no point in checking that you got a response in 1 test, and then checking the content/result of that response in another test. The useful portion of that test is the response bit.

If I understood this part correctly, you are making the dangerous assumption that your tests will run in a particular order.

wccrawford

No, I definitely am not making that assumption. With a bad response, but a good response code, 1 test would fail and the other would succeed, no matter the order. I just don't think that that valid response code is a useful test on its own. It's much better with both assertions in the same test, unless you have some reason to think that response code failure would signify something special on its own.

daviddever23box

...which is why it may be worthwhile to chain some assertions together in a single test.

commandlinefan

> have to investigate why a test failed

Still better than investigating why the whole system failed, though.

bazoom42

Unit tests are one of the many great ideas in our industry which have been undermined by people treating it is a set of rituals rather than a tool.

seadan83

I wonder how much of this is the journeyman problem (aka, the expert beginner)

I believe writing test code is its own skill. Hence, like a coder learning SRP and dogmatically applying it, so does a person that is forced to write unit tests without deep understanding. (And of course, bad abstractions are worse than code duplication)

I think it's very possible to have a developer with 10 yes experience but effectively only 2 years experience building automated test suites. (Particularly if they came from a time before automated testing, or if the testing and automated tests were someone else's job)

rhdunn

I view it more as "only test one operation per unit test". If that needs multiple asserts (status code, response content, response mime type, etc.) to verify, that is fine.

IIUC, the guideline is so that when a test fails you know what the issue is. Therefore, if you are testing more than one condition (missing parameter, invalid value, negative number, etc.) it is harder to tell which of those conditions is failing, whereas if you have each condition as a separate test it is clear which is causing the failure.

Separate tests also means that the other tests run, so you don't have any hidden failures. You'll also get hidden failures if using multiple assertions for the condition, so will need to re-run the tests multiple times to pick up and fix all the failures. If you are happy with that (e.g. your build-test cycle is fast) then having multiple assertions if fine.

Ultimately, structure your tests in a way that best conveys what is being tested and the conditions it is being tested under (e.g. partition class, failure condition, or logic/input variant).

ronnier

I’ve come to the conclusion that none of this matters for most parts of a system. I worked in the most horrendous code and systems you can imagine but it turned into a multi billion dollar company. Then everyone starts talking about code quality and rewrites etc and new features stall as beautiful systems are written and high test coverage met and competing companies surpass us and take market share with new and better features. We’ve gotten religious over code and tests in the software industry abd should probably shift back some.

dxdm

I've worked on shitty code with shitty tests that ran the core of the business. Even while doing that, it was horrible to work with, held important features back and drove talented people away, leaving everything to stagnate in a "this works enough" state. When the wind changed, it was hard to turn the ship around, important people got nervous, and things got into a bad spiral.

None of this is the failure of code and tests alone; but both can be indicative of the structural health and resilience of the wider situation.

ascendantlogic

I've always operated on the principal that each test should be testing one logical concept. That usually translates into one concrete assertion in the test code itself, but if you need to assert multiple concrete conditions to test the logical concept it's not a bad thing. At the end of the day tests are just there to make you more confident in code changes and they are a tax you have to pay for that comfort. However you arrive at "I feel comfortable making changes", go with it.

BlargMcLarg

Out of everything coming out of testing and the pain of testing old code, this seems like such a trivial thing to discuss about. Then again, automated testing seems like a breeding ground for inane discussions wasting more time than picking a less-than-ideal solution and moving on.

Daily Digest email

Get the top HN stories in your inbox every day.

Multiple assertions are fine in a unit test - Hacker News