In test cases where you extensively involve mocks, more often than not in my experience, you end up testing that your mocks do the thing you told them to.

most of the bugs are in the joints of the system, not in the componentsit&#x27;s much easier to write modules that are internally consistent, much harder to be globally consistent across modules. mocks ensure you only test for internal consistency

I think the original ideas of mocks (if you go and read Growing Object-Oriented Software, Guided by Tests) had some merit: In that style of TDD, mocks are used to discover (hopefully somewhat stable) interfaces between components, and in theory, it fits with the idea that OOP is about &quot;objects sending messages to each other&quot;. I can believe that it&#x27;s possible to write good systems with this kind of approach.Unfortunately, in practice, mocks are rarely used like that and most &quot;OOP&quot; designs have horrible boundaries and are really not much about message passing anymore. That leads to brittle mocks where you constantly have to change tests when you change implementation details.I have also gravitated away from classic OOP and much more towards the &quot;functional core, imperative shell&quot; concept as outlined in the article (although it&#x27;s difficult to keep this pattern throughout a codebase, especially if you have team members). In such a system you really rarely need mocks.Agreed that fakes, when you have them, are nicer than mocks, especially when the system to be faked has a large API (i.e. use a redis fake, instead of checking the exact commands you send to redis).However, for some outside systems, writing a fake can be a lot of effort. In such a case, I think it&#x27;s totally valid to write a &quot;gateway class&quot; that isn&#x27;t unit tested (you can cover it with integration tests instead) which exposes a nice API (e.g. &quot;storeFile(...)&quot;) and then to use mocks of that class in other tests.

At minimum, you need to be able to choose where the files are stored to use temp,I find thtre is still the problem of complicated setup. Also, if you are going for particular semaitics (e.g. cross process interactions), isolating that for testing it more specifically can be helpful.For me, I look for higher level abstractions and mock when possible and don&#x27;t sweat testing against temp files otherwise. I&#x27;ve had several TDD people jump to wanting to mock each filesystem call. One was for a cross process storage API. I was trying to get them to just have a Backend interface for reading and writing instead, with tests just using an InMemory implementation as a Double.

I have not found storage to be flaky and so I don&#x27;t mock it. Tmpfile always gives me a unique file, and that is all the fake I need. I don&#x27;t even look up the various forms of temp file to see which ones don&#x27;t have a race condition as in practice they never do (if I was writing encryption or other such production code I would, but for a unit test the odds of. A race causing a failure are low enough to ignore)

I agree - but it can be hard to get good ones, e.g. if you need to simulate sql query execution.

What do you mean by simulators? An anonymized copy of a database or something similar?

- Prefer simulators over mocksWe&#x27;re running business critical system for several years now, creating simulators has been one of our &quot;secrets&quot; that contributed to the success.Not only our tests are running in as close to production environment as possible, we&#x27;re also using them for local development where developers can spin up functioning system on own dev machine.

It isn&#x27;t immediately obvious to me why a small Go static site generator would require &quot;massive amounts of time&quot; to run its tests, so it&#x27;s hard to answer what you&#x27;re doing wrong.Are your tests perhaps just too darned big? You don&#x27;t, in general, need to render 5000 pages of something to test your template doesn&#x27;t crash or something.It could also just be disk access. Consider trying an in-memory file system, or if you&#x27;re on linux, look at using &#x2F;dev&#x2F;shm which is a RAM disk.It is also possible you&#x27;ve snuck in a quadratic or worse time algorithm. There&#x27;s nothing fundamental to the problem of a static site generator that would require such algorithms, but speaking from experience it is an environment where it&#x27;s easy to loop over the return value of one function, which itself loops over something other function (quite likely the same data), which itself loops over the same structure, and it&#x27;s easy to end up with O(n^3) or O(n^4) without realizing it. It&#x27;s especially easy to end up with that being a &quot;for each page&quot; type of loop. Static site generation should be O(n), give or take small factors (maybe O(n log n) for some things technically, but at a scale where O(n log n) is practically O(n) anyhow).

Make it CI’s problem. You should only really be running tests regularly for the component you’re currently working on with CI making sure you don’t have accidental regressions.

if you don&#x27;t mind spending the effort, have a quick profile to see where the tests are taking a long time - is it initializing the real components that could be mocked?Or are the tests themselves taking a long time because of other factors, such as IO etc?For IO, may be there should be an abstraction over these IO api and you use an inmemory option instead for testing.

I try to write the test first, or at least stub in the test how I think it should work. Or at least enough notes to pick it up &quot;tomorrow&quot;. Then I have a reference for usage i can build with that in mind

Thanks all for super helpful feedback. This was a generous gift of your time and brainpower.

It all sounds great. I agree totally in principle! I am finding that testing my fairly small Go project (static site generator, because the world definitely needs another one) chews up massive amounts of time. So I tend to avoid the testing pass for longer than I should. Any thoughts on that issue?

Generally the more difficult a test is to write and run, the more useful it is.

Amen to the idea.- Prefer real objects, or fakes over mocks. It will make your tests usually more robust.- Use mocks when you must: to avoid networking, or other flaky things such as storage.- Use mocks for “output only objects”, for example listeners, or when verifying the output for some logging. (But, prefer a good fake)- Use mocks when you “need to get shit done”, it’s the easiest way to add tests in an area that has almost none, and the code is not designed to be easily testable. But remember this is tech debt, and try to migrate towards real objects over time.That’s my short advice I told many times. So might as well comment with it here.

I think you have missed the point of fakes versus mocks (as this article puts it) based on what you said - fakes should be able to do everything mocks can, more, even.But I hear you complain &quot;but that&#x27;s so much work to maintain!!&quot; - mocks are more work to test less, if you need a better fake often, write a library which has them. What you are referring to is known as fuzzing, not unit testing, and fuzzing has a better track record than straight up &quot;testing&quot; generally for finding issues. Fakes can have deterministic or indeterminate behavior, the case the OP makes is for deterministic &quot;unit&quot; testing with more real behaviors than not. This doesn&#x27;t preclude good fuzzing...&quot;It&#x27;s easier than one would think to accidentally rely on implementation details&quot;To be somewhat philosophical, what if that &#x27;implementation detail&#x27; is the feature? No amount of testing will save you from incompatible feature sets, to &#x27;enable testing of functionality that the current concrete implementation doesn&#x27;t exercise&#x27; you run the very real risk of baking in incompatible sets of features into the core of your app. You shouldn&#x27;t be designing your app at the unit level...

It is different. You tell the mock to give you an specific error with specific information, while you tell the fake to give you the error for an specific condition.It is subtle and may not be worth the time, but again, it may be.

Also error conditions - telling a fake to give you a specific error before you run the test is no different from telling a mock to do it.

As a potential counter-argument, the use of mocks can enable testing of functionality that the current concrete implementation doesn&#x27;t exercise. It&#x27;s easier than one would think to accidentally rely on implementation details rather than coding just to the interface (and optionally any documented restrictions to that interface).They explicitly call out clocks as a source of non-determinism that probably should be mocked, but I&#x27;ll re-use them as an example anyway because everyone is familiar with them: it&#x27;s extraordinarily useful for the tests to execute nearly immediately rather than actually waiting on a clock, and rare behavior like a clock running backward, two consecutive timings being identical within the clock&#x27;s resolution, or whatever other weird artifacts that your code should handle are definitely better explicitly tested rather than not mocking the clock. Other domain-specific interfaces are often similarly able to exhibit a weird edge case that ought to be explicitly tested (rather than accidentally relying on a &quot;nice&quot; implementation) if you really want to unit test the callers and not integration test the coupled system.

Nothing pisses me off like finding a suite of tests that has fakes with logic in them. By the time I find them, the fakes are longer than the tests. Often the commit history shows that this accumulated by accretion, and nobody ever pulled the emergency stop lever. Other times it&#x27;s people who are wrong-headed about what problems tests are trying to solve (coverage chasers are but one category).

Dependencies are why we have functional tests.Write as much code as you can that has no dependencies. Unit test that code exhaustively. Fake all inputs that don&#x27;t contain behavior. Mock all interactions that do. Then write functional tests that check that the glue and state management actually work with the real things.The plumber is still going to run water and check for leaks before they leave, no matter how many certifications the copper piping came with. But that&#x27;s only at the end of a long process of work and inspections.

&gt; Will you remember to always comb over your entire test suite and review all mocks to faithfully simulate actual production behavior?This doesn’t make sense. If I write unit tests for my ‘FunctionExecutor’, and it has a dependency with a function ‘shouldExecute’, I don’t care how that function is implemented, only that it returns a boolean. If the implementation of my ‘shouldExecute’ function changes, there is no need to update any other test as long as it still returns a boolean.I do agree that what you call fakes are better, because you get the extra guarantee of them implementing the same interface (e.g. your compiler will scream bloody murder if you don’t update both).

&gt; The heart of the argument presented is that using mocks in unit tests is problematic (...) If you use real implementationsNo. As you say, the moment you have a real out-of-process dependency, you no longer have a unit test, but an &quot;integration-with-external world&quot; test.The heart of the argument is to not mock out internal (in-process) business logic. Instead:a) do not use mocks (e.g. leveraging Moq); use fakes (aka &quot;simulators&quot;), i.e. proper classes having in-memory implementations of the out-of-process dependencies and cruciallyb) replace only out-of-process dependencies, which usually are flaky&#x2F;nondeterministic. Never replace internal business logic.If you feel you need to replace internal business logic, you are not following the functional core architectural pattern. After you refactor to functional core you will no longer need to replace any dependencies in your unit test, as there won&#x27;t be any - you will just call your tested pure method and make assertions on the returned value.Special case here is testing &quot;top-level&#x2F;controller&quot; logic. Here you need to use the in-memory fakes, but there will be only few, reused across all tests, and such test will be an &quot;end-to-end internal-business-logic integration test&quot;, but still it will have all the properties of a unit test - it will be fast, deterministic, and you will be able to run it as part of a unit test suite out-of-the-box, with no environment setup necessary.Further reading:
<a href="https:&#x2F;&#x2F;enterprisecraftsmanship.com&#x2F;posts&#x2F;when-to-mock&#x2F;" rel="nofollow">https:&#x2F;&#x2F;enterprisecraftsmanship.com&#x2F;posts&#x2F;when-to-mock&#x2F;</a>&gt; If you want to change code you will always measure the effect it will have on the code and tests are incidental, not the primary concernExactly. Mocks are very brittle and break all the time, causing unnecessary rework. If you instead rely on a small set of fakes of out-of-process dependencies, you will drastically reduce test suite rework and improve signal-to-noise ratio.&gt; Using a real dependency, I would be left in the unfortunate situation of depending on knowledge of the internals of that dependency.And what happens if the real dependency changes behavior, but you forget to update your mock? You will have a test that will be running against a mock that simulates obsolete behavior, no longer present in production. In worst case scenario the test will be green&#x2F;passing, while there is a bug in production. Will you remember to always comb over your entire test suite and review all mocks to faithfully simulate actual production behavior?&gt; It may not allow me to execute specific paths via pure configuration at all.If you write fakes, you will have full control over how to configure them.

The heart of the argument presented is that using mocks in unit tests is problematic because if an interface is changed, that will possibly break every test that involves mocking that interface and that&#x27;s friction to making changes. This is silly. If you use real implementations and change the interface, you have the exact same problem except that your configuration&#x2F;setup is also likely to have to change in subtle ways.Let&#x27;s just assume the absurd notion that a choice of creating tech debt or changing a common interface is a real choice. If there was a serious change that could not be accounted for with easy test changes, I&#x27;m not the only one to see the tests commented out with a &quot;TODO: fix these&quot;. Developers tend to be pragmatic.If you want to change code you will always measure the effect it will have on the code and tests are incidental, not the primary concern. Make it work. Make it right. Make it fast. I want to be able to trigger all code paths and exceptions (make it good). Using a real dependency, I would be left in the unfortunate situation of depending on knowledge of the internals of that dependency. It may not allow me to execute specific paths via pure configuration at all.I don&#x27;t think using real dependencies is a good idea at all for unit tests. Integration tests are a different and I do fear that they are being confused.

This is exactly how I feel as well! Units in isolation certainly deserve testing, but the actual long-lived value comes from testing the public interface of the module&#x2F;program&#x2F;etc. In one compiler project I maintain, I stopped writing unit tests years ago in favor of just feeding it input like users would and asserting on the outputs. It gives me both great freedom to refactor quickly, and great confidence that I&#x27;m not regressing.As you say, depends on the project. What I described above is entirely free of side effects - I wouldn&#x27;t dream of testing a web service this way.Of course integration is never in conflict with unit testing - they&#x27;re different and can happily coexist.

What? Nobody is advocating for integration tests to the exclusion of unit tests.but apart from the fact that nobody can agree on what an &quot;integration test&quot; isNot being precise doesn&#x27;t invalidate a guideline. The ideas that &quot;stubbing less is better&quot; and &quot;testing functionality end to end is good bang-for-buck&quot; aren&#x27;t crappy because people don&#x27;t agree on the details.it just becomes immediately apparent in a code base of sufficient size that &quot;just use integration tests for everything&quot; is only possible if you severely under-testYour parent comment said &quot;If we&#x27;re building a rocket ship, you need both granular testing and coarse testing.&quot; Nobody is advocating for integration tests to the exclusion of unit tests. If you write a hash table or a CSV parser, yes you should unit test it.But for most application-ish functionality you should reach for integration tests first. For example, testing direct message functionality in an app, checking that after a send there&#x27;s a notification email queued and the recipient inbox endpoint says there&#x27;s 1 unread will get you really far in 20 lines of code. Is it exhaustive? Of course not. But the simplicity is a huge virtue.if you severely under-test (which usually includes thing like not properly testing for error conditions etc.)I advocate &quot;default to integration testing for application functionality&quot;. Those focused on unit tests often mock exactly the things most likely to break: integration points between systems. &quot;Unit&quot; tests of systems are often really verbose, prescriptive about internal state, and worst of all don&#x27;t catch the bits that actually break.

Nobody can agree on what a unit test is either. I&#x27;ve seen people say that:* If it uses an xUnit framework it&#x27;s a unit test.* That since the whole application constitutes a unit then a test for the whole application is a unit test.* That anything that doesnt use the UI is a unit test.Both names should be trashed, IMO. They both lack clear boundaries.

People keep saying this all the time, but apart from the fact that nobody can agree on what an &quot;integration test&quot; is (because there&#x27;s almost always some part of the application flow that you&#x27;re stubbing out), it just becomes immediately apparent in a code base of sufficient size that &quot;just use integration tests for everything&quot; is only possible if you severely under-test (which usually includes thing like not properly testing for error conditions etc.).

In other words, shouldn&#x27;t we prefer integration testing, or perhaps partial integration testing. This requires an initial effort of setting up your test framework&#x2F;environment, but in my experience integration tests provide good value for the time you put into testing.Rather than a granular test on a single class, test the orchestration of many classes. You end up hitting big % of code. As always, depends on the project. If we&#x27;re building a rocket ship, you need both granular testing and coarse testing.

Well, I strongly believe in testing all the queries&#x2F;transactions (for example the repository pattern is super helpful to separate concerns and allow you to test only what concern your queries and the DB); but those are tested separately from the workflow of your service, if we are talking about unit tests. Why? Because otherwise either you write a gaziliion of tests, or they are unreliable. As an example, if you have in the service you wanna test various code paths, and your queries have paths of their own (what happens if you don&#x27;t retrieve any object? if some property is null? and so on), I just think it gets messy quickly.My solution is to test things separately, at least in unit tests. Obviously, this has pitfalls (it&#x27;s easy and fun to write green tests, so not always tests respect the interface of the components and they don&#x27;t get red even if something is wrong); but that&#x27;s where integration testing come into play.Have a few code paths where you touch multiple external services, and you want to test that everything actually works? Create integration tests that use either a fake or the actual service in a `staging` environment, take 10x time to run, but test the path that is most important for your logic. Obviously, if you many unit tests, you will have way less integration tests, but they serve different scopes, and one cannot substitute the other!

&gt; I gain nothing by mocking [database queries and transactions] them out.What about the fact that you will need to spin up, migrate and possibly seed a database in your CI pipeline? What about the toll this will take on the execution speed of your test suite? Additionally, consider that you also need to test the behavior of your system when the query fails, and using a mock implementation that always throws an exception is a trivial and reliable way of achieving this.&gt; That said if the interaction between your system and it&#x27;s deps is sufficiently unreliable that you need mocks for test reliability, how are you going to have any confidence in the system&#x27;s production behavior?Sometimes your codebase depends on external services which are flaky for reasons beyond your control, just the way it be sometimes. Mocks are useful to ensure the system behaves a certain way when everything goes right as well as when everything goes wrong.Ultimately the article raises many good points about avoiding mocks if it can be helped, but don&#x27;t forget a test that only tests the happy path of your system is not very useful. Mock an error in that dependency you expect to always work and understand what would happen, make the necessary provisions.

Your dependencies do need to work reliably if you&#x27;re not going to use mocks. But if they don&#x27;t, then mocks might be necessary to ensure test reliability. That said if the interaction between your system and its deps is sufficiently unreliable that you need mocks for test reliability, how are you going to have any confidence in the system&#x27;s production behavior?Like, if I test one of the workflows of my service, it might involve executing queries and transactions on an underlying database. But these operations have essentially 100% reliability, so the fact that these operations are being &quot;tested&quot; at the same time as my service do not impact test reliability. I gain nothing by mocking them out.

Great article, not the first time I read this argument against the usage of mocks; I never understood though how to solve the problem of the explosion of the path that must be tested (usually exponential).As an example, consider a single API that uses ~3 services, and these 3 services have underneath from 1 to 5 other internal or external dependencies (such as time, an external API service, a DB repository, and so on).
How can I test this API, the 3 services, and their underneath dependencies without exponential paths to test - I want to be able to cover all the paths of my code, and ensure that my test only tests a single thing (either the API, the service, or the dependency interface); otherwise, it is not an unit test.I always felt like that these type of tests without mocks works super-nice in nice situations without any external, or even complex but internal, dependency; otherwise, it becomes very very hard to test ONLY what I want, and not all the dependencies underneath.Mocks allow me to stub the behaviour of a service&#x2F;dependency that I can test in a separate fashion, covering all the paths, and ensuring that each unit test covers a single unit of my code, and not the integration of all my components.

I feel like you are still making the case for it. “Should I promote the logic of the dependency to the calling module?” No, I shouldn’t. That’s why I made them a dependency. They could be dependent to many modules, or there could be many implementations based on ioc implementations or a some random factory or strategy pattern. Or maybe I do not control the dependency as it is external to our platform. I can understand all too well the engineering bias to do less work. “But now all my tests are broken” may be the correct result.With that said without concrete examples too argue about it’s hard to say if we are even disagreeing. The articles examples were wanting and to what level you break up your code is a hard fought learning exercise. Some don’t care, some say no more than fits on a screen, and on the other end some people use an NPM package to find out if a number is even.

I hear you, and that&#x27;s how things work at my present shop.However, let me give a strong defense of the point.It turns out that the only thing you can test is a pure function. This is just the nature of a test, we pin down the inputs to a piece of code and we see what outputs it produces. All of the mocking that you are doing is an attempt to turn an impure function into a pure function. Even more extreme setups where you connect your container to a container running postgres, is trying to turn that container into a pure function in a different way. You don&#x27;t need to do any of this if the function is pure in the first place.Once we&#x27;ve established that “functional core” is a lazier means to the same ends the question of dependencies still comes up, and the “should I mock dependencies” question becomes “should I promote this internal function call to the main I&#x2F;O section and pass in its result as an argument?”... And the answer is that that changes the language with which this outermost level is written. And probably this outermost shell should be written to sound like the business logic that you are implementing, and you should never do that.If that&#x27;s the logic then it means that the sort of testing you&#x27;re doing is extremely particular and fussy. Because it suggests that every test should really be constructed at a business level, it is a story about your product that you want to make sure holds fast even when the internals are changed. This works really well with domain driven design, because that says that your modules should be also business level entities, so each module comes with some tests that say, here&#x27;s what this sort of person interacts with the system like. So you are always testing integration of the pure functions, and why would you not. If those pure functions do not integrate together, you want to know about it, and you do not want to know about it through a persnickety test which just fixes the inputs and outputs of something that has no business relevance, because you know what happens in those cases: the developer just rewrites the test to say the opposite of what it used to say, so that it passes now. There is no semantic check on the test output because there cannot be if it is scoped too small.I think you can quibble a lot with those details but I think that&#x27;s the strongest case you can make for it?

&gt; besides its use of dependenciesIsn&#x27;t that sort of the point of the unit test in that case? You test the use of those dependencies:Write a test where a mocked dependency returns something unexpected and see how your &#x27;unit&#x27; responds.This is why mocks are useful because often you rely on implementation details of your dependencies and only test the happy cases. With a mock you can return whatever edge cases you come up with and ensure your unit handles everything as expected.

The way I think about tests is: what’s the cost of failure here vs the cost of testing?Sometimes, bugs in this part of the codebase would not be a showstopper, so why test as heavily?The most important tests by far are smokescreen integration tests for critical paths through the system. I tend to care much less about other tests in many cases.Obviously, if I was writing a compiler or database or medical software, that’d be different. But I’m generally writing web applications where if the entire application were to fail for a day, we probably wouldn’t even lose a customer.

When I unit test, there is normally not much going on in the “unit” besides its use of dependencies, so the tests are largely circular (asserting things about mock expectations).I still have to write them, because Thou Shalt Have Unit Tests. I can’t consolidate the overly-trivial units, because that would not be Architecture Best Practices, and might even be Spaghetti Code.

I don’t get it. When I unit test, I want to only test what I’m testing. A dependency or the result of a dependency is not what I’m testing. So I mock the result of that dependency to unlink the test from the dependency. If an interface changes I generally want to know anyway, as the test may be different or even obsolete depending on the change.

This article espouses what is known as &quot;classicist testing&quot; school of thought and rejects &quot;mockist&#x2F;London-style testing&quot;. I wholeheartedly agree with it. I have been championing it in my team of 20+ devs since many years now, to great effect. You can read more about it in the excellent book by Vladimir Khorikov from January 2020 titled &quot;Unit Testing Principles, Practices, and Patterns&quot;.

&quot;Non-deterministic tests&quot; usually refers to a test whose output depends on the execution environment in a way that cannot be controlled. For example, a multithreaded test with a race condition, or a test that uses the real-time clock.This is fundamentally different from a test that uses random numbers. Another way to look at it: Deterministic tests can be used in `git blame`. A unit test that uses a specific PRNG algorithm and sets the seed is fine. A test with a race condition is not.Fuzzing is very useful, but it&#x27;s not unit testing. If you want to run a fuzzer or some other kind of endless randomized testing in CI, it should be a separate job from the unit tests (IMO).

This is almost right except for two points:For fakes, spin up the real thing. If you’re not able to model your database transactions deterministically, then your transactions could themselves be flawed and tests are great way to catch that.Deterministic tests are not a goal in and of themselves. Controlled non determinism is valuable. This is popularized in various frameworks under the names of property checking and fuzzing which will let you know the seed to use for the failure for example so that while the runs don’t have the exact same input&#x2F;output for every invocation, you get better coverage of your test space and can revisit the problematic points at any time. If you’re doing numeric simulation, make sure you are using a PRNG that’s seedable and that you log the seed at the start if you’re using a seed (and make sure time is an input parameter). Why is this technique valuable? You transitively get increasing code coverage for free through CI&#x2F;coworkers running the tests AND you have a way to investigate issues sanely.

I would argue that if you need mocks, the code isn’t well suited to unit testing and you’re better off with integration tests.If you want to unit test it, then refactor in such a way that you no longer need mocks.

If you are testing the externalities, they aren&#x27;t unit tests, but integration tests.The better advice is: continue to isolate unit tests away from real dependencies, and ALSO have integration tests that test the way the package connects to dependencies (and the other packages in the software dependent on it).

I have two services that communicate over the network. Neither does anything useful without data from the other. So I write my tests like this article suggests and inject at the boundaries of the services, the network boundary in this case. I&#x27;ve built tests that now only test my idea of what the services will return. IMO these aren&#x27;t particularly useful tests: what happens when the remote service starts returning unexpected data or errors due load problems? I guess my point is just this article is good advice: prefer testing your actual code and dependencies, but unit testing isn&#x27;t a replacement for integration testing.

As I wrote it I thought the same thing! I think some of the problem was org friction of adding a new module, getting approval to do so. This was a place with about 400 devs so they could really let anyone freely add modules and waiting for architecture approval would take too long for a typical ticket.

I would say that testing only the public contract is generally very helpful.When the logic gets super complex, I would think that a good comprise is to modularize your code - and test the public API of each module. A good heuristic is to see if that modules are (or could be) helpful by themselves in the future.You then write tests for a module assuming it&#x27;s dependents cover their edge cases.

&gt; So if I write my own sort algorithm I am not allowed to unit test that class unless I make it public.I think there&#x27;s a slow movement away from this kind of straightjacket. The compiler sees it all anyway. Class accessors serve 2 purposes.1. They are a social tool; and this is only if you believe that developers who work on source code, which they can read, cannot be trusted to call methods to do what they need.2. They are a convenience for opaque modules, so that users who may not have access to the source code, can avoid using APIs that may have no effect or problematic side effects, while also decluttering the API.Python allows for methods to act as if they are declared public when using a special syntax eg _backdoor(). Go allows for any method that is in the same package, to access any other method regardless of accessors...like tests for that package. In javascript, you can use rewire.js to mock methods in imported modules (which are effectively methods in closures).These solutions are an improvement to the classic rigid class access pattern that many languages are stuck with and forces the kind of theatre you have experienced.

I worked somewhere with a hard rule that you are not allowed to test internal classes.So if I write my own sort algorithm I am not allowed to unit test that class
unless I make it public.But if the sort algorithm is to solve a specific sub problem in a library there is no need to make it public - it may not make much sense.So I had to test it “through” its consumer class(es) that is public. For illustration sake lets say a MVC controller mocked up to the eyeballs with ORM mocks, logging mocks etc.This bugged me because having
direct access to a functional
core I can quickly amplify the number of test cases against it and find hidden potential
bugs much quicker.

In database land you can even freely bring up a number of _proprietary_ databases without needing an account or license. I spin up MySQL, Postgres, Oracle and SQL Server databases in containers in Github Actions for tests. Unfortunately IBM DB2 seems to require a license key to spin up their container. :(