Skip to content(if available)orjump to list(if available)

Monad Confusion and the Blurry Line Between Data and Computation


I find it a bit strange that I didn't have this experience at all once I found out what monads are.

In fact I felt quite a bit underwhelmed, because (in the context of programming) it turned out they're just a generic interface (like e.g. Iterable) with a `flatMap` function and a `wrap` function and some "property tests" that must always pass, and whatever implementation to that interface you can make... well that's a monad


I think a part of the whole monad confusion comes from the fact that monads "add something" to a computation, which is valuable in Haskell but not in many other languages. Any function in C/Java/Python/Ruby is already free to log to stdout, truncate tables in the database, phone home to the NSA, modify files on the filesystem, throw exceptions, modify global state in the program, etc etc etc. You could say they all live in some `ErrorT (StateT s) IO` type of monad by default.

In Haskell, it's almost exactly the opposite: by default a function can do nothing but access its argument and do computations, but by adding more and more monadic contexts it can do more. This is useful for things like lazy evaluation, because the compiler can infer many more optimizations for pure functions than for functions that might do "anything". It also has benefits for security: a function with type `a -> a` is (almost) certainly not going to access the database or the filesystem. You can have the compiler confirm many assertions about the program when all side effects have to be declared through the type system.

For (say) a method in Ruby, which can already do anything, monads are not very useful IMO.


Additionally, monads, applicatives, and functors with currying and without currying will look very different. In a language that differentiates between single variable functions and a multivariate functions, even functors are weirder.

Haskell simplifies this all by just currying everything so that a function of two variables looks like a single variable function. Then you just have to define “applicative” and you get fmap that can apply to functions of any arity.

In python, by contrast, any equivalent definition of Applicative would be super non-idiomatic. On the other hand, you could write a very idiomatic ApplicativeOfTwoVariables or ApplicativeOfThreeVariables and so on.

The difference between programming with curried and non-curried functions runs really really deep, and I don’t think we’ve found the right abstractions yet that makes functor/applicative/monad analogs actually simplify developers’ lives in uncurried languages.


The real challenge for newcomers, I think, is to reconciliate this almost magical "monads allow side effects in pure functions" with the very mundane "a monad is a type with a 'return' and a 'flatMap' aka >>= method".


The reconciliation is simple: Monads don't allow side effects in pure functions (in Haskell). I know it's a very popular idea but it's also just wrong.

(Yet another in the long list of Very Popular But Also Just Wrong ideas surrounding this not-that-complicated concept.)

What the monadic interface to the IO type (very carefully phrased) in Haskell allows is building up a value to be interpreted at run time with a convenient API that looks very like it is an imperative programming language. But it actually constructs a large, complicated value that (per the linked article) is really just a big data structure, which at runtime Haskell lazily walks through (laziness being key here since the vast majority of programs have effectively infinite loops which mean the data structure representing the program would be infinite in size if strictly manifested in memory) to run the program.

In practice, you don't need to think about this distinction and you program with the IO type in the top-level IO context as if it is "really" an imperative language that is really doing what you tell it to do.

But "monad" has nothing to do enabling side effects. Monad is just an interface. Interfaces lack the power to fundamentally rewrite the nature of a language. If you just hacked out the place in the Haskell base library where IO declares that it is an instance of Monad, you could still do everything Haskell does today. Nothing about IO itself would change. What would change is that all the generic functions that have a Monad constraint would no longer work on IO, and you'd have to clone all of into a new specialized set of functions that work on IO. But that's all that would happen, and a whole lot of code would have to be mechanically translated to use these new non-generic functions. Haskell would not break. Haskell would not suddenly become unable to speak on the network or read files. It would merely become less convenient to use for no benefit.


> Any function in C/Java/Python/Ruby is already free to log to stdout, truncate tables in the database, phone home to the NSA, modify files on the filesystem, throw exceptions, modify global state in the program, etc etc etc. You could say they all live in some `ErrorT (StateT s) IO` type of monad by default.

I think it's worth expanding this point out a bit. As well as your other point about knowing that a function from `a -> a` is sound. It's the declarative nature of monads that are truly valuable, especially when communicating with other devs through code. The explicit declaration of side-effects really comes into its own when applications are large. Often it means we don't have to go and 'look inside' a function to see what its interaction with the world will be.

Obviously there's the 'other stuff' too:

* Compositionality

* Package up complex boilerplate so it's never written again

* Ability to optimise the core implementations later without rewrites (I'm thinking for more complex domain managing monads, rather than the 'built-ins')

> For (say) a method in Ruby, which can already do anything, monads are not very useful IMO.

That's probably true for dynamic languages where there's no declarative value to using monads; although the encapsulation of common behaviour is still useful.

I develop language-ext [1] a functional framework for C# that tries to bring the benefits of monads (and other functional goodness) into C#. It is true that a programmer can still launch nuclear missiles in between lines of code if they want to. But with a bit of self-discipline it's entirely possible to reap similar benefits to what you'd see in Haskell. LINQ in particular is very powerful for this. The compiler doesn't get a look in, like it does for Haskell, and that is definitely a downside - because it can't ever be optimised as well as Haskell can, but that's just a trade-off.

Of course it isn't Haskell, but it's still possible to get 95% of the benefits.



Oh I understand the usefulness of individual monads and infact am very happy that the NodeJS community finally switched to promises from callbacks (even though they violate some of the monad laws, you still get a lot of the same benefits).

I just don't think the generalization of those concepts is very interesting. Monoids seem a lot more interesting to me, especially when it comes to splitting up computations (mapreduce) or partial computation caching, you can really come up with quite a few useful ones

Monads in comparision seem boring.


I tend to agree on this. Understanding the power introspection trade-off is important. Monads, while supplying a lot of power, aren't terribly introspectable. Applicative functors have some degree of introspectability. While functors are the most introspectable, but have the least amount of power. Understanding these trade-offs is important when deciding to use the right of my power when solving a problem.


That's pretty typical for understanding mathematical concepts. You start knowing the definition (interface), but with absolutely no intuition for it, so it seems mysterious. Then you work though some examples, theorems, use-cases etc. to build understanding, and the concept seems to become more than the sum of its parts. At some point, you look back at the definition with your new intuition and realize: There really isn't anything more to it than this definition! Everything else just follows from that!


I do know what concepts map to it, I understand all common monads (including IO) as well as monad transformers. I think ZIO style monads (combined reader, error and effect monad) are pretty awesome for building backend apps, especially the reader bit which solves the common "contextual data" problem typically solved by threadlocalstorage or CLS or OOP in other languages

I still find "monad" as a concept underwhelming and meaningless. I can put `flatMap` and `wrap` on a lot of things but that won't make them interesting or useful. The individual use cases are interesting, the concept isn't - seems like an unnecessary overgeneralization.


I agree with you that there's no inherent magic to the monad concept, but there really is something neat about the monad generalization -- it's not just a senseless abstraction. That meme "a monad is just a monoid in the category of endofunctors" has merit here. If you already believe that monoids are a useful enough concept, and also if you believe functors are useful enough, then this is a basic structure that pops out. It's a higher-order one, which makes it that much harder to grasp, which only amplifies any possible disappointment :-)

One place this generalization is useful is in defining other structures, for example traversable functors, which as part of their definition need to commute with every other monad. Sure, you won't be using most monads, but it's easier to define a traversable functor that works for all monads instead of the small subset of computationally useful ones.

The deep part to monads is that they (well, a large class of them, the "finitary" ones) end up classifying Lawvere theories. This means that each monad is secretly encoding some suite of composition laws for operators of arbitrary arity. I'll just give an example with the list monad. If you have a list [1,2,4,2], then you can think of it as being a function, where `map` takes each integer (a "variable") and substitutes it for some value, yielding a list -- so in this way the list is an operator with an input per int. The monad part is that you can compose it with other operators by using `flatMap`. For example, if 1 -> [1], 2 -> [2, 3], and 4 -> [2], then the `flatMap` gives a new "function" [1,2,3,2,2,3] from integers to lists by splicing in other operators (and `wrap` is a way to tell `flatMap` that you want to preserve a particular variable, so to speak). I think this is a way to think about why monads have anything to do with computation -- a value of `m a` where `m` is a monad is a computation with "`a`-shaped holes," and `flatMap` is a way to plug other computations into these holes.

Anyway, that's a lot to say for this, but I'd come across Lawvere theories earlier this year and it gave me a new appreciation for monads after years of just using them while programming. (Disclosure: I'm also a pure mathematician, so I understand if others won't share this appreciation.)


It is a pretty good concept to scare away non-monadic ppl. Cool word too


This is an underrated observation IMO. Mathematics gets mystified so often when it really is just a very useful (and aesthetically appealing) method to approach problems. There's no secret sauce.


In the definitional sense it is underwhelming, like with many algebraic structures the diversity of instances leads to interesting results. The behaviour of these instantiations have a completely different character even though they instantiate the same pattern

  replicateM :: Int -> [a]      -> [[a]]
  replicateM :: Int -> Cont r a -> Cont r [a]

  sequenceA :: [IO a]             -> IO [a]
  sequenceA :: Maybe (x -> a)     -> (x -> Maybe a)
  sequenceA :: Map key (Parser a) -> Parser (Map key a)
In fact both of them require only an Applicative pattern (aka n-ary lifting) which is weaker than Monad

  liftA0 :: Applicative f => (a)                -> (f a)
  liftA1 :: Functor     f => (a -> b)           -> (f a -> f b)
  liftA2 :: Applicative f => (a -> b -> c)      -> (f a -> f b -> f c)
  liftA3 :: Applicative f => (a -> b -> c -> d) -> (f a -> f b -> f c -> f d)
  liftA4 :: Applicative f => .. 
    liftA0 = pure
    liftA1 = fmap
but has more interesting properties than Monad. Because there is no dependency between Applicative computations they can be run Concurrently[1] or Backwards[2]. They are also closed under Compose-ition[3]. This is not even mentioning what makes an Applicative or a Monad. Functor instances are unique[4] but a single type constructor can have different law-abiding Applicative instances.[5]







Yes. They aren't much more complicated in any other context, either.

But it is interesting how widely widely that interface is supported. Eg functions are also monads (where function composition corresponds to monad's fmap.)


> it turned out they're just a generic interface

Yep. It's a generic interface for the interpreter pattern. (That one code pattern from OOP fame that is as powerful as macros.)

All of the importance of the type comes from what it represents, not on its own complexity. All the hype people get once they understand it comes from learning how to write interpreters.


The interpretation I have come to use, which I'm sure isn't a perfect one, is that a Monad is a means to wrap Data in a Context before applying Computation to it.

A Maybe monad's context is that the data might not be there. A List or Stream, the context is multiple values. A Future/Promise, asynchronous application of the computation. Writing software this way is a bit more generic because you can swap the data, computation, or context while leaving the rest the same.

This is very much a "what is it good for?" answer for anyone who doesn't have a strong functional programming background.


I think the monad-confusion stems largely from the fact that they are usually explained in Haskell. It's hard to understand an explanation in Haskell if you don't know the language Haskell.


Monads aren't really hard to understand if they're explained with a programmer's vocabulary. The only reason it seems hard is because "Haskell communicators"(like that dude on youtube in the fuckin fedora) insist on importing all this mathematical jargon that's completely unnecessary. Then they have to explain all the jargon, and by the time they actually get to monads, you're thinking about what to have for dinner.

Now I don't use them for anything really, so I don't care enough to have remember what made sense to me back then. But if you know LISP, there's a good guide on the internet somewhere explaining how to implement monads in CL. It's essentially a PROGN that passes values around(IIRC).


Agreed. For such explanations, I generally turn to Brian Beckman:


IMO Beckman video is very good. Recently I liked this video too: ("The Absolute Best Intro to Monads For Software Engineers" by " Studying With Alex "). I also suggested a few resources at which helped me much.


Funny, that's literally the video I was referring to. Guess it's not quite a fedora, I don't know anything about hats.


Hah, they didn't even notice how Maybe is algebraic and State is coalgebraic ...

(I completely made this up, it could be true according to my limited understanding, but more likely it's completely wrong.)

I think the confusion comes from a different source: Obviously, Monads are an extremely powerful interface. But such an interface only make sense when we have some form of hiding of implementation details: As long as I know the constructors of Maybe, I can completely ignore or reinvent the monadic functions. So State is actually the better example here, because in a paramatrically polymorphic type system, we don't know anything about the concrete state when implementing return and flatmap. We need a composition technique that works for every possible state.

So unless your language has a parametric type system and/or allows for hiding implementation details, you won't ever benefit fully from and thus never grasp an intuition for monads.


I'd say Monads are useful for encoding computation that ensures through type system that no data escapes wrapping; which can be used for tracking computations with side effects but not only.

I am not sure about the Clojure macros vs Haskell monads, there are no macros in Haskell. Also, the article does not mention that Haskell is non-strict, while Clojure is strict, also static typing vs dynamic typing.

Those are very different languages and comparisons can be very misleading. Code is data is a LISP idiom expressed with S-expressions, I think in Haskell, code is code, data is data, which makes it less powerful for some use cases, but possibly more safe etc. I've heard about a different rite of a passage for Haskell programmers - contributing to GHC.

Clojure is a LISP dialect, basics of LISP are possibly eval, quote, lambda, apply; functions for meta-circular interpreter.

Haskell is a little different, but also functional paradigm, basics of Haskell are type system, algebraic data types, lazy evaluation, classes.

It is possible to emulate Monadic chaining in other languages; but static type system and lazy evaluation is a different matter.


> I am not sure about the Clojure macros vs Haskell monads, there are no macros in Haskell.

Haskell has macros, they are just less prominent than in Lisps.

There's even multiple approaches to macros in Haskell. One of them is template Haskell.


Isn't it about runtime macros vs compile time macros aka templates, Haskell has no runtime macros, LISP has them, I'd guess you can manipulate and redefine mostly everything in runtime in LISP, not so in Haskell?


Huh? Which Lisp has runtime macros? I am only aware of 'compile time' macros.

I am most familiar with Scheme (and Racket) and Common Lisp. Could you explain how 'runtime macros' work in any of these systems? Perhaps give some example code?


Maybe a

Just a


The conceptual issue is that the center of gravity moved from `a`, where our imperative programs would have it.

Now, much of the logic has transitioned over to Maybe, Just, and Nothing.

The profound effect is that `a` matters less. "Where is the `a`?" the procedural programmer wants to know.

"Meh. Have this monad and let the runtime execution of the code kiss your `a` on an as-needed basis," says Haskell.


So Haskell is among my favorite programming languages, but…

Monad tutorials suck and confuse people because they are usually thinly-veiled sales pitches for a solution to a set of problems that readers don’t feel that they have, often written by intermediate Haskellers.

Haskell is a tough sell: no one feels that they write dangerous, buggy code and should enthusiastically accept weird, foreign seeming constraints. Add a very different syntax and lots of terminology from outer space and you get, well, the adoption Haskell has.

Rust is doing Haskell a real favor here I think, because it’s normalizing the idea of “constraints as power”, has limited monadic structure, and is smashing the “has to look like C” syntax monopoly.

Some percentage of new Rustaceans are going to be like Neo after the King Fu disk: “I want more”, and those people are the new crop of Haskellers I think.


I was with you until your point about Rust smashing the “has to look like C” syntax monopoly. One of the things I don’t like about Rust is that it tries so hard to look like C.


It's all relative I guess, and Rust is not the only language eroding C-like syntax, just probably the most visible one.

I like most things about Rust and hate a few, syntax being one of them. But I can see that it was a very pragmatic tradeoff between being sufficiently familiar to not badly hinder adoption, while also sufficiently different to strongly encourage thinking in new ways.

It feels a bit 1990s to a Haskeller I guess, but to a C++ programmer it's like, whoa there buddy, WTF is this match gizmo and what's with the weird question mark. To me personally it's in an uncanny valley between the old and the new, but I get why they did it and it seems to be working out.


it's not clear to me, and from the studies i have seen, that Haskell leads to less unintended exceptions than Clojure.

haskell might certainly lead to less nil pointer expections, but that doesn't even mean the faulty logic is easier to resolve right?

what I'm saying is, no one pays to avoid nil pointer expections. if they did, i would simply right no code and call it a day.

the question is, does a language let you express your intent accurately. imo, this is what clojure gets right, because it helps me talk to the machine in the machines terms: datastructures.

clojure makes using well understood types easy by putting them into the snytax. it wires it into your brain.

my worry is, and hear this on every level, haskell is too open ended, people will choose to reinvent lists and sets and hashmaps making it harder and harder to understand how anything can compose together.


What do you mean by 'open ended', are you talking about literals for data structures or something deeper?


I'm asking a question about human psychology really. Maybe in a too cheeky way, because I feel some language enthusiasts, unintentionally, do the same by implying there programs are "safer" by being more constrained, without really justifying the claim.

I'm asking how haskell encourages people to use composable well understood abstractions and data structures?

Clojure does this by putting them into the syntax itself as primates E.g ([] () #{} {}) see? the list () contains a list of things, one of which itself is a list. This cuts away the translation barrier, i don't have to label the ocean as wet if your in it.

In this way all clojure developers are lead to think and talk to the machine the same way, this unifies how all Clojure programmers often choose to express intent.

It's at the heart of what Rich wanted for the language, making simple easy. Or as i see it know, making simple useful structures so easy to reach for, your discouraged, before you know better, to do something else. That's why i view clojure as more constrained then haskell, because I'm not sure haskell has that level of built in encouragement in it's design. I think the authors were interested in what could be, as where rich has taken a much narrow stance on how to deliver high level programs.


edit primates should be primitives. that's what i get for trying swiping on my phone and not triple checking. ;)


A soft boundary between data and code raises interesting legal and political implications once viable homomorphic encryption rolls out.

We talk of "data centres" as if they were warehouses and distribution depots for good. And we talk of "cloud computing" as if it were an online sausage factory where things get sent to be "processed". This is the paradigm into which our current trust models are fitted. Along with it comes the idea of moving or limiting PII across jurisdictional borders. But it's not the only paradigm.

If (a BIG if) we could actually have endpoint security for the user/owner, trusted platform technology might one day work in a fashion rather favourable to privacy and science (benevolent technological society and digital governance).

You send me the computation (as ostensible data) that you want to operate on my (secret) "data" (as ostensible code/function) which I apply and return results. You never see the data. If there's a way you cannot even in principle discover the data by reverse engineering or probing then it could enable the good uses of big data (voting and population scale medical research) to be achieved while eliminating the evils of surveillance capitalism and targeted advertising.

Any thoughts as to the theoretical feasability of this? (I'm not interested in "big thugs will never let that happen" or "nobody gives a fig, privacy is dead anyway" whines)


> You send me the computation (as ostensible data) that you want to operate on my (secret) "data" (as ostensible code/function) which I apply and return results.

Isn't that basically what we do with web applications? The server sends me JavaScript code which I run on my machine with my data (which the server doesn't need to see) and send the result to the server (either for further computation or for storage).

I mean, in it's current form it's not a perfect separation. It's more of a hybrid model where computation happens both on the client and the server, and some data is actually sent to the server without client-side computation. But the basic model is there.


> Isn't that basically what we do with web applications? ... It's more of a hybrid model where computation happens both on the client

Yes I think that's a valid observation, the essential paradigm, but you nail it (the limitations) as "hybrid". You're sending potentially untrustworthy (tamperable of obfuscated) code in the clear, and I have to trust your code and execute it. Flipping the data-code it's my code that runs on your data.

Without sequencing the crypto primitives clearly yet, it feels like with a set of homomorphic operations we can get a "zero knowledge result" - you don't get to see my data and I don't get to see your "ask", and nobody in the middle gets to see anything - yet useful, mutually beneficial computation is obtained.

Seems a Haskell/LISP-like paradigm using remotely evaluated first-class functions/monads would be the ecosystem within which to start experimenting with such ideas.


>> Any thoughts as to the theoretical feasability of this?

and if I request an identify function as the computation of interest?

The use case I'm most familiar with is petabyte storage of raw satellite feeds (along with digitised air photos, multi spectral data, etc) and a requested computation that georeferences and mosaics raw data and combines and enhances spectral bands in various ROIs for specific purposes.

It's a pipelined functional programmin data transformation .. with no specific reason to keep the raw imputs secret, the intent is to make them useful via transform.


> once viable homomorphic encryption rolls out

It should be noted that there is currently no reason to beleive homomorphic encryption will ever be fast enough to be applicable for anything except niche problems that require the utmost secrecy for tiny amounts of computation.

It would be great if some improvement is found that would speed it up, but there is no reason to beleive such an improvement must exist.


cf "KeyKOS factory"


KeyKOS: "mechanism for secure sharing of programs among mutually suspicious users"

Interesting, Thankyou.


This seemed like a really great explanation. But I have no idea if it actually is, or it if it just feels that way because I finally have some Haskell coding time under my belt.

Regardless I enjoyed the read and found it a useful way to think about monads


Maybe as a monad? Now I'm even more confused.


Maybe is a monad cause you can wrap something in a Maybe (that’s ‘return’) and you can also bind a ‘Maybe a’ to an action that takes an ‘a’ (that’s ‘bind’ or ‘>>=‘).