Skip to content(if available)orjump to list(if available)

Math on GitHub: The Good, the Bad and the Ugly

rectang

Did Github ever solicit feedback from the community for this feature? Was there ever a beta before they rolled it out?

Because some of these critiques really should have been dealt with beforehand and I'm concerned that we're now stuck with lousy defaults that they won't ever be able to change.

The visual font size problem seems particularly disastrous:

> The math font is a really small,

> MathJax’s default font MJXTEX-I and GitHub’s default text Helvetica have a different x-height/cap-height ratio.

I don't understand how Github could launch this feature with this problem unaddressed. LaTeX isn't just about getting formulas right, it's about communicating ideas.

edflsafoiewq

Small font size seems the easiest to fix. Kerning seems worse since people will manually tweak spacing with \, etc, so you can't just change it whenever you want.

nschloe

Blog post author here. The kerning is probably a problem with their font config that can hopefully be fixed. I don't think -- and that's me personal estimation -- that people will write tons of `a\,=\,b` to work around the bad kerning, so that's a change I would still recommend making.

kmill

I thought kerning specifically referred to small adjustments in spacing between pairs of letters. This just seems to be a failure in honoring the spacing for the different TeX math mode symbol classes. The rel and op classes each have a certain amount of spacing they are supposed to have on either side (unless they aren't used as binary operators), and somehow this is broken in their implementation -- the spacing doesn't come from the font configuration.

For example, a\mathbin{foo}b is supposed to render as "a foo b", but on GitHub it comes out as "afoob".

When I looked at the MathJax configuration GitHub uses, nothing pops out as being odd. It would be funny if a minifier totally messed up the part of the MathJax source code specifically for this.

rectang

I agree that kerning is also a problem, and I also agree that it it can't practically be fixed.

However, even if it's theoretically possible to change the font size, would Github ever do so? My impression is that as an organization they place a high value on interface stability. Very little about how I interact with the site has ever changed.

I suppose that individual publishers might stick in a hack to increase the font size, but a bad default means that as a consumer, I'll be stuck looking at tiny math on Github for the vast majority of documents.

enw

> Very little about how I interact with the site has ever changed.

GitHub tunes the UI all the time.

nschloe

Author here.

> Did Github ever solicit feedback from the community for this feature? Was there ever a beta before they rolled it out?

There actually was a closed preview for this, but it wasn't long and the feedback I gave them (which is almost everything that's in the blog post) wasn't implemented.

> The visual font size problem seems particularly disastrous:

They made the decision to match the capital heights of the two fonts, not the x-heights. If they match the x-heights, the capitals in math mode will be too large.

Since the small letters are way to small, perhaps they'll increase math font size a little bit in the future to balance it out.

pxeger1

This does seem like a remarkably poor implementation, especially given the alternatives: GitLab, mentioned; StackExchange also has a pretty decent implementation; many markdown-based static site generators also have good support

thaumasiotes

> advantages of KaTeX:

> It’s faster.

> You can copy-and-paste math.

Someone mentioned that you can theoretically copy-and-paste KaTeX output in an earlier thread about GitHub's new math rendering too. But I think calling that an "advantage" is crazy.

LaTeX will transform your one-dimensional textual formula specification into a two-dimensional graphical formula. The concept of copying and pasting the output as text is a category error. It isn't text and if you try to paste it, you'll get something other than what you wanted.

Just to be sure I'm not crazy, I tried copying the output of the demo on the KaTeX homepage. Here it is:

> f(x)=∫−∞∞f^(ξ) e2πiξx dξ

This is much, much, much worse, if you want a textual representation, than copying the LaTeX source:

    \f\relax{x} = \int_{-\infty}^\infty
        \f\hat\xi\,e^{2 \pi i \xi x}
        \,d\xi
And it's worse even though the raw source includes the quirk that LaTeX isn't able to provide proper spacing for the differential over which an integral is being calculated, so you have to space it yourself with \, commands.

Royi

KaTeX also supports only susbset of the features of MathJaX. Some of them are really important. While speed is great, missing basic features is the worse. I'm happy with GitHub's choice.

JadeNB

> \f\relax{x}

I know it's not your point, and the design choice is KaTeX's, not yours, but making a macro that has to be invoked as `\f\relax{x}` to substitute for `f(x)` is … kind of crazy.

(Of course, in regular TeX, at least, you could do `\f\relax x`, saving one token at the expense of looking even less like a function invocation.)

thaumasiotes

`f(x)` works just fine. As far as I can see, the entire point of their \f macro is to let you write `\f\hat x` instead of `\hat{f}(x)`.

Leaving aside whether that's a good idea, it's not at all clear why the example then goes on to use `\f\relax{x}` to display a function with no diacritic in its name. The diacritic was the only reason to use \f in the first place. And the advertised definition, `#1f(#2)`, doesn't only require \relax when you want to omit a diacritic. It also prevents you from doing perfectly normal things like `f\left( something-hairy \right)`.

hprotagonist

A source block type would have been a very natural fit, it’s surprising that isn’t how they chose to go.

  ```equation
  e^{i\pi}+1\eq0
  ```
feels like what i’d try to do right off the bat.

ajnin

A code block is used to display text literally, except adding syntax coloring. You would use it to display TeX source notation for example. A math block transforms TeX into something else, it's entirely different. I think it's correct that they did not use the code block syntax for this.

edflsafoiewq

It's how Github already handles Mermaid diagrams, so there's precedent, although you're correct that logically there should be a distinction between "display highlighted Mermaid source" and "render Mermaid diagram".

ajnin

Yes, and now they've reached a dead since they have no way to render mermaid source code except as plain text or by introducing a new name like mermaid-src. It's minor but it's unsatisfying conceptually. I'd rather have them introduce a new "run delimited text into external program and replace by output" tag.

nschloe

Author here.

The main idea of using code blocks for math is to protect its content from being messed with. Markdown parsers do that by default, so that's how it's "natural". As you mention, the drawback is that you can't have a codeblock with syntax highlighting for a language called "math" (should there ever be one), but that seems like a small price to pay.

ly

I prefer the $ $ way, as it makes it possible to do inline equations, while keeping the source easily readable.

hprotagonist

you can do both, the normal markdown way:

  `$a$` squared is `$a^2$`, which is good to know for the pythagorean theorem:
  ```equation
  a^2+b^2\eq c^2
  ```

IshKebab

That doesn't work because then how do you display $a$ as literal inline code?

ly

Ah yes, agreed, then that does indeed seem like to optimal solution here.

null

[deleted]

samwillis

They suggest this for inline using combination of the code back tick and dollar syntaxes:

Inline math: $`a^2 + b^2 = c^2`$.

Hendrikto

Especially since this approach has already been tried and proven by competitors like GitLab.

darkscape

Yes, could use different tags for literal display and rendering eg ```tex and ```eq.

patrick451

> The reason why I’m so excited about this feature is that, in combination with version control and the issues/discussions capabilities in GitHub, I can see tectonic changes in how we’re publishing science. At last, science can really reap the benefits of a connected internet by moving away from static PDFs to living, breathing repositories which render like PDFs and provide a central place where one can actually talk about the article. – And fix bugs!

I'm skeptical of this take. Gitlab has had math rendering support for quite a while now, so this is hardly novel and doesn't seem to have resulted in the utopia the author is hoping for.

Beldin

I had the same feeling for a different reason: I don't think this reduces friction enough compared to sharing LaTeX files (on github or overleaf), which we can already do. So I don't think this will usher in a new era.

I realise this could very well become the next "dropbox comment" - and I'll be happy to be proven wrong.

patrick451

It would be cool if they would just render entire latex manuscripts that are part of the repo. It sort of sucks that you need to duplicate what you wrote a paper submission into markdown to get github to render it.

disgruntledphd2

They do this for org mode files, but that's a much simpler format than LaTeX.

nschloe

Author here.

You're right, the big dreams always turn out smaller in real life when they come true, right? I'm hoping that the popularity of GitHub will aid a shift of publication style though. :)

jaltekruse

One thing that the article gets wrong is accusing mathjax of being abandoned. Development has moved to a new repo for the next version.

https://github.com/mathjax/MathJax-src/graphs/contributors

nschloe

Author here.

Thanks for the hint! I had indeed overlooked that. I have updated the article accordingly.

null

[deleted]

fiddlosopher

Note also that TeX math can contain \text{..} which can itself contain $-delimited TeX math, e.g. $x = \text{my $y$}$. This currently breaks the GitHub implementation.

bewuethr

I tried all the examples using

  pandoc --from markdown --to html --mathjax
i.e., using pandoc Markdown[1] with MathJax enabled, and it has none of the problems described in the article (see output[2]). The problem doesn't seem to be due to MathJax, and even using GitHub Flavored Markdown as the input format to pandoc produces the correct results.

[1]: https://pandoc.org/MANUAL.html#pandocs-markdown

[2]: https://gist.github.com/bewuethr/691b4870828d7b2261113f14eef...

nschloe

The difference between GitHub and pandoc with GitHub-flavored Markdown as input is that pandoc doesn't sanitize as aggressively as GitHub does. For example, pandoc doesn't remove the backslash-escapes before non-letters as in `\{`.

Not sure if GitHub will compromise here, though.

atomashevic

I completely agree with the conclusions, GitHub repositories can really be the focal point of scientific publishing in years to come.

I always wondered why it took so long to implement math. There are better implementations out there for sure, it's weird they waited for so long and implemented something that is below the standard of alternatives.

The next step is to render citations from .bib files. I hope they get that right in the future.

throwaway981572

I’m very happy they went with simple $..$ and $$…..$$

This is much more natural for anyone doing math or science and also makes it easy to copy and paste math from elsewhere

It’s better that the parser works harder than to force an inconvenient syntax on the user.

anonydsfsfs

The problem with adding so much complexity to the parser is that it's now impossible to predict how any given Markdown is going to render on Github. If Github's parser was open-source, you could at least look at the source to figure out this stuff is handled, but it isn't, so the only alternative is tedious experimentation.

One of the main draws of Markdown was that it was so simple you were rarely left guessing how any given input would be handled (though the lack of standardization hampered that).

colejohnson66

It’s simplicity is also it’s crutch. It lacks so many niche things (like math) that every implementation have their own syntax.

IshKebab

The problem with heuristic parsers is that you can't learn the rules, so the only way to use it now is with live preview.

nschloe

Author here.

One problems is that the parser doesn't work harder yet, and it will be difficult even for GitHub to make it so. The $-syntax is familiar to TeXies indeed, but if the cost is that math won't ever work properly, I'd rather have a markdowny syntax.

runarberg

The author fails to mention MathML as an alternative choice. The options aren’t just MathJax and KaTeX, but also raw MathML. MathML gives you extremely fast rendering, more font choices, copy-paste, a11y, etc. out of the box. The only downside is that Chrome is lagging behind in implementation, for that they could use MathJaX as a polyfill—as MathJax understands and is able to transform MathML—and allow Safari and Firefox users the benefits of using browsers that can render math natively.

forgotpwd16

It isn't mentioned because because they aren't really the same thing. MathML is a web native math markup language and being XML is meant to be written by machines rather humans[^0]. TeX is a markup language that MathJax and KaTeX render to a web suitable format but meant to be written by humans[^0]. Both MathJax and KaTeX have support for rendering TeX to MathML.

[^0]: Compare Pythagorean theorem written in MathML and TeX.

MathML:

  <math>
  <mrow><msup><mi>  a </mi><mn>2</mn>
  </msup><mo>+ </mo><msup><mi>b </mi><mn>2</mn>
  </msup><mo>= </mo><msup><mi>c </mi><mn>2</mn>
  </msup></mrow></math>.
TeX:

  $$
  a^2 + b^2 = c^2
  $$
I'll be surprised if someone preferred to write the first rather the second.

runarberg

Nobody writes the first example by hand[^1]. They write it in TeX (or some other easy to write dialect, or use a graphic WYSIWYG editor) and then use a tool to translate it to MathML.

What I’m advocating here is that GitHub translates TeX to MathML on their servers and serve us the MathML, as opposed to leaving the TeX as is and ship a JavaScript library to render it after it reaches our browsers. Chrome users won’t see a difference as they don’t support MathML, so they need MathJax as a polyfill. But there is a world of difference for us Firefox and Safari users.

Another benefit is that you could allow us to include the pure MathML (by hand or authored by some other tool) if we preferred that.

^1: and if they did write it by hand it would be written as

    <math display="block">
      <msup>
        <mi>a</mi>
        <mn>2</mn>
      </msup>
      <mo>+</mo>
      <msup>
        <mi>b</mi>
        <mn>2</mn>
      </msup>
      <mo>=</mo>
      <msup>
        <mi>c</mi>
        <mn>2</mn>
      </msup>
    </math>

forgotpwd16

>What I’m advocating here is that GitHub translates TeX to MathML on their servers and serve us the MathML

I see. What confused me was the "options aren’t just MathJax and KaTeX, but also raw MathML" since those tools can be used server-side and they output some HTML or MathML or images. MathJax moreover besides TeX also takes MathML and AsciiMath input.

>Another benefit is that you could allow us to include the pure MathML

I guess they went with what most people work with since rendering TeX in Markdown complements their Jupyter notebook rendering.

>and if they did write it by hand it would be written as

Maybe. I copied my example from MDN docs git repo.

nschloe

Author here. Thanks for the comment!

The main problems pointed out in the blog post are problems in _parsing_ the Markdown+TeX pages. The output could be MathML indeed, but this is an issue they could always fix later.

nmalaguti

My theory is that there’s already a lot of existing content using $ and $$ that GitHub wants to start rendering without requiring any changes.

I agree the code block approach would be less ambiguous, but there is an advantage in going where people already are.

runarberg

The pick of $ and $$ as delimiters seems rushed, to be frank. Although I’m not a big fan of mixing LaTeX in Markdown, I understand that choice (alternatively you could go with ascii-math like syntax which mixes way better with Markdown IMO). But $ and $$ makes not a lot of sense other then LaTeX does it this way. It would have been easy e.g. to use $$ for inline math and $$$ + newline or ```math for block, and that would have gotten rid of many of the warts of mixing Markdown and LaTeX.

In my opinion the familiarity of $ and $$ is sacrificing a lot for not much benefit.

auggierose

Given that the main point of using LaTeX in Markdown is familiarity of users, using $ and $$ is actually the ONLY proper choice. But yeah, it leads to problems, which is why I would not use Markdown in the first place, but some Markdown inspired format which mixes better with $ and $$.

nvrspyx

This is why I prefer AsciiDoc. It's consistent because there's only one implementation, it's less ambiguous, and more predictable. Although it takes a bit longer to remember all the syntax, it's not difficult, especially if you're only going to use the same subset of features that markdown supports since it supports most of the markdown syntax as well. I also much prefer the flexibility with tables compared to markdown. I just wish there were more parsers/converters other than the main ruby one and the transpiled JS one, although I know there's work being done on other language implementations.

As an example for math/equations, inline math is stem:[sqrt(4)], which defaults to AciiMath, but can be changed with a page attribute. To specify inline, LaTeX is latexmath:[\sqrt(2)] and AciiMath is asciimath:[sqrt(2)].

For blocks (which you can replace stem with either latexmath or asciimath to specify),

[stem] ++++ sqrt(2) ++++

nschloe

Author here.

I've been a managing editor for scientific journals for a number of years, and I can tell that -- while $ is still popular -- (almost) nobody uses $$ anymore. So I wouldn't say this is "where people already are".