<ruby>: The Ruby Annotation element

dalke

Interesting!

I noticed "It can also be used for annotating other kinds of text" and wanted to experiment with being able to number specific letters in a string.

More specifically, SMILES is a linear molecular structure notation. "O" is water, "COO" is ethyl alcohol, "c1ccccc1" is a benzene ring, and much more. (See https://en.wikipedia.org/wiki/Simplified_molecular-input_lin... .)

I want to annotate atom positions in the SMILES string. I currently do this with ("pip install smiview") text over the string, as this example using phenol.

  1 23456 7
  c1ccccc1O
I wanted to try it with ruby so I used:

  <ruby>
  c<rt>1</rt>1c<rt>2</rt>c<rt>3</rt>c<rt>4</rt>c<rt>5</rt>c<rt>6</rt>1
  </ruby>
The "2" is located over the center of "1c" instead of over the second "c" like I wanted.

How do I get it to center only over the "c"?

I tried changing the CSS too, using this catch-all:

  ruby {
      font-size: 2em;
      ruby-align: center;
      text-align: center;
  }
  * {ruby-align: center;}
No luck. I also tried wrapping things in a span, like:

  c<rt>1</rt>1<span>c<rt>2</rt></span>c<rt>3</rt>
but got the 3 as a new ruby line, centered over the "1cc", itself with a ruby "2" between the second and third "c".

I tried other combinations of <span>, to no avail.

vore

Multiple ruby spans, perhaps?

  <ruby>
   c<rt>1</rt>
  </ruby>1<ruby>
   c<rt>2</rt>
  </ruby><ruby>
   c<rt>3</rt>c<rt>4</rt>c<rt>5</rt>c<rt>6</rt>1
  </ruby>

dalke

Yes, that was the solution. Thanks!

twic

CCO is ethanol. COO is methyl hydroperoxide.

SMILES is a lovely standard, really simple and easy enough to write by hand, but powerful enough to describe real molecules in detail. Not long ago, i saw a chemical structure used as part of an illustration, and wondered what it was. I transcribed it as SMILES, put it into a chemical search engine, and found out what it was (nothing interesting!).

dalke

D'oh! Yup. You think I would know that by now. I'll attribute it to trying this out being the weekend, in the down times between the kids' bedtime routine.

gvx

As far as I can tell, the only content inside of a ruby tag should be annotated text or its annotation, the 1 that should not be annotated should not be inside the <ruby> tag.

I get a good result with this:

    <ruby>c<rt>1</rt></ruby>1<ruby>c<rt>2</rt>c<rt>3</rt>c<rt>4</rt>c<rt>5</rt>c<rt>6</rt></ruby>1

alin23

Indeed, this looks quite good: https://cln.sh/5MWsIR

dalke

Thanks to you for the working demo, and to gvx for figuring it out!

UPDATE: Here's an example for theobromine - https://jsfiddle.net/j84z1kyb/ .

It looks great!

But it's not as useful as I hoped it would be. Copy&paste captures the numeric annotations. I probably should have expected that, but didn't.

And highlighting is wonky. In Safari and Firefox, I seem to get a character and the ruby annotation for the character next to it, more often than I do the one overhead.

OJFord

I know nothing about this, but use a fixed width font perhaps (if you weren't)? Sounds like it could be because the '1' is narrower.

danschuller

This is a cool.

It's something I've toyed with putting into my toy font renderers but it always seemed like it had a lot of edge cases. Length of the ruby text overflowing the width of the parent, in some to most cases a little overflow is ok but it's certainly not guaranteed. Scaling down the ruby text isn't the ideal solution because it quickly becomes unreadable. The other option is to scale the spacing in the parent text, which seems to be done for <ruby>境界面<rt>インターフェース</ruby> in the specification https://html.spec.whatwg.org/multipage/text-level-semantics.... but then that's going to impact the line wrapping and so on. Kudos to the implementators!

robin_reala

Weirdly I’ve only used this in anger outside of Japanese text, to replicate a semantic layout in the original printing of Tristram Shandy for Standard Ebooks (see book 3, chapter 11, the Latin version: https://standardebooks.org/ebooks/laurence-sterne/the-life-a...).

kingcharles

What does the "ruby" gloss mean in the linked text? Why do some words have it? It's been many a decade since I took Latin...

hcayless

At a quick glance, it looks like it’s giving plural forms as alternatives. So the document can refer to a person or a group.

kazinator

I used that for the furigana over the made-up words in a Jabberwocky translation.

http://www.kylheku.com/~kaz/gayabōkin.html

wodenokoto

The `<rp>` tag showed in the examples, isn't explained on the page but is a fallback - something that should be rendered if the ruby tag is not understood.

Sadly, the rp page doesn't show any examples of what fallback behavior might look like.

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/rp

kelnos

In general, browsers will just drop tags they don't understand, while still rendering the tags' contents (as normal in-line text). So if, like in the example, you did:

    <ruby>
    明日 <rp>(</rp><rt>Ashita</rt><rp>)</rp>
    </ruby>
But the browser didn't have support <ruby>, it would just render as:

明日 (Ashita)

The idea is that a <ruby>-supporting browser wouldn't need to render the parentheses, because it's going to display "Ashita" in a special way that sets it off from the regular text (so you "annotate" the parens with <rp>). But in a browser that doesn't support <ruby>, you'd want it to still display in a sane, understandable way, where it would still be easy to understand that the added text is a pronunciation hint.

shawnz

Simply remove all the tags to see how the fallback behaviour would look, for example data:text/html;charset=utf-8,%E6%BC%A2%20(kan)%20%E5%AD%97%20(ji)

kingcharles

I assume this was added as a way to implement language features such as Furigana, which is a minor, but useful, feature of written Japanese: https://en.wikipedia.org/wiki/Furigana

Would this make sense for putting romanized text above non-roman languages?

Currently the standard is just to write the native text and then the romanized. See, e.g.: https://en.wikipedia.org/wiki/Weekly_Sh%C5%8Dnen_Jump

I'm thinking Hepburn above the kana: https://en.wikipedia.org/wiki/Hepburn_romanization

Anyone with more language knowledge want to cuss me out this idea?

divbzero

MDN describes <ruby>’s typical usage for showing pronunciation of East Asian characters but only gives examples for Japanese.

Wikipedia [1] offers a few additional examples for other languages.

Chinese (pinyin):

  <ruby>
    北 <rp>(</rp><rt>běi</rt><rp>)</rp>
    京 <rp>(</rp><rt>jīng</rt><rp>)</rp>
  </ruby>
Chinese (zhuyin):

  <ruby>
    北 <rp>(</rp><rt>ㄅㄟˇ</rt><rp>)</rp>
    京 <rp>(</rp><rt>ㄐ丨ㄥ</rt><rp>)</rp>
  </ruby>
Korean (hangul):

  <ruby>
    韓 <rp>(</rp><rt>한</rt><rp>)</rp>
    國 <rp>(</rp><rt>국</rt><rp>)</rp>
  </ruby>
Vietnamese (chữ Quốc ngữ):

  <ruby>
    河 <rp>(</rp><rt>Hà</rt><rp>)</rp>
    內 <rp>(</rp><rt>Nội</rt><rp>)</rp>
  </ruby>
[1]: https://en.wikipedia.org/wiki/Ruby_character

antonkar

I used it in my old and free iOS web browser to put translation (Spanish, French…) or Pinyin on top of English or Chinese words https://apps.apple.com/app/id932996489

akaBruce

For those studying a language that might use benefit from this, I have this CSS in my Anki cards. I use the ruby tag to remind me of readings for things that aren't the main focus of the card I'm working with. For example, if a vocab word is used in an example sentence, but one of the other words in the example is unfamiliar to me.

It shows the rt tag on hover or focus and works for me for both mouse and touch on Anki and AnkiDroid. Maybe this or some variation might help others as well.

  ruby {
   text-decoration: underline dotted;
  }
  ruby rt {
   visibility: hidden;
  }
  ruby:hover rt, ruby:focus rt {
   visibility: visible;
  }

nyuszika7h

When I read the headline I thought it's some weird new syntax for embedding Ruby code snippets in HTML.

This is definitely not going to be confusing... /s

alin23

My initial title was:

    The <ruby> HTML element
But the ruby tag got stripped by HN and I ended up with

    The  HTML element

Izkata

Hum... Does it accept

  &lt;ruby&gt;

?

alin23

Not really, it renders:

    The andlt;Ruby> HTML element
https://cln.sh/LteC0S

null

[deleted]

makach

I thought the same. Now that I read the documentation I think I will be fine and confusion less likely.

7c7599bfe5df

It's been around for a decade in browsers[1], and the terminology itself precedes that of the ruby language.

You haven't been confused for the last 11 years, so this doesn't seem to have been a problem.

[1] The W3C spec is even older.

notreallyserio

> You haven't been confused for the last 11 years

You underestimate me! Or over.

kingcharles

Now I'm even more confused!

thrashh

I think I used it in 2005 so it’s definitely olddd.

null

[deleted]

jhvkjhk

Although Ruby element is basically a Japanese thing, it can be used to display both original text and translated text. I think using ruby rather than two-column-view is far better.

For example, this script[1] will show original English word above Japanese loan words, using the ruby element.

[1]: https://greasyfork.org/en/scripts/33268-katakana-terminator

LAC-Tech

> Although Ruby element is basically a Japanese thing

It's commonly used in textbooks for different Chinese languages. I have ministry of education textbooks from Taiwan, and ruby characters are used for both Hokkien and Mandarin (the Hokkien one has two different ruby character scripts which is quite visually busy).

I would imagine it would be handy for Hindi learners as well. And probably hundreds of other languages, though I can't speak if it is used.

philsnow

This is neat, and immediately made me think of the annotations that show up when you hit the play button on https://lowerquality.com/gentle/ , but it turns out those are made with absolutely-positioned divs and a lot of offline-precalculated px math.