Lexbor – An open source HTML Renderer library

Daily Digest email

Get the top HN stories in your inbox every day.

chearon

The title made me think this could actually layout and paint HTML, but I couldn't find anything remotely layout-related in the source tree. Then I found this comment saying even block sizing isn't done: https://github.com/lexbor/lexbor/issues/219#issuecomment-207.... Looks like a nice groundwork, though. It's nice to see things like parsing and Unicode being part of the same source tree.

nicoburns

We have a decent chunk of layout and paint implemented in an HTML renderer I'm working on (https://github.com/DioxusLabs/blitz), which is targeting the "electron" use case (but with a rust scripting interface rather than a JS one).

The implementation is currently very immature and there are a lot of bugs and missing features (I only got a first cut of inline layout working yesterday (but we already have flexbox and grid implemented)), but we're already seeing pretty decent results on a bunch of real-world web pages and hope to be at the point where we can render most of the web (excl. JS) in the next 6 - 12 months.

There are some screenshots on the PR for the inline layout branch https://github.com/DioxusLabs/blitz/pull/63

yencabulator

Sometimes it's really hard to tell the exact boundary between current day software development and elaborate jokes:

> Blitz builds upon:

> Parley for text/inline-level layout

> Currently, Parley directly depends on four crates: Fontique, Swash, Skrifa, and Peniko.

> Peniko builds on top of kurbo

Kiro

I interpreted your comment as this being unfinished but then I heard that PHP has already switched from libxml2 to Lexbor so I guess it's production-ready.

lmz

I guess PHP isn't using it for Rendering (as in the title), just the parsing parts.

bratao

We have been using https://github.com/rushter/selectolax as a faster alternative to BeautifulSoup with html5lib because many malformed webpages in the wild don't work with lxml.

nwellnhof

The problem is that libxml2's 20-year old HTML parser never supported HTML5 [1], leading to more and more problems with downstream consumers like lxml, PHP or Nokogiri. PHP recently switched to Lexbor [2] and Nokogiri to libgumbo [3]. That said, I'm hopeful to receive enough funding to implement a HTML5 parser in libxml2.

[1] https://gitlab.gnome.org/GNOME/libxml2/-/issues/211

[2] https://wiki.php.net/rfc/domdocument_html5_parser

[3] https://github.com/sparklemotion/nokogiri/issues/2204

postepowanieadm

libxml is xml parser, html5 is not xml.

tedunangst

It's a bit late to be saying that to people already using libxml because "It should be able to parse "real world" HTML." https://gnome.pages.gitlab.gnome.org/libxml2/devhelp/libxml2...

sgc

Speaking of which, I don't understand why not. It seems like it would have been trivial to keep html5 a true xml. I do not understand what the actual technical reason for not doing that was. Naively, it just seems like breaking compatibility out of disdain rather than actually useful progress. Saving a couple of characters every once in a while does not justify the change, so I presume there must be a better reason?

thomasfromcdnjs

Ah this answers my question in another comment.

Thanks!

hliyan

Rarely does one see a C++ quick start guide that's actually this quick: https://lexbor.com/docs/lexbor/#quick_start

lelanthran

> Rarely does one see a C++ quick start guide that's actually this quick: https://lexbor.com/docs/lexbor/#quick_start

Could be because it isn't C++?

zamadatix

Step 1 is a bit of a "draw the rest of the owl" step in that it's either done for you on your specific platform with default settings already or you have to go do all of the actually hard stuff of building the app (and sure enough that's where the typical cmake build step is hidden as well). Step 2 is just "and remember to link your code against the hard part when you compile it, by the way here's a single minimal example".

Maxatar

Step 1 is:

  cmake .
  make
  make install

boxed

C, not C++

hartator

We open sourced our Ruby bindings and port:

- https://github.com/serpapi/nokolexbor

- https://serpapi.com/blog/nokolexbor-a-performance-focused-ht...

It is super fast compared to Nokogiri with libxml.

thomasfromcdnjs

Inspiring infrastructure.

The module aspect is super cool, is there much adoption with any other projects using the individual modules? e.g. a webparser using the dom module

troupo

Quite unusual to see Elixir among languages supported via bindings

lelanthran

> Quite unusual to see Elixir among languages supported via bindings

Not due to difficulty, usually. Bindings to non-mainstream languages are unusual to see.

I never heard of a language that couldn't interface to C in one way or another; it's one of the advantages of using C over (say) C++.

Daily Digest email

Get the top HN stories in your inbox every day.