Jenga Stack

als@lemmy.blahaj.zone · 1 day ago

Jenga Stack

chellomere@lemmy.world · edit-2 7 hours ago

HarfBuzz does opentype shaping, that is, transforming strings of unicode characters to lists of glyphs with positioning. The significance of this can be hard to understand for someone used to the latin script, as that needs very little shaping - kerning is often the only thing that’s absolutely necessary.

But in complex scripts, most notably the Indic, there’s a lot going on. Unicode characters can merge into one glyph under circumstances, or one character can split into several, and relative positioning in both the x and y axis is imperative.

A reason that OpenType shaping is complex is that part of the rules for what to do will be found in the font, and part will need to be hard-coded in the code implementing it.

If you’re going to roll your own text renderer, you’ll have to care about the following areas:

Rasterization/rendering to bitmaps, including hinting (notoriously difficult, old-style TrueType hinting instructions are bytecode, so you’ll be writing a tiny VM for this)
Shaping (Kerning at a minimum, full OpenType shaping for international support)
BiDi (for full international support, primarily Hebrew and Perso-Arabic)
A caching system for rendered text glyphs and shaped text runa, as it will be too slow to perform this each time you want to render some text

Let’s just say that I do not recommend going this route unless you’re prepared to spend a lot of time on it.

calcopiritus@lemmy.world · edit-2 5 hours ago

I’ve got all that. I just needed to convert a string of characters into a list of glyph IDs.

For context, I’m doing a code editor.

I don’t use harfbuzz for shaping or whatever, since I planned on rendering single lines of mono spaced text. I can do everything except string->glyphs conversion.

Just trying to implement basic features such as ligatures is incredibly hard, since there’s almost no documentation. Therefore you can’t make assumptions that are necessary to take shortcuts and make optimizations. I don’t know if harfbuzz uses a source of documentation that I haven’t been able to find, or maybe they are just way smarter than me, or if fonts are made in a way that they work with harfbuzz instead of the other way around.

As someone trying to have as little dependencies as possible, it is a struggle. But at the same time, harfbuzz saved me soo much time.

EDIT: I don’t do my own glyph rasterization, but that’s because I haven’t gotten to it yet, so I do use a library. I don’t know if it’s going to be harder than string->glyphs, but I doubt so.

chellomere@lemmy.world · edit-2 3 hours ago

It would make sense that a code editor could use a more limited subset of text rendering that could be more optimized.

Perhaps a bit surprisingly, Microsoft actually has pretty good documentation on OpenType. Here’s info on what shaping applies to “standard” scripts:

https://learn.microsoft.com/en-us/typography/script-development/standard

And here’s the landing page for the latest OpenType spec:

https://learn.microsoft.com/en-us/typography/opentype/spec/

Specifically for ligatures, you’re looking for the liga feature which is specified in the font’s GSUB table.