How do I design a multi-language database?

orsetto@lemmy.dbzer0.com · edit-2 3 months ago

How do I design a multi-language database?

TehPers@beehaw.org · 3 months ago

Localization is a hard problem, but storing your translations in the DB is a bit unusual unless you’re trying to translate user data or something.

I’d recommend looking into tools like Project Fluent or similar that are designed around translating.

As for the schema you have, if you’re sticking with it, I would change the language into an IETF language tag or similar instead. The important part is that it separates language variants. For example, US English and British (or international) English have differences, Brazilian Portuguese and Portugal Portuguese have differences, Mexican Spanish and Spain Spanish have differences, etc.

Using an ID instead of the text content itself as part of the PK should be a no-brainer. Languages evolve over time, and translations change. PKs should not. Your choice of PK = (TextContentId, Language) is the most reasonable to me, though I still think that translations should live as assets to your application instead to better integrate with existing localization tools.

One last thing: people tend to believe that translating is enough to localize. It is not. For example, RTL languages often swap the entire UI direction to RTL, not just the text direction. Also, different cultures sometimes use different colors and icons than each other.

orsetto@lemmy.dbzer0.com · 3 months ago

Sorry, I didn’t think to add in the post that the translations are in fact of user generated content, and are themselves provided by users.

Project Fluent is still a good resource tho, thank you.

And also yeah, I’ll use a better schema for language tags, that’s a clear fault

Using an ID instead of the text content itself as part of the PK should be a no-brainer. Languages evolve over time, and translations change. PKs should not.

~~I still don’t get why having a separate table for languages is useful. I mean, even if the translation changes, the language itself will remain the same, right?~~

Oh, right. Taking into account language variants makes VERY obvious why I’d want to use a table to store them.

people tend to believe that translating is enough to localize. It is not.

Onestly, I just hope that won’t be something i should have to worry about. The rest of the codebase is as shitty as it gets, and I don’t want to be the one to refactor it for proper localization. I’m implementing a new feature that allows me some degree of movement to think about a good design for that, and new, features, but this is as far as I’ll go (Yes I know I probably sound like an ass but it really is that bad)

TehPers@beehaw.org · 3 months ago

I know I probably sound like an ass but it really is that bad

Nah I work in shitty codebases on a regular basis, and the less I need to touch them, the happier I am.

With regards to other localization changes, it’s not important to localize everything perfectly, but it’s good to be aware of what you can improve and what might cause some users to be less comfortable with the interface. That way you’re informed and can properly justify a sacrifice (like “it’d cost us a lot of time to support RTL interfaces but only 0.1% of users would use them”) rather than be surprised that there even is one being made.

Also, user-generated content explains why these are in a DB, and now it makes a lot more sense to me. User-generated translations used as-is makes more sense than trying to force Project Fluent (or other similar tools) into it.

orsetto@lemmy.dbzer0.com · 3 months ago

I mean for now it’s not being requested to add other languages beside italian and english, and i’m pretty sure my employer will never care about languages he doesn’t speak, so chances of languages that require some work other than translations are basically null.

xianjam@programming.dev · 3 months ago

Why is storing resources in the database unusual? I’ve done that my entire career, and I’ve believed it to be a best practice.

TehPers@beehaw.org · 3 months ago

Storing UI assets in a database is unusual because assets aren’t data, they are part of your UI. This is of course assuming a website - an application may choose to save assets in a local sqlite database or similar for convenience.

It’s the same reason I wouldn’t store static images in a database though - there’s no reason to do so. Databases provide no additional value over just storing the images next to the code, and same with localizations.

User-generated content changes things because that data is now dynamically generated, not static assets for a frontend.