Open Clonk in Polish.

Not logged inOpenClonk Forum

Forum Home Help Search Watchlist Register Login

Topic Development / Developer's Corner / Open Clonk in Polish.

Post

By Don

Date 2011-03-27 15:31

Hello, I come from polish Clonk forum. As the Clonk Rage, isn't developing, Open Clonk is our forum last hope. I come with offer, I can translate Open Clonk files from english to polish, making the game translated. What do you think about that? I need only files to translate.

(By the way, adress of polish clonk forum is : www.clonk.glt.pl )

By Günther [de]

Date 2011-03-27 17:55

Welcome to OpenClonk! We haven't done any work on the translation facilities yet, but the old mechanisms from Clonk Endeavour should still be working. We'd need to get the scripts from Redwolf Design to extract the strings, but that shouldn't be a problem. That said, the old translation method was rather adhoc. Perhaps the arrival of actual interest in translations will motivate someone to write gettext support?

In the meantime, the files are all available with hg.

By bahamada [de]

Date 2011-03-27 20:49

I don't know if it helps but in Launchpad there is a quite good community tool to do translations via a web interface. I don't know how exactly it works but this could be a possibility to provide localizations to many other countries too.

Edit:
Explanation on translations with launchpad.

By Günther [de]

Date 2011-03-27 21:42

Launchpad is indeed an example of the advantages using gettext has. I'm not that fond of launchpad itself, but there are lots and lots of other tools that support gettext.

By bahamada [de]

Date 2011-03-28 14:20

when you scroll down a bit they write:

>With Launchpad, all you'd need to do is publish a "translation template"
>-- in the form of a GNU Gettext .pot file -- via a web form when it best
>suits you (for example, when your next version is feature complete). Once
>published in Launchpad Translations, it would be made available to the
>community of translators who would be then able to translate it. When
>you're ready to release your next version, you would request an 'export' of
>all the available translations, which you would get in a tar.gz archive.

So would the use of gettext make sense for us?

By Caesar

Date 2011-03-27 21:09

We could also just include the translation files into the game code.

By AlteredARMOR [ua]

Date 2011-03-28 07:47

By the way, that leads us to an important question.

As OpenClonk being developed further on and our community grows larger and larger we would at one point come to the necessity of translating OpenClonk to other (rather than English and German) languages. And when that happens we would not be satisfied with the current "translation mechanism" (which is currently not much better than no mechanism at all).

We will need an easy and comfortable system for providing adequate translation for all strings available ingame. Many (mainly open-sourced) projects provide a list of all strings that are present inside an application with the possibility for anyone to post an appropriate translation on any language they like. Asides from that we can consider developing a tool to help creating translation packs in an elegant manner.

Not an issue of high urgency though. We can consider at a later time.

P.S. I personally can (and would love to) help in translating OpenClonk on Russian and Ukrainian languages ;-)

By PeterW

Date 2011-03-28 15:31

> And when that happens we would not be satisfied with the current "translation mechanism" (which is currently not much better than no mechanism at all).

Huh? How much better than having string tables can we do? As far as I remember that's what gettext does at well, only they provide a few more tools to work with them. Maybe we'd just make our format match that "po" stuff more closely?

By Günther [de]

Date 2011-03-28 19:02

gettext has also some features to deal with different rules for embedded numbers (not all languages simply append an s for plural or even just have two forms for singular and plural), and the format uses the original English string (optionally with an additional key for disambiguation) as the key to lookup the string. It also provides a more efficient binary format to use at runtime. There's probably more.

By AlteredARMOR [ua]

Date 2011-03-29 07:57

Yep, string tables are perfectly fine.
Now all we need is a user-friendly tool for editing them :-)

By PeterW

Date 2011-03-29 10:54

Like, say, a good text editor? :P

What more than "here's the English text, what's that in your language?" can a tool realistically do? The only problem with that approach can be the lack of context, but that's difficult to establish... Showing the source code probably won't help most non-technical folks, for example.

By AlteredARMOR [ua]

Date 2011-03-29 11:07

> "here's the English text, what's that in your language?"

That is exactly how I imagined "the tool". This should be able to display two columns of text where first column would contain Endlish words/phrases/sentences and the second one - edit fields for translation onto other language.
The abovementioned "user-friendliness" can consist in automatically applying performed changes (so user will not have to worry about where to place his text files)

By PeterW

Date 2011-03-29 12:32

Just open two text editors, one with the original string table and one you're working on?

By Caesar

Date 2011-03-30 15:55

Some text editors (e.g. N++) feature opening two files in split screen mode with synchronized scrolling. That's about what you want.

By AlteredARMOR [ua]

Date 2011-03-31 07:54

Yep, this is :-)

By Günther [de]

Date 2011-03-29 14:33

And how is a text editor going to help you when the original text slightly changed? Even the puny Clonk Endeavour translation solution had more tools than a text editor, and it only worked because the changes to CE after release were limited. Take a look at tools/language in the redwolf repository.

By PeterW

Date 2011-03-29 15:13

Yes, that's what I was thinking about as well. But that's not really the translators task - we would probably just give out lists of new strings, so the translators would not have to think about this. And we can be bothered to use diff and grep once in a while.

By Sven2

Date 2011-03-29 11:20

What bugs me in the current string table solution is that all strings to be used in definitions are local at the object definition they're used. It would be really useful (and easy to implement?) to have string table entries that are not found fall back to the parent definition's string table - and ultimately, to the main System.c4g string table.

That way, you wouldn't have to redefine generic strings like "Activate item" or "Eat" a hundred times within the same object pack.

It would also be possible for an object pack to have ALL strings ever used in a single file, which would make the job much easier for translators.

By PeterW

Date 2011-03-29 12:29

Agreed, that could be a worthwhile thing to have. But note that reusing strings can be a mixed blessing: It might well be that the same short phrase means something different per context. We already had a documentation mistranslation due to the German "löschen" meaning both "delete" and "extinguish".

By Günther [de]

Date 2011-03-29 14:40

Gettext solves this, of course, with a feature called (surprise!) contexts.

By PeterW

Date 2011-03-29 14:58

Well, we get the context for free right now. I'm arguing that we might not want to throw that away - reintroducing it using "particular" functions doesn't really seem that appealing to me.

By Günther [de]

Date 2011-03-29 18:54 Edited 2011-03-29 19:06

It's really a straightforward tradeoff: The cost of duplicate translation effort versus the cost of noticing an ambiguity and fixing it by introducing the context. Weighted by the particular distribution of strings in the project and the amount of code movement that makes the "free" context change. I think it's clear that the advantage is on the side of the "gettext" method, but we could of course try to measure it: Just count the duplicate strings in planet/ and how many of them have different translations, and how many of those differences are unwanted. Then estimate how many translations we're going to get, and multiply the cost of duplication by that. And if you're assuming that we're using gettext, you can subtract a little due to the tools helping with duplication via databases of common translations. (Making a gettext-based solution automatically add the file path based context would be trivial.)

By PeterW

Date 2011-03-29 19:22

You still miss my point: With Svens suggestion we already have both contextual as well as less contextual strings. So gettext is just overhead, minus some tools.

By Günther [de]

Date 2011-03-29 22:16

The tradeoff is the same: You either collapse strings or not. If you do, the difference between having every translator adding a more specific translation or a developer adding a context is minimal - the work is in finding the ambiguous text.

What you mean with "overhead"? gettext is almost certainly faster, especially with large stringtables, and would require less code in the engine than the current solution (some of which is in c4group...)

By PeterW

Date 2011-03-30 00:07 Edited 2011-03-30 00:10

That we need to spend time on actually doing it. And discussing how we should do it. It just seems a bit pointless to me to constantly shift our technologies around in pursuit of marginally better solutions. I mean, if it was an elegant way with clear advantages I would be all for it. But gettext has the typical look of something that started as a quick hack and then "organically" grew into something that can theoretically do everything, but practically fails at doing anything well. The documentation is about 250 A4 pages long, for god's sake. Do we really want to replace a simple string-string mapping with this monstrocity?

Sigh. It just seems to me that there are so many more productive ways to work on the engine. But well, if someone finds this sort of work motivating, by all means...

By Günther [de]

Date 2011-03-30 00:51

We don't actually have a "simple string-string mapping". We have at least three, and they aren't simple. There's lots of support code in the engine and redwolfs repository for it. I even wrote a hash table to fix a real performance issue! (I probably should have used std::map instead, but well.) If every file in tools/language is a separate tool, we have about the same amount of tools gettext has, except that the gettext ones are maintained by somebody else, have documentation (which is a good thing ;-)), and have more features, some of which we want but haven't yet.

When you ignore all the irrelevant aspects of the documentation, gettext is also simple. Take a look at docs/Makefile: There's one line to extract the source strings to doku.pot, one to update the .po files (commenting out no-longer-needed translations, deactivating translations whose source has slightly changed, adding new strings), and one to create the binary file to speed up lookups, and one to created the translated output files. (xml2po.py itself is more complex, but that is XML for you.)

I doesn't seem to me that extending our custom code to reimplement already existing features or reading through the old tools to learn how they work are that productive either. At least learning gettext will help you with working on basically any other open source project.

By PeterW

Date 2011-03-31 12:41

> We don't actually have a "simple string-string mapping". We have at least three, and they aren't simple

Well, one reason that they aren't simple is that they're designed to - at least in theory - work as "add-ons" that could theoretically even translate user-defined contents. To support that, I suppose (hope) even gettext wouldn't end up doing everything with just one file.

> When you ignore all the irrelevant aspects of the documentation, gettext is also simple.

For sufficiently complex definitions of "simple" ;)

By Günther [de]

Date 2011-03-31 14:33

> Well, one reason that they aren't simple is that they're designed to - at least in theory - work as "add-ons" that could theoretically even translate user-defined contents.

No, mostly because they "'organically' grew". Even the old Script.c+ScriptDE.c/ScriptUS.c stuff was still in there, so there are at least four approaches, though some share at least some code with each other. But there's no excuse for LoadResStr and the stringtables using two entirely different approaches and implementations.

> To support that, I suppose (hope) even gettext wouldn't end up doing everything with just one file.

Of course.

> For sufficiently complex definitions of "simple" ;)

Unfortunately, internationalization is not a simple problem.

By bahamada [de]

Date 2011-03-30 19:06

As I posted earlier there is a great interface available which is be used by a great number(1202) of projects.
Reviews are available here and here
Due to our ubuntu ppa we also have the option allready available.
Detailed explanations can be found here.

By Newton

Date 2011-03-31 01:55

Lets say Clonk has been translated completely into Ukranian at one point. If the english string now changes or new strings are added, others removed, how are single strings in the Ukranian translation marked as obsolete in the current system?

By Günther [de]

Date 2011-03-31 13:51

I think there's a script in tools/language in the redwolf repository that does that, though it obviously has to fail when original strings changed but their key stayed the same.

By PeterW

Date 2011-03-31 14:04

Obviously, because we have no "original" string. You'd have to look for the changes in the stringtable(s).

By Günther [de]

Date 2011-03-31 14:32

Okay, it could work if it got the information out of the repository. Well, nobody implemented that yet.

By Caesar

Date 2011-03-31 14:51

It's not that easy. The script shouldn't go off if you just fixed a misspelling.

By PeterW

Date 2011-03-31 15:12

Wouldn't hurt too much - translators would easily see that.

By Caesar

Date 2011-03-31 15:28

Well, how would it be turned off?

By PeterW

Date 2011-03-31 16:58

It doesn't need to? Even if it "goes off", there's a human looking at the result which can just ignore the small change.

By Günther [de]

Date 2011-03-31 23:44

gettext uses the full original string as key instead of something made up, and has heuristics to detect small changes and to make it easy to reuse an existing translation. Sure, that makes typo create some work for the translators, but that's considered better than missing substantial changes to the original text, and missing out on the saved work from duplicate original texts.

Topic Development / Developer's Corner / Open Clonk in Polish.

Post