• Welcome to the Chevereto user community!

    Here users from all over the world gather around to learn the latest about Chevereto and contribute with ideas to improve the software.

    Please keep in mind:

    • 😌 This community is user driven. Be polite with other users.
    • 👉 Is required to purchase a Chevereto license to participate in this community (doesn't apply to Pre-sales).
    • 💸 Purchase a Pro Subscription to get access to active software support and faster ticket response times.
  • Chevereto Support CLST

    Support response

    Support checklist

    • ⚠️ Got a Something went wrong message? Read this guide and provide the actual error. Do not skip this.
    • ✅ Confirm that the server meets the System Requirements
    • 🔥 Check for any available Hotfix - your issue could be already reported/fixed
    • 📚 Read documentation - It will be required to Debug and understand Errors for a faster support response

Plural forms in Russian

Denpa

Chevereto Member
I can see how this change aimed at improving customization, but does this updated gettext library finally correctly procceses Plural-Forms gettext header?

As half Russian Finnish citizen curently living in Japan and, more importantly, as Russian translator for chevereto I need to know ;)

P. S.
Long time ago in a galaxy far, far away, I provided a little bit of insight in this problem with current
gettext implementation in chevereto here.
Final modified version that can parse aforementioned header attached below this message.
See parsePlurals(), selectPluralForm() and parenthesizePluralExpression() functions in this file.
 

Attachments

  • php-gettext.php.zip
    3.9 KB · Views: 2
Last edited:
Problem is that I'm quite confused by how OneSky handle the Russian language. As you may notice, the .po (or .mo) generated by OneSky outputs something like this:

Code:
#: ../../../app/lib/classes/class.image.php:954
msgid "view"
msgid_plural "views"
msgstr[0] "просмотр"
msgstr[1] "просмотра"
msgstr[2] "просмотра"
msgstr[3] "просмотров"

Which contains 4 indexes, that I assume are: 0->one, 1->few, 2->many, 3->other. Is that correct? I ask because in other systems I see that the .po files are like this (which doesn't have the "other" index):
Code:
#: modules/upload/upload.module:65,318,318
msgid "1 attachment"
msgid_plural "@count attachments"
msgstr[0] "@count файл"
msgstr[1] "@count файла"
msgstr[2] "@count[2] файлов"

And the link that you provided also mention 3 plural = 3 indexes, not 4. So besides that the class implementation doesn't handle anything different than n=2, I'm seeing that OneSky is handling incorrectly the Russian language. Can you confirm this?
 
Those particular strings are in fact incorrect.

The correct strings for above translation with regards to the formula present in Plural-Forms header for russian language would be
Code:
msgstr[0] "просмотр"
msgstr[1] "просмотра"
msgstr[2] "просмотров"
msgstr[3] "not used"

But those forms I can fix myself through translation site(in fact I just did ;)).

What I can't fix is that in Russian language plural forms are not as easy as 0->one, 1->few, 2->many.
Its more like
msgstr[0] for 0, 1, 21, 31, 41, ... etc - not only for 1 but for also for numbers ending with 1
msgstr[1] for 2, 3, 4, 22, 23, 24, 32, 33, 34 ... etc. - basicaly for numbers ending with 2, 3 and 4
msgstr[2] for every other cases including 0(zero)

Try to execute this little program that I wrote and see how plural forms should generate in Russian for yourself.
So the major issue is not with incorrect order of string which can be easily fixed through translation interface,
but with plural picking algorithm itself as it ignores correct formula present in po files.

For example, current gettext library returns msgstr[1] for numbers 3, 4 which is incorrect and msgstr[2] for 21 which is also incorrect.

And russian isn't even the only language that uses different expression for generation plural forms.
Check out this table http://docs.translatehouse.org/proj...est/l10n/pluralforms.html?id=l10n/pluralforms.

Modifed version on the other hand parses correct expression from po header and creates plural-form-picking function based on said expression.
What's awsome about it is that it's fast because it uses create_function only once during the lifetime of gettext object and
pretty secure as header is filtered by regular expression that restricts characters used in the expression to 0-9, n, %, &, !, ?, (, >, = , <, \ , ) ,\ , : and |
PHP:
     /**
     * Parse a number of forms and expression value from headers
     *
     * @return Array with number of forms and expression
     */
    private function parsePlurals() {
        if ( preg_match('/^\s*nplurals\s*=\s*(\d+)\s*;\s+plural\s*=\s*([0-9n%&!?\(>=<\)\\:|; ]+)$/', $this->headerTable['Plural-Forms'], $matches) ) {
            $nplurals = (int)$matches[1];
            $expression = trim( $this->parenthesizePluralExpression($matches[2]) );
            return array($nplurals, $expression);
        }

        return array(2, 'n != 1'); // Fallback to two forms for 'one' and 'many'.
    }

  /**
  * Create function to evaluate plural expression and call it
  *
  * @param Integer  $count  The number for plural form.
  *
  * @return Integer  Index of plural form for translation array
  */
  private function selectPluralForm($count) {
  if ( is_null($this->_selectPluralForm) ) {
  list( $nplurals, $expression ) = $this->parsePlurals();

  $expression = str_replace('n', '$n', $expression);
  $func = "\$index = (int)($expression); return (\$index < $nplurals)? \$index : $nplurals - 1;";
  $this->_selectPluralForm = create_function('$n', $func);
  }

  return call_user_func($this->_selectPluralForm, $count);
  }

Sorry if I wasn't as clear as I would liked.
Explaining languages is quite difficult and it's super late here which is also not helping.

Hope you understand.
 
Last edited:
I know what is the problem of the class, I'm wondering why I was seeing 4 keys instead of 3. I will find a way to correctly handle those plural forms and to save them in the cached translations.

Also, there is an issue in the OneSky files because [2] contents should be [3] and [3] shouldn't exists. Apart from parsing the plural forms the .po file must be corrected for the 3 and not 4 plural forms. If I add the plural parser and OneSky keeps outputting 4 keys the thing won't work as expected. This seems to be an issue of the Russian settings in OpenSky because other languages (Arabic for instance) has 6 plural forms and 6 keys, not 7.

I've already contacted OneSky asking them to fix this so in the meanwhile I will be adding the plural parser to Chevereto.
 
Last edited:
I see.
You're totaly right then, there indeed should only be three forms for Russian.

I'm also looking forward to see your way of handling plurals.
(cause I actually mainly bought chevereto for education :p)
Should be very interesting.

Thank you very much for looking into this problem.
Glad I was of some help.

P. S.
Will I need to redo the translation of plurals when OneSky sort everything out?
 
Last edited:
My first idea was to keep your base, which later I noticed that it was taken from WordPress. I've done small tweaks to the code and it seems that it is working fine. We will have to wait for OneSky now.
 
There were few tweaks to fit with existing library and my own regex but you're right,
the create_function method itself and parenthesizePluralExpression function does come from WordPress with a little bit of Joomla mixed in ;)

I also found an alternative method(tokenizer without eval) in drupal source code but it was too slow, so I decided that one created on-fly function with regex filtering was good enough.

By the way I was using class attached in the op all the way since 3.0.5 so it's pretty well-tested.
Strangely enough OneSky correctly sets nplurals in header to 3 despite having four keys so it(class) never complained. :confused:
 
Last edited:
Same here, I looked the Drupal code because I didn't liked the call_user_fn alternative but like you said is slower and evaluates only from 0 to 100.

At the end I used the same logic of WordPress but I cached the function so it doesn't need to re-parse anything.
 
Sorry to interrupt the conversation, but the same problem in the Ukrainian language.

Было бы не плохо если и в Украинском языке будет так же как и в Русском. ;)
 
@Rodolfo
Apparently OneSky also has the same issue with Ukrainian(and possibly other slavic languages).
This language also uses only 3 forms(nplurals=3 in header) but has four keys in translation table.
Although it seems that translator incorrectly used single form for all cases.

@newkos
Обсуждаемый выше заголовок присутствует во всех gettext(как mo, так и po) файлах, и соответственно будет обрабатываться новой версией класса для всех языков.

Судя по формуле, Украинский использует схожее с Русским формирование множественных чисел.
Если это так, то вероятно в переводе эти формы указаны неверно(одна форма для всех чисел >1).
К сожалению, я не достаточно уверен в своих знаниях Украинского языка(да и Русского если честно :oops:), поэтому
не могли бы Вы проверить и по возможности исправить перевод на http://translate.chevereto.com/

В соответствии с указанным в заголовке Plural-Forms выражением для Украинского языка,
первое поле используется для 1, 21, 31 и т. д.,
второе для 2, 3, 4, 22, 23, 24, 32, 33, 34 и т. д.,
третье для всех остальных случаев включая 0,
четвертое не используется и, судя по этой теме, вообще не должно существовать. ;)
 
Last edited:
Someone at OneSky told me that they used the specs from the unicode standards, which indeed has more than 3 plural forms but they are just applying it wrong. We will have to wait a little bit more.
 
Last edited:
@Rodolfo
Oh my−
I swear, wording in some standards makes simplest of things confusing as heck. (~_~;)
For example a so called "few" tag. How they like 1000002 for a few σ(^_^)

From what I read here, I got that fourth "other" form is needed for fractions.
Hovewer, as you said, gettext have no standard way of handling fractions,
so adding this form to translation table is unpractical and confusing for translators and software alike.

Hope they'll understand that.

@newkos
Всегда пожалуйста!
 
Last edited:
@Rodolfo
Oh my−
I swear, wording in some standards makes simplest of things confusing as heck. (~_~;)
For example a so called "few" tag. How they like 1000002 for a few σ(^_^)

From what I read here, I got that fourth "other" form is needed for fractions.
Hovewer, as you said, gettext have no standard way of handling fractions,
so adding this form to translation table is unpractical and confusing for translators and software alike.

Hope they'll understand that.

@newkos
Всегда пожалуйста!

Problem is that they are not following the gettext formulas at all. They are just going to the Unicode standards website, and set those keys as things that people need to translate. The result is 4 keys instead of 3 keys which causes that gettext doesn't work as expected.
 
Yeah, I understand that.

Kinda weird as they're translation service and not some unicode converter.
Gettext standard should take precedence over unicode standard for them.

In case they decided to leave it as it is, can't we just set 3rd and 4th form to the same value and be done with it?
I mean they have right expression that can only return 0, 1, 2 and nplurals set to 3 so fourth form won't ever be used anyway.
 
Last edited:
I'm sorry Rodolfo, but plural forms still appear to be broken. :(
Chevereto only uses first and second form like it did before.
For example here is number of views that's ending in five(actually is 5 in this case), so a third form(просмотров) must be used.
Got second one instead. Same with minutes.

edit.
Try with attached po file, that I confirmed working correctly with modified class after converting it to mo.
Maybe the problem was with exporting from OneSky.

1b5830bacc5dba4b140455f274f5714f.png


From OneSky:
1d39761a21d9403272bf6849ed23ea87.png
 

Attachments

  • ru.po.zip
    30 KB · Views: 1
I never said that this is already fixed.
Ahhh... I see.
My apologies then, I incorrectly assumed it from
- Fixed bug with translations that have complex plural forms
It also seems that OneSky now generates PO file with only three forms instead of four so they kinda fixed it.

Take your time then.
It's not such a big problem anyway.
 
@Rodolfo
Yep, everything is working as expected now.
Thanks again for looking into this.
For project as complex as chevereto your customer service is truly outstanding! ❤
 
Back
Top