Jun 23, 2015

Experiment on Roslyn C# compiler: Translatable Strings

Basically anyone can use resources, gettext or managed-commons-core, to translate (localize) strings in its C# code, and it can even be kind of terse like this sample using managed-commons-core:
using System.Collections.Generic;
using Commons.GetOptions;
using static System.Console;
using static Commons.Translation.TranslationService;

namespace TestApp
{
    class AppCommand
    {
            // Returns the translated form of "First mock command"
            public virtual string Description { get { return _("First mock command"); } }

            public virtual string Name { get { return "alpha"; } }

            // Returns the translated form of "Command {0} executed!" with Name substituted
            public virtual void Execute(IEnumerable args, ErrorReporter ReportError)
            {
                WriteLine(TranslateAndFormat("Command {0} executed!", Name));
            }
        }
}
Then enters C# 6.0 with its fantastic new feature called interpolated strings and now that last method in our example can't be optimized to use the new feature because:
public virtual void Execute(IEnumerable<string> args, ErrorReporter ReportError)
{
   WriteLine(_($"Command {Name} executed!"));
}
would in truth first format and then try to lookup a translation, which would be truly the wrong thing to happen...
This experiment, I've started, would allow for C# 7 a new syntax for translatable strings that would make that snippet into:
using System.Collections.Generic;
using Commons.GetOptions;
using static System.Console;

namespace TestApp
{
    class AppCommand
    {
            // Returns the translated form of "First mock command"
            public virtual string Description { get { return $_"First mock command"; } }
            public virtual string Name { get { return "alpha"; } }
            // Returns the translated form of "Command {0} executed!" with Name substituted
            public virtual void Execute(IEnumerable args, ErrorReporter ReportError)
            {
                WriteLine($_"Command {Name} executed!");
            }
        }
}
Interpolated strings can return an IFormattable, and thus one can easily do some localization (number formatting for instance), but not truly translation, so this feature is interesting beyond the small gain on shortening code, for the other cases.
But the killing feature that adding this tentative feature to the compiler would allow us is to have the extraction of translatable texts done by the compiler, as it does for xml documentation, if the right command line parameter is specified.
$_"Command {Name} executed!" would be extracted as "Command {0} executed!", automagically.
All is well but some may ask as this, which looks a lot like the way gettext does things would work for extracting to a .resx file, where keys can't be arbitrary strings. Well for this scenario the compiler would generate SHA1 hashes as keys and insert the hashing while calling the TranslationService behind the scenes. TranslationService is a pluggable infrastructure that can have 'translators' sourcing their translations on resources, .mo files, hard-coded dictionaries, whatever...
My experimentation will use managed-commons-core, which I'm the core developer/maintainer, as the backend but if real merit is found on this discussion, surely the runtime team will have to come forward and implement something like it, or just borrow the logic from my implementation there, which MIT-licensed.

  1. Code
  2. Issue