We use i18next to handle our localization requirement. We have written in great detail how we use i18next and react-i18next libraries in our applications.
As our translations grew, we realized instead of adding every combination of the texts as separate entries in the translation file, we can reuse most of them by utilizing the i18next interpolation feature.
Interpolation is one of the most used functionalities in i18n. It allows integrating dynamic values into our translations.
{
"key": "{{what}} is {{how}}"
}
i18next.t("key", { what: "i18next", how: "great" });
// -> "i18next is great"
As we started to use interpolation more and more, we started seeing lot of text
with irregular casing. For instance, in one of our apps, we have an Add
button
in a few pages.
{
"addMember": "Add a member",
"addWebsite": "Add a website"
}
Instead of adding each text as an entry in the translation file as shown above, we took a bit of a generic approach and started using interpolation. Now our translation files started to look like this.
{
"add": "Add a {{entity}}",
"entities": {
"member": "Member",
"website": "Website"
}
}
This is great, but it has a slight problem. The final text formed looked like this.
Add a Member
We can see the Member
is still capitalized, we needed it to be properly
sentence-cased like this.
Add a member
We first thought we would just add .toLocaleLowerCase()
to the dynamic value.
t("add", { entity: t("entities.member").toLocaleLowerCase() });
It worked fine. But often, developers would forget to add .toLocaleLowerCase()
in a lot of places. Secondly, it started to pollute our code with too much
.toLocaleLowerCase()
.
As always, we decided to extract this problem to our neeto-commons-frontend package.
At first, it seemed like a very simple problem. We thought we can just use the
post-processor
feature. We just need to sentence-case the entire text on post-process
like
this.
const sentenceCaseProcessor = {
type: "postProcessor",
name: "sentenceCaseProcessor",
process: text => {
// Sentence-case text.
return (
text.charAt(0).toLocaleUpperCase() + text.slice(1).toLocaleLowerCase()
);
},
};
i18next
.use(LanguageDetector)
.use(initReactI18next)
.use(sentenceCaseProcessor)
.init({
resources: resources,
fallbackLng: "en",
interpolation: {
escapeValue: false,
skipOnVariables: false,
},
postProcess: [sentenceCaseProcessor.name],
});
Voila! Now onwards all the texts will be properly sentence-cased, we no longer
need to add .toLocaleLowerCase()
. Great? Not really.
We soon realized that not every text should be sentence-cased, there are a lot of cases where we need to preserve the original casing. Here are some examples.
Your file is larger than 2MB.
Disconnect Google integration?
No results found with your search query "Oliver".
Your Api Key: AJg3c4TcXXXXXXXXX
No internet, NeetoForm is offline.
These examples clearly show why it's not a simple problem. We require a more
targeted and nuanced solution. Upon revisiting the issue, we found that our
initial solution of adding .toLocaleLowerCase()
does work, but it's a bit
verbose.
So we decided to try
custom formatters.
So instead of adding .toLocaleLowerCase()
we created a nice custom formatter
called lowercase
.
i18next.services.formatter.add("lowercase", (value, lng, options) => {
return value.toLocaleLowerCase();
});
{
"add": "Add a {{entity, lowercase}}",
"entities": {
"member": "Member",
"website": "Website"
}
}
This works perfectly, but it doesn't solve the verbosity problem. Instead of
adding .toLocaleLowerCase()
in JavaScript files, we're now adding it in
translation JSON files - essentially just moving the problem to a different
place.
We needed a better solution that required minimal effort.
The idea here is to lowercase all dynamic values by default and create a
formatter to handle exceptions. To achieve this, we combined our previous
post-processor and a new formatter. The new formatter which we called anyCase
can be used to flag any dynamic part in the text that needs to be excluded from
lowercasing. The post-processor will ignore these particular parts of the text
while sentence-casing.
const ANY_CASE_STR = "__ANY_CASE__";
i18next.services.formatter.add("anyCase", (value, lng, options) => {
return ANY_CASE_STR + value + ANY_CASE_STR;
});
{
"message": "Your file is larger than {{size, anyCase}}"
}
The post-processor we wrote attempted to identify these parts of the text marked
by anyCase
formatter using pattern matching and retaining the original casing.
However, this approach failed when the text contained identical words in both
the dynamic and static parts of the text. It ended up lowercasing both words,
which is not the output we needed.
Before we discuss the final solution, i18next recently changed how a formatter is added, which is what we have been using so far, like below.
i18next.services.formatter.add("underscore", (value, lng, options) => {
return value.replace(/\s+/g, "_");
});
Before this, i18next had different syntax, which they now call legacy formatting is like below.
i18next.use(initReactI18next).init({
resources: resources,
fallbackLng: "en",
interpolation: {
format: (value, format, lng, options) => {
// All our formatters should go here.
},
},
});
Now back to our original problem.
We need to make sure when applying formatting it only formats dynamic parts. For
this, we found that if we use the legacy version of formatting, it offers an
option called alwaysFormat: true
. One thing to remember here is if we choose
to use this flag, the latest style of formatting does not work. That means we
need to move all our custom formatters to legacy format function.
i18next.use(initReactI18next).init({
resources: resources,
fallbackLng: "en",
interpolation: {
escapeValue: false,
skipOnVariables: false,
alwaysFormat: true,
format: (value, format, lng, options) => {
// All your formatters should go here.
},
},
});
This is not a problem for us, because we are already maintaining all our custom
formatter in one place(neeto-commons-frontend
package). Now the formatter is
applied to every dynamic text. This approach also overcame the "identical words
in the text problem" that we encountered with the previous version of the
formatter. Let's look at our updated formatter.
const LOWERCASED = "__LOWERCASED__";
const lowerCaseFormatter = (value, format) => {
if (!value || format === ANY_CASE || typeof value !== "string") {
return value;
}
return LOWERCASED + value.toLocaleLowerCase();
};
To elaborate on the code, the formatter lowercases all dynamic texts and
prefixes them with __LOWERCASED__
. This prefixing is necessary because the
formatter lacks information about where this specific piece of text originally
appeared in the complete text. By adding this prefix, if the lowercased text
happens to be the first part of the output, we can revert it during the
post-processing stage. And that's precisely what we accomplished in the
post-processor.
const sentenceCaseProcessor = {
type: "postProcessor",
name: "sentenceCaseProcessor",
process: value => {
const shouldSentenceCase = value.startsWith(LOWERCASED); // Check if first word is lowercased.
value = value.replaceAll(LOWERCASED, ""); // Remove all __LOWERCASED__
return shouldSentenceCase ? sentenceCase(value) : value;
},
};
Below is everything put together, If you're interested in a working example of the same, checkout this gist.
const LOWERCASED = "__LOWERCASED__";
const ANY_CASE = "anyCase";
const sentenceCase = value =>
value.charAt(0).toLocaleUpperCase() + value.slice(1);
const lowerCaseFormatter = (value, format) => {
if (!value || format === ANY_CASE || typeof value !== "string") {
return value;
}
return LOWERCASED + value.toLocaleLowerCase();
};
const sentenceCaseProcessor = {
type: "postProcessor",
name: "sentenceCaseProcessor",
process: value => {
const shouldSentenceCase = value.startsWith(LOWERCASED);
value = value.replaceAll(LOWERCASED, "");
return shouldSentenceCase ? sentenceCase(value) : value;
},
};
i18next
.use(LanguageDetector)
.use(initReactI18next)
.use(sentenceCaseProcessor)
.init({
resources: resources,
fallbackLng: "en",
interpolation: {
escapeValue: false,
skipOnVariables: false,
alwaysFormat: true,
format: (value, format, lng, options) => {
// other formatters
return lowerCaseFormatter(value, format);
},
},
postProcess: [sentenceCaseProcessor.name],
detection: {
order: ["querystring", "cookie", "navigator", "path"],
caches: ["cookie"],
lookupQuerystring: "lang",
lookupCookie: "lang",
},
});
If this blog was helpful, check out our full blog archive.