---
title: "Automatically sentence-case i18next translations"
description: "Automatically sentence-case i18next translations"
canonical_url: "https://www.bigbinary.com/blog/lowercase-translations"
markdown_url: "https://www.bigbinary.com/blog/lowercase-translations.md"
---

# Automatically sentence-case i18next translations

Automatically sentence-case i18next translations

- Author: Farhan CK
- Published: April 9, 2024
- Categories: JavaScript, ReactJS

We use [i18next](https://www.i18next.com/) to handle our localization
requirements. We have written in great detail how we use
[i18next and react-i18next libraries](https://www.bigbinary.com/blog/react-localization)
in our applications.

As our translations grew, we realized that instead of adding every combination
of the texts as separate entries in the translation file, we can reuse most of
them by utilizing the i18next interpolation feature.

[Interpolation](https://www.i18next.com/translation-function/interpolation) is
one of the most used functionalities in i18n. It allows integrating dynamic
values into our translations.

```json
{
  "key": "{{what}} is {{how}}"
}
```

```js
i18next.t("key", { what: "i18next", how: "great" });
// -> "i18next is great"
```

### Problem

As we started to use interpolation more and more, we started seeing a lot of
text with irregular casing. For instance, in one of our apps, we have an `Add`
button in a few pages.

```json
{
  "addMember": "Add a member",
  "addWebsite": "Add a website"
}
```

Instead of adding each text as an entry in the translation file as shown above,
we took a bit of a generic approach and started using interpolation. Now our
translation files started to look like this.

```json
{
  "add": "Add a {{entity}}",
  "entities": {
    "member": "Member",
    "website": "Website"
  }
}
```

This is great, but it has a slight problem. The final text looked like this.

```plaintext
Add a Member
```

We can see the `Member` is still capitalized, we needed it to be properly
sentence-cased like this.

```plaintext
Add a member
```

We first thought we would just add `.toLocaleLowerCase()` to the dynamic value.

```js
t("add", { entity: t("entities.member").toLocaleLowerCase() });
```

It worked fine. But often, developers would forget to add `.toLocaleLowerCase()`
in a lot of places. Secondly, it started to pollute our code with too much
`.toLocaleLowerCase()`.

As always, we decided to extract this problem to our
[neeto-commons-frontend](https://www.bigbinary.com/blog/neeto-commons-frontend)
package.

### Solutions we looked at

At first, it seemed like a very simple problem. We thought we can just use the
[post-processor](https://www.i18next.com/misc/creating-own-plugins#post-processor)
feature. We just need to sentence-case the entire text on `post-process` like
this.

```js
const sentenceCaseProcessor = {
  type: "postProcessor",
  name: "sentenceCaseProcessor",
  process: text => {
    // Sentence-case text.
    return (
      text.charAt(0).toLocaleUpperCase() + text.slice(1).toLocaleLowerCase()
    );
  },
};

i18next
  .use(LanguageDetector)
  .use(initReactI18next)
  .use(sentenceCaseProcessor)
  .init({
    resources: resources,
    fallbackLng: "en",
    interpolation: {
      escapeValue: false,
      skipOnVariables: false,
    },
    postProcess: [sentenceCaseProcessor.name],
  });
```

Voila! Now onwards, all the texts will be properly sentence-cased, we no longer
need to add `.toLocaleLowerCase()`. Great? Not really.

We soon realized that not every text should be sentence-cased, there are a lot
of cases where we need to preserve the original casing. Here are some examples.

```plaintext
Your file is larger than 2MB.
Disconnect Google integration?
No results found with your search query "Oliver".
Your Api Key: AJg3c4TcXXXXXXXXX
No internet, NeetoForm is offline.
```

These examples clearly show why it's not a simple problem. We require a more
targeted and nuanced solution. Upon revisiting the issue, we found that our
initial solution of adding `.toLocaleLowerCase()` does work, but it's a bit
verbose.

So we decided to try
[custom formatters](https://www.i18next.com/translation-function/formatting#adding-custom-format-function).
So instead of adding `.toLocaleLowerCase()`, we created a nice custom formatter
called `lowercase`.

```js
i18next.services.formatter.add("lowercase", (value, lng, options) => {
  return value.toLocaleLowerCase();
});
```

```json
{
  "add": "Add a {{entity, lowercase}}",
  "entities": {
    "member": "Member",
    "website": "Website"
  }
}
```

This works perfectly, but it doesn't solve the verbosity problem. Instead of
adding `.toLocaleLowerCase()` in JavaScript files, we're now adding it in
translation JSON files - essentially just moving the problem to a different
place.

We needed a better solution that required minimal effort.

The idea here is to lowercase all dynamic values by default and create a
formatter to handle exceptions. To achieve this, we combined our previous
post-processor and a new formatter. The new formatter, which we called `anyCase`
can be used to flag any dynamic part in the text that needs to be excluded from
lowercasing. The post-processor will ignore these particular parts of the text
while sentence-casing.

```js
const ANY_CASE_STR = "__ANY_CASE__";
i18next.services.formatter.add("anyCase", (value, lng, options) => {
  return ANY_CASE_STR + value + ANY_CASE_STR;
});
```

```json
{
  "message": "Your file is larger than {{size, anyCase}}"
}
```

The post-processor we wrote attempted to identify these parts of the text marked
by `anyCase` formatter using pattern matching and retaining the original casing.
However, this approach failed when the text contained identical words in both
the dynamic and static parts of the text. It ended up lowercasing both words,
which is not the output we needed.

### Final solution

Before we discuss the final solution, i18next recently changed how a formatter
is added, which is what we have been using so far, like below.

```js
i18next.services.formatter.add("underscore", (value, lng, options) => {
  return value.replace(/\s+/g, "_");
});
```

Before this, i18next had a different syntax, which they now call legacy
formatting is like below.

```js
i18next.use(initReactI18next).init({
  resources: resources,
  fallbackLng: "en",
  interpolation: {
    format: (value, format, lng, options) => {
      // All our formatters should go here.
    },
  },
});
```

Now back to our original problem.

We need to make sure that when applying formatting, it only formats dynamic
parts. For this, we found that if we use the legacy version of formatting, it
offers an option called `alwaysFormat: true`. One thing to remember here is if
we choose to use this flag, the latest style of formatting does not work. That
means we need to move all our custom formatters to the legacy format function.

```js
i18next.use(initReactI18next).init({
  resources: resources,
  fallbackLng: "en",
  interpolation: {
    escapeValue: false,
    skipOnVariables: false,
    alwaysFormat: true,
    format: (value, format, lng, options) => {
      // All your formatters should go here.
    },
  },
});
```

This is not a problem for us, because we are already maintaining all our custom
formatter in one place(`neeto-commons-frontend` package). Now the formatter is
applied to every dynamic text. This approach also overcame the "identical words
in the text problem" that we encountered with the previous version of the
formatter. Let's look at our updated formatter.

```js
const LOWERCASED = "__LOWERCASED__";
const lowerCaseFormatter = (value, format) => {
  if (!value || format === ANY_CASE || typeof value !== "string") {
    return value;
  }
  return LOWERCASED + value.toLocaleLowerCase();
};
```

To elaborate on the code, the formatter lowercases all dynamic texts and
prefixes them with `__LOWERCASED__`. This prefixing is necessary because the
formatter lacks information about where this specific piece of text originally
appeared in the complete text. By adding this prefix, if the lowercased text
happens to be the first part of the output, we can revert it during the
post-processing stage. And that's precisely what we accomplished in the
post-processor.

```js
const sentenceCaseProcessor = {
  type: "postProcessor",
  name: "sentenceCaseProcessor",
  process: value => {
    const shouldSentenceCase = value.startsWith(LOWERCASED); // Check if first word is lowercased.
    value = value.replaceAll(LOWERCASED, ""); // Remove all __LOWERCASED__

    return shouldSentenceCase ? sentenceCase(value) : value;
  },
};
```

Below is everything put together, If you're interested in a working example of
the same, checkout this
[gist](https://gist.github.com/neerajsingh0101/3c1413c28ec9115091b6644e3ceb9764).

```js
const LOWERCASED = "__LOWERCASED__";
const ANY_CASE = "anyCase";

const sentenceCase = value =>
  value.charAt(0).toLocaleUpperCase() + value.slice(1);

const lowerCaseFormatter = (value, format) => {
  if (!value || format === ANY_CASE || typeof value !== "string") {
    return value;
  }
  return LOWERCASED + value.toLocaleLowerCase();
};

const sentenceCaseProcessor = {
  type: "postProcessor",
  name: "sentenceCaseProcessor",
  process: value => {
    const shouldSentenceCase = value.startsWith(LOWERCASED);
    value = value.replaceAll(LOWERCASED, "");

    return shouldSentenceCase ? sentenceCase(value) : value;
  },
};

i18next
  .use(LanguageDetector)
  .use(initReactI18next)
  .use(sentenceCaseProcessor)
  .init({
    resources: resources,
    fallbackLng: "en",
    interpolation: {
      escapeValue: false,
      skipOnVariables: false,
      alwaysFormat: true,
      format: (value, format, lng, options) => {
        // other formatters
        return lowerCaseFormatter(value, format);
      },
    },
    postProcess: [sentenceCaseProcessor.name],
    detection: {
      order: ["querystring", "cookie", "navigator", "path"],
      caches: ["cookie"],
      lookupQuerystring: "lang",
      lookupCookie: "lang",
    },
  });
```

## Links

- [Human page](https://www.bigbinary.com/blog/lowercase-translations)
