Skip to main content

Encoding

The text of the given (key) shall be encoded in the specified (encoding).

Parameter table

paramunitrequireddescription
keyyesThe key of the value to check if it is encoded in the specified encoding (key).
encodingyesThe encoding to check the text against.

Correction

The correction will be applied to the provided (key), wherein the text will undergo conversion to the specified encoding.

There are 2 possible scenarios:

  • The characters that are not supported by the specified encoding will be replaced with an approximate character. For example Š will be replaced with S if we are trying to encode in ASCII.

  • The characters that are not supported by the specified encoding will be removed entirely. For example, 😋 will be replaced with "" (no clear approximation).

Note: The correction might result in some inconsistencies for some languages as getting rid of special characters can give words which don't make sense. So don't expect the correction to be perfect in all cases in terms of readability. For example,

  • French: the accents will be replaced entirely.
  • German: the umlauts will be replaced entirely.
  • Russian: the Cyrillic characters will be replaced entirely. And so on and so forth.

How to setup

The rule verifies whether the text associated with the specified (key) is encoded in the specified (encoding). If the text does not meet this criterion, the rule will fail. A character encoding is a system that pairs numbers with characters. These numbers are stored in binary files and are used to map the characters into the binary values. The most common character encoding is UTF-8 which is a variable-length encoding that can encode all Unicode characters. However, some older infrastructures do not support UTF-8. ASCII is the most primitive encoding and should be supported everywhere, even though it is far more limited. Other encodings include UTF-16 and more such as ISO-8859-1, Windows-1252, ISO-8859-5, ISO-8859-6. For more information on character encodings, see here.

Examples

valuevalidencodingdescription
hello,مرحبًاUTF-8The text is encoded in the specified encoding.
hello,مرحبًاASCIIThe text is not encoded in the specified encoding.
El niño, le garçon élu, das hübsche MädchenISO-8859-5The text is not encoded in the specified encoding.
El niño, le garçon élu, das hübsche MädchenISO-8859-1The text is encoded in the specified encoding.