Non-Standard Characters

Learn how GL Strings supports non-standard characters

Table of Contents


Non-Standard Characters

GL Strings supports characters that go beyond those commonly used in the English basic alphabet and punctuation marks, enabling representation of diverse languages. Any Unicode character can be entered into the GL Strings editor.

Considerations

While the GL Strings editor accepts any Unicode character, how the string ultimately appears to the end users depends on several downstream factors:

  • Font Support:
    Not all fonts support every glyph required to vialually represent a character. If the font does not include a particular glyph, it may not display correctly and instead show a placeholder, such as a square (□) or a question mark (?).

  • Encoding and Integration:
    When integrating localized strings into source code, design files, or content management systems, it’s essential to ensure proper character encoding (such as UTF-8). Inconsistent encoding or integration errors can can lead to character corruption or rendering issues.

  • Environment Differences:
    Local testing (development environment), staging, and production environments can produce different results due to differences in server configuration, caching, or deployment processes. The final production environment is the only true indicator of end-user experience.

  • Font Availability = Glyph Rendering:
    Characters will only display correctly if the necessary font and its glyph, is available and properly applied, whether in a design tool like Figma (font installed locally or accessed via Figma’s limited cloud library) or in a website/app (font included and referenced in code or style sheet).

Back to Top

Best Practice

Test localized strings in their final destination environment to ensure the strings display correctly. This step is important, especially for languages that use less common scripts or symbols.

Back to Top

Entry Methods

You can directly enter the Unicode character using a virtual keyboard or the keyboard shortcut, or copy and paste the character into the GL Strings editor.

  • Example use case:
    Source brand name is a registered trademark in source language/locale, but target brand name is simply trademarked in target language/locale. Non-standard characters example use case

Helpful external resources:

  • Symbl has a helpful search function
  • Compart makes identifying each Unicode character by name within Unicode blocks (ranges) easy
  • Unicode Explorer includes how to type each Unicode character using (Windows and Linux) keyboard shortcuts

Back to Top

Non-Printable Characters

Non-printable characters, also known as formatting marks or hidden characters, are used for content design in text files or user interfaces and aren’t displayed at printing or in the rendered UI. Some of the most common ones are line breaks, spaces, non-breaking spaces, tabulators and soft hyphens.

The editor view supports non-printable unicode characters, but not all of them are visible by default. In this article, we’ll explain the different ways in which they are displayed on the editor view.

Please note that this functionality is exclusively available on the Chrome web browser.

  • In the preview mode, the line break is displayed as a return key symbol (see screenshot below, marker A) and the tabulator is displayed with the tab symbol (marker B). Spaces and non-breaking spaces are rendered as regular spaces. Soft hyphens aren’t visible at all.

  • In the edit mode, all non-printable characters are displayed in the non-printable form:

  • By clicking on the eye icon below a string, users can enable the show hidden characters feature. In this mode, hidden characters are represented as follows:
Hidden character Representation Notes
Line break pilcrow
Space blue dot
No-break space space This includes leading and trailing spaces
Soft hyphen purple pipe symbol
Tabulator left arrow

Back to Top

Other Considerations

  • GL Strings integrations convert non-printable characters between formats, so we recommend not using escaped characters to represent them in our editor. The reason for this is that it may create issues on some integrations.
  • In XLIFF files (XML Localization Interchange File Format) and other localization file formats, non-printable characters can be represented in their escaped form. In these, a line break will be represented by \n and a tabulator will be represented by \t.
  • Soft Hyphen, abbreviated SHY, will be represented by \U00AD, and non-breaking spaces will be represented as \00A0 (or &nbsp in xml derived formats).
  • Currently, our editor’s search bar supports non-breaking spaces, regular space or soft hyphens. To look them up, simply copy and paste the character on the search bar and hit enter.

Back to Top