SMS Unicode Character Encoding
If we strive for cost-effective communication, but at the same time, want to avoid uncomfortable input methods, the feature transliteration (Unicode Encoding of Characters) could be one quite useful tool for us.
Table of Contents
- What is transliteration?
- Switching off the transliteration feature
- What is a GSM character, and what other types of characters are there?
- GSM
- Unicode
- Best Practices for Cost-Effective SMS Communication
What is transliteration?
Transliteration, while composing an SMS message, monitors characters that are not included in the GSM encoding standard in the input field of the message, and replaces them as defined in the GSM character chart, if needed. With this feature, one can avoid using Unicode, meaning that the message size of 160 characters (defined in the GSM standard) will be available for composing messages.
The purpose of this feature is that however the entered characters are replaced, the intelligibility of the message is not corrupted. Accordingly, the system replaces all non-GSM characters represented in the first column of the below chart to the GSM characters represented in the second column of the same chart:
Input character | Replaced character |
---|---|
á | à |
í | ì |
ó | ò |
ú | ù |
ő | ö |
ű | ü |
Á | Å |
Í | I |
Ú | U |
Ó | O |
Ő | Ö |
Ű | Ü |
Switching off the transliteration feature
Naturally, it is possible to turn off this feature. In this case, no replacement takes place and the message is sent with the same characters the user has entered.
To switch off the SMS transliteration feature
- Select a project from the project list, then select Channels > SMS.
- Press Add new SMS template.
- Select the Message tab
- Select GSM/UTF-8 in the Character encoding. On the right-hand side, you will see in brackets that the type has changed to UniCode, and the character limit is now 70.
Note: Transliteration works based on the above character replacement chart only. Therefore, if the user enters a character that is a non-GSM character and is not included in the replacement chart, meaning that it cannot be replaced with a similar character, then the message will be sent in the Unicode encoding standard automatically. Such characters are for example the letters of the Chinese or Greek alphabet.
What is a GSM character, and what other types of characters are there?
Upon creating an SMS message one can choose between two character encoding standards, where the decision is affected by cost-effectiveness and aesthetic choices.
GSM
GSM 03.38 is an encoding standard that applies basic characters, including the ones listed below:
@£$¥èéùìòÇØøÅåΔ_ΦΓΛΩΠΨΣΘΞÆæßÉ!”#¤%&'()*+,-./0123456789:;<=>? ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑܧ¿abcdefghijklmnopqrstuvwxyzäöñüà
- A standard SMS message encoded in GSM 03.38 can contain up to 160 characters.
- Concatenated SMS messages are supported, allowing longer messages to be split and sent across multiple SMS segments.
- However, concatenation slightly reduces available character space:
- 2 messages: 306 characters instead of 320.
- 3 messages: 459 characters instead of 480.
- Certain special characters use two character spaces instead of one:
|^€{}[~]
Unicode
Unicode encoding allows the use of a vast range of characters, including accented letters and special symbols. However, it significantly reduces the character limit:
- A standard Unicode SMS can contain only 70 characters.
- Concatenated Unicode SMS messages also reduce available character space:
- 2 messages: 134 characters instead of 140.
- 3 messages: 201 characters instead of 210.
Best Practices for Cost-Effective SMS Communication
To optimize SMS messaging for cost and efficiency, consider the following tips:
- Use GSM Characters Whenever Possible: Avoid special characters and symbols that require Unicode encoding.
- Enable Transliteration: This will replace non-GSM characters with GSM equivalents, preserving the 160-character limit per message.
- Be Mindful of Special Characters: Characters like
|
,^
,{
,}
consume two spaces, reducing the total available characters. - Test Messages Before Sending: Some SMS platforms provide message length indicators that help prevent unintended Unicode encoding.
- Avoid Emojis in Text Messages: Emojis automatically switch the encoding to Unicode, reducing message length drastically.
By following these best practices, users can minimize SMS costs while ensuring messages remain clear and readable across different devices and networks.
Related articles
There's always more to learn. Discover similar features by visiting related articles:
Comments
Can’t find what you need? Use the comment section below to connect with others, get answers from our experts, or share your ideas with us.
There are no comments yet.