utf8.width()

Type Function

Library utf8.*

Return value Number

Revision Release 2022.3683

Keywords utf8, UTF-8, Unicode, string, width

Overview

Calculates the width of UTF-8 string s in terms of character positions, taking into account whether characters are double-width, "compose" characters (typically accents), unprintable, or regular single-width characters.

Note that this is an approximation which may be useful, but it's limited to circumstances where a monospace width estimate is appropriate.

The return value of this function depends on usage:

If s is a code point, returns the width of that code point.
If ambiIsDouble is specified and true, the ambiguous width character's width is 2; otherwise it's 1. For comparison, the width of full-width and double-width characters is 2 while other characters have a width of 1.
If defaultWidth is specified, it will be used as the width for unprintable characters.

Syntax

utf8.width( s [, ambiIsDouble [, defaultWidth]] )

s _^(required)

String. The string to examine.

ambiIsDouble _^(optional)

Boolean. If true, the ambiguous width character's width is regarded as 2; otherwise it's regarded as 1.

defaultWidth _^(optional)

Number. If specified, this value will be used as the width for unprintable characters.

Example

local utf8 = require( "plugin.utf8" )

local testStr = "♡ 你好，世界 ♡"

print( utf8.width( testStr ) )  --> 14
print( utf8.width( testStr, true ) )  --> 16
print( utf8.width( utf8.codepoint( testStr, 5, 7 ), false ) )  --> 2

Type	Function
Library	utf8.*
Return value	Number
Revision	Release 2022.3683
Keywords	utf8, UTF-8, Unicode, string, width