A locale with _
(underscore) as thousands separator
Motivation
Some programs produce long nubers as output, so long that it's hard
to determine the size of the numbers; e.g.:
[~:131680] perf stat gzip -9 -c /lib/x86_64-linux-gnu/libc.so.6 >/dev/null
...
966803105 cycles # 4.000 GHz
1248192237 instructions # 1.29 insn per cycle
323092321 branches # 1336.721 M/sec
...
We can use a locale that uses thousands separators to get a more
readable result instead:
[~:131681] LC_NUMERIC=en_US perf stat gzip -9 -c /lib/x86_64-linux-gnu/libc.so.6 >/dev/null
...
965,155,530 cycles # 4.000 GHz
1,248,186,198 instructions # 1.29 insn per cycle
323,091,110 branches # 1339.002 M/sec
...
Unfortunately, if we cut and paste such numbers into a programming
language implementation (such as python3 or gforth), they do not
understand these thousands separators. Many programming languages
understand _
as thousands separators. So in order to
work with those, it would be useful to have a locale that produces
such numbers. In the following I describe how to create it:
Creating the locale
The following has been tried on Debian 11; run as root:
mkdir -p /usr/local/share/i18n/locales
cd /usr/local/share/i18n/locales
cp -p /usr/share/i18n/locales/en_US prog
wget -O - https://www.complang.tuwien.ac.at/anton/locale-prog/prog.diff | patch
localedef -i prog -f UTF-8 prog
Using the locale
Now you can use the locale prog
. Given that it's
based on en_US
and the only difference is for LC_NUMERIC,
you typically specify it only for LC_NUMERIC and use your regular LANG
setting for the rest. You can use it for an individual invocation:
LC_NUMERIC=prog perf stat ...
or set it permanently:
export LC_NUMERIC=prog
Result
For our motivating example, we get:
LC_NUMERIC=prog perf stat gzip -9 -c /lib/x86_64-linux-gnu/libc.so.6 >/dev/null
...
950_828_362 cycles # 3.997 GHz
1_250_383_211 instructions # 1.32 insn per cycle
323_493_763 branches # 1359.854 M/sec
6_396_544 branch-misses # 1.98% of all branches
We can see at a glance what order of magnitude the numbers are, but we
can also compute
the MPKI metric for
gzip on this Skylake processor as follows, cutting and pasting the
numbers from above:
python3 -c 'print(6_396_544*1000/1_250_383_211)'
Anton Ertl