A locale with _ (underscore) as thousands separator

Motivation

Some programs produce long nubers as output, so long that it's hard to determine the size of the numbers; e.g.:
[~:131680] perf stat gzip -9 -c /lib/x86_64-linux-gnu/libc.so.6 >/dev/null
...
         966803105      cycles                    #    4.000 GHz
        1248192237      instructions              #    1.29  insn per cycle
         323092321      branches                  # 1336.721 M/sec
...
We can use a locale that uses thousands separators to get a more readable result instead:
[~:131681] LC_NUMERIC=en_US perf stat gzip -9 -c /lib/x86_64-linux-gnu/libc.so.6 >/dev/null
...
       965,155,530      cycles                    #    4.000 GHz
     1,248,186,198      instructions              #    1.29  insn per cycle
       323,091,110      branches                  # 1339.002 M/sec
...
Unfortunately, if we cut and paste such numbers into a programming language implementation (such as python3 or gforth), they do not understand these thousands separators. Many programming languages understand _ as thousands separators. So in order to work with those, it would be useful to have a locale that produces such numbers. In the following I describe how to create it:

Creating the locale

The following has been tried on Debian 11; run as root:
mkdir -p /usr/local/share/i18n/locales
cd /usr/local/share/i18n/locales
cp -p /usr/share/i18n/locales/en_US prog
wget -O - https://www.complang.tuwien.ac.at/anton/locale-prog/prog.diff | patch
localedef -i prog -f UTF-8 prog

Using the locale

Now you can use the locale prog. Given that it's based on en_US and the only difference is for LC_NUMERIC, you typically specify it only for LC_NUMERIC and use your regular LANG setting for the rest. You can use it for an individual invocation:
LC_NUMERIC=prog perf stat ...
or set it permanently:
export LC_NUMERIC=prog

Result

For our motivating example, we get:
LC_NUMERIC=prog perf stat gzip -9 -c /lib/x86_64-linux-gnu/libc.so.6 >/dev/null
...
       950_828_362      cycles                    #    3.997 GHz
     1_250_383_211      instructions              #    1.32  insn per cycle
       323_493_763      branches                  # 1359.854 M/sec
         6_396_544      branch-misses             #    1.98% of all branches
We can see at a glance what order of magnitude the numbers are, but we can also compute the MPKI metric for gzip on this Skylake processor as follows, cutting and pasting the numbers from above:
python3 -c 'print(6_396_544*1000/1_250_383_211)'

Anton Ertl
[ICO]NameLast modifiedSizeDescription

[DIR]Parent Directory  -  
[TXT]prog.diff28-Jul-2022 19:08 225  

Apache/2.2.22 (Debian) DAV/2 mod_fcgid/2.3.6 PHP/5.4.36-0+deb7u3 mod_python/3.3.1 Python/2.7.3 mod_ssl/2.2.22 OpenSSL/1.0.1e mod_perl/2.0.7 Perl/v5.14.2 Server at www.complang.tuwien.ac.at Port 80