2023-01-13
Inspecting Unicode strings in Perlord $str returns the unsigned integer value of the first character of $str
The special-case split you're trying to remember is split //, $str (for disintegrating a string into a list of characters, right? Yep, called it).
use charnames (); charnames::viacode(ord $str) returns the "best" name (most recent Name_Alias if any, otherwise Name if any, otherwise any alias you defined, otherwise undef). charnames is ... less un-ergonomic for going the other direction, from names to ordinals or strings.
Caveat: Be sure you are handling Unicode strings, not byte strings.
- -C7 when you write a one-liner
- use open qw(:std :encoding(UTF-8)); when you write a script, assuming you have been allowed to encode your Unicode string data the sane way
- use feature qw(unicode_strings); when you are confident you don't need the hinky "guess if it's 8-bit" (ASCII or EBCDIC) backwards compatibility behavior
- use utf8; if that script also has embedded non-ASCII (think about your __DATA__ section)
Bonus code: having a confusing time with trailing whitespace? Make it explain itself.use strict;use warnings;use open qw':std :encoding(UTF-8)';use feature qw(unicode_strings say);use utf8;use charnames ();while (<>) {chomp;/(\s*)$/;say join "\n\t", $_, map {charnames::viacode(ord $_)} split //, $1;}
02:04
2023-01-10
Google Drive path in crouton chroots/var/host/media/fuse/drivefs-[unique_number_tag]/root
21:52