Been looking for an excuse to play with AppleScript?!
Now you have one very good one!
Been looking for an excuse to play with AppleScript?!
Now you have one very good one!
For creating audio-books I use a text-to-speech engine. One problem is that the application dies on Unicode text. The documents that I encode are too long to correct manually so I want it automated. The correction isn’t as simple as removing all Unicode text though because if possible I don’t want to lose the meaning of the character when it is easily converted to ASCII.
Continue reading “Best Way To Transliterate Unicode to ASCII? Python Help Needed With Solution.”
Unicode homoglyphs at best make it easy to play jokes on other programmers and at worst make it easy mislead users.
Emacs tells you everything that you need to know about it using describe-char
:
position: 927 of 1056 (88%), column: 33
character: ⌘ (displayed as ⌘) (codepoint 8984, #o21430, #x2318)
preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x2318
script: symbol
syntax: . which means: punctuation
category: .:Base, j:Japanese
to input: type "C-x 8 RET HEX-CODEPOINT" or "C-x 8 RET NAME"
buffer code: #xE2 #x8C #x98
file code: #xE2 #x8C #x98 (encoded by coding system utf-8-unix)
display: by this font (glyph code)
mac-ct:-*-Lucida Grande-normal-normal-normal-*-17-*-*-*-p-0-iso10646-1 (#x3B4)
Character code properties: customize what to show
name: PLACE OF INTEREST SIGN
old-name: COMMAND KEY
general-category: So (Symbol, Other)
decomposition: (8984) ('⌘')
There are text properties here:
fontified t