#Unicode is one of those little things in life that I can't help but smile about.
Is it perfect? No, of course not. Is it better than the alternative? Yes, so much so that every time I'm confronted with a long list of character encodings I can choose from, I feel a sense of relief when I find #UTF8 among them.
I wouldn't have thought it possible to standardize a single character encoding for everyone, and yet, somehow, there is just such a standard.
I decided on codeberg I still hate UTF8.
(why couldn't there be a size prefix?)
https://codeberg.org/Loganer/Sauce/src/branch/Base/src/Sauce/Function/UTF8/ToPoint.c
#Programming #UTF8 that was kind of annoying to implement but I can now store and a UTF8 string and print it out.
...at least a subset of utf8, I don't have checks for all the possible utf8 characters yet.
How quickly can you check that a string is valid unicode (UTF-8)? — https://lemire.me/blog/2018/05/09/how-quickly-can-you-check-that-a-string-is-valid-unicode-utf-8/ #utf8 #programming
Imutin kaikki #Facebook'in julkaisuni – ainakin jos #Meta'a uskotaan. Pyysin #JSON-muodossa toivossa, että tulisi sutjakammin. Hieman ongelmia aiheutti JSONin koodaus: merkkijonot ovat validia #UTF8:aa mutta JSON ilmeisesti olettaa #UTF16:n, joten vaaditaan mukamuunnos eestaas; apua löytyi #StackOverflow’sta. Aikaleimat sentään olivat standardi-#POSIX’ia.
En tiedä, kuinka täydellinen ”arkisto” on, mutta ainakin jotakin saisi talteen, kun lähtee lätkimään. #some #atkjuttuja
Hey everyone. I must admit, I don't believe I have ever seen someone enter #utf8 #unicode characters on a #computer in a natural way. Which seems weird, because a bunch of languages use them.
I wrote a #commonLisp #asdf package that just looks up a list of symbols in a file that has every non-surrogate unicode codepoint in it, and an #emacs #elisp function that just calls the #lisp one.
https://codeberg.org/tfw/unicode-chars
Multilingual people, what can you tell me about doing this at all?
In #UTF8 gitb es das Symbol ⍼.
Es heißt Angzarr, eine Art Ligatur aus einem L und einem Blitz.
Niemand weiß so richtig, warum es in UTF8 enthalten ist, aber in diesem Dok. ist am Ende ein Brief vom AMS-Präsidenten!
https://www.unicode.org/wg2/docs/n2191.pdf
Also wofür steht es??
Wrong answers only
#UserAgent based banning of #textmode browsers is sooooo lame.
$ lynx -useragent= https://[…]
Why does this PHP construct:
normalizer_normalize( $search_string, \Normalizer::FORM_D );
Convert ÖÖÖ to OOO, but keeps ÅÅÅ as ÅÅÅ ... WTF?!
[Перевод] Кодирование UTF-8 без ветвления
Можно ли кодировать UTF-8 без ветвлений? Да . Вопрос Натан Голдбаум задал в чате Recurse вопрос: Я знаю, как декодировать UTF-8 с помощью битовой математики и таблиц поиска (см. https://github.com/skeeto/branchless-utf8 ), но если я хочу преобразовать кодовую точку UTF-8, то можно ли сделать ли это без ветвлений? Для начала, можно ли как-то написать эту функцию на C, которая возвращает количество байтов, необходимых для хранения байтов UTF-8 кодовой точки, без использования ветвления? Или для этого потребуется огромная таблица поиска?
#Linux #Bash #utf8
Wer sogar die Schriftarten mit Menue auswählen will für eine
𝐅𝐄𝐓𝐓𝐄 𝐒𝐂𝐇𝐑𝐈𝐅𝐓
,der kann so was auch in meinen Github Gists finden.
https://gist.github.com/dewomser
Glyphe surprenant du jour : ₣, symbole monétaire du Franc français (!).
Rendu par "un F majuscule doublement barré, qui a été proposé par Édouard Balladur en 1988, une ligature Fr ou d’autres variantes.
Selon Yannis Haralambous en 2004, ce symbole n’a jamais été utilisé"
https://fr.wikipedia.org/wiki/%E2%82%A3
#typo #UTF8