#Unicode is one of those little things in life that I can't help but smile about.
Is it perfect? No, of course not. Is it better than the alternative? Yes, so much so that every time I'm confronted with a long list of character encodings I can choose from, I feel a sense of relief when I find #UTF8 among them.
I wouldn't have thought it possible to standardize a single character encoding for everyone, and yet, somehow, there is just such a standard.
I decided on codeberg I still hate UTF8.
(why couldn't there be a size prefix?)
#Programming #UTF8 that was kind of annoying to implement but I can now store and a UTF8 string and print it out.
...at least a subset of utf8, I don't have checks for all the possible utf8 characters yet.
How quickly can you check that a string is valid unicode (UTF-8)? — https://lemire.me/blog/2018/05/09/how-quickly-can-you-check-that-a-string-is-valid-unicode-utf-8/ #utf8 #programming
Imutin kaikki #Facebook'in julkaisuni – ainakin jos #Meta'a uskotaan. Pyysin #JSON-muodossa toivossa, että tulisi sutjakammin. Hieman ongelmia aiheutti JSONin koodaus: merkkijonot ovat validia #UTF8:aa mutta JSON ilmeisesti olettaa #UTF16:n, joten vaaditaan mukamuunnos eestaas; apua löytyi #StackOverflow’sta. Aikaleimat sentään olivat standardi-#POSIX’ia.
En tiedä, kuinka täydellinen ”arkisto” on, mutta ainakin jotakin saisi talteen, kun lähtee lätkimään. #some #atkjuttuja
Hey everyone. I must admit, I don't believe I have ever seen someone enter #utf8 #unicode characters on a #computer in a natural way. Which seems weird, because a bunch of languages use them.
I wrote a #commonLisp #asdf package that just looks up a list of symbols in a file that has every non-surrogate unicode codepoint in it, and an #emacs #elisp function that just calls the #lisp one.
Multilingual people, what can you tell me about doing this at all?
In #UTF8 gitb es das Symbol ⍼.
Es heißt Angzarr, eine Art Ligatur aus einem L und einem Blitz.
Niemand weiß so richtig, warum es in UTF8 enthalten ist, aber in diesem Dok. ist am Ende ein Brief vom AMS-Präsidenten!
Also wofür steht es??
Wrong answers only
#UserAgent based banning of #textmode browsers is sooooo lame.
$ lynx -useragent= https://[…]
Why does this PHP construct:
normalizer_normalize( $search_string, \Normalizer::FORM_D );
Convert ÖÖÖ to OOO, but keeps ÅÅÅ as ÅÅÅ ... WTF?!
[Перевод] Кодирование UTF-8 без ветвления
Можно ли кодировать UTF-8 без ветвлений? Да . Вопрос Натан Голдбаум задал в чате Recurse вопрос: Я знаю, как декодировать UTF-8 с помощью битовой математики и таблиц поиска (см. https://github.com/skeeto/branchless-utf8 ), но если я хочу преобразовать кодовую точку UTF-8, то можно ли сделать ли это без ветвлений? Для начала, можно ли как-то написать эту функцию на C, которая возвращает количество байтов, необходимых для хранения байтов UTF-8 кодовой точки, без использования ветвления? Или для этого потребуется огромная таблица поиска?
Wer sogar die Schriftarten mit Menue auswählen will für eine
,der kann so was auch in meinen Github Gists finden.
Glyphe surprenant du jour : ₣, symbole monétaire du Franc français (!).
Rendu par "un F majuscule doublement barré, qui a été proposé par Édouard Balladur en 1988, une ligature Fr ou d’autres variantes.
Selon Yannis Haralambous en 2004, ce symbole n’a jamais été utilisé"
#typo #UTF8