free software resistance
the cost of computing freedom is eternal vigilance
### unicode-is-not-a-standard-its-a-hot-mess
*originally posted:* jan 2024
fundamentally broken standards like css and unicode all bear the same hallmarks. the gemini protocol instead of replacing or reforming such a broken mess, offers an alternative.
sadly, while gemini fixes html, http and css (i think http is better in some ways, but overall i think gemini is inspired) it leaves unicode the same hot mess it always is.
this is defensible, i dont think it would help gemini to become as ambitious as fixing unicode. as far as i know, the western world has ebcdic (grotesque) and baudot (insufficent) and ascii (truly outmoded) and unicode (impossible) to choose from, and unicode is the lesser evil here. its unfortunate, but i dont really blame gemini for the fact that a better choice does not exist.
html and css could have been real standards, but industry is pretty intent on destroying such things. if big companies are overambitious in their efforts to add features we dont need and more importantly, cannot implement (css will NEVER be standard! its a false promise, a swindle, a hoax for profits sake) we will always have "standards" whose reality is closer to "a constant stream of wishful prototypes" whereas ASCII was a REAL standard: wherever you were promised ascii- you actually got ascii.
if some idiot did a phoned-in (an apt phrase for teletype users) implementation or wedged extra data into bit 7, thats nothing compared to the impossible mess we can thank unicode or css for. in the 90s we waged holy war against nagware and later browser popups, only to have css (THANKS!) become the ultimate bastion for pop-up simulation. css should be taken to the wall and shot.
yes, i do use it on this page. please disable css in your browser and do not reenable it. the world will manage.
its not that standards evolve, every reasonable person knows that whether or not english is always improving that forcing king james english on everyone would be ridiculous and tedious- not to mention that it would break the vast majority of printed pages in existence today. english is a wonderful example of communication managing just fine without real standardisation. when its broken, sometimes thats because we want it to be.
the real problem, as usual, is capitalism. without megacorporations, the unicode standard would surely be more tolerable. but that doesnt change the fact that its perpetually broken.
im completely in favour of documenting whats GOOD about unicode, because there are a number of steps forward. special, unprintable formatting characters for unicode are nothing new- for example, ascii has a non-breaking space, which instructs the output device on not only what to do with the present character but the next one.
i am less fond of (we should avoid as much as possible) characters that affect the PREVIOUS character instead, but at least in the context of term windows and vts both ascii 8 and unicode 8 affect the previous character by deleting it. there is a precedent, but i think its better to be conservative with this because implementing it really brings back every problem we had with roman numerals.
but if we dont overdo it, the ability to do strikethrough for example, as a feature of the text stream itself, is arguably superior to the necessity to have it as part of html, or css or even the gemtext standard. gemini doesnt implement strikethrough, but makes it possible by requiring unicode support. and good on them for this.
i do strongly lean towards, the idea that one value is equal to one result- this is neat and clean and above all, predictable. you dont have to sell me on the point that for some languages, it makes more sense to combine existing values. this is a compromise, and should be treated like one.
i say "values" because thats the point of the unicode standard: to reduce written information to numeric values in a standard and predictable fashion. its in the predictability that the standard not only fails, but will always fail. unicode is broken because it continually tries to do more than most can implement. it is not reliable- it will never be reliable. it just keeps marching forward and breaking text.
if you reform this the problem continues, simply showing up less of the time. if you provide a modest alternative, the way gemini does to html and css- call it unicode lite or if legal trademarks make that impossible, unicycle or something- you get something better than reform- you get predictable, even standard implementations back.
i dont think going back to ascii is the answer- im not against people making software that uses an outdated standard, what if you have an ancient device and you want a filter that turns unicode into the closest ascii characters possible, or simply drops everything but the first 128 characters? thats not a crime.
but i prefer a more honest connection between what is promised and what is delivered, and even if we have to exaggerate slightly, things like unicode and css become a "standard" not by truly replacing limited and fragmented standards, but by glueing them together into a single giant, forever fragmented standard. theres never a real point in sight where this becomes reliable, and thats what makes it a big lie.
maybe im being an unreasonable perfectionist here, but im tired of seeing this stuff break year after year while fanboys hoot about how this fixes everything. it does just the opposite. and it leads to gaslighting, where im supposed to believe im the last person on earth who experiences these imposed artifacts of incompatibility on a regular basis. oh its not the standard! its just YOUR SOFTWARE.
there really is no standard, i routinely experience this with software recommended to me by fans of unicode, which still has these issues. this has happened for more than a decade. this is just a big collective lie.
but i cant help notice the similarity in tone and tack of other initiatives like it, such as systemd or wayland. these are worse, because the failures overlap and the promises overlap, but the necessity and value of something LIKE unicode far outweighs the necessity of a monstrosity like systemd. systemd really is a corporate wishlist, imposed on every gnu/linux distro like a squadron of tanks rolling across the border. each of these is overambitious, hopeless to make work, and every one of us is expected to take time out of our day to shoehorn it into our lives, before it actually benefits us.
i realise that i sound like a codgy old person here, thats the image youre supposed to have of anyone making an argument like this. but the only time i really miss ascii is when unicode craps all over something for the third time in the same week- an event i wish did not happen almost weekly, but i resent the way that self-important developers always insist this is my fault.
the real heroes are people like the ones who develop alternatives like gemini, and people in respected positions who speak out against the corporate gaslighting- including theo de raadt and the memeber of the netbsd foundation board who have done this.
im not saying we should get back to the good old days. im saying i would like better days in the future, and more honesty and integrity from developers. unfortunately, developers channeling marketing scoundrels is also an industry standard. we should deprecate that one, even if it breaks capitalism permanently.
one-size-fits-all is the marching song of monopolies. if you can afford industrial bs at scale, you can tell people that something broken works for everyone.
theres plenty of evidence it works. theres plenty of evidence it doesnt. honesty will cite both of these, and industry will push whatever evidence works in favour of its goals.
personally, i think standards would be better if developed outside the hands of monopolies. whats that? a standard IS a monopoly? i think that depends entirely on the scope and implementation. im not salty that ascii subsumed ebcdic, ebcdic was trash- baudot was TOO modest. i think gopher is very cool, but i like gemtext better because its formatted with visible characters rather than tabs.
theres not much in the way of a hard rule that will make this work. but i think humanity is up to this, if we put gaslighting and marketing aside and have real conversations about whats better and what isnt and why. industry will never support this fully enough, until industry is in the hands of the people. thus for now we can probably count on industry to do one thing at least, which is turn our standards into hot messes that can never be properly implemented. my standard response: your "standard" is broken by design.
people frame this as being against standards. id simply prefer to replace broken ones with ones that are more practical for everyone. but as a prototype for a real standard, which actually could work in a world with an industry less tainted by dishonest goals and dishonest marketing, unicode is paralleled by nothing. it has so much potential, and its a shame that its own creators will never be able to fix it. my advice is simple enough: dont create impossible standards, and then they really can become what they promise.
license: 0-clause bsd
```
# 2018, 2019, 2020, 2021, 2022, 2023, 2024
#
# Permission to use, copy, modify, and/or distribute this software for any
# purpose with or without fee is hereby granted.
#
# THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
# WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
# MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
# ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
# WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
# ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
# OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
```
=> https://freesoftwareresistance.neocities.org