Making of the PETSCII typer (posted 2022-03-26)
I thought it might be interesting to talk a bit about how my PETSCII typer web tool came together.
It all started with the Drop + Matt3O MT3 retro keycap set. I ordered this set because I like the old style key shape that's replicated here. And then as a bonus you get the PETSCII characters we know and love from the Commodore 8 bit computers printed on the front of the keycaps.
Not sure how I ended up there, but I found the Style64 C64 TrueType fonts, which replicate Commodore's take on the ASCII character set, usually referred to as PETSCII because the Commodore PET computer from 1977 introduced it.
So now I can have PETSCII graphical characters on my keyboard and PETSCII on my screen. The one missing thing: pressing a key and having the PETSCII character on that key show up on the screen. That's what the PETSCII typer does.
PETSCII, fonts and Unicode
PETSCII is actually two (four?) sets of 128 printable characters. One set (mostly) has the 95 printable ASCII characters and then about 30 graphical characters. The other set turns the lower case letters into upper case, and replaces the upper case letters with additional graphical characters. The two sets are also included in inverted (reverse video) form.
Style64's C64 Pro Mono font will display regular text in the Commodore 64's font, including Unicode characters that match PETSCII graphical characters. So when you change fonts, the characters will look different, but you still see the same characters. In addition to that, the four sets of 128 characters are also mapped into the Unicode private use space. Text that uses these Unicode code points will only show properly with the C64 font. If you change fonts, you'll just see a lot of "unknown character" boxes.
When using the C64 Pro Mono font, I try to keep most ASCII characters as ASCII and use the private use area mapping for the graphical characters. I also built in the ability to switch to a different font, which then requires additional character remapping. There are still some bugs there, but I decided to publish the tool now and do some more bug hunting at a later date.
To make things even more interesting, fairly recently additional "symbols for legacy computing" were added to Unicode, which makes it possible to map almost all non-inverted PETSCII characters to Unicode. However, my systems don't implement this version of Unicode (13 or 14) yet, requiring additional filtering/mapping options.
Javascript: mapping keyboards and Unicode
Javascript lets you run some code when the user presses a key. Your code then sees which physical key was pressed, which letter that key produced, and whether any modifier keys such as shift, control or alt were also pressed at the time.
The PETSCII typer keyboard event function then checks if there is a mapping from the key being pressed along with shift/control/alt to a PETSCII character. If so, it puts the mapped PETSCII character in the text area, and then uses event.preventDefault() to prevent the browser from performing the normal action associated with that keypress. Easy peasy. (If the key doesn't map, I just let the browser handle it the way it sees fit.)
But... I'm using the alt key for typing the PETSCII characters, so it's possible to type regular letters without alt. However, on the Mac the alt key (often called option there) can also be used to invoke dead keys for typing accented letters. And for some reason, in that case event.preventDefault() doesn't do its job, and an extra character appears. Turns out there is absolutely no way to get around that. Javascript actually has several events that let you catch dead key character composition, but no help there.
Eventually I did two things: allow the user to use control rather than alt for typing PETSCII characters, and catching the compositionupdate event, then noticing the size of the text in the text area, and then when a keydown event happens, if the text is now longer, remove the extra characters. On Safari this solves the problem, but not on Chrome.
And remember, Javascript handles characters as 16-bit values. So Unicode characters that don't fit into 16 bits (such as the new legacy characters, but not just those) are actually two code points in Javascript. So I had to handle that myself in order to be able to map to/from those Unicode characters.
Pasting rich text
The idea behind the PETSCII typer was that you can use it to type PETSCII and then copy and paste the results into another application, such as a text editor or word processor. To make that work as seamlessly as possible, I set things up such that when you copy text from the PETSCII typer text area (but not cut text), I catch that event and then put the plain text on the clipboard and also an HTML version of the same text with the C64 Pro Mono font specified. So in most other applications, the PETSCII will paste looking the way it should. Although often, it's necessary to tinker with the line height a bit to make sure everything lines up nicely.
(By they way, designers of modern fonts have no idea how box drawing characters and other graphical characters are supposed to work. The amount of whitespace above and below is all over the place so PETSCII graphics converted to fonts like Courier look pretty bad, even if all the characters could be mapped to Unicode.)
Loading files
I thought it would be a nice addition to be able to load PETSCII text files, as I'm not aware of an obvious way to convert those otherwise. The Javascript FileReader event makes the browser ask the user to select a local file, which is then made available to the Javascript script. So even though it looks like you're uploading a file, it's not transmitted over the network, it stays within the browser.
Mapping the characters from a text file to PETSCII in the text area was of course simple enough. Then I thought: what about Commodore BASIC programs? Those use line numbers in binary and they map all the BASIC keywords to "tokens". Again some mapping, so that was fairly easy, too. Although my code doesn't handle control codes inside strings in BASIC.
C64 colors
I couldn't resist the temptation to copy the C64 color scheme along with the PETSCII font. Turns out that there's many takes on how the C64 colors should look. I had a look at Secret colours of the Commodore 64 and Commodore VIC-II Color Analysis, but both delivered very dark and drab colors. (But both interesting reads anyway.) Especially the red is barely red. Eventually I used an HDMI capture dongle to capture the output of my THEC64 but then still increased the saturation a bit for several colors.
Of course today it seems weird that we can't agree on the objective color output specs of a computer. But the C64 and computers like it output NTSC or PAL directly, with no RGB values anywhere. Also, build variations between chip and board revisions and between individual computers made for significant differences. And we all set the saturation on our monitors or TVs to whatever we found most pleasing. But for sure, a Commodore 64 doesn't produce some drab brownish output when you ask for "red".
You can change the text color using control 1 - 8 and shift control 1 - 8, and the border and background colors by entering a number from 0 to 15 the "poke" boxes. And you can enable reverse video (background color becomes text color, text color becomes background color) using control 9 and turn it back off with control 0.
But that adds a new complication: what if the text cursor is on some text in reverse video? So the text cursor ("caret" in Javascript speak) needs to contrast with both the foreground and background colors. So I needed to keep track of (roughly) the brightness of each color, and then either use a bright cursor when text and background are dim, a dark cursor when both are bright, or a middle gray when one is dark and the other is bright.
It's always the small things that take an unexpected amount of extra time. Another example: when you click the "choose file" button, and select the last file you loaded, then no Javascript events trigger. So additional complexity is needed to be able to load the same file again.