Dabeaz

Dave Beazley's mondo computer blog. [ homepage | archive ]

Sunday, August 22, 2010

 

Using Python to Encode Cassette Recordings for my Superboard II

See Part 2 for a discussion of decoding audio

See Part 3 to see real-time audio encoding/decoding used in conjunction with telnet.

My family's first computer was an Ohio Scientific Superboard II--something that my father purchased around 1979. At the time, the Superboard II was about the most inexpensive computer you could get. In fact, it didn't even include a power supply or a case. If you wanted those features, you had to add them yourself. Here's a picture of our system with the top of the (homemade) case removed so that you can see inside.


To say that the Superboard II is minimal is certainly an understatement by today's standards. There was only 8192 total bytes of memory and no real operating system to speak of. When you powered on the system you could either run the machine language monitor or Microsoft Basic Version 1.0. Here's a sample of what appeared on the screen (yes, that's maximum resolution):


Much to my amazement, our old Superboard II system stayed in the family. For about 20-25 years it sat in the basement of my mother's house surrounded by boxes. After that, it sat for a few years in a closet at my brother's condo. Occasionally, we had discussed the idea of powering it up to see if it still worked, but never got around to it--until now. About a week ago, my brother threw the old computer along with an old Amiga monitor in the back of his car and headed east to Chicago. After some discussion, we decided we'd just blow the dust out of it, power it on, and see what would happen.

Unbelievably, the machine immediately sprang to life. The above screenshot was taken just today. Since powering it up, I've written a few short programs to test the integrity of the memory and ROMs. Aside from a 1-bit memory error (bit 2 at location 0x861) it appears to be fully functional.

One problem with these old machines is that they had very little support for any kind of real I/O. Forget about USB, Firewire, or Ethernet. Heck, this machine didn't even have a serial or parallel port on it. In fact, the only external interface was a pair of audio ports for saving and loading programs on a cassette tape player--which was also the only way to save any of your work as there was no disk drive of any kind. Here is a picture of the back


Since the old machine seemed to be working, I got to thinking about ways to program it. Working directly on the machine was certainly possible, but if you look at the keyboard, you'll notice that there aren't even any arrow keys (there is no cursor control anyways) and some of the characters are in unusual locations. Plus, some of the keys are starting to show their age. For example, pressing '+' tends to produce about 3 or 4 '+' characters due to some kind of key debouncing problem. So, like most Python programmers, I started to wonder if there was some way I could write a script that would let me program the machine in a more straightforward manner from my Mac.

Since the only input port available on the machine was a cassette audio port, the proposition seemed simple enough: could I write a Python script to convert a normal text file into a WAV audio file that when played, would upload the contents of the text file into the Superboard II? Obviously, the answer is yes, but let's look at the details.

Viewing Cassette Audio Output

On many old machines, cassette output is encoded using something called the Kansas City Standard. It's a pretty simple encoding. A 0 is encoded as 4 cycles of a 1200 Hz sine wave and a 1 is encoded as 8 cycles of a 2400 Hz sine wave. If no data is being transmitted, there is a constant 2400 Hz wave. Each byte of data is transmitted by first sending a 0 start bit followed by 8 bits of data (LSB first) followed by two stop bits (1s). Click here to hear a WAV file sample of actual data being saved by my Superboard II. I recorded this sample using Audacity on my Mac.

Python has a built-in module for reading WAV files. Combined with Matplotlib you can easily view the waveform. For example:

>>> import wave
>>> f = wave.open("osi_sample.wav")
>>> f.getnchannels()
2
>>> f.getsampwidth()
2
>>> f.getnframes()
1213851
>>> rawdata = bytearray(f.readframes(1000000))
>>> del rawdata[2::4]    # Delete the right stereo channel    
>>> del rawdata[2::3]
>>> wavedata = [a + (b << 8) for a,b in zip(rawdata[::2],rawdata[1::2])]
>>> import pylab
>>> pylab.plot(wavedata)
>>>

After some panning and zooming, you'll see a plot like this. You can observe the different frequencies used for representing 0s and 1s. Again, this plot was created from an actual sound recording of data saved by the system.


Converting Text into a KCS WAV File

Using Python's wave module, it is relatively straightforward to go in the other direction--that is, take a text file and encode it into a WAV file suitable for playback. Here is the general strategy for how to do it:

Here is a script kcs_encode.py that has one implementation.

##!/usr/bin/env python3
# kcs_encode.py
#
# Author : David Beazley (http://www.dabeaz.com)
# Copyright (C) 2010
#
# Requires Python 3.1.2 or newer

"""
Takes the contents of a text file and encodes it into a Kansas
City Standard WAV file, that when played will upload data via the
cassette tape input on various vintage home computers. See
http://en.wikipedia.org/wiki/Kansas_City_standard
"""

import wave

# A few global parameters related to the encoding

FRAMERATE = 9600       # Hz
ONES_FREQ = 2400       # Hz (per KCS)
ZERO_FREQ = 1200       # Hz (per KCS)
AMPLITUDE = 128        # Amplitude of generated square waves
CENTER    = 128        # Center point of generated waves

# Create a single square wave cycle of a given frequency 
def make_square_wave(freq,framerate):
    n = int(framerate/freq/2)
    return bytearray([CENTER-AMPLITUDE//2])*n + \
           bytearray([CENTER+AMPLITUDE//2])*n

# Create the wave patterns that encode 1s and 0s
one_pulse  = make_square_wave(ONES_FREQ,FRAMERATE)*8
zero_pulse = make_square_wave(ZERO_FREQ,FRAMERATE)*4

# Pause to insert after carriage returns (10 NULL bytes)
null_pulse = ((zero_pulse * 9) + (one_pulse * 2))*10

# Take a single byte value and turn it into a bytearray representing
# the associated waveform along with the required start and stop bits.
def kcs_encode_byte(byteval):
    bitmasks = [0x1,0x2,0x4,0x8,0x10,0x20,0x40,0x80]
    # The start bit (0)
    encoded = bytearray(zero_pulse)
    # 8 data bits
    for mask in bitmasks:
        encoded.extend(one_pulse if (byteval & mask) else zero_pulse)
    # Two stop bits (1)
    encoded.extend(one_pulse)
    encoded.extend(one_pulse)
    return encoded

# Write a WAV file with encoded data. leader and trailer specify the
# number of seconds of carrier signal to encode before and after the data
def kcs_write_wav(filename,data,leader,trailer):
    w = wave.open(filename,"wb")
    w.setnchannels(1)
    w.setsampwidth(1)
    w.setframerate(FRAMERATE)

    # Write the leader
    w.writeframes(one_pulse*(int(FRAMERATE/len(one_pulse))*leader))

    # Encode the actual data
    for byteval in data:
        w.writeframes(kcs_encode_byte(byteval))
        if byteval == 0x0d:
            # If CR, emit a short pause (10 NULL bytes)
            w.writeframes(null_pulse)
    
    # Write the trailer
    w.writeframes(one_pulse*(int(FRAMERATE/len(one_pulse))*trailer))
    w.close()

if __name__ == '__main__':
    import sys
    if len(sys.argv) != 3:
        print("Usage : %s infile outfile" % sys.argv[0],file=sys.stderr)
        raise SystemExit(1)

    in_filename = sys.argv[1]
    out_filename = sys.argv[2]
    data = open(in_filename,"U").read()
    data = data.replace('\n','\r\n')         # Fix line endings
    rawdata = bytearray(data.encode('latin-1'))
    kcs_write_wav(out_filename,rawdata,5,5)

You can study the implementation yourself for some of the finer details. However, most of the heavy work is carried out using operations on Python's bytearray object. For padding the audio, a constant 1 bit is emitted (a constant 2400 Hz wave). To handle old text encoding, newlines are replaced with a carriage return. Moreover, to account for the slow speed of the Superboard II, a pause consisting of about 80 bits is inserted after each carriage return.

To use this script, you now just need an old BASIC program to upload. Here's a really simple one (from the Superboard II manual):

10 PRINT "I WILL THINK OF A"
15 PRINT "NUMBER BETWEEN 1 AND 100"
20 PRINT "TRY TO GUESS WHAT IT IS"
25 N = 0
30 X = INT(RND(56)*99+1)
35 PRINT
40 PRINT "WHATS YOUR GUESS   ";
50 INPUT G
52 N = N + 1
55 PRINT
60 IF G = X THEN GOTO 110
70 IF G > X THEN GOTO 90
80 PRINT "TOO SMALL, TRY AGAIN ";
85 GOTO 50
90 PRINT "TOO LARGE, TRY AGAIN ";
100 GOTO 50
110 PRINT "YOU GOT IT IN";N;" TRIES"
113 IF N > 6 THEN GOTO 120
117 PRINT "VERY GOOD"
120 PRINT
130 PRINT
140 GOTO 10
150 END

Let's say this program is in a file guess.bas. Here's how to encode it using our script.

bash $ python3 kcs_encode.py guess.bas guess.wav
bash $ ls -l guess.wav
352652
bash $

Now, we have an audio file that's ready to go (note: it's rather impressive that a 476 byte input file has now expanded to a 350Kbyte audio file). You can listen to it here. Note that data doesn't start until about 5 seconds have passed.

Now, the ultimate test. Does this audio file even work? To test it, we first hook up the audio input of the Superboard II to my Macbook.


Next, we go over to the Superboard II and type 'LOAD'


Next, we start playing the WAV file on the Mac. After a few seconds, you see data streaming in (at about 300 baud). Excellent!


Finally, the ultimate test. Let's play the game:


Awesome! Note for anyone under the age of 40: yes, this is the kind of stuff people did on these old machines--and we thought it was every bit as awesome as your shiny iPad. Maybe even more awesome. I digress.

(It occurs to me that fooling around on this machine might be the reason why I got an F in 7th grade math and had to attend summer school)

Just so you can get the full effect, here is a video of the upload in action. It's really hard to believe that systems were so slow back then. For big programs, it might take 5 minutes or more to load (even with the 8K limit):




Well, that's about it for now. The power of Python never ceases to amaze me--once again a problem that seems like it might be hard is solved with a short script using nothing more than a single built-in library module and some basic data manipulation. Next on the agenda: A Python script to decode WAV files back into text files.

By the way, if you take one of my classes, you can play with the Superboard II yourself (wink ;-).


Comments:
You, sir, are a geek's geek.This accounts for why we get on so well together!
 
Dave, This is amazing! Thanks!
(Yes, I am more than 40 years old and like this kind of stuff)
 
P-u-r-e geekiness, thanks for sharing!
 
10 PRINT "YOU, SIR, ARE A GEEK!"
20 PRINT
30 REM ** LET'S DO AN INFINITE LOOP
40 GOTO 10

RUN
 
my trs-80 cred has been trumped. You are the king !
 
Mike, Ah yes, the TRS-80. I don't ever remember using an original Model I, but there was a lab of TRS-80 Model IIIs at school that were used to teach programming--well, at least until they all disappeared in the middle of the night during a school break-in.
 
I'm under 40 and can tell this is amazing. Very nice work.
 
This is totally awesome!

I remember accidentally putting a program tape in my boom-box a few times and hearing the screeches.

Oh how I wish my parents had saved the Challenger 2P and my programs.
 
Since originally posting this I've been working on a decoder script to go in the other direction (WAV files back to text). Based on that, I've fixed a few minor glitches in my original script concerning the handling of newlines. Updated version should more accurately reflect the actual encoding used on the OSI.
 
I used my SWTPC 6800 to translate code into the Kansas City standard, and then recorded that with my Mac. Now I can just play it back using Audacity. However, I prefer to have my PC laptop talk with the SWTPC over 9600 baud serial, since that's a lot faster. I can't wait to see a program that can take the tape and turn it back into code though! And I'm under 40...haha!
 
We were on the model I level II (which means it had 16k instead of 4k), got it in 1980. Dad bought two model IIs and wrote some ad-hoc accounting software for them a few years later - the model II's were *very* unusual and they used 8" floppies. I think he rigged one to control some motors at some point after that. Funny how dad, after rigging up trs-80s, can barely handle checking his email these days like all the other dads....
 
Swooning with nostalgia!

I once met a guy who said that he'd actually spotted the design of an OSI Superboard II being reused in a later piece of embedded hardware. It was a printer buffer.

The outrage!
 
Wonderful stuff! Thanks for a great post (nostalgia and all; now I'll have to see if I can find the old rats-nest wired TI function "electronic" calculator from the late 60's my dad bought me .... still, not as cool as this).

- Yarko
 
Another nice article, Dave.

You state the Kansas City Standard as being based on a sine wave, yet the recording from the S. II looks square and you generate a square wave without explaining why that suffices. My understanding is that the cassette recorder's circuitry would often have the effect of smoothing the generated square wave somewhat, and since the computer would be counting the zero-crossings to decode the played tape, it doesn't matter whether it's sine or square; the KCS just said sine. That's accumulated hearsay so I'd be interested if you know more.

The null_pulse required to give the S. II time to process the just-entered line, does that appear on recordings from the S. II too?

And then some really nit-picky points... Newlines aren't being replaced with a carriage return, LF is being prefixed with a CR. AMPLITUDE being 64 isn't a peak-to-peak amplitude, which is 128. That confused me for a bit but it seems amplitude can also mean `peak amplitude' which is the absolute value of the swing about 0, in this case 64. And guess.wav is 352KB or 344KiB. ;-)
 
Ralph,

Although I don't have any definitive answers on the Sine-wave vs. Square wave, my understanding of KCS is that it's really just counting zero-crossings (in which case, the choice of waveform wouldn't seem to matter much). My audio recordings of the SB seem to indicate a simple square wave pulse. Likewise, using a square wave in my encoding seems to work fine.

The Null pulse is used by the SB when saving audio data so the encoding scheme I am using is meant to match that exactly. It's actually kind of weird--10 NULL bytes are inserted between the carriage return character and the newline character. Based on the observed behavior of my SB, I suspect that the cassette input is actually being polled (as opposed to being interrupt driven) and that an input time delay is needed to give the machine enough time to perform a screen scroll after each line. I would need to dig deeper to know for certain however.
 
Since posting this originally, I updated the encoding script slightly to target Python-3.1.2. Partly this is because I'm going to be doing more with it in the future.
 
Impressive! I had one of those, complete with a cabinet etc. Hacked the hardware to display reverse text, black letters on white, according to the MSB of the ascii code. Did some actual physics and math research with it, even some very crude audio signal processing with a homebuilt ADC. Now it seems so antique, weak, with no I/O but for the Kansas City format tape + whatever homebrew circuitry one concocted.
 
Awsome!

I'm very interested in this!

Currently, all the archiving software for the APF IM1 and most for the Bally/Astrocade are Windows programs. (I use Linux)

The 300 baud format for the Bally/Astrocade is Kansas City also. Bally's 1200 baud I'm not sure and then there is the IM.

Thanks for showing me that Python can be put to this task.
 
Wow, reading this I had a flashback to my 10th grade computer class. A room full of TRASH-80's (TRS-80) to play with.

Unfortunately, I was more dork than nerd.
 
Wow, that brought back memories. At the time I lived in Sweden and used to import the Ohio Scientific Superboard II's.
Not being happy with the Kansas City's 300 baud I stepped them up to reliably run at 1200 bd. Mine had a whopping 8MB RAM, if memory serves.
I also remember building a 5 1/4" floppy interface, where we had to manually turn the motor on and off (via code). At one point it was set up to detect the rotary signals from the phone and tell me the number dialed. But that might have been on another computer. Too long time ago...
Fun to see someone else who used them!
 
The 10 nulls between each line are to give the machine time to parse the line of BASIC into tokens and store it in memory.
 
The Kansas City Standard was created at a meeting that brought together the manufacturers of computers in the town of its name and was sponsored by Byte magazine. Part of the reason for the standard was the simplicity of the hardware needed to generate the tone and read it. Generating it was accomplished with switching between +5 and -5 V logic signal levels - thus the square wave seen in the output. Reading it was accomplished by a phase locked loop (available as a discrete circuit) cleaning up the input and then feeding it to an input pin on the microprocessor.
 
Awesome! I used to program in Basic on an Atari 800 when I was in elementary school. The initial cassette storage initially baffled me and also took forever to load anything. This really lays it out nicely and is a pretty fun way to revisit the technology.
 
You should set something up on your Macbook to receive strings of text and tweet them. So you are basically tweeting from the past.
 
Post a Comment

Subscribe to Post Comments [Atom]





<< Home

Archives

Prior Posts by Topic

08/01/2009 - 09/01/2009   09/01/2009 - 10/01/2009   10/01/2009 - 11/01/2009   11/01/2009 - 12/01/2009   12/01/2009 - 01/01/2010   01/01/2010 - 02/01/2010   02/01/2010 - 03/01/2010   04/01/2010 - 05/01/2010   05/01/2010 - 06/01/2010   07/01/2010 - 08/01/2010   08/01/2010 - 09/01/2010   09/01/2010 - 10/01/2010   12/01/2010 - 01/01/2011   01/01/2011 - 02/01/2011   02/01/2011 - 03/01/2011   03/01/2011 - 04/01/2011   04/01/2011 - 05/01/2011   05/01/2011 - 06/01/2011   08/01/2011 - 09/01/2011   09/01/2011 - 10/01/2011   12/01/2011 - 01/01/2012   01/01/2012 - 02/01/2012   02/01/2012 - 03/01/2012   03/01/2012 - 04/01/2012   07/01/2012 - 08/01/2012   01/01/2013 - 02/01/2013   03/01/2013 - 04/01/2013  

This page is powered by Blogger. Isn't yours?

Subscribe to Posts [Atom]