Acoustic Modem Using Atmega

High Level Block Diagram

Data transmission over sound is used in many communication protocols, the most common being Dual-Tone Multi-Frequency signaling (DTMF). It is used to dial phone numbers and the frequency combinations chosen for the digits are very familiar to the general public. It was also used in early modems to request internet connections and its characteristic sound is nostalgic for many people.

However, to push the data rate to higher values, ultrasound (sound waves of frequency higher than 20 kHz), is a better choice than audible sound given that it is relatively unaffected by noise produced by human speech and, at the same time, does not disturb the user.

Nowadays, some cellphone and computer applications use ultrasound to transmit data. However the channel’s characteristics, such as echo, high noise for audible range and inadequate hardware for ultrasonic range, are very difficult to handle and severely limit the data rate. Applications that use it to transmit data are usually academic. Distance and speed measurement and sonar devices are where ultrasound is most useful.

The device described on this report consists of a pair of microcontrollers assisted by analog circuitry, which transmit data to each other through sound.

When a key is pressed on the keyboard connected to the transmitter microcontroller, a corresponding sound wave is generated by the speaker. The microphone converts the sound wave to an analog electric signal, which is converted to a digital signal and interpreted by the receiver. The data is then shown on an LCD display.

The initial design would use ultrasound to transmit the information as to not disturb the user and the data would be converted back into the PS2 keyboard protocol in order the mimic the keyboard itself. That way, a computer would experience no difference between a signal received from the device and a signal from an actual keyboard. Due to unforeseen challenges, we opted for a less sophisticated design, which uses audible sound.

The idea for this project came from a lab in ECE 4760, the microcontroller design class, involving DTMF. We noticed that sound was an interesting way to transmit information and decided to create a transmitter receiver pair using this method.

High Level Design

Logical Structure

At a high level, our system consists of two main components: a transmitter unit, and a receiver unit.

High Level Block Diagram
High Level Block Diagram

The transmitter consists of a keyboard, a microcontroller (TX-MCU), and a set of speakers. Whenever a key on the keyboard is pressed, the keyboard drives two signals, CLK and DATA. The TX-MCU samples DATA on the falling edge of CLK to build a data packet representing the key that was pressed, then transmits the packet as a combination of two alternating on/off tones (one at 8 kHz, another at 5 kHz) over the speakers, where one tone represents CLK and the other represents DATA.

It is important to note that the keyboard CLK is driven at a very high speed (roughly 13 kHz). At these speeds it is impossible to represent the signal as on/off 8 kHz or 5 kHz tones, as even a single period would not fit within the clock pulse, let alone the number of periods per pulse necessary to be recognized by the receiver. As a result, a significantly slower transmission rate must be used by the TX-MCU.

The tones output by the speakers are then picked up with a microphone, and the resulting analog electric signal passes through three stages of high pass filters and two amplifiers before being input into a tone-decoding circuit. As it’s name suggests, the tone-decoding circuit consists of two tone decoders that are tuned to detect the presence of 8 kHz and 5 kHz frequency signals, each driving a low output if such frequencies are detected. In this way digital CLK and DATA signals can be reconstructed. However, these signals tend to oscillate slightly when transitioning between low and high, which is undesirable. To counter this, the signal is low-pass filtered and passed through a comparator with hysteresis. The resulting signals are finally then considered suitable for the receiver microcontroller (RX-MCU) to use as inputs. Once again, the RX-MCU samples DATA on the falling edge of CLK to build a data packet representing the key that was pressed. From this, the RX-MCU is able to determine the corresponding character and ouput it to the LCD.

The user interacts with the overall system using the keyboard. Assuming there are no errors in transmission, valid key-presses on the keyboard will result in the corresponding character being output to the LCD after the previously output character, as in a typical text editor. Going past the end of either of the two LCD lines will result in the text wrapping around to the beginning of the other line. Aside from the typical alphanumeric keys, there are additional keys with functions associated with them.

Key Result
<Shift> Output alternate character
<Backspace> Erase last character, move cursor back by one
<Enter> Clear current line, move cursor to beginning of current line
<Esc> Clear both lines, move cursor to beginning of first line

Design Decisions

Tone Generation

We considered several different options for generating tones at a target frequency.

One of the first methods that came to mind was using Direct Digital Synthesis (DDS) as in Lab 2 of this class. Obvious benefits to this approach were the ease and familiarity that came with having implmented this approach previously. One huge restriction at the time was the fact that DDS was limited by the microcontroller to a maximum frequency of roughly 8 kHz, but, with some handling, we decided that the harmonics could be used

We also considered using a pure hardware implementation, such as using a 555 timer or phase-shift oscillators.

Lastly, we considered low-passing a square wave, which could be generated either by the MCU or the LM 567 IC.

Even though it is not the most appropriate option, this design uses DDS. We had unexpected difficulties implementing some of the other options and opted for DDS, as we were more familiar and experienced with it.

Tone Detection

As before, we considered several different options for detecting tones at a target frequency.

One approach was to use a combination of an Analog-to-Digital Converter (ADC) and Goertzel’s algorithm, which is a procedure with detects the frequency component of a single tone in a signal, primarily used in embedded systems. In case the frequency of the signal was too high for the internal ADC, an external one could be used or the frequency could be brought down by multiplying it by a sine wave with a frequency a little lower than the carrier with the IC AD633.

Another approach was to use the LM567 tone decoders, which involved a relatively simple hardware implementation, and was capable of detecting up to 500 kHz.

Our first attempt was using band-pass filters and rectifying the signal, but low order analog filters proved insufficient for this application. Then, we decided to use LM567’s.

Hardware

Hardware Overview

Our system consists of two main physical components: a transmitter circuit, and a receiver circuit. The transmitter circuit is relatively simple, consisting only of a microcontroller, a keyboard, a low-pass filter to filter the output of the PWM, and a set of speakers to output the corresponding tones. The receiver circuit is significantly more complex, and can be subdivided into a microcontroller, a microphone circuit to pick up, filter, and amplify the tones, a tone-decoding circuit to detect the presence of certain frequencies in a signal, and a low-pass hysterisis circuit to clean up the resulting digital signal.

Hardware Design: Transmitter

As mentioned previously, the transmitter circuit is relatively straightforward, as most of the work is done in software. A schematic for the overall circuit can be found in the appendix. PIN B0 and PIN B2 of the TX microcontroller are connected to the PS/2 keyboard DATA and CLK lines respectively, with 330Ω resistors in between for protections purposes. The output of the PWM at PORT B3 is passed through a low-pass filter to drive a set of speakers, which has its own power supply. For the low pass filter, we needed a resistor that was high enough so as not to load the port pin, yet low enough to be below the speaker input resistance of 30kΩ. We ultimately settled on a 10kΩ resistor. We then chose a capacitance of 2.2nF for a cut-off frequency of ~7.25 kHz. Though this is technically below 8 kHz, we were still able to reliably detect 8kHz tones.

Hardware Design: Receiver

The receiver circuit is where most of the complexity lies, and can be subdivided into microphone, tone decoder, and hysteresis sub-circuits, which we will examine in more depth in the following sections. A schematic for the combined receiver circuit can be found in the appendix.

Microphone Sub-Circuit

Receiver Microphone Sub Circuit
Receiver: Microphone Sub-Circuit

The microphone stage is shown above. This is a modified version of the design presented in class to adapt the signal for Analog-to-Digital Conversion.

The microphone is set into a voltage divider with a 10kΩ resistor. This value provides the best results since it sets the DC voltage on the microphone to a value close to half of Vcc, therefore the ratio between the AC voltage and changes of the resistance of the microphone is maximum.

C1 isolates the DC voltage between the microphone and the second voltage divider R2 and R3, which are set so that the voltage at the input of the amplifier is half of Vcc. This guarantees the input for the amplifier will not be a negative voltage, which cannot be handled with the power sources available. The two amplifiers are set to gains of 50 and 10, respectively. C2 and C3 cut off the DC current of the amplifiers, effectively turning them into buffers for DC voltages. The amplifiers are implemented using LF353 for its fairly high bandwidth-gain property.

Note this circuit has 3 high-pass filters: C1 and R2||R3, C2 and R5, and C3 and R7. Their values are set to cut-off frequencies of approximately 14, 7 and 14 kHz, respectively, to remove noise from human speech. Ideally, these values should have been adjusted to lower cut-off frequencies since the final design used 5 kHz and 8 kHz, but this could not be done due to time constraints.

Tone Decoder Sub-Circuit

Receiver Tone Decoder Sub Circuit
Receiver: Tone Decoder Sub-Circuit

The tone decoder stage is shown above. The input is pin 3 and the output is pin 8.

There are two of these blocks, one for each frequency used. They are implemented by the LM567CN. The configuration used is the one described on the data sheet. Rf, Cf and the center frequency of the tone decoder are related by the following equation:

On this design, capacitors of 10nF and 5kΩ potentiometers were used to set the center frequency.

One of the problems encountered was that since C1 and C2 are electrolytic and maintain their charges for a long time, Rf had to be adjusted after every power cycle to achieve the proper center frequency.

Another problem encountered was that when transitioning between high and low states the tone decoder output would occasionally oscillate. This proved to be particularly troublesome as the resulting falling edges would trick the receiver microntroller into thinking that a new data bit had arrived.

Hysteresis Sub-Circuit

Hysteresis Sub Circuit
Receiver: Hysteresis Sub-Circuit

The solution to the oscillation problem mentioned above was to add an additional stage consisteing of a low-pass filter and a comparator with hysteresis to smoouth out the signal. The resulting circuit is shown above. This configuration is based on the one used in lab 4 of this class.

Software

PS/2 Background

Before we begin discussing our code, a brief overview of the PS/2 protocol is in order. A typical PS/2 keyboard cable consists of four wires: Vcc, GND, CLK, and DATA. Vcc and GND can be connected to microcontroller ground and Vcc. When a key is pressed, the keyboard begins to drive a clock signal on CLK. The frequency of the CLK signal can vary, but for the keyboard we are using the frequency is approximately 13 kHz. On the falling edge CLK, the value of DATA reflects the current bit. By repeatedly checking the value of DATA on the falling edge of CLK, a data packet can be constructed, as seen below.

Keyboard CLK and DATA Lines
PS 2 Keyboard CLK and DATA Lines

Ps/2 packets consist of 11 bits: a start bit of 0, the 8 bits of the scan code (more on this later), least significant bit first, an odd parity bit who’s purpose is to ensure that the number of 1’s in the scan code is always odd, and a stop bit, which should always be 1.

The aforementioned scan code is what represents the keystroke information. While a key is held, the keyboard repeatedly sends a scan code corresponding to the key being held down. Once the key is released, the keyboard sends two more additional scan codes: first a ‘break’ code of 0xFO to signify that a key has been released, and finally the scan code of the key to signal which key was released. A table of keys and their corresponding scan codes can be found in the Reference section. Using this knowledge, we are able to implement code to effectively handle keystroke information coming from the keyboard. In addition, we are also able to emulate this protocol with our transmitter.

Software Overview

Like the hardware, the software for this project can be divided into two main components: a transmitter, and a receiver. By design, both are very similar in structure, and can be represented as a circular buffer with one reader and one writer. An INPUT module, which acts as the writer, constantly uses the PS/2 or a PS/2-like protocol to construct a key code corresponding to a certain character, and writes it to the circular buffer. At the same time, an OUTPUT module, which acts as the reader, constantly reads key codes out of the buffer and performs the appropriate operations. For the transmitter, the OUTPUT module transmits the key code via the speakers, and for the receiver, the OUTPUT module writes the character corresponding to the key code to the LCD. A diagram representing this structure can be seen below.

Transmitter/Receiver Generalized Structure
Transmitter/Receiver Generalized Structure

Though the structure for the two codes are very similar, there are some subtle differences that we will now explain in the next section.

Software Design : Transmitter

Within our transmitter, the INPUT module uses a standard PS/2 protocol as it is simply interfacing with the keyboard. It effectively handles key scan codes, break codes, as well as extended codes, which are simply keys that are represented by two packets instead of one. The format of the PS/2 packets that it constructs are as follows.

Transmitter INPUT packet
Transmitter INPUT packet

It is important to note the INPUT module saves the extracted key scan code as a 16 bit integera and adds a 1 in the 9th bit position to indicate whether or not the shift key was pressed. The key scan code that we transmit effectively becomes 9 bits long, and the resulting packet becomes 12 bits long.

The OUTPUT module of transmitter effectively emulates the PS/2 transmission protocol by alternatively turning on and off an 8 kHz frequency tone to simulate a clock signal, and by alternatively turning on and off a 5 kHz frequency tone to simulate a data signal. A more detailed explanation of the code can be found in the comments of the code itself, in the Appendix.

Software Design : Receiver

Within our receiver, the INPUT module uses a protocol very similar to PS/2 but with a 9 bit key scan code to account for the shift flag. The format of the modified PS/2 packets that it constructs are as follows.

Receiver INPUT packet
Receiver INPUT packet

The OUTPUT module of receiver extracts the 9 bit key scan code from the modified PS/2 packet and uses the lower 8 bits (the original key scan code) to index into one of two character arrays, determined by the shift flag. The returned character is then output to the LCD. A more detailed explanation of the code can be found in the comments of the code itself, in the Appendix.

Results

Speed of Execution

Our receiver was extremely responsive. Since the code responsible for updating the LCD was typically able to finish executing by the time a new key was pressed, the primary limiting factor seemed only to be how often we were calling the lcd_display() task. After giving lcd_display() a release time of 50 mSec and testing with a PS/2 keyboard, we had some difficulty getting lcd_display() to lag behind. Only by repeatedly mashing multiple keys simultaneously were we able to get a noticeable delay, and even then the addition of a circular buffer ensured that no characters were lost, only output later.

Unfortunately, our transmitter was another story entirely. Since our transmitter worked by maintaining tones for a certain duration of time (say x mSec) to represent a bit, and since our packets were 12 bits long, our transmitter was guaranteed to take at least 12*x mSec to send a single packet. Add in a certain duration of silence between bits (say y mSec), and our transmitter was guaranteed to take at least 12*x + y mSec to send a single packet. Although the transmitter wasn’t completely unresponsive (new keys could still be queued), the 12*x + y mSec/packet execution time was something we could not do away with. Instead, we had to find a way to minimize this transmission time as much as possible.

In the interest of finding a tolerable bit-rate without sacrificing accuracy, we decided to try gradually decrementing the tone and silence durations and running a series of trials to see how often errors occurred with these durations. Our results are summarized in the table below.

Tone Duration Silence Duration Transmission Time Transmission Rate
100 mSec 250 mSec 1450 mSec/packet ~0.690 packets/sec
75 mSec 125 mSec 1025 mSec/packet ~0.976 packets/sec
50 mSec 125 mSec 725 mSec/packet ~1.379 packets/sec
25 mSec 60 mSec 360 mSec/packet ~2.778 packets/sec
20 mSec 60 mSec 300 mSec/packet ~3.333 packets/sec

At 20 mSec tone durations we begin to see a noticeable increase in error rate (accuracy will be discussed in the following section) so we decided to use a tone duration of 25 mSec with a silence duration of 60 mSec. With these durations we are able to get a transmission time of around 360 mSec per packet, which while still not nearly as good as the receiver, made for a fairly responsive transmitter that could keep up reasonably well with the user.

Accuracy

As mentioned in the previous section, we gradually decremented tone and silence durations while running a series of trials to examine the effect on error rate. We ran three trials for each tone duration, where each trial consisted of queuing up the 26 letters of the alphabet, and seeing which characters were dropped or incorrectly printed. The results are summarized below

Tone Duration Trial Trial Accuracy Average Accuracy
100 mSec 1 11.5%
2 7.7%
3 12.8%
12.8%
75 mSec 1 11.5%
2 11.5%
3 3.8%
9.0%
50 mSec 1 7.7%
2 11.5%
3 3.8%
7.7%
25 mSec 1 11.5%
2 7.7%
3 15.4%
11.5%
20 mSec 1 15.4%
2 38.5%
3 26.9%
26.9%

Leave a Comment

Your email address will not be published. Required fields are marked *