Acoustic Modem Using Atmega • Tech Projects

Data transmission over sound is used in many communication protocols, the most common being Dual-Tone Multi-Frequency signaling (DTMF). It is used to dial phone numbers and the frequency combinations chosen for the digits are very familiar to the general public. It was also used in early modems to request internet connections and its characteristic sound is nostalgic for many people.

However, to push the data rate to higher values, ultrasound (sound waves of frequency higher than 20 kHz), is a better choice than audible sound given that it is relatively unaffected by noise produced by human speech and, at the same time, does not disturb the user.

Nowadays, some cellphone and computer applications use ultrasound to transmit data. However the channel’s characteristics, such as echo, high noise for audible range and inadequate hardware for ultrasonic range, are very difficult to handle and severely limit the data rate. Applications that use it to transmit data are usually academic. Distance and speed measurement and sonar devices are where ultrasound is most useful.

The device described on this report consists of a pair of microcontrollers assisted by analog circuitry, which transmit data to each other through sound.

When a key is pressed on the keyboard connected to the transmitter microcontroller, a corresponding sound wave is generated by the speaker. The microphone converts the sound wave to an analog electric signal, which is converted to a digital signal and interpreted by the receiver. The data is then shown on an LCD display.

The initial design would use ultrasound to transmit the information as to not disturb the user and the data would be converted back into the PS2 keyboard protocol in order the mimic the keyboard itself. That way, a computer would experience no difference between a signal received from the device and a signal from an actual keyboard. Due to unforeseen challenges, we opted for a less sophisticated design, which uses audible sound.

The idea for this project came from a lab in ECE 4760, the microcontroller design class, involving DTMF. We noticed that sound was an interesting way to transmit information and decided to create a transmitter receiver pair using this method.

High Level Design

Logical Structure

At a high level, our system consists of two main components: a transmitter unit, and a receiver unit.

The transmitter consists of a keyboard, a microcontroller (TX-MCU), and a set of speakers. Whenever a key on the keyboard is pressed, the keyboard drives two signals, CLK and DATA. The TX-MCU samples DATA on the falling edge of CLK to build a data packet representing the key that was pressed, then transmits the packet as a combination of two alternating on/off tones (one at 8 kHz, another at 5 kHz) over the speakers, where one tone represents CLK and the other represents DATA.

It is important to note that the keyboard CLK is driven at a very high speed (roughly 13 kHz). At these speeds it is impossible to represent the signal as on/off 8 kHz or 5 kHz tones, as even a single period would not fit within the clock pulse, let alone the number of periods per pulse necessary to be recognized by the receiver. As a result, a significantly slower transmission rate must be used by the TX-MCU.

The tones output by the speakers are then picked up with a microphone, and the resulting analog electric signal passes through three stages of high pass filters and two amplifiers before being input into a tone-decoding circuit. As it’s name suggests, the tone-decoding circuit consists of two tone decoders that are tuned to detect the presence of 8 kHz and 5 kHz frequency signals, each driving a low output if such frequencies are detected. In this way digital CLK and DATA signals can be reconstructed. However, these signals tend to oscillate slightly when transitioning between low and high, which is undesirable. To counter this, the signal is low-pass filtered and passed through a comparator with hysteresis. The resulting signals are finally then considered suitable for the receiver microcontroller (RX-MCU) to use as inputs. Once again, the RX-MCU samples DATA on the falling edge of CLK to build a data packet representing the key that was pressed. From this, the RX-MCU is able to determine the corresponding character and ouput it to the LCD.

The user interacts with the overall system using the keyboard. Assuming there are no errors in transmission, valid key-presses on the keyboard will result in the corresponding character being output to the LCD after the previously output character, as in a typical text editor. Going past the end of either of the two LCD lines will result in the text wrapping around to the beginning of the other line. Aside from the typical alphanumeric keys, there are additional keys with functions associated with them.

Key	Result
<Shift>	Output alternate character
<Backspace>	Erase last character, move cursor back by one
<Enter>	Clear current line, move cursor to beginning of current line
<Esc>	Clear both lines, move cursor to beginning of first line

Design Decisions

Tone Generation

We considered several different options for generating tones at a target frequency.

One of the first methods that came to mind was using Direct Digital Synthesis (DDS) as in Lab 2 of this class. Obvious benefits to this approach were the ease and familiarity that came with having implmented this approach previously. One huge restriction at the time was the fact that DDS was limited by the microcontroller to a maximum frequency of roughly 8 kHz, but, with some handling, we decided that the harmonics could be used

We also considered using a pure hardware implementation, such as using a 555 timer or phase-shift oscillators.

Lastly, we considered low-passing a square wave, which could be generated either by the MCU or the LM 567 IC.

Even though it is not the most appropriate option, this design uses DDS. We had unexpected difficulties implementing some of the other options and opted for DDS, as we were more familiar and experienced with it.

Tone Detection

As before, we considered several different options for detecting tones at a target frequency.

One approach was to use a combination of an Analog-to-Digital Converter (ADC) and Goertzel’s algorithm, which is a procedure with detects the frequency component of a single tone in a signal, primarily used in embedded systems. In case the frequency of the signal was too high for the internal ADC, an external one could be used or the frequency could be brought down by multiplying it by a sine wave with a frequency a little lower than the carrier with the IC AD633.

Another approach was to use the LM567 tone decoders, which involved a relatively simple hardware implementation, and was capable of detecting up to 500 kHz.

Our first attempt was using band-pass filters and rectifying the signal, but low order analog filters proved insufficient for this application. Then, we decided to use LM567’s.

Hardware

Hardware Overview

Our system consists of two main physical components: a transmitter circuit, and a receiver circuit. The transmitter circuit is relatively simple, consisting only of a microcontroller, a keyboard, a low-pass filter to filter the output of the PWM, and a set of speakers to output the corresponding tones. The receiver circuit is significantly more complex, and can be subdivided into a microcontroller, a microphone circuit to pick up, filter, and amplify the tones, a tone-decoding circuit to detect the presence of certain frequencies in a signal, and a low-pass hysterisis circuit to clean up the resulting digital signal.

Hardware Design: Transmitter

As mentioned previously, the transmitter circuit is relatively straightforward, as most of the work is done in software. A schematic for the overall circuit can be found in the appendix. PIN B0 and PIN B2 of the TX microcontroller are connected to the PS/2 keyboard DATA and CLK lines respectively, with 330Ω resistors in between for protections purposes. The output of the PWM at PORT B3 is passed through a low-pass filter to drive a set of speakers, which has its own power supply. For the low pass filter, we needed a resistor that was high enough so as not to load the port pin, yet low enough to be below the speaker input resistance of 30kΩ. We ultimately settled on a 10kΩ resistor. We then chose a capacitance of 2.2nF for a cut-off frequency of ~7.25 kHz. Though this is technically below 8 kHz, we were still able to reliably detect 8kHz tones.

Hardware Design: Receiver

The receiver circuit is where most of the complexity lies, and can be subdivided into microphone, tone decoder, and hysteresis sub-circuits, which we will examine in more depth in the following sections. A schematic for the combined receiver circuit can be found in the appendix.

Microphone Sub-Circuit

The microphone stage is shown above. This is a modified version of the design presented in class to adapt the signal for Analog-to-Digital Conversion.

The microphone is set into a voltage divider with a 10kΩ resistor. This value provides the best results since it sets the DC voltage on the microphone to a value close to half of Vcc, therefore the ratio between the AC voltage and changes of the resistance of the microphone is maximum.

C1 isolates the DC voltage between the microphone and the second voltage divider R2 and R3, which are set so that the voltage at the input of the amplifier is half of Vcc. This guarantees the input for the amplifier will not be a negative voltage, which cannot be handled with the power sources available. The two amplifiers are set to gains of 50 and 10, respectively. C2 and C3 cut off the DC current of the amplifiers, effectively turning them into buffers for DC voltages. The amplifiers are implemented using LF353 for its fairly high bandwidth-gain property.

Note this circuit has 3 high-pass filters: C1 and R2||R3, C2 and R5, and C3 and R7. Their values are set to cut-off frequencies of approximately 14, 7 and 14 kHz, respectively, to remove noise from human speech. Ideally, these values should have been adjusted to lower cut-off frequencies since the final design used 5 kHz and 8 kHz, but this could not be done due to time constraints.

Tone Decoder Sub-Circuit

The tone decoder stage is shown above. The input is pin 3 and the output is pin 8.

There are two of these blocks, one for each frequency used. They are implemented by the LM567CN. The configuration used is the one described on the data sheet. Rf, Cf and the center frequency of the tone decoder are related by the following equation:

On this design, capacitors of 10nF and 5kΩ potentiometers were used to set the center frequency.

One of the problems encountered was that since C1 and C2 are electrolytic and maintain their charges for a long time, Rf had to be adjusted after every power cycle to achieve the proper center frequency.

Another problem encountered was that when transitioning between high and low states the tone decoder output would occasionally oscillate. This proved to be particularly troublesome as the resulting falling edges would trick the receiver microntroller into thinking that a new data bit had arrived.

Hysteresis Sub-Circuit

The solution to the oscillation problem mentioned above was to add an additional stage consisteing of a low-pass filter and a comparator with hysteresis to smoouth out the signal. The resulting circuit is shown above. This configuration is based on the one used in lab 4 of this class.

Software

PS/2 Background

Before we begin discussing our code, a brief overview of the PS/2 protocol is in order. A typical PS/2 keyboard cable consists of four wires: Vcc, GND, CLK, and DATA. Vcc and GND can be connected to microcontroller ground and Vcc. When a key is pressed, the keyboard begins to drive a clock signal on CLK. The frequency of the CLK signal can vary, but for the keyboard we are using the frequency is approximately 13 kHz. On the falling edge CLK, the value of DATA reflects the current bit. By repeatedly checking the value of DATA on the falling edge of CLK, a data packet can be constructed, as seen below.

Ps/2 packets consist of 11 bits: a start bit of 0, the 8 bits of the scan code (more on this later), least significant bit first, an odd parity bit who’s purpose is to ensure that the number of 1’s in the scan code is always odd, and a stop bit, which should always be 1.

The aforementioned scan code is what represents the keystroke information. While a key is held, the keyboard repeatedly sends a scan code corresponding to the key being held down. Once the key is released, the keyboard sends two more additional scan codes: first a ‘break’ code of 0xFO to signify that a key has been released, and finally the scan code of the key to signal which key was released. A table of keys and their corresponding scan codes can be found in the Reference section. Using this knowledge, we are able to implement code to effectively handle keystroke information coming from the keyboard. In addition, we are also able to emulate this protocol with our transmitter.

Software Overview

Like the hardware, the software for this project can be divided into two main components: a transmitter, and a receiver. By design, both are very similar in structure, and can be represented as a circular buffer with one reader and one writer. An INPUT module, which acts as the writer, constantly uses the PS/2 or a PS/2-like protocol to construct a key code corresponding to a certain character, and writes it to the circular buffer. At the same time, an OUTPUT module, which acts as the reader, constantly reads key codes out of the buffer and performs the appropriate operations. For the transmitter, the OUTPUT module transmits the key code via the speakers, and for the receiver, the OUTPUT module writes the character corresponding to the key code to the LCD. A diagram representing this structure can be seen below.

Transmitter/Receiver Generalized Structure

Though the structure for the two codes are very similar, there are some subtle differences that we will now explain in the next section.

Software Design : Transmitter

Within our transmitter, the INPUT module uses a standard PS/2 protocol as it is simply interfacing with the keyboard. It effectively handles key scan codes, break codes, as well as extended codes, which are simply keys that are represented by two packets instead of one. The format of the PS/2 packets that it constructs are as follows.

It is important to note the INPUT module saves the extracted key scan code as a 16 bit integera and adds a 1 in the 9th bit position to indicate whether or not the shift key was pressed. The key scan code that we transmit effectively becomes 9 bits long, and the resulting packet becomes 12 bits long.

The OUTPUT module of transmitter effectively emulates the PS/2 transmission protocol by alternatively turning on and off an 8 kHz frequency tone to simulate a clock signal, and by alternatively turning on and off a 5 kHz frequency tone to simulate a data signal. A more detailed explanation of the code can be found in the comments of the code itself, in the Appendix.

Software Design : Receiver

Within our receiver, the INPUT module uses a protocol very similar to PS/2 but with a 9 bit key scan code to account for the shift flag. The format of the modified PS/2 packets that it constructs are as follows.

The OUTPUT module of receiver extracts the 9 bit key scan code from the modified PS/2 packet and uses the lower 8 bits (the original key scan code) to index into one of two character arrays, determined by the shift flag. The returned character is then output to the LCD. A more detailed explanation of the code can be found in the comments of the code itself, in the Appendix.

Results

Speed of Execution

Our receiver was extremely responsive. Since the code responsible for updating the LCD was typically able to finish executing by the time a new key was pressed, the primary limiting factor seemed only to be how often we were calling the lcd_display() task. After giving lcd_display() a release time of 50 mSec and testing with a PS/2 keyboard, we had some difficulty getting lcd_display() to lag behind. Only by repeatedly mashing multiple keys simultaneously were we able to get a noticeable delay, and even then the addition of a circular buffer ensured that no characters were lost, only output later.

Unfortunately, our transmitter was another story entirely. Since our transmitter worked by maintaining tones for a certain duration of time (say x mSec) to represent a bit, and since our packets were 12 bits long, our transmitter was guaranteed to take at least 12*x mSec to send a single packet. Add in a certain duration of silence between bits (say y mSec), and our transmitter was guaranteed to take at least 12*x + y mSec to send a single packet. Although the transmitter wasn’t completely unresponsive (new keys could still be queued), the 12*x + y mSec/packet execution time was something we could not do away with. Instead, we had to find a way to minimize this transmission time as much as possible.

In the interest of finding a tolerable bit-rate without sacrificing accuracy, we decided to try gradually decrementing the tone and silence durations and running a series of trials to see how often errors occurred with these durations. Our results are summarized in the table below.

Tone Duration	Silence Duration	Transmission Time	Transmission Rate
100 mSec	250 mSec	1450 mSec/packet	~0.690 packets/sec
75 mSec	125 mSec	1025 mSec/packet	~0.976 packets/sec
50 mSec	125 mSec	725 mSec/packet	~1.379 packets/sec
25 mSec	60 mSec	360 mSec/packet	~2.778 packets/sec
20 mSec	60 mSec	300 mSec/packet	~3.333 packets/sec

At 20 mSec tone durations we begin to see a noticeable increase in error rate (accuracy will be discussed in the following section) so we decided to use a tone duration of 25 mSec with a silence duration of 60 mSec. With these durations we are able to get a transmission time of around 360 mSec per packet, which while still not nearly as good as the receiver, made for a fairly responsive transmitter that could keep up reasonably well with the user.

Accuracy

As mentioned in the previous section, we gradually decremented tone and silence durations while running a series of trials to examine the effect on error rate. We ran three trials for each tone duration, where each trial consisted of queuing up the 26 letters of the alphabet, and seeing which characters were dropped or incorrectly printed. The results are summarized below

Tone Duration	Trial	Trial Accuracy	Average Accuracy
100 mSec	1	11.5%
	2	7.7%
	3	12.8%
			12.8%
75 mSec	1	11.5%
	2	11.5%
	3	3.8%
			9.0%
50 mSec	1	7.7%
	2	11.5%
	3	3.8%
			7.7%
25 mSec	1	11.5%
	2	7.7%
	3	15.4%
			11.5%
20 mSec	1	15.4%
	2	38.5%
	3	26.9%
			26.9%

As we were originally pretty worried about the resolution we could get from the tone decoders, these results came as a pleasant surprise. We were able to decrease tone duration down to 25 mSec with seemingly little effect on error rate. However, any lower than 25 mSec accuracy rapidly begins to deteriorate. By 15 mSec we were no longer able to accurately receive any packets.

These results show that while tone duration can still be an issue when set really low, errors in transmission seem to be related more to the volume of the speakers, as well as the speakers position in relation to the microphone (perhaps due to interference from reflections or noise). Unfortunately, we found these factors to be difficult to test (though a tradeoff naturally exists between the two factors). We have found though, that even at moderately loud volumes the speakers typically needed to be held within 2-3 inches of the microphone for reliable transmission.

Safety

Our circuit can be generally considered to be safe. Though admittedly there are several exposed connections and tiny components that may present choking hazards, for the most part there are very few risks associated with a normal use case. Since the user only interacts with the keyboard, there is little reason for them to come into contact with an exposed pin let alone wind up with a component in their mouth. Even if they were to accidentally touch an exposed pin, all of the hardware in our system runs at less than 6V at low current, which should not pose a risk to the user. That being said, given more time we would have liked to reduce the size of our circuit and put it in an enclosure to further reduce the risk of someone accidentally touching the circuit.

Interference with Other Designs

Our system transmits packets through shrill, repetitious chirping noises, and for reliable transmission a reasonably loud volume is required. We find it reasonable that such a system would interfere with some of the other student designs, particularly the ones that attempted to do sound localization on a clapping or snapping sound. To our knowledge there were two such projects being developed, though since we never tested our design at the same time as these groups, we can’t know for certain if there actually would have been interference.

However, we DO know that the chirping noises are HIGHLY irritating, and while they might not have had an effect on the students’ projects, it definitely had an effect on the students themselves, ourselves included, given the volume and number of repetitions necessary to properly test the circuit. There were times we it felt like our group was single-handedly responsible for at least half, if not all of the noise pollution in the lab. A significant number of students’ demo videos are probably doomed to forever have the tell-tale chirping playing in the background. To all the students and staff that were in the lab that were stuck in the lab while we were testing, we are sorry.

Usability

The use-case for our project is a very simple one. User presses key on a keyboard, the corresponding key gets output to an LCD. However, while understanding how to use it is fairly easy, actually getting it to work reliably takes some skill (and patience). While significantly improved, transmission times can still lead to delays between a key being pressed and the corresponding packet getting transmitted. In addition, for transmission to work reliably, the user usually has to set the speakers to a reasonably loud volume and hold the speakers relatively close to the microphone, which as we established in the previous section, is not always pleasant. For these reasons we have to say that our project is easy to understand, but relatively hard to master, and probably should not be used for extended periods of time.

Conclusions

Summary

Sound is the main form of communication of most living creatures, and even though the high levels of noise and the complex channel characteristics make it less efficient for data transmission than electromagnetic waves, its use for data transmission in electronic devices is a valid option on some circumstances.

Interference from human speech and other noises must be taken into account and the position of the speakers and the microphone is critical.

In this design, sound was generated using Direct Digital Synthesis (DDS), which proved to be inappropriate given the limitations of the MCU. A more efficient choice would be simply low pass filtering a PWM signal generated by the MCU.

Something worth mentioning are general debugging guidelines we learned along the way. Before replanning the logistics of your design, it is important to make sure all prerequisites are met. For instance, if the hardware does not work and all connections seem to be correct, check if power is connected, check if the IC’s are not backwards, look for short circuits. If debugging is taking too long, the problem is very likely something simple you are taking for granted.

Although we did not set goals for how high we wanted the data rate to be, we are certain it can be pushed much further. The current design uses a very simple protocol called On-Off Keying on two tones. Upgrading the code to use a more sophisticated protocol would increase the data rate. The hardware also imposed limitations since both the speakers and the microphone are designed to operate at audible frequencies.

Standards

We use the PS/2 protocol to interface our TX microcontroller with the keyboard. A very similar protocol is used to by the TX micrcontroller to transmit packets and by the receiver microcontroller to reconstruct these packets, though the packet structure is very slightly changed.

Intellectual Property

To implement the PS/2 protocol, we referenced code found on the Nerdkits website, and expanded it to support additional characters, more robust error checking, and added a circular buffer to avoid dropping characters. The author of the code can be contacted here, and the page in question can be found in the References section.