VoIPey Valentines Day: Demonstrating V.61 Analog Simultaneous Voice and Data

The way we communicate has changed substantially over the past few decades. What was once the most important staple of communication at home, namely the POTS land-line, is slowly disappearing altogether in favour of VoIP and mobile based services.

Along with the loss of the landline is the loss of voice-band modem technologies such as fax and voice-band modems which face a difficult transition path to IP, owing to signal integrity needs. While the original FTTP NBN in Australia had two UNI-V ports for POTS emulation and could offer a landline, many providers were not willing to guarantee operation of these voice-band modems. The latest Coalition NBN with mixed-technology does not offer a voice port at all, and thus, users are forced to go back to VoIP ATAs along with all their quirks, contention, latency, echo and jitter buffering. It is expected that these technologies will also fade once replacement services get a foothold (e.g. PDF scans via e-mail).

A week or so ago, you might have wondered why I was busy resurrecting two Netcomm Roadster II 56k USB modems. In a stroke of inspiration, I decided that since it didn’t look like I’d have anyone to connect with on Valentines day, that shouldn’t stop my two modems from hooking up ;).

Rarely Used: ASVD and DSVD

Flash-back to a time when computers were slow, and mobile phones were not cheap nor ubiquitous. Home phones were king, and most people could only afford one line at home. Modems existed, but data rates were slowly increasing, think around 14.4kbit/s TCM. The internet wasn’t yet the “sure way” to get information as it is now, and people made direct modem-to-modem calls to get onto BBSes to download files, or even to connect directly to a friend’s computer to transfer a few files or play a network game or two.

Once the modems were dialled and a data link was established, the phone line was essentially tied up. From then on, everything was “digital”. Digital signal processing was not as advanced as it is today, and computers weren’t guaranteed to have sound cards, nor even a CPU that could do real-time compression/decompression of digital audio. The best that most could do was a keyboard-to-keyboard chat. If you wanted to talk, most of the time, you really had to hang up and dial again.

Wouldn’t it be nice if somehow, we could use the line to talk while the data was being transmitted?

This is where the concept of simultaneous voice and data (SVD) comes in. The most significant SVD modes were the ITU-standardized V.61 analog SVD (ASVD), and V.70 digital SVD (DSVD).

In ASVD, the voice received by the modem is encoded by offsetting the data constellation from its ideal position, as a result, line noise and the intended voice signal will be decoded as the output audio.


The advantage of this type of modulation is that it is relatively simple to implement, although there are a few intricacies in regards to control and synchronization that the ITU has thrown in. Computationally, it really just involves taking the digitized audio from an ADC and using that to generate your symbol offset, and doesn’t involve any complicated compression at all.


As a result, ASVD is limited in its throughput and efficiency. The base mode negotiates a symmetrical 4800bps connection with voice, with optional revert-to-data-only 14.4kbit/s mode when no voice is detected. Rockwell/Conexant chipsets also have ML144/ML288 modes which appear to offer higher speeds (up to 9600bps data in ML144, and 14400bps data in ML288 with simultaneous voice) and better audio quality at the same bit-rate.

DSVD was a big upgrade on this, resulting in a single physical data connection whose bitrate is shared between voice and data depending on whether the voice channel was used. This was achieved by having the voice processed by an audio codec to compress it into a 8kbit/s G.729a stream, with voice-activity detection (VAD) to suppress data output when no voice is detected. This is multiplexed in with the data circuit over the one connection. This was more computationally expensive.

Other rarer methods of “SVD” were also developed, such as VoiceView which alternated between data and voice signals but did not achieve traction in the same way that ITU-standardized V.61 ASVD (Voicespan/Audiospan) and V.70 DSVD did. Despite this, it was rare to actually have an SVD connection in use – I’ve never dialled into a service that supported it and I’ve never actually heard an SVD handshake before.

Partner Needed: For SVD Connection

Both ASVD and DSVD technologies were not necessarily standard features of modems. In fact, many earlier modems were data-only modems which could not handle fax or voice, and some later modems were data-fax only. Voice modems possess an internal voice codec which allows for recording and playback from phone lines and were more likely, but not required, to offer SVD capabilities.

Thanks to the proliferation of Rockwell/Conexant chipsets in external modems, many of the later voice modems did have ASVD capabilities because they came standard on the RCV144 and newer series chipsets, and was marketed as Audiospan. This was based on V.61, and was marketed by AT&T Paradyne as Voicespan, with their modems capable of using a regular desktop phone connected to the modem during data calls.

DSVD, however, remained an option on the Rockwell/Conexant chipsets and required an external co-processor to function. DSVD was also hampered by the fact that there were numerous incompatible voice codecs used. While the ITU recommended G.729a, Rockwell championed DigiTalk instead and USR used Truespeech. Others had their own VQ/CELP algorithms as well. The added cost and incompatibility made it rare to find DSVD capable modems.

As I had a relatively extensive set of modems, I was semi-confident that I could find something that worked. In the end, my search for a DSVD capable pair was hampered, as even when the hardware seemed capable, the firmware seemed to have restrictions that prevented its operation.

To find out whether a modem is capable of SVD, you need to issue AT-SMS=?


In this case, the Netcomm RoadsterII 56k USB modem is reporting supporting data, DSVD, Audiospan and Automatic mode, however for some reason, the maximum and minimum bitrates are fixed at 4800bps. The final parameter, symbol rate, must remain at 0.


Sadly, it seems this is the default response for a Rockwell/Conexant with no DSVD co-processor – it will reject choosing Audiospan modes needing ML144/ML288 modes and DSVD with error. At least the modem can still do ASVD.

Get a Room: Hooking Up

Given that I’ve been without a land-line for over five years now, getting the two modems to hook up is a little less than straightforward. The POTS network relies on having loop current to pass signals from device to device, and it also generates the necessary call-progress tones, ring signals, etc to make the devices operate as intended.

Local Loop Current Generator

A long time ago, before I even had regular access to the internet, I had an idea that I would like to turn my fax machine into a printer. Some commercial phone-line emulators retailed for AU$250 and upwards, and that really wasn’t an economical solution.

After some experimentation, I found (rather by accident) that if I wire a 9v battery in series with a phone cable, that I could speak in one phone and the voice would come out the other.

Plug 1                     + 9V -                          Plug 2
Tip  ----------------------|    |------------------------- Tip
Ring ----------------------------------------------------- Ring

Now that I am a little older, I better understand why this was the case. As it turns out, each POTS device presents a DC impedance of about 200 ohms, and with this loop, the equivalent resistance is about 400 ohms, so a loop current of 22.5mA or so should flow (normal loop current is 12-80mA).

While this was perfectly acceptable for just communication, it didn’t generate any dial-tone, ringing, busy signal, etc. This required configuring the modems for blind dialling (X3) and answering manually amongst other issues.

Analog Telephone Adapter (ATA)

2016021410492481Thanks to the proliferation of VoIP technology, ATAs came onto the market and often emulated one, two or four phone lines for use with VoIP service providers. In a pinch, you could probably sign them all up to register with a free VoIP service (e.g. IdeaSIP, IPKall, Linphone, etc) so as to be able to dial from port to port, which would be fine for regular voice calls especially if you didn’t worry about data usage in and out of your network over the internet, or the security of your calls which are “in the clear”.


Of course, an ATA is not just an ATA. Different ATAs behave differently and have different features. My first ATA was a Netcomm V100 which I promptly returned because someone else had tampered with it before I received it, and it was really restricted as to how it could be configured.


The second ATA I managed to get was an unwanted ZyXEL Prestige 2302R. This one has been bundled by overseas VSPs and is both a router and two-line VoIP ATA at once. Part of being an earlier ATA was the integration of a router so that the QoS can be managed by the VoIP ATA to ensure better voice quality, but this only added to the configuration confusion for others especially if they didn’t want to use the routing feature (i.e. hook WAN to your LAN) and also resulted in some frustrations for others as the routers were often very second-rate and you easily could have a NAT behind NAT issue.

2016021410592490 2016021411002491

The reason this ATA was almost given away to me became clear when I started using it. While it has ample configuration for many things, the VoIP side was not very configurable. Only G.711a/u and G.729a were available for codecs, with no jitter buffer configuration, or even line impedance configuration. The resulting audio gain settings (+1/0/-1) often resulted in terrible echo to full-on feedback when both lines are used, where -1 is still too loud and the echo return loss was shocking. There were also quirks with the password configuration for it, with occasional authentication issues.

The whole unit is based around the Infineon (ADMtek) ADM5120P Network Processor SoC married to a Texas Instruments TNETV2402PGE VoIP DSP.

My favourite ATA was the Linksys/Cisco PAP2T. I actually have three units myself, and they have been serving me well over the past few years.


The main reason I favoured the PAP2T was its configurability and wide codec support compared to its competition. You had very fine configurability of the generated tones, so your line can sound Australian, and you could configure the line impedance to match the Australian 820+220||120nF complex standard and reduce the echo. In fact, I’ve measured echo-return loss figures greater than 25dB in some cases. You also had very fine-grained gain settings for both input and output, and the ability to disable all echo cancellers which helps improve echo, signal to noise and ensures modems are happy. Additional hacks included being able to set the RTP Packetization interval to 10ms to ensure the lowest latency possible, and being able to set the jitter buffer to a fixed size to prevent adjustment during a call causing dropped or repeated samples. You could even do direct IP calling which most other boxes could not do.

2016021410512483 2016021410552485

2016021410522484 2016021410562486

Genuine PCB                                                     Counterfeit PCB

The PAP2T itself isn’t a perfect ATA, as it does have some T.38 quirks (unreliable fax detection), and old-fashioned Realtek 10Mbit/s half-duplex Ethernet interface which is sub-optimal and a clunky configuration. It was so popular that the PAP2T and its “relatives” (SPA3102, SPA3100) were widely counterfeited and I was hit with one without knowing until much later as my case labels, serial numbers, MAC addresses and OUI all matched. It was the power supply and case stand that gave it away in the end, but the counterfeit functioned flawlessly and was entirely firmware compatible. Others complained of a short lifetime.

However, it too has an interesting story. The PAP2T was built around what would really appear to be scrap parts from a different time. The main SOC is a ESS Visba 3 ES3890F, and has absolutely nothing to do with VoIP. In fact, its main role was to be the SoC powering VCD players (i.e. the nasty MPEG1 sort). The Realtek RTL8109AS was an ISA 10Mbit/s Ethernet controller, and the Samsung KM416C1204AJ-6 is an EDO memory chip which really pushes back in time for a product that was still selling in 2007. But despite its cobbled-together origins, it definitely works well for my intended purposes.

Letting the ATAs Speak

When it comes to emulating the PSTN, the ATAs mostly take care of the physical interface just fine. By configuring the PAP2T, dial-tones can all be regionalized, the gain settings sorted, echo cancellers disabled, line impedance regionalized, codecs fixed to G.711a and latency minimized by fixing the jitter buffer and reducing the RTP packetization interval.

However, we do need to register them with a SIP server of some sort so that each line can contact another. Online SIP services are starting to dry up (formerly Voxalot and FreeWorldDialup were my favourites) and are hardly optimal as the RTP media streams sometimes will end up traversing the internet adding to latency. My router also has some sort of anti-tromboning set-up where if an internal device requests to connect to my public IP, they will not receive the data, meaning that if the media isn’t relayed on the services, I would get no audio.

As a result, the best solution was to run a SIP server at home. Things should be simpler, but sadly, they aren’t. The most simple solution that worked was MiniSipServer, which is a paid-for product, and it was straightforward to set-up. I didn’t really like having to use evaluation software and paid-for ones especially, so I explored OfficeSIP. Sadly, that was very problematic and I couldn’t get it working at all in the end, so I tried 3CX Phone System (Free Edition). Configuration with 3CX was much more difficult, but it did work. The problem was when media relaying was enabled, 3CX performed poorly resulting in lost packets and a noticeable buffering of the stream adding latency. I couldn’t accept this.

As a result, I ended up going the whole hog and installing AsteriskNow in a virtual machine. It took a lot of configuration, but it was able to perform as I had expected with the modification of a few .conf files, and allowed for me to explore more complex multiple-trunk routing options and also get Digium Free Fax for Asterisk as well to have a virtual fax extension. It’s nice finally having an 11 extension home-setup with an incoming and outgoing trunk shared amongst them all. With the set-up, even V.90 connections at 46667bps via the internet outbound trunk is possible, and regular V.34+ 36000bps symmetrical connections within the intranet, analog-to-analog modem, are achieved.

Voicespan/Audiospan in Action

You might be wondering why I even need media relay (or comedia). The reason is simple – I want to be able to record the audio of the call by sniffing the network with Wireshark. As a result, I have my AsteriskNOW server on a VM talking to a physical internal network interface connected to a switch with my ATAs. Sniffing on that network allows me to reconstruct the calls.

Actually making the calls happen was an interesting experience in what cost reduction has done to the modems. According to the manual, it is possible to:

>  AT#VLS=0 ; Select phone as audio source
>  AT-SMS=2 ; Select Audiospan modulation
>  ATDT102  ; Call other extension
<< RING     ; Receive RING from other modem
>> AT#VLS=0 ; Match the configuration on answer modem
>> AT-SMS=2 ; Match the configuration on answer modem
>> ATA      ; Cause other modem to answer
; Wait as the modems negotiate
<  CONNECT 4800  ; Modems connected
<< CONNECT 4800 ; Modems connected
; Pick up attached desk phone connected to PHONE port
; on either modem and start talking down the line (?)

But instead, when you pick up the phone, you hear the data carrier and the modems drop out immediately. It seems that cost reduction means that a second relay isn’t used so the phone socket is not being routed to the audio processor on the modem and actually using ASVD with a desk phone is not possible. However, using AT#VLS=6 speakerphone mode seems to work, provided a headset is used.

An ASVD call is being set-up

In this audio file, the right channel is the originating modem, and the left channel is the answer modem. The call is negotiated with V.8 and sounds a little different from a regular modem call. The data and audio phase is cut short, and then the modem is instructed to hang up. Speaking loudly into the channel does not result in intelligible audio if listening to the modem channel directly (as I had assumed might be possible).

Audio transacted over ASVD link

This was recorded from the modem’s headset jack, with AT#VLS=6. The call is placed, the speaker is muted as the initial negotiation is performed, then the voice link comes up. The audio sounds a little muffled compared to regular voice-only link.


After some hard work, two 17-year-old Netcomm Roadster II 56k USB modems successfully went on a date, spoke to each other, and exchanged numbers over a virtual POTS line served out of a FreePBX/Asterisk server and a Linksys/Cisco PAP2T ATA. In doing so, we’ve managed to catch two members of a rare species (voiceband modem) performing courtship (handshaking) in a rarely observed manner (ASVD).

Maybe next year, on Valentines Day, I’ll be … hooking up more than just modems?

About lui_gough

I'm a bit of a nut for electronics, computing, photography, radio, satellite and other technical hobbies. Click for more about me!
This entry was posted in Computing, Telecommunications and tagged , , , , , . Bookmark the permalink.

Error: Comment is Missing!