Investigating: Vodafone & Kogan Mobile Data Connection Issues

Access to the internet is so important nowadays that we rarely stop and think of the consequences of not having access to the internet until it stops working. Because of my recent relocation, getting access to the internet was an issue. At this premises, there hasn’t been a phone line in years and the distance to the exchange means that ADSL2+ would be at a crawl (expecting about 3Mbit/s at best). Cable is available, but it’s fairly pricey and the install fees wouldn’t make sense since NBN HFC is apparently coming in mid-next-year.

As a result, the only logical conclusion was to opt for a wireless connection to the internet. For any self-respecting nerd, the idea of being on a wireless connection feels a little “wrong”, as we know from experience that such systems are often variable in performance due to shared-medium congestion, interference, sometimes poor signal levels, etc. In fact, as this area is adjacent to a number of new property developments which also are not served by copper nor NBN (yet), congestion had reached such a high point in the 3G-era that Optus 3G was slower than dial-up. But compared to not having a connection, a wireless connection is definitely preferable and the advent of LTE has provided a much needed boost to available bandwidth.

Because of a number of rather lucrative deals, I’ve spent the past few months having my primary data connection being routed through Vodafone Pre-Paid and most recently, through Kogan Mobile which is carried through the Vodafone network (and is practically the same stuff). This experience has been rather interesting to say the least.

Connection Problems and Strangeness

One unexpected side effect of being on Vodafone or Kogan Mobile is that the connections are routed through a corporate-grade NAT.

Your device is issued an IP within the 10.x.x.x private network segment, as a result, meaning that your device is not directly exposed to the internet. Unfortunately, this breaks the end-to-end connectivity model of the internet, but seems to be a more common approach as IPv4 address exhaustion continues to be a potential issue and IPv6 migration is still not as far advanced as many would like.

The outside facing address is shown to be part of a NAT pool, which surprisingly, in my case terminated in VIC despite being in NSW. Depending on the data session, it can jump around between different pools from time to time. Unfortunately, as the devices are behind CGNAT, direct port-forwarding for running servers is not possible and other technologies are a mixed whether they work or not. While I had no issues with SIP (surprisingly), I did run into troubles with FTPS which meant that I had to either revert to plain (insecure) FTP or go over my own VPN to my other fixed-line-connection. This may just be down to the NAT’s ALG interfering with the transaction.

Oddly enough, my first data plan was with Lycamobile which also seemed to have a CGNAT of its own, but FTPS worked just fine even with the same internal networking structure which adds two NATs into the path (i.e. mobile phone tethering NAT + household main router NAT). FTPS also seems to work through my other mobile broadband connections, so this was a bit inconvenient.

But the bigger annoyance was that from time to time, very important “big” sites would just stop working.

On my mobile phone, Facebook would complain that it can’t connect and even visiting mail.google.com via Chrome complains of connection refused. At first, I dismissed this as transient, but it seemed to recur without any noticeable pattern. Sometimes rebooting the equipment worked and other times it didn’t.

Initially, I was lazy and just waited it out – it seemed to fix itself after a few hours. But after not being able to check my e-mail in the middle of job application season, I decided to work around the issue by routing out of an alternate connection. The alternative via Telstra always seemed to work just fine – it just cost a lot more in comparison, so I didn’t want to rely on it more than necessary.

On the desktop, things were even more dire as it warned me that something might be tampering with my secure connections and it refuses to connect as the pinned certificate does not match. This is both curious and rather cryptic.

Early on, I had a feeling that the DNS was part of the problem and I was being redirected elsewhere. As a result, I flushed all the DNS resolver caches on my PC, in my router and set the router to ignore Vodafone’s own DNS and instead use the Google Public DNS servers at 8.8.8.8 and 8.8.4.4.

This seemed to solve the problem for a few days, but then today, the same thing happened again. I got tired of switching connections, so it was time to understand the problem.

Playing the Detective

The scenario is as follows – I couldn’t load my e-mail with the load failing to connect to mail.google.com with an HSTS error. The connection goes through Ethernet via a Mikrotik router (NAT), via USB 2.0 to my Xiaomi Redmi Note 4X (NAT) and into Vodafone’s infrastructure (Kogan Mobile). The router is configured to ignore peer DNS and use Google Public DNS only. Caches on the router and locally have been cleared to no effect.

At first, I thought it could be malware – so I tried using a different device on the same network. The result was the same, meaning it was unlikely to be device specific.

The first thing is to see why the HSTS error is occurring – so where are we being sent to when we attempt to load mail.google.com? The answer to this should come from the DNS. Bringing out nslookup reveals something strange:

The left side shows the queries for mail.google.com being sent to the router first, then to Google Public DNS (primary) and Google Public DNS (secondary) before sending it to Cloudflare DNS. The right side shows the responses for mail.google.com when routing out of my secondary Telstra connection. In all cases, the one on the right shows valid responses, whereas Vodafone somehow seems to be returning the loopback address as where mail.google.com resides.

This also explains why I get the HSTS error on my desktop, as I am running VMWare Workstation which binds locally to the HTTPS port 443, thus my browser’s request for mail.google.com is being sent to VMWare which replies with its self-signed certificate triggering the HSTS error (as it should).

I didn’t take any screenshots, but using Google Public DNS to resolve other domain names didn’t have any issues – it seemed specific to mail.google.com and also instagram.com as of that moment. This seems highly suggestive that something is tampering with the DNS queries, but where is it and what is it?

I tried to solve this question by using traceroute, but it didn’t seem to show anything anomalous. In the case of Vodafone, it goes through their network and straight to Google. In the case of Telstra, it’s pretty much the same. This makes sense, because they are probably peering with Google as one of the major destinations of the internet – but this means that whatever the culprit is wasn’t caught by a traceroute. This makes me suspicious there is some deep packet inspection “catching” only DNS queries regardless of destination and spoofing replies.

As I have used nslookup directly to target 8.8.8.8/8.8.4.4/1.1.1.1, it seems highly suggestive that whatever is tampering with it is probably not local to my network. To be sure, I decided to investigate the router’s DNS cache. By flushing it and querying mail.google.com from my desktop through the router, the router’s cache contains the bad response.

This tells me that the bad DNS replies are not the fault of the router, so might it be because of the phone I am using to tether with? The phone feeds the router, so logically I rebooted the phone, reconnected it and saw no change in behaviour.

In fact, I installed termux on the phone itself and used the phone itself to make a direct DNS query and got exactly the same result. Alas, it seems that Vodafone is indeed giving us a reply that gives a loopback address – how strange.

Digging further using the Fing app, it seemed that the phone could correctly resolve the address for mail.google.com, but how?

The answer seems that the Android apps are using Vodafone’s own DNS servers in their own network to resolve addresses.

By the time I came this far in my investigation, the problems with resolving mail.google.com were slowly subsiding. At first, the replies to queries to 8.8.8.8 began producing a valid IPv4 reply (but not IPv6). This is enough to get me back into my mailbox. Vodafone’s own DNS is making correct replies.

So you might ask – why don’t I just use Vodafone’s DNS servers? The problem was that in the past, the exact same issue occurred using Vodafone’s DNS servers. The responses indicated various sites were at 127.0.0.1 resulting in a failure to connect/load/HSTS errors. I swapped to Google Public DNS to try and evade these issues, and while it seemed to work for a while, today even that method seemed to fail. So while I could go back to using “peer DNS” and prioritise responses from Vodafone’s own DNS, I would probably run into the same trouble sooner or later.

Why is this happening? I have absolutely no idea. I expect my data to pass through the internet “verbatim” without being tampered – the evidence seems to suggest one of several possibilities:

  • maybe there’s a misconfiguration or a bug somewhere in Vodafone’s NAT equipment that is corrupting DNS requests from time to time or caching incorrect responses.
  • maybe Vodafone’s NAT or routing equipment is configured to intercept and respond to DNS by “proxy” as a means to ensure the court-ordered content blocks are less easily circumvented and this equipment is problematic.
  • maybe there’s an active attempt to poison any caches along the way resulting in the propagation of incorrect responses back down the chain.
  • maybe there’s something wrong with Google and Cloudflare’s DNS servers at the exact moment (extremely unlikely).
  • maybe there’s something wrong with my phone that I use to tether (even though it works fine with other carriers, a prospect I feel is unlikely).

Ironically, as I was writing this, the problem came back for a few minutes and my cache got polluted again. To get around it for now, I’ve added a static DNS binding for mail.google.com to just one of their server IPs. While it won’t help with their load balancing, it should ensure I can continue to reach the site.

At least I’m happy that I understand where the problem appears to be and it’s not something that I can do much about. These are still all issues which I generally didn’t face on such a regular basis while on a fixed line connection (maybe once or twice a year rather than everyday for a few hours).

Speed and Bandwidth Quotas

Coming from a household that had only 9/1 on ADSL2+, I thought we had it bad. Being on “Lightning-Fast Vodafone” was supposed to be a treat, although a treat that would only last so long as the bandwidth quota allowed. Living life as a digital citizen with a bandwidth quota really does limit the amount of indulgence you can have (if you don’t want to break the bank), but the need for such quotas is understandable especially in a shared medium context.

I haven’t used Vodafone in many years – they just weren’t competitive and the VodaFail era really did deflate their image quite significantly. After a period of heavy network investment, they came back “all-guns-blazing” with such offers such as $7 starter SIMs offering 18Gb over 35 days (double-data) and occasionally, even Kogan managed a $0.99 SIM with 23Gb over 30 days. At these prices, it seems inevitable that Vodafone would at least see some curious people give them another go.

As a result, I employ my Xiaomi Redmi Note 4X (MTK edition) as my means of accessing the Vodafone network. Featuring an LTE Cat6 (300/50Mbit/s) modem with LTE-A capability and support for all bands that Vodafone operate on, it’s not a bad match. Using USB 2.0 tethering (to avoid Wi-Fi slow-downs) to a Mikrotik hAP cabled to the PC and placed up high for a full five bar signal with LTE+ indication (i.e. carrier aggregation), I did my absolute best to optimize my internet.

Despite this, while browsing, the internet still felt somewhat slow compared to the old fixed line, especially in the evenings. At first, I thought it was my imagination, but I wanted to check so while reviewing the Fingbox, I also used a bit of my excess quota and the Internet Speed feature to record some data to analyze.

In all, speed tests occurred during the period of 9th June through to 21st June (Vodafone) and 26th June to 1st July (Kogan Mobile). Speed was tested at 12am, 9am, 12pm, 3pm, 6pm and 9pm (at a random time within the hour) but not on all days. Results for each time are aggregated to show the trend – a total of 16, 17, 16, 16, 15, 16 samples respectively.

This is not intended to be a highly scientific appraisal of the performance of the Vodafone network – the performance will vary depending on location, signal strength, signal quality, interference, equipment used, number of carriers in use, local congestion, speed test server congestion and transit-link bandwidth just to name a few variables. However, this is what I seem to experience in my location.

A look at the mean (dark line) and median (grey line) shows there is a clear downward trend in speeds as the day progresses. Around midnight, the traffic loading should be relatively low and speeds peaking around 42Mbit/s have been observed, averaging about 24Mbit/s. However, at 9pm, we see that the speeds top out at around 14Mbit/s and average around 8.5Mbit/s. Despite that, we can see there are a number of samples that read below the 10Mbit/s line and a few under 5Mbit/s as well, meaning the service isn’t quite as fast as even the base-level NBN standard connection would be. On the upside, I think this means there might be quite a few Vodafone customers around here.

To some degree, a limited speed is expected due to the nature of the shared medium and Vodafone’s spectrum availability, however, it does pale to my previous analysis of Telstra during “We’re Sorry” day where even my older Cat4 LTE device supporting only 1800Mhz was able to deliver 100+Mbit/s in the evening and stay above 25Mbit/s (mostly) during the day, albeit at a different location. As a result, for the proponents who think that wireless can be an adequate replacement for cabled technologies, I think this is highly dependent on what level of quality you expect from the service.

The upload speed is more even and mainly hovers around 8-9Mbit/s during the day with peaks of 16Mbit/s. The lower rate is expected as the radio is only capable of 50Mbit/s upstream (ideally), but this suggests that upload bandwidth is not as contented as download bandwidth. Nothing unexpected there.

Unlike 3G-era technologies, LTE offers much improved and relatively stable ping times – the median ping time is about 34-41ms throughout the whole day, although the mean ping time is skewed significantly by a few outliers which corresponded with low-points on the download speed test. These events may be symptomatic of base-station outages, interference, extremely high loading, etc.

Conclusion

Most of the time, the internet just works and we don’t think about it. But not having the internet really puts a downer on productivity and enjoyment – it’s become vital to modern day living. As a technology enthusiast, the internet is probably even more important to me, so when it does go wrong, I try to find out why.

It seems that Vodafone uses a CGNAT of sorts and this breaks my FTPS. Oh well, not much I can do about that. At least SIP works fine … and I still have a fixed line elsewhere I can tunnel my data through.

It seems that Vodafone might be sending incorrect DNS replies breaking access to some sites from time to time. I tried changing to Google Cloud DNS, but that broke today along with Cloudflare DNS suggesting Vodafone or an intermediary may be somehow tampering with DNS requests to other servers “in flight” (caching? proxying?) or (less likely) something is wrong with both DNS servers at the same time when reached from Vodafone or my particular phone which works fine with other carriers.

Maybe there is an active DNS poisoning attack resulting in bad replies being cached, but if I’m querying Google Public DNS, I expect an answer from them (not a proxy) and I would expect Google wouldn’t let their own subdomain records be poisoned. At least I know how to clear my DNS caches and can make a bind a static DNS entry for the most affected sites to continue along (or edit some HOSTS files).

Vodafone and their resellers, including Kogan Mobile, have offered relatively irresistible plans and that has kept me online for the most part, so I am thankful. But sometimes, there are issues with such deals, in this case, the speed doesn’t seem to be quite as fast as their rivals. Can’t have it all …

Update – 5/7/18

It seems that today, while posting an article, DNS poisoning happened again resulting in requests to Facebook and Yahoo being misdirected to 127.0.0.1 loopback address. For now, I’ve removed the Google Public DNS and only rely on Vodafone’s internal DNS to see if it helps but it’s not a configuration I prefer and I remain unconvinced that Google Public DNS is to blame.

About lui_gough

I'm a bit of a nut for electronics, computing, photography, radio, satellite and other technical hobbies. Click for more about me!
This entry was posted in Computing, Telecommunications and tagged , . Bookmark the permalink.

Error: Comment is Missing!