I have been having performance problems with my DSL link for some time. Since I work at home, I am pretty quick to notice problems. The issue has to do with the RTTs across the DSL link itself. That is, when I ping the router at the Intrex end of my link from my home LAN, RTTs jump to 200-500ms for minutes at a time. This is when the link is idle (I am neither uploading or downloading data, and the modem light on my TP-LINK shows the DSL modem as being idle. I am quite certain of this.)

My methodology is simple: I send out 30 "pings", each separated by 10 seconds, and then average the results together to produce one sample over the 5 minute interval. Thus, a value of 200ms indicates the RTT was not just an instantaneous sample (which one would expect), but reflects a sustained load over a 5 minute interval.

When there are no issues, I regularly see RTTs in the 30-50ms range. But when the RTTs are above 200ms, the link is essentially unusable (i.e., web pages take forever to load, and interactive telnet/ssh sessions are unbearable). Configuration information:

I've been having problems for months. I finally called Intrex on June 18, and was told that I had an old modem (on a FR link) and that I would first have to upgrade to an ATM modem, which I did. This did not fix the problem, but I went on travel shortly thereafter and did not followup up.

On Thursday, Jul 17, I called again. This was after I had experienced an hour of RTTs in the 500ms range. See the chart below at around 12PM. NoPicture

Intrex told me they would have to go check with Verizon. But they then called back and said that while Verizon could investigate "current" problems, there was not much that could be done if the problem wasn't present when they ran there tests (I hope that isn't true!!). This is frustrating, since the problems are intermittent. I have been unable to identify a pattern to suggest when or why the RTT goes up. The following charts show the data between the Friday Jul 19 - Sunday July 20. Look at the Sunday data (last chart) in particular. It shows a sustained RTT of above 1/2 second for more than 15 consecutive hours! (Again, I am not uploading or downloading data during this time, so I am NOT the cause of this).

The Friday data varies from being OK, to being horrible. But the numbers fluctuate substantially from one sample to the next. I would not say that service was acceptable during this time.

NoPicture

Saturday's data again varies a lot, but shows plenty of RTTs above 1/2 second! NoPicture

Sunday's data is just amazing. The RTT was above 1/2 second for something like 15 hours straight! NoPicture

Update of Wednesday Jul 23: Performance has continuted to be uneven. Not as bad as Sunday, but certainly not acceptable. For some time, I have been trying to look for patterns that might explain what is going on.

I have been told that the problem is likely on the "shared circuit". I.e, even though I have DSL, the DSLAM presumably tunnels all my traffic with that of other DSL users to Intrex over a shared circuit (i.e., ATM). Thus, my DSL link is not actually a dedicated circuit (to the router at Intrex). I'm actually sharing a link with others, and their traffic may be the source of the problem. (So much for one of the claimed advantages of DSL over Cable Modems!)

One interesting observation. If I run pings (every second) when the link is performing well, a funny thing happens. The RTTs will be low (e.g, 40ms) for a while, then one or two data samples will jump -- substantially -- and then the link goes idle again. The shortness and intensity of the spike, suggest to me something other than normal traffic variations. Could it be there is some sort of routing loop on the circuit, with certain packets getting sucked into the loop and then going back and forth until their TTLs expire? It would be interesting to see if packets are being dropped (on the circuit) due to TTLs expiring. (For me, even when the performance is really poor, I almost never see lost packets).

      64 bytes from 209.42.213.129: icmp_seq=198 ttl=254 time=40.2 ms
      64 bytes from 209.42.213.129: icmp_seq=199 ttl=254 time=54.1 ms
      64 bytes from 209.42.213.129: icmp_seq=200 ttl=254 time=38.0 ms
      64 bytes from 209.42.213.129: icmp_seq=201 ttl=254 time=39.1 ms
      64 bytes from 209.42.213.129: icmp_seq=202 ttl=254 time=40.2 ms
      64 bytes from 209.42.213.129: icmp_seq=203 ttl=254 time=51.0 ms
      64 bytes from 209.42.213.129: icmp_seq=204 ttl=254 time=42.2 ms
      64 bytes from 209.42.213.129: icmp_seq=205 ttl=254 time=49.1 ms
      64 bytes from 209.42.213.129: icmp_seq=206 ttl=254 time=44.3 ms
      64 bytes from 209.42.213.129: icmp_seq=207 ttl=254 time=39.3 ms
==>   64 bytes from 209.42.213.129: icmp_seq=208 ttl=254 time=1048 ms
      64 bytes from 209.42.213.129: icmp_seq=210 ttl=254 time=35.3 ms
      64 bytes from 209.42.213.129: icmp_seq=211 ttl=254 time=34.3 ms
      64 bytes from 209.42.213.129: icmp_seq=212 ttl=254 time=53.3 ms
      64 bytes from 209.42.213.129: icmp_seq=213 ttl=254 time=60.3 ms
      64 bytes from 209.42.213.129: icmp_seq=214 ttl=254 time=37.4 ms
      64 bytes from 209.42.213.129: icmp_seq=215 ttl=254 time=36.2 ms
      64 bytes from 209.42.213.129: icmp_seq=216 ttl=254 time=37.2 ms
      64 bytes from 209.42.213.129: icmp_seq=217 ttl=254 time=44.4 ms
      64 bytes from 209.42.213.129: icmp_seq=218 ttl=254 time=39.1 ms
      64 bytes from 209.42.213.129: icmp_seq=219 ttl=254 time=48.4 ms
      64 bytes from 209.42.213.129: icmp_seq=220 ttl=254 time=45.2 ms
      64 bytes from 209.42.213.129: icmp_seq=221 ttl=254 time=54.4 ms
      64 bytes from 209.42.213.129: icmp_seq=222 ttl=254 time=63.3 ms
      64 bytes from 209.42.213.129: icmp_seq=223 ttl=254 time=38.4 ms
      64 bytes from 209.42.213.129: icmp_seq=224 ttl=254 time=39.5 ms
      64 bytes from 209.42.213.129: icmp_seq=225 ttl=254 time=36.3 ms
      64 bytes from 209.42.213.129: icmp_seq=226 ttl=254 time=67.3 ms
==>   64 bytes from 209.42.213.129: icmp_seq=227 ttl=254 time=1058 ms
      64 bytes from 209.42.213.129: icmp_seq=229 ttl=254 time=39.4 ms
      64 bytes from 209.42.213.129: icmp_seq=230 ttl=254 time=50.3 ms
      64 bytes from 209.42.213.129: icmp_seq=231 ttl=254 time=37.4 ms
      64 bytes from 209.42.213.129: icmp_seq=232 ttl=254 time=36.2 ms
      64 bytes from 209.42.213.129: icmp_seq=233 ttl=254 time=43.5 ms
      64 bytes from 209.42.213.129: icmp_seq=234 ttl=254 time=46.4 ms
      64 bytes from 209.42.213.129: icmp_seq=235 ttl=254 time=37.4 ms
      64 bytes from 209.42.213.129: icmp_seq=236 ttl=254 time=38.4 ms
      64 bytes from 209.42.213.129: icmp_seq=237 ttl=254 time=35.4 ms
      64 bytes from 209.42.213.129: icmp_seq=238 ttl=254 time=42.4 ms
      64 bytes from 209.42.213.129: icmp_seq=239 ttl=254 time=37.5 ms
      64 bytes from 209.42.213.129: icmp_seq=240 ttl=254 time=58.4 ms
      64 bytes from 209.42.213.129: icmp_seq=241 ttl=254 time=103 ms
      64 bytes from 209.42.213.129: icmp_seq=242 ttl=254 time=54.3 ms
      64 bytes from 209.42.213.129: icmp_seq=243 ttl=254 time=35.4 ms
      64 bytes from 209.42.213.129: icmp_seq=244 ttl=254 time=46.7 ms
      64 bytes from 209.42.213.129: icmp_seq=245 ttl=254 time=45.3 ms
      64 bytes from 209.42.213.129: icmp_seq=246 ttl=254 time=36.5 ms
      64 bytes from 209.42.213.129: icmp_seq=247 ttl=254 time=63.4 ms
      64 bytes from 209.42.213.129: icmp_seq=248 ttl=254 time=96.4 ms
      64 bytes from 209.42.213.129: icmp_seq=249 ttl=254 time=45.5 ms
      64 bytes from 209.42.213.129: icmp_seq=250 ttl=254 time=38.4 ms
      64 bytes from 209.42.213.129: icmp_seq=251 ttl=254 time=39.3 ms
      64 bytes from 209.42.213.129: icmp_seq=252 ttl=254 time=46.6 ms
      

In August, I finally was able to get the attention of someone higher up within the support team (Camil). As a result of that conversation, he contacted Verizon who said they first wanted to "rebuild the circuit." We both doubted that would have any impact, since the circuit had already been rebuilt when I moved from FR to ATM in July. But it seemed like a necessary step in Verizon's processes. The circuit was reportedly rebuilt on Thursday September 4. I can say that the rebuild did not fix my problem, but performance does seem to be a bit different than before (at least on some days). On the whole, however, my problems still exist.

Here is the data on Thursday Sept 4, the day the circuit was rebuilt. I see no improvement here at all.

NoPicture

The data for Friday - Sunday are also no different:

NoPicture

NoPicture

NoPicture

That said, the data for Tuesday looks very different. On the whole, the RTTs are consistently low, with a few exceptions. I can't recall seeing RTTs this good across the entire day for a long time. Note that there are extended periods of time when the RTTs do stay low.

NoPicture

The data for Tuesday, Sept 9, is mixed. There are periods where the RTTs are low, but there are also times when they are quite high.

NoPicture