Stardate
20030806.1627 (Engineering log): I am having severe networking problems right now with RoadRunner, which is interfering with my ability to get out and with your ability to reach my web server. Sometimes the link works great, sometimes it sucks, and sometimes it's completely dead either briefly or for extended intervals.
I have a crude script running on the server which wakes about once a minute and tries to ping a test server at the RR office. Each time it logs how many pings out of ten were answered. According to that log, the connection was out from about 0630 to 0745PDT this morning, out again from about 1043 to 1146, and yet again from about 1300 to about 1420. In between those there were other brief outages, and periods where communication was unreliable. Here's what it captured since midnight, smoothed with a running filter:
The lower the curve, the worse things were. When it's railed on the bottom, communications were out completely.
It's obvious that I needed more information than my crude script could collect. So about noon I started running a program called Ping Plotter (which despite the name actually does repetitive tracerts). Ordinarily my ping time to the test server should be less than 20 milliseconds. Here's what the last three hours have actually looked like:
The large red sections indicate that the test server was unreachable. The horizontal green section is 0-200 milliseconds ping time, yellow is 200-500, and pink is even worse.
The server has two ethernet ports. One connects directly to the cable modem, and the other connects to the ethernet hub at the center of my LAN. Each of the ports has a single activity LED which shows traffic but doesn't differentiate between incoming and outgoing traffic.
Sometimes when the link is out there's no traffic at all. But during most of the extended outages today when all my pings and tracerts failed utterly, and when the server logs showed no page loads going on, there was a lot of traffic on the ethernet port anyway. It's incoming, because it doesn't change when I halt the processor on the server. I conjecture that RR isn't routing traffic correctly (in my modem or somewhere else), and that the traffic which is intended for me is sometimes going elsewhere, while someone else's traffic is being routed to me and is properly being ignored by my server.
So far I don't think it's a denial of service attack. If it is one it's not very good because it isn't saturating my pipe, but in any case a DOS wouldn't cause all the kinds of things I'm seeing.
This appears to be a continuation of the problem I had Sunday evening. During one of those outages, a RR phone support person tried to do a tracert to me and ended up somewhere else instead, which is why I think there's some sort of routing problem.
I'm sure that it isn't a problem in my server or any of my local wiring. Even when things fail completely, I can always access the server through my LAN, and when I tracert to other destinations, my server and the modem always respond as the first two steps, with negligible round-trip times. The problem could be in the networking part of the modem (it may even be an RF problem), or it could be elsewhere. But I can't really diagnose it further and I cannot fix it. (And I would like to request that my mailbox not be filled with suggestions for things to try.)
I don't know how long it's going to take to resolve this. I'm going to let pingplot run overnight, and I'll call RR tech support tomorrow. The main thing I was trying to do today was to prove to myself that it wasn't my problem. I think I've done that, and I'm ready to call RR now. Unfortunately, while we "business class" users get better service than normal home users, business class tech-support is only available during business hours and it's too late today to call them.
include
+force_include -force_exclude
|