DO 4 cycling On-Off
1. The DR Server is not responding to every request from GRIDlink.
This could be due to excessive load or there may be maintenance being performed.
This is a 24 hour trend of the Quality of Service (in blue) and parses (in red). Parsed are binary represented by either 1 (successful) or 0 (fail). Here there were 9 individual fails through out the day with a cluster of 5 at 22:00 in the evening. By simply reviewing the logs of all GRIDlinks in this service area and seeing the same cluster it is clear that there is maintenance being performed each night at 22:00. The other 9 fails are random. This is to be expected among Internet communications and since there are approximately 1440 parses each day, this represents a failure of ~0.6% and would not affect participation in any Event.
2. The local network is passing the request and poll response within the time out period.
This could be as simple as a bad Ethernet cable to the network is experiencing excessive loads. Check with IT to see if the company using larger than normal amount of bandwidth as seen when performing large data dumps to corporate HQ.
3. Cell Connection
If connecting with a cell modem, it is not uncommon that the local cell tower is experiencing excessive loads, undergoing maintenance which is temporarily reducing bandwidth or even interference.
4. Wireless Network
If the local network connection is reliant on Wi-Fi or Spread Spectrum radios, it is very common to see miss parses due to latency and retries.
One way to detect a network problem is to see the load average on the Status page in GRIDview.
This is what the load should be:
Here is an example of a network issue which requires more than the usual number of retries. This increases the processor load.
This load will increase the number of failed parses by not enough to miss an Event. When it exceeds 4.5 for long periods, then the problem should be addressed.