Missed Packets/Transmissions

I have (3) industrial 4-20 receivers all within line-of-sight to (1) mqtt gateway, RSSI for all devices is 100, and they are all configured the same, transmitting every 4 hours or so. One of them has not dropped a packet since it was put in the field. The other 2, are dropping packets at a very consistent interval. For example, they both follow this same sequence.

  • None dropped for a few days
  • One packet dropped
  • Next packet received
  • Next packet dropped
  • None dropped for a few days
    …This cycle repeats consistently. Here is a graph, where each spike indicates a missed packet.

Any ideas what might be causing this? I assume since one of the devices is not missing any, that it is not an issue with the gateway or cellular connection.

Does this appear to happen at or around the same time of day every time? Hard to tell for certain from the chart.

Travis, no, there doesn’t seem to be a correlation in the time of day. It just seems that it occurs on the 20th packet. Take a look at this data, it is from a different date range but exact same behavior. a “2” in the “X” column indicates a dropped packet.

That’s very odd. Especially odd that it’s happening on 2 devices but not on a 3rd device.

Are you tracking the MQTT Client connection from the Gateway? Do you ever see any dropped connections? It should be maintaining a connection to the broker and sending keep alive.

What is this environment like? Could there be something like heavy equipment moving around blocking the signal? Even if that were the case the consistency here is suspicious.

I agree. No, I am not aware of a way to remotely monitor that connection. Plus, the 3rd device not having dropped a single packet has transmission times very close to when one of the other 2 dropped a packet. I would assume if that was the issue then we would see it on all 3 devices eventually.
Environment is rural, no traffic, but some machinery at times. Given the consistency though, I find it unlikely to be a cause.
I am going to update the transmission rates on all 3 soon to sample much more frequently, maybe we will see a change in behavior.

1 Like

Let us know what you find after increasing the report frequency of the sensors.

Travis,
Update on this issue. We now have 5 devices communicating to the same mqtt gateway. I did decrease the transmission delay on the original 3 as planned. Here are the devices:

  1. 4-20 Receiver (type 48) Sampling 1/hr
  2. 4-20 Receiver (type 48) Sampling 1/hr
  3. 4-20 Receiver (type 48) Sampling 1/hr
  4. Counter (type 35) Sampling minimum of 6/hr
  5. Counter (type 35) Sampling minimum of 6/hr
  • Device 2 has not missed a single packet since deployed.
  • Devices 1, 3, 4, and 5 exhibit the behavior described above (19 packets received, followed by a missed packet, followed by two packets received, followed by a missed packet, followed by 19 packets receieved…). Again, this behavior is consistent.

Please let me know any ideas that you might have to rectify this issue. Would adjusting the transmission retries have any effect on this? They are all currently set at the default of 10.

@Bhaskar Do you have any thoughts on what could be causing this very consistent cycle of dropped packets? I’m not sure it’s the MQTT Gateway as device 2 seems to function properly at all times.

@spscogg could you try moving device 1, 3, 4, or 5 closer to the MQTT gateway? Perhaps about 10ft away just for diagnostic purposes to see if this alleviates the dropped packets?

Travis,
We can try that and let you know the results. I will add however that device 2 is not the closest to the gateway (roughly 250 ft away). Device 4 is <100 ft away with no obstructions yet it still has this issue. The gateway and all devices are mounted at the top of 10’ poles, and as mentioned before all devices are reporting an rssi of 100.
Do you have an email I can contact you at with regards to our mqtt broker possibly being a cause?

@TravisE_NCD_Technica @Bhaskar - Today I moved device 1 within 10 feet of the gateway for approx. 2 hours. I set the transmission delay to 30 seconds. Within that timeframe, 18 packets were missed, and they followed the exact same pattern, see sample below. This rules out that distance is the issue. Looking for any help or recommendations on this.

Timestamp Transmission Count Missed
2/20/2023 14:36 219
2/20/2023 14:36 218
2/20/2023 14:35 217 X
2/20/2023 14:34 215
2/20/2023 14:33 214
2/20/2023 14:33 213 X
2/20/2023 14:31 211
2/20/2023 14:31 210
2/20/2023 14:30 209
2/20/2023 14:29 208
2/20/2023 14:29 207
2/20/2023 14:28 206
2/20/2023 14:27 205
2/20/2023 14:27 204
2/20/2023 14:26 203
2/20/2023 14:25 202
2/20/2023 14:25 201
2/20/2023 14:24 200
2/20/2023 14:23 199
2/20/2023 14:23 198
2/20/2023 14:22 197
2/20/2023 14:22 196
2/20/2023 14:21 195
2/20/2023 14:20 194
2/20/2023 14:20 193
2/20/2023 14:19 192 X
2/20/2023 14:18 190
2/20/2023 14:17 189
2/20/2023 14:16 188 X
2/20/2023 14:15 186
2/20/2023 14:14 185
2/20/2023 14:14 184

@spscogg I believe we have exhausted all troubleshooting we can remotely. I would like to ask that you return these 2 units to us for further evaluation. I have discussed with other engineers here and no one has ever seen anything like this, nor can they think of anything that could cause such a 100% repeatable failure such as this. Please fill out the RMA form here and return the devices for evaluation: NCD Login - ncd.io

Thank you,
Travis

Travis, unfortunately these devices are deployed for a customer and removing them without having replacements on hand is not possible. Could this issue be on the Ubidots side? Would increasing the transmission retries higher than 10 have an effect?

the device itself does not have any inbuilt RTC which could cause such a pattern. I have never seen anything like this and am not sure what could be causing this.

@Bhaskar understood, the pattern and consistency does not correlate with any particular time of day, and also occurs whether the transmission delay is as low as 30 seconds, or as high as 1 hour. The pattern correlates to the number of transmissions.

Just to re-emphasize, this is occurring on multiple devices, but yet we have 1 device (that is the farthest away from the gateway), that never misses any transmissions.

Increasing the retries on the XBee above 10 is not possible. 10 is the max.

I cannot speak for Ubidots to be honest. You could try connecting the MQTT Gateway to a local MQTT Broker for diagnostics or just a different broker such as beebotte.com who I commonly use for testing/diagnostics. I have an article that references BeeBotte here:
https://ncd.io/using-the-mqtt-gateway-with-beebotte-and-displaying-info-on-mqtt-dash/

If you still have the same issue with a different MQTT Broker please let me know.

Travis, still having the same issues with beebotte. I have also enabled remote monitoring on our cellular router to monitor the network connection as well as wifi connection of the gateway. Both are solid, and have not seen any issues with dropped cellular or wifi. Could it be an issue with the gateway?

@spscogg,

I would say it’s nearly impossible to rule anything out at this point as we have not eliminated all possible variables which could cause the problem.

However we have sold thousands of these devices to various users and @Bhaskar and I have never seen this before.

If I were in your shoes I would begin isolating possible causes. The first thing I would try to eliminate, given my experience, would be the cellular connection. Cellular is a very funny thing. If possible I would recommend connecting the MQTT Gateway to a normal WiFi network. Generally I take this as far as bringing the device to my home network where I have extensive experience of reliability. If the problem persists then the cellular internet connectivity can be eliminated as a possible cause. I would say that is step 1.

Travis, thanks for the recommendation. We will be ordering a new set of devices including another mqtt gateway and I will do some testing on another network prior to deploying.

Sounds good. Let me know what you find.