Posted: Mon Apr 30, 2018 3:30 am
by berkinet
I am not sure when this happened... But, I just noticed that apt-get on all my SBC3s was broken. The only host listed in multistrap-debian.list was commented out. At the same time, I see that the only process writing to /var/log/syslog is the cron. Even on boot, nothing is being logged.

Is something broken?

Posted: Mon Apr 30, 2018 6:37 am
by berkinet
After uncommenting the hosts in multistrap-debian.list I was able to do an apt-get update, and then an upgrade and things seem to be working (more-or-less) normally.

One of the problems I was experiencing was the phidgetwebservice21 dropping out at roughly 10-15 minute intervals. I was in the process of debugging that issue when I ran into the other problems noted above. After the upgrade, the issues with the web service seem to have vanished. Odd?

Posted: Mon Apr 30, 2018 6:45 am
by berkinet
oops. I wrote too soon. Still having the problem with the phidgetwebservice21 dropping out and then restarting around 2 minutes later. I don't actually think this is a web service issue, since the entire SBC3 becomes unresponsive during the period when the web service is unavailable.

Posted: Tue May 01, 2018 6:32 am
by berkinet
For anyone else with the same apt-get problem. It was simple cockpit error. The issue was the Include full Debian Package Repository option on the Packages web management page. I had forgotten about that and had unchecked it.

However, the phidgetswebservice21 problem is ongoing, on just one SBC3. From what I can discern:
  • The computer does not reboot during these outages.
  • Nothing is looked in any of the /var/log logs
  • No processes consume unusual cpu during the outage (I captured the top ten processes every 4 seconds into a file)
  • Pinging the machine shows an odd, but consistent pattern:
    • ping times for from >1ms to 100 to 800ms
    • then, a period of 10 20 15 seconds of no response
    • followed by more slow responses
    • follwed by 30 seconds of no response
    • and, finally back to < 1ms - and the machine is again accessible and the webservice reconnects
At this point I strongly suspect a network hardware problem. But, ethanol doesn't show anything.

Posted: Wed May 02, 2018 11:08 am
by Patrick
Have you ruled our issues with other network hardware? Maybe try a different switch?

Do the link/activity LEDs on the SBC do anything during the outages?


Posted: Wed May 02, 2018 11:13 am
by berkinet
All the SBC’s are on the same switch. But only the one is having trouble. I haven’t gone down to look at the lights on the SBC, I guess I should do that. Unfortunately it’s not nearby, it’s in the chicken coop :-)