Skip to content

Incident 4th November 2021: Power cut before student night causing touchscreen system failure.

James Muscat edited this page Nov 9, 2021 · 2 revisions

Summary

Quick sketchy write up from memory by @rjmunro. I'll try to fill in more detail later:

There was reportedly a power cut in church today. When power was restored, the touch screens could not connect to the server because they couldn't resolve the network address of the server. This is of course misleading because it's using pyro name resolution, not DNS.

I checked the syslog on the server, and it didn't start because it couldn't find a usb device, something like /dev/usb/3.1-0..... (TODO: check actual path name)

I search for that device in /etc/avx.config (TODO: check actual path name) and it was one of the blind relays, so I made a backup of the config and removed all the blinds. This allowed the server to start and it is mostly working except:

  • I didn't try the blinds page, but I expect them not to work (they should still be controllable with the physical switches).
  • Camera 1 doesn't seem to move, probably because it is on a serial connection, not an IP connection like the other 2. It still sends a video signal to the ATEM, and it should be possible to move it to a good position using the remote, but it will have to stay in the same position for a whole service.

Notes:

Initially when Lauren arrived, the ATEM and screens were powered but the touch screens could not connect. Angus and her tried rebooting the server and this caused all power to be turned off. I guess the fact that the server is a laptop that has some degree of battery backup confused things a bit, meaning the power was able to stay on despite the power cut.

Investigation

[jrem] The relevant section of the logs is as follows:

Nov 04 18:10:29 HP-Pavilion-dm1 avx-controller[1008]: 2021-11-04 18:10:29,393 INFO: Creating device Camera 1
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: 2021-11-04 18:10:30,304 ERROR: Could not open serial device /dev/usb-ports/3-1.5:1.0 for Camera 1
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: Traceback (most recent call last):
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/avx/devices/serial/SerialDevice.py", line 23, in __init__
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:     self.port = Serial(serialDevice, baud, timeout=2)
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/serial/serialutil.py", line 244, in __init__
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:     self.open()
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/serial/serialposix.py", line 325, in open
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:     raise SerialException(msg.errno, "could not open port {}: {}".format(self._port, msg))
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: SerialException: [Errno 2] could not open port /dev/usb-ports/3-1.5:1.0: [Errno 2] No such file or directory: '/dev/usb-ports/3-1.5:1.0'
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: 2021-11-04 18:10:30,660 INFO: Creating device Camera 2
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: 2021-11-04 18:10:30,662 INFO: Creating device Camera 3
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: 2021-11-04 18:10:30,664 INFO: Creating device Power
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: 2021-11-04 18:10:30,683 INFO: Creating device Blinds A
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: 2021-11-04 18:10:30,684 ERROR: Could not open serial device /dev/usb-ports/3-1.1.4:1.0 for Blinds A
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: Traceback (most recent call last):
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/avx/devices/serial/SerialDevice.py", line 23, in __init__
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:     self.port = Serial(serialDevice, baud, timeout=2)
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/serial/serialutil.py", line 244, in __init__
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:     self.open()
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/serial/serialposix.py", line 325, in open
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:     raise SerialException(msg.errno, "could not open port {}: {}".format(self._port, msg))
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: SerialException: [Errno 2] could not open port /dev/usb-ports/3-1.1.4:1.0: [Errno 2] No such file or directory: '/dev/usb-ports/3-1.1.4:1.0'
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: 2021-11-04 18:10:30,685 INFO: Creating device Blinds B
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: 2021-11-04 18:10:30,686 ERROR: Could not open serial device /dev/usb-ports/3-1.1.3:1.0 for Blinds B
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: Traceback (most recent call last):
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/avx/devices/serial/SerialDevice.py", line 23, in __init__
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:     self.port = Serial(serialDevice, baud, timeout=2)
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/serial/serialutil.py", line 244, in __init__
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:     self.open()
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/serial/serialposix.py", line 325, in open
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:     raise SerialException(msg.errno, "could not open port {}: {}".format(self._port, msg))
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: SerialException: [Errno 2] could not open port /dev/usb-ports/3-1.1.3:1.0: [Errno 2] No such file or directory: '/dev/usb-ports/3-1.1.3:1.0'
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: 2021-11-04 18:10:30,687 INFO: Creating device Blinds C
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: 2021-11-04 18:10:30,688 ERROR: Could not open serial device /dev/usb-ports/3-1.1.2:1.0 for Blinds C
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: Traceback (most recent call last):
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/avx/devices/serial/SerialDevice.py", line 23, in __init__
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:     self.port = Serial(serialDevice, baud, timeout=2)
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/serial/serialutil.py", line 244, in __init__
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:     self.open()
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/serial/serialposix.py", line 325, in open
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]:     raise SerialException(msg.errno, "could not open port {}: {}".format(self._port, msg))
Nov 04 18:10:30 HP-Pavilion-dm1 avx-controller[1008]: SerialException: [Errno 2] could not open port /dev/usb-ports/3-1.1.2:1.0: [Errno 2] No such file or directory: '/dev/usb-ports/3-1.1.2:1.0'
< snip >
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]: Traceback (most recent call last):
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/bin/avx-controller", line 8, in <module>
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:     sys.exit(main())
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/avx/controller/Controller.py", line 295, in main
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:     controller.initialise()
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/avx/controller/Controller.py", line 150, in initialise
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:     device.initialise()
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/avista_avx/__init__.py", line 21, in initialise
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:     "procedure": "{}.power_on".format(self.avista_power_device)
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/requests/api.py", line 117, in post
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:     return request('post', url, data=data, json=json, **kwargs)
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/requests/api.py", line 61, in request
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:     return session.request(method=method, url=url, **kwargs)
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/requests/sessions.py", line 542, in request
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:     resp = self.send(prep, **send_kwargs)
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/requests/sessions.py", line 655, in send
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:     r = adapter.send(request, **kwargs)
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:   File "/home/sound/avx/local/lib/python2.7/site-packages/requests/adapters.py", line 516, in send
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]:     raise ConnectionError(e, request=request)
Nov 04 18:10:37 HP-Pavilion-dm1 avx-controller[1008]: requests.exceptions.ConnectionError: HTTPConnectionPool(host='192.168.1.133', port=8080): Max retries exceeded with url: /call (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f815cb4d7d
Nov 04 18:10:38 HP-Pavilion-dm1 systemd[1]: avx-controller.service: Main process exited, code=exited, status=1/FAILURE
Nov 04 18:10:38 HP-Pavilion-dm1 systemd[1]: avx-controller.service: Failed with result 'exit-code'.

There are two things at play here, and the critical one is not related to the blinds.

  • For some reason, the laptop has renumbered its USB controllers; rather than being connected to 3-1... the serial ports now show as:

    $ ls /dev/usb-ports/
    1-1.1.2:1.0  1-1.1.3:1.0  1-1.1.4:1.0  1-1.5:1.0  1-1.6:1.0  4-1:1.0
    

    This has actually happened a number of times before; I had attributed it to Ubuntu's LivePatch (or whatever they're calling ksplice these days) causing things in the kernel to jump somehow, but a reboot always returned the devices to their original paths. I'd love it if there were a better way of uniquely identifying the USB-to-serial adapters, but there isn't (at least, not that I've found in 6+ years of looking).

    This is not the cause of the crash though; note that AVX tries and fails to set up each serial device in turn, but continues with the startup process. The killer was the second one:

  • On controller startup and system power-on, avx sends a message via the avista-avx bridge device to power on the Avista control system. (This is a transitional arrangement; avista will eventually replace avx, as a more robust and scalable approach than Pyro4's RPC calls; in avx for now it's the mechanism that a "power on" request is sent to the sequencer in the stage rack.) That request failed on the original startup; I'm assuming because the avista master router process hadn't started accepting requests yet. When the controller was restarted, it was able to make the request successfully - the removal of the blinds relays is a red herring (and it still fails to find camera 1's serial port).

Suggested improvement

If a USB device can't be found AVX should be able to issue a warning and carry on without it, not

  • avista-avx should trap and log exceptions, not allow them to propagate to the main process
  • Perhaps make the dependency of avx-controller on avista explicit in systemd?