How to get the users to do the right thing in case of an emergency ?
This is a part of a series of articles about redundancy. These articles should be read in the correct order. If you have not read the previous article, then use this link to go to the first article Redundancy to solve all problems.
In the previous article I use a case of a TV station that got off very wrong because of the human factor, where they couldn’t broadcast a current affairs program even when they had lots of redundant equipment. After the worst panic had died off, I spoke to one of the members of their video support team who was involved in finding the errors. I said “If this was a ship...” and he interrupted me and said “then it would have sunk”. He was probably right, or perhaps not. Because all passenger ships must do regular rescue drills, where all the crew have a part to do. If this TV station had done some similar recue drills, then this would never have gone wrong.
When passenger ships are obliged to do these rescue drills by law (at least in Denmark they are), and TV stations are not, then its not because it is more difficult to control a ship than a TV station. Its because the consequences if such things happen on a ship are much worse. But this shouldn’t prevent broadcasting from doing the same things.
We can spend a lot of money on redundancy, but if no time (money) is spent on rescue drills, then most of this money will be spent for nothing.
If the users must do something differently that they are used to, then it must be as intuitive as possible. Nobody will be surprised that if you need to play video from an other playout server, then its an other button on the video mixer and an other fader on the audio mixer. This is actually what people would expect. So the red button that seemed like such a good idea actually ends up making things more unclear and less flexible because then you cant use both Main and Backup at the same time, and it introduces a new single point of failure.
Its not possible to make everything so intuitive that nobody has to know anything. So we still need the rescue drills. But the amount of things users need to know which are only used in emergency situations should be minimized as much as possible.
Radio and TV stations with several studios or control rooms usually end up using only some of them for broadcasting, even when the smaller control rooms are built to do the same as the larger ones. The smaller control rooms are then used only for smaller productions. The management usually imagines that the smaller control rooms will be used for the main shows if there is a big problem with one of the large control rooms. But it usually doesn’t happen, and if it does, then the risk of a failure is quite high. First of all because the staff probably lacks training in switching control room. Second because control rooms who aren’t used very much tend to degrade. This degradation could be hardware that stops working or equipment that looses calibration or configuration or human errors where something is set wrong, but since the facility not used, nobody notices the problems.
The colution is to make sure that all studios and control rooms are used regularly and switch between them frequently. So if you have a station that makes news and there is two control rooms that can be used for this, then make sure that both control rooms are used for it. This will create more errors in the beginning; there is no doubt about that. But these will be smaller errors, and its important to remember that the goal is to avoid the disaster, where a news program can’t go on air at all. It will also take extra time, because the staff need to switch various signals and check that everything is still working. But once that everbody is used to do it, then it can be done fast so that if there is a big problem one day, the staff will be so well trained to switch control room that they can do it in no time.