SPOF

Never put all your eggs in one basket  — Don Quixote

Diversify your investments any investment advisor

A single point of failure (SPOF) is a part of a system that, if it fails, will stop the entire system from working. SPOFs are undesirable in any system with a goal of high availability or reliability, be it a business practice, software application, or other industrial system Wikipedia

Or a boat. So then, why do marine engineers invent a network that, when it fails, will take down the boat's entire system of electronics? Mind you, this is not a modern network that can be engineered with redundant segments and backup routers and switches so that there is no SPOF. No, this is a network based on RS422, a prehistoric wire protocol that was never intended for networking. It is insane. To be fair, Raymarine's version of it (Seatalk) has been tightened up so that, on a good day, the thing actually works. But it is the principle of it that gets me. It is such bad engineering that anyone can understand why you shouldn't base all your electronics on it.

Recently, it seems that good days only happen about two days out of three. On the third day the net doesn't work. It happened again the other day just after I gybed in fairly heavy weather in Storm Bay meaning I was suddenly without an autopilot. When single-handed, as I was, the autopilot is not just nice to have, it's essential. You cannot steer the boat AND change sails at the same time. So I had a tedious trip back to Hobart, stuck behind the helm. When I got back into calmer water I was able to balance the boat well enough to leave the wheel, get the sails down and motor home, but it was not an enjoyable outing.

This is an intermittent problem — it comes and goes. This time though it seemed to be stuck in 'comes'. Good, at least I would have some chance of finding it.

I suspected the wind instruments. Why? Because I'm an idiot and because they are furthest away, at the top of the mast and quite exposed. So I ferreted around in the bilges trying to find where the wind instruments connect to the network. Finally, I discovered a tiny little black box (yep) at the foot of the mast with three wires going into it, one of which might be the wind data. I disconnected it and switched on the network. No difference, so it isn't the wind instruments. It had taken about an hour to trace the wiring and deduce which instrument belonged to which wire. I now had no choice but to repeat the process for every transducer and display connected to the network -- about a dozen items all up. I was down to the last three items on the list when my brain snapped into gear and I realized what the problem was. No prizes for guessing that it was the grungy piece of crap depicted in the photo above.

This is an old GPS that I no longer use and that I had stupidly overlooked simply because it's no longer an active part of the system. But is was still connected!

Jubilation at having found the problem was tempered by the fact that I had missed the obvious culprit, and wasted most of a day disassembling the wrong end of the network. Never mind, I now knew where everything connected. There is no doubt that the network will fail again some day and this knowledge might then help me find the problem faster.


4 Comments

  • Gravatar Image

    Ah yes, you learn a lot more by looking at all the wrong things first. My usual modus operandi!

  • Gravatar Image

    If this came out of my kiln I could sell it as a piece of art! SPOF-art!
    Anjea will be a Gem when you have finished with her.

  • Gravatar Image

    Dave, with your documentation background I'm sure you are clearly documenting where all these transducers and black boxes connect to each other for future reference.

    If you were a golfer you would know that the golf ball from your last crap shot will eventually be found in the last place you are your playing partners look. ;-) A lot like tracking down elusive bugs in code actually.

  • Gravatar Image

    Nev, this might surprise you but yes I certainly have documented it!

Add a Comment