There are multiple things that can and do go wrong with your Internet connection. Lets have a look:
The first major issue is power. Power failures are common and and occur along any part of the Internet path and also at the end point. Data centres have sophisticated power backup systems that include generators and online uninterruptible power systems. These are pretty solid while on the home or business side that may not be the case. Along the path a service provide will have points of presence with varying degrees of power protection. Often there might be power backup but its poorly managed. As an example in our area, the ISP realizes the tank on his diesel generator is empty when the customers phone to complain about their links failing. They do not use any type of Internet of Things sensor to pro-actively manage the failures. Ironically, you'll find the significant majority of ISPs do not!
The first thing to consider is to buy an inverter if your requirements are > 300W. Geewhiz has a greater unit based on Mecer. If you only need to power the ONT, Edge and access point then one of the micro-UPS units from Ultralan will do the trick.
The next issue is the last mile link itself. Many people and businesses have only one link. Often business attempt to improve the quality and uptime of that single last mile link by implementing a so called business service level agreement. These cost significantly more and often have two attributes:
- north and south access to the client premise;
- a commitment to prioritize response and repair.
The problem with north and south client access is that it practically solves the wrong problem. The access path to the point of presence typically becomes common within a city block from the client premises. 80% of breaks happen closer to the point of presence and not the client premises. This north and south access is often not rebated or discounted, meaning you pay double. The result is you pay 100% more to solve 20% of the problem. Its not a good return on investment.
The commitment to prioritize response and repair is also an empty promise. Firstly, the repair for business and non-business is exactly the same process and equipment. Secondly the response commitment is not rational. Assume you have 1 business customer for every 500 consumer customers. When there is an outage will you prioritize the 1 business customer before you fix the 500 consumer customers? No, especially if there is a common causation to the problem or outage. Practically, the consumer customers are fixed just as fast as the business customers, if not faster. There is not technical or service difference between a business or consumer fibre product.
The real contributing factor to fibre pricing is the build and most of that lies in the civil work. Thats it but often the operators try to average out the build costing to a standard billable price. That average is there immaterial of whether its a business or not.
Multiple operators and service providers
The better way to achieve last mile resilience is to have two separate operators. An example is a premise that has both Vumatel and Openserve. A link from each one of those will provide resilience. Another alternative is to use a fixed wireless solution (not mobile like LTE). LTE is a really a last resort for any type of business solution. How often have you seen a shop or food trader standing on a chair and waving their speedpoint around trying to get a signal to try and make a payment. It sucks. Fixed wireless is more reliable and has a predictable service delivery. It is perfect to use as a back for fibre as the installation typically uses better antenna, radios and networking kit.
The best solution for links is to have one in the ground, one in the air and one holding each of those hands.
Outages and brown outs
The issue with having two links is that you need hardware and software that is capable of managing the links in a manner than improves stability. Here a hub and spoke solution is key as a mesh solution is slow and unpredictable. Examples of the latter are clumsy load balancing solutions deployed by the firewall and mainstream router vendors. They provide a really poor user experience w.r.t. handling brown outs and are slow in detecting outages. Brown outs are often missed.
A good SD-WAN implementation, such as the one provided by Fusion ticks all the boxes to improve the link experience and solve the problems highlighted above.
However, that is not all. The ISP greater operations are also important. These include peering. It is crucial that greater peering points and upstream transit providers are available. As an example, a degraded peering session from the ISP to a cloud provider will impact the experience adversely. The ISP only monitor their own network and not anything outside of their network. Just like in the lack of using Internet of Things sensors as mentioned above, ISPs have no ability to review traffic analytics that provide insight into poor and adverse peering sessions. They often only do a "moment in time" investigation which is meaningless as unless the problem is happening when the investigation is conducted, they come up empty handed. A significant number of the user experience problem are micro-bursts that can only be detected using trending over multiple hours, days and weeks. Within most ISPs the latter is lacking.
Finally, the elephant in the room is "fat fingers." In aviation terms, its known as pilot error. The human responsible for network task makes a mistake and causes an outage. These quantity of these types of outages are significant. In terms of SD-WAN, although not fully mitigated the probability is less as tasks are automated. Less "fat fingers" poking the pie.
SD-WAN gives the client control of their destiny and provides the ability to keep the ISP honest. Its transparent and provides a workable mechanism of improving uptime.
** This article was originally published over on LinkedIn: The things that go wrong with your Internet connection