Skip to main content

The things that go wrong with your Internet connection

 

There are multiple things that can and do go wrong with your Internet connection. Lets have a look:

Power

The first major issue is power. Power failures are common and and occur along any part of the Internet path and also at the end point. Data centres have sophisticated power backup systems that include generators and online uninterruptible power systems. These are pretty solid while on the home or business side that may not be the case. Along the path a service provide will have points of presence with varying degrees of power protection. Often there might be power backup but its poorly managed. As an example in our area, the ISP realizes the tank on his diesel generator is empty when the customers phone to complain about their links failing. They do not use any type of Internet of Things sensor to pro-actively manage the failures. Ironically, you'll find the significant majority of ISPs do not!

The first thing to consider is to buy an inverter if your requirements are > 300W. Geewhiz has a greater unit based on Mecer. If you only need to power the ONT, Edge and access point then one of the micro-UPS units from Ultralan will do the trick.

Last mile

The next issue is the last mile link itself. Many people and businesses have only one link. Often business attempt to improve the quality and uptime of that single last mile link by implementing a so called business service level agreement. These cost significantly more and often have two attributes:

  • north and south access to the client premise;
  • a commitment to prioritize response and repair.

The problem with north and south client access is that it practically solves the wrong problem. The access path to the point of presence typically becomes common within a city block from the client premises. 80% of breaks happen closer to the point of presence and not the client premises. This north and south access is often not rebated or discounted, meaning you pay double. The result is you pay 100% more to solve 20% of the problem. Its not a good return on investment.

SLA

The commitment to prioritize response and repair is also an empty promise. Firstly, the repair for business and non-business is exactly the same process and equipment. Secondly the response commitment is not rational. Assume you have 1 business customer for every 500 consumer customers. When there is an outage will you prioritize the 1 business customer before you fix the 500 consumer customers? No, especially if there is a common causation to the problem or outage. Practically, the consumer customers are fixed just as fast as the business customers, if not faster. There is not technical or service difference between a business or consumer fibre product.

The real contributing factor to fibre pricing is the build and most of that lies in the civil work. Thats it but often the operators try to average out the build costing to a standard billable price. That average is there immaterial of whether its a business or not.

Multiple operators and service providers

The better way to achieve last mile resilience is to have two separate operators. An example is a premise that has both Vumatel and Openserve. A link from each one of those will provide resilience. Another alternative is to use a fixed wireless solution (not mobile like LTE). LTE is a really a last resort for any type of business solution. How often have you seen a shop or food trader standing on a chair and waving their speedpoint around trying to get a signal to try and make a payment. It sucks. Fixed wireless is more reliable and has a predictable service delivery. It is perfect to use as a back for fibre as the installation typically uses better antenna, radios and networking kit.

The best solution for links is to have one in the ground, one in the air and one holding each of those hands.

Outages and brown outs

The issue with having two links is that you need hardware and software that is capable of managing the links in a manner than improves stability. Here a hub and spoke solution is key as a mesh solution is slow and unpredictable. Examples of the latter are clumsy load balancing solutions deployed by the firewall and mainstream router vendors. They provide a really poor user experience w.r.t. handling brown outs and are slow in detecting outages. Brown outs are often missed.

A good SD-WAN implementation, such as the one provided by Fusion ticks all the boxes to improve the link experience and solve the problems highlighted above.

ISP operations

However, that is not all. The ISP greater operations are also important. These include peering. It is crucial that greater peering points and upstream transit providers are available. As an example, a degraded peering session from the ISP to a cloud provider will impact the experience adversely. The ISP only monitor their own network and not anything outside of their network. Just like in the lack of using Internet of Things sensors as mentioned above, ISPs have no ability to review traffic analytics that provide insight into poor and adverse peering sessions. They often only do a "moment in time" investigation which is meaningless as unless the problem is happening when the investigation is conducted, they come up empty handed. A significant number of the user experience problem are micro-bursts that can only be detected using trending over multiple hours, days and weeks. Within most ISPs the latter is lacking.

Fat fingers

Finally, the elephant in the room is "fat fingers." In aviation terms, its known as pilot error. The human responsible for network task makes a mistake and causes an outage. These quantity of these types of outages are significant. In terms of SD-WAN, although not fully mitigated the probability is less as tasks are automated. Less "fat fingers" poking the pie.

Uptime

SD-WAN gives the client control of their destiny and provides the ability to keep the ISP honest. Its transparent and provides a workable mechanism of improving uptime.

* Ronald works connecting Internet inhabiting things at Fusion Broadband

** This article was originally published over on LinkedIn: The things that go wrong with your Internet connection

Comments

Popular posts from this blog

Why Madge Networks, the token-ring company, went titsup

There I was shooting the breeze with an old mate. The conversation turned to why Madge Networks which I wrote about here went titsup. My analysis is that Madge Networks had a solution and decided to go out and find a problem. They deferred to more incorrect strategic technology choices. The truth of the matter is that when something goes titsup, its not because of one reason only, but a myriad of them all contributing to the negative consequence. There are the immediate or visual ones, which are underpinned by intermediate ones and finally after digging right down, there are the root causes. There is never a singular root cause for anything but I'll present my opinion and encourage everyone else to chip in. All of them together are more likely the reason the company went titsup. As far as technology brainfarts go there is no better example than Kodak . They invented the digital camera that killed them. However, they were so focused on milking people in their leg

Flawed "ITIL aligned"​ Incident Management

Many "ITIL aligned" service desk tools have flawed incident management. The reason is that incidents are logged with a time association and some related fields to type in some gobbledygook. The expanded incident life cycle is not enforced and as a result trending and problem management is not possible. Here is a fictitious log of an incident at PFS, a financial services company, which uses CGTSD, an “ITIL-aligned” service desk tool. Here is the log of an incident record from this system: Monday, 12 August: 09:03am (Bob, the service desk guy): Alice (customer in retail banking) phoned in. Logged an issue. Unable to assist over the phone (there goes our FCR), will escalate to second line. 09:04am (Bob, the service desk guy): Escalate the incident to Charles in second line support. 09:05am (Charles, technical support): Open incident. 09:05am (Charles, technical support): Delayed incident by 1 day. Tuesday, 13 August: 10:11am (Charles, technical support): Phoned Alice.

Updated: Articles by Ron Bartels published on iot for all

  These are articles that I published during the course of the past year on one of the popular international Internet of Things publishing sites, iot for all .  These are articles that I published during the course of the past year on one of the popular international Internet of Things publishing sites, iot for all . Improving Data Center Reliability With IoT Reliability and availability are essential to data centers. IoT can enable better issue tracking and data collection, leading to greater stability. Doing the Work Right in Data Centers With Checklists Data centers are complex. Modern economies rely upon their continuous operation. IoT solutions paired with this data center checklist can help! IoT Optimi