Skip to main content

Checklist for network architecture documentation

Michael Morris blogged about a great checklist for network architecture documentation here.  Here is the checklist:

  • Introduction and Business Drivers - this describes why you are writing the network architecture and what are the main business drivers for the network. Availability and consistent performance are given, but what else drives your network? Incredibly high availability for JIT manufacturing? Or maybe flexibility for fast growth?
  • Global Network Design - a high-level description, with diagrams, of how the network goes together. People need to understand the big picture to see how all the little parts go together.
  • Site Tiers - This is a key part of the network architecture. How do you categorize your sites? Maybe into five Tiers (1-5)? Maybe just three (Big, Medium, Small)? Any which way, you need a way to classify each of your sites so an appropriate Network Design Templates can be used. You should also include Site Decision Matrixes to help people decide which tier a site is, perhaps based on number of employees, or types of applications used, or business criticality.
  • Network Design Templates - diagrams for each Site Tier ("1-5" or "Big, Medium, Small", etc.) that can be used repeatedly to build homogeneous networks at each site.
  • Bill of Materials - the actual BOM that is used to order the equipment for each Site Tier. It is very important this BOM matches your Network Design Templates exactly since one missing piece of hardware can make a world of difference during installation. Your VAR or Cisco Sales can provide a detailed BOM for you based on your Network Design Templates and standards. Also include any other hardware used in the network like UPSs and cabling.
  • WAN Options - types of WAN options that the network architecture will support, for example, MPLS, private lines, and IPSec tunnels. This list should include the WAN technologies that will likely be needed. Even if your entire network is using MPLS, you should have a few other technologies supported since needs can arise that may not be solved by a single WAN technology.
  • Site Tracking List - once you have chosen a Site Tier, applied a Template, and picked appropriate WAN technologies for the site, you need to track all these decisions. This way you know how the NYC site is built - maybe as a Tier-2 site with dual MPLS circuits.
  • Internet Access - how do people get outbound Internet access? What inbound access to company resources is provided from the Internet?
  • Routing - since routing is arguably the key technology in a network, you need a section explaining how routing works. Sections should include normal and backup traffic flows, global routing, IGPs, BGP, static routes, and default routing.
  • Configuration Standards - now that you have all the standards defined, you need actual device configuration standards. You should have exact configs for SNMP, booting, DHCP, HSRP, BGP, OSPF, ACLs, interfaces, VLANs, LAN switching....etc....etc...etc. In short, if it's a command on your equipment, have a written configuration standard for it.
  • IP Addressing - how your global IP addressing and summarization is done, the policies for subnet allocation, and site specific IP subnetting standards. For example, if you assign a /22 to a Tier-3 site, how is that /22 divided into subnets? This is key to ensure IP addressing is the same at all sites.
  • QoS - this can be a huge section that describes all aspects of QoS in your network. From LAN QoS, to IPT, to WAN QoS, to QoS with MPLS carriers, to where the actual applications fit into the QoS model. Take your time on this section since it is a key business enabler, but can also cause huge problems if implemented wrong.
  • Wireless LAN - how does the WLAN work? (NOTE: this may require it's own architecture document like IPT and SANs if your WLAN is not simple.)
  • Naming Conventions - these are more important than you may originally think. Spend some time to ensure your naming conventions are correct.
  • Software Standards - levels, versions, and features sets of software used in the network. From IOS, to VPN concentrators, to firewalls. Each piece of equipment in use should have a software standard. You will also want to include a process for software maintenance and changes.
  • Best Practices - a list of best practices for your network; things you have learned over the years that are unique to your environment.
  • Network Architecture Review Board and Architecture Revision Process - these are the rules and guidelines on how you will make changes to all the stuff written above.

Comments

Popular posts from this blog

Why Madge Networks, the token-ring company, went titsup

There I was shooting the breeze with an old mate. The conversation turned to why Madge Networks which I wrote about here went titsup. My analysis is that Madge Networks had a solution and decided to go out and find a problem. They deferred to more incorrect strategic technology choices. The truth of the matter is that when something goes titsup, its not because of one reason only, but a myriad of them all contributing to the negative consequence. There are the immediate or visual ones, which are underpinned by intermediate ones and finally after digging right down, there are the root causes. There is never a singular root cause for anything but I'll present my opinion and encourage everyone else to chip in. All of them together are more likely the reason the company went titsup. As far as technology brainfarts go there is no better example than Kodak . They invented the digital camera that killed them. However, they were so focused on milking people in their leg

Flawed "ITIL aligned"​ Incident Management

Many "ITIL aligned" service desk tools have flawed incident management. The reason is that incidents are logged with a time association and some related fields to type in some gobbledygook. The expanded incident life cycle is not enforced and as a result trending and problem management is not possible. Here is a fictitious log of an incident at PFS, a financial services company, which uses CGTSD, an “ITIL-aligned” service desk tool. Here is the log of an incident record from this system: Monday, 12 August: 09:03am (Bob, the service desk guy): Alice (customer in retail banking) phoned in. Logged an issue. Unable to assist over the phone (there goes our FCR), will escalate to second line. 09:04am (Bob, the service desk guy): Escalate the incident to Charles in second line support. 09:05am (Charles, technical support): Open incident. 09:05am (Charles, technical support): Delayed incident by 1 day. Tuesday, 13 August: 10:11am (Charles, technical support): Phoned Alice.

A checklist for troubleshooting network problems (22 things to catch)

  Assumptions! What is really wrong? Is it the network that is being blamed for something else? Fully describe and detail the issue . The mere act of writing it down, often clarifies matters. Kick the tyres and do a visual inspection. With Smartphones being readily available, take pictures. I once went to a factory where there was a problem. Upon inspection, the network equipment was covered in pigeon pooh! The chassis had rusted and the PCB boards were being affected by the stuff. No wonder there was a problem. In another example, which involved radio links. It is difficult with radio links to remotely troubleshoot alignment errors. (I can recall when a heavy storm blew some radio links out of alignment. Until we climbed onto the roof we never realised how strong the wind really was that day!) Cabling. Is the cable actually plugged in? Is it plugged into the correct location. Wear and tear on cabling can also not b