On Site in Scotland: The Open Network
When I signed up four years ago to help support Straight Up Technologies at The Open in Muirfield, North Berwick, I didn't really know what to expect. I'd been building networks for over 20 years, and worked with hundreds of companies helping them use Cisco technology to keep their businesses running, but never at a golf course during a live event.
I've now worked four consecutive Opens and I can tell you that the demands placed on this particular event network fly in the face of most network design best practices across all levels of the OSI stack. Allow me to elaborate.
The first thing you have to deal with are the elements: earth, wind, rain, (and sometimes) fire. When you're deploying any kind of electronics on a golf course, especially in Scotland by the sea (all Opens are played on links courses), you have deal with lots of weather changes. The team at StraightUp has come up with weather-proof containers that allow the equipment to have a decent chance of staying dry and cool. Most access points in enterprises don’t need to worry about this kind of thing. See the picture below for an example of an AP under a grandstand.
The second thing consider is that most IT departments in corporations have some level of control over what is attached to the network, and generally understand the requirements well in advance. At The Open, due to the scale and dynamic nature of real-time productions, there isn't a lot of control. The network just has to be wherever the people need it: whether it’s on a remote fairway where there is no power, or 100 feet in the air for a camera shot. We employ features like BPDU guard, port-security, and DHCP snooping to keep things pretty safe, but we've seen people plug in their own switches, routers, DHCP servers, wireless gear, and firewalls because their requirements changed between their services order and the time of the event. There were almost 2,000 wired clients this year along with close to 6,000 simultaneous wireless clients this year with spikes in some years of 12,000 users in one day.
The third thing to think about is SCALE, both physical and logical. The size of the network has grown every year I’ve done it and has provided more services.
We deployed 220 switches, 535 access points, 260 IP phones, 345 Exterity IP multicast media streamers, 170 scoring terminals, and 144 point of sale terminals.
Remember, this is for a temporary network that gets shipped from the UK and US, deployed, and then repackaged and shipped back. Also consider the cabling plant. The fiber is run specifically for the event with just a portion of it left onsite for the next time The Open comes back to the course. There are thousands of twisted pair runs custom crimped and then torn out and recycled once the event is concluded. It's incredible. The network design below shows the scale of the event.
The fourth thing to understand is the sheer variability of everything. The power plant is generally done with generators, and sometimes when power isn't available you have to resort to solar charging a car battery connected to an AP. The picture below is an example of a tiny area of multiple generators powering the equipment on the monstrous TV compound. That’s just power.
What were once empty fields are transformed over a short period of time into an entire “back of house” compound where the production is made real. Roads are created using miles of interlocked metal plates, and large buildings are created from plastic, and Portacabins and Bunkabins are brought onsite where the work gets done. Furthermore, tons of tents and buildings are created out of thin air for spectators to enjoy a cold drink and food while walking the course.
Power, wireless and wired networks have to be extended to those areas. Try spotting our antennas in this wide shot.
Here’s a close up of a small example of what has to be done in some cases. All the gear is shipped in from both the US and UK and you have what you have. When you are asked to add capacity at the last minute, you improvise, and it works.
When it starts to rain and the wind is blowing 30 miles per hour, things get flooded all kinds of power anomalies happen. Our job is to react very quickly and get replacement equipment where it is needed. Our team has backups at the ready and people deployed in two centers around the course for quick recovery.
People have a tendency to just "plug stuff in" like APs and home routers. We also have seen people replicate our SSIDs and when people attach, they don't go anywhere. Our NOC gets the call and our team has to figure out what's happening very quickly to ensure our network is providing the level of service our customers expect. We see a ton of alarms on our Cisco Prime Infrastructure console regarding rogue APs and interference, but there is really nothing we can do about it. There are just too many to shut them all down. We do what we can to verify the equipment and ensure it’s being used for legitimate purposes.
Another thing that's quite variable are the user requests. A majority of the requirements are known before we get on site, however, we invariably get last minute requests that require custom cable runs or configuration. If someone wants an ice cream cart by the 14th hole, new cable has to be run from the closest switch which might be in a box under the grandstand like the one shown below.
This all brings us to the real technical stuff. Anyone building a modern network would never bridge 20 VLANs to 250 switches, but that's exactly what we have to do. We have constituents on our network that statically configure their gear and the same subnet has to appear everywhere they physically have a presence. This can cause major issues when topology changes occur because of power issues, cable breaks, dirty fiber, flash floods, or wireless bridges getting misdirected because of strong winds and building sinking in mud because of heavy rain. You have to configure IGMP snooping in such a way as to not flood when a topology change notification (TCN) occurs. If you don't, you get packet loss which manifests itself in both picture breakup and device management issues. We’ve talked about trying to use MPLS to break up the L2, but that introduces another level of complexity and hardware/software requirements combined with the complexity of on the fly end station placement and VLAN assignment.
You have to make sure you prune your VLANs appropriately so you don't take the 6000-15000 wireless clients and learn their MAC addresses in your small edge switches. That’s why we use Cisco equipment with a central controller. Some “high-density” vendors put controllers into every block of 4, 8, or 16 APs, which means every VLAN a wireless user needs has to be trunked to every switch, and with a network the size of The Open, that’s a non-starter in terms of switch costs and size of the physical space we have. As the network has quadrupled in size over time, we've had to expand the size of our DHCP scopes because of the large bridge domains, and we have to make reservations on the fly for things like network printers people bring onsite. You can’t set a very long DHCP lease length in case you have to change things quickly on the fly.
On the security side, we have to tune down our NAT xlate idle timers on our ASAs because of the amount of devices people bring with them, especially when an unexpected number of people need “dirty IP” access and you only have one or two IPs from which to form a PAT pool. Nothing is static and many things to be done on the fly, so it’s imperative to design things in a flexible manner. We also keep a close eye on our top users to ensure we’re seeing people misusing the bandwidth we have. In fact, last year in Saint Andrews, half of the Internet going through London experienced an outage, and we had to react very quickly to reassure everyone the problem was further upstream. Using Cisco SourceFire software on our ASAs, we are able to identify individuals all the way down to username and MAC address and shut them down or limit them if they are misusing bandwidth.
Hopefully this article provides a small glimpse into the magnitude of a production like The Open. The one constant through all this is the amount of teamwork and communication that’s required to ensure a positive experience for everyone. The coordination of all the groups involved and the amount of time put into planning and execution is very difficult to measure. The stress can overwhelming and the hours long, but in the end, our network passes THE ONE TRUE TEST.