• For Specialists

    A blog for service providers focused on QoS, QoE, and network performance. Join us for in-depth analysis of industry news, technology, and solutions driving performance in mobile networks, cable MSO business services, cloud and data center connectivity, enterprise WANs, and financial networks.

  • Join us Live

    We host webinars each month covering topics from solution design to performance assurance technology and demos of our latest innovations. Join us engineers online for tailored insight and Q&A with our network engineers.
    Upcoming Webinars:

    Click Here
  • Learn on YouTube

    Accedian is the Performance Assurance Specialist for mobile networks, enterprise to data center connectivity, and service provider SDN. With dozens of videos covering network performance and QoE, our YouTube channel is a unique training resource.

    Watch Now

Monday, August 24, 2009

NOC needs plug & go Ethernet

Everybody’s doing it: Ethernet is getting deployed on a large scale everywhere. I’ve had the chance to meet with NOC staff at several service providers recently, ranging from regional operators, to utilities, MSOs and multinational carriers. Whether for business services, wholesale Ethernet or wireless backhaul there’s a common focus: move from regional and one-off offerings to large-scale, full-footprint Ethernet deployments. We’re talking hundreds of endpoints instead of just a few, and it’s starting to take its toll on operations.

Invariably, the pain is the same for operators large and small – having moved far beyond testing and trusting the technology itself, the ability to rapidly scale Ethernet service offerings without excessive manual effort is front and center. Caution: what I’ve heard might make you choke on your coffee. We’re talking 40% success in services commissioning, mis-configured switches that merge management traffic with customer data, and full-fledged security breaches caused by mismatched VLANs. Oh, and the time Ethernet OAM went wild on an aggregation node, and took down hundreds of cell sites. And the New York, city-wide outage for a major operator, simply because standard operating procedures were overlooked.
I was sensing a trend (or maybe it was really hard to miss), so to get a bigger sampling I setup a survey on the EtherNEWS blog, and operators were quick to speak up.
Nearly 90% of respondents say Ethernet deployment automation is important or very important. Service providers are scrambling for a way to simplify the mechanics of getting E-Line and ELAN services up and running in a reliable, repeatable way. Over half say ensuring error free deployment is their biggest concern, followed closely by the need to configure QoS and validate that service performance is up to SLA specs. Interestingly, the cost and time required, and finding and training staff, rank as background issues. How can that be? I imagine it’s because if you get automation working, you can do much more with less staff, and training, cost and time drop out of the equation.
So quality and consistency is driving the need for a Plug & Play equivalent for Ethernet services – more accurately Plug & Go, or Plug & Run, since everyone’s tired of playing around with their Ethernet gear late into the overtime hours.
Are there any efforts emerging to standardize a quick, easy way to get Ethernet up? The closest parallel is probably the CableLabs DOCSIS cable modem self-registration standard, a key reason why cable operators were able to deploy home phone service and high-speed internet at the expense, largely with staff that had little experience with either. So is the MEF, the IETF or the IEEE up to something? Haven’t heard a whisper – but you can be sure that if the NOC folk have their say, they’ll be making a lot of noise very soon – just as soon as the fires are out and they see the light of day again.
Accedian Networks’ Plug & Go instant provisioning feature was inspired by theses needs in the NOC. Learn all about this amazing technology by watching this short video.

Monday, August 17, 2009

The 3D, 4G mesh

Just in time to join the big summer sci-fi blockbusters is a bigger-than life techno-drama for mobile operators: the 3D, 4G Mesh. Unfortunately it’s not entertainment, not even mildly entertaining: tackling sticky QoS issues is a serious dilemma for providers rolling out WiMAX & LTE backhaul. In a previous post I outlined how the move to intelligent, self-organizing networks (SONs) has created unprecedented performance challenges for 4G mobile backhaul. Towers communicating directly with each other to coordinate roaming hand-offs, deliver and optimize user traffic has created an adaptive mesh-based network where the intelligence has been delegated to “empowered towers”.

However operators choose to connect their cell sites together, whether through a direct mesh or traditional hub-and-spoke design, it’s the tower-to-tower latency, jitter, packet-loss and prioritization that counts as users roam between cells while watching District 9. From the user-experience perspective, the network is a mesh regardless of how the data gets moved around. And this is where the mind-bending fun begins.
Enter the 3rd Dimension
The word exponential is not common in backhaul networking. We’re much more comfortable thinking about tidy point-to-point circuits, or even 2D “clouds” with data in, data out. But packet-based applications have gone beyond this to the third dimension: quality of service tiers (service classes) stack up on the network. Priority traffic associated with real-time applications like VoIP and video are latency and jitter sensitive, and need special handling so calls don’t go robotic. And control-plane traffic is just as critical as we roam on the highway and our conversations jump tower-to-tower within milliseconds. Stack up to 8 classes of service on the mesh interconnectivity of 4G backhaul and you’ve got a really interesting mess – in fact an exponential mesh mess.
To illustrate, this simple diagram shows only 4 towers and a Mobile Switching Center, connected through an Enhanced Packet Core (EPC) to PSTN and Internet gateways. The most basic configuration would be 3 classes of service between each site (control plane, real-time applications & best effort). The result? 54 unique service flows to maintain (27 flows in each direction). Now take a more realistic scenario: 100 towers talking to each other while homing to an MSC, and 5 classes of service. The damage? 49,510 unique flows (I’ll let you verify the math)!
In these 49,510 flows, at least 40% (19,804) will be high-priority streams that are particularly QoS sensitive. They’ll need to be monitored for latency and jitter, packet loss, throughput and availability in real-time. Not monitoring is not an option: if something went wrong, how would you even know where to start troubleshooting when you’ve got almost 20,000 flows to sift through? And the other 30,000 or so? They also need to be monitored, at the very least for packet loss and continuity – because you want to know if the whole pipe went down or just one service.
So you’re the operations guy (who definitely is watching a different kind of widescreen content in the NOC). Where do you start? The approach most operators are using clones the mesh itself with a service assurance overlay. Network Interface Devices (NIDs) capable of monitoring up to 100 flows each in a full-mesh setup are installed at each cell site and the MSC. Automation gets them all talking and watching each flow, and a centralized monitoring system crunches mountains of per-second data, boiling it off into a dashboard view that makes sense of this 3D, 4G world.
Sometimes it’s interesting to know what’s happening behind the scenes: the making of one of the most amazing networking stories of our time.

Tuesday, August 4, 2009

LTE backhaul: Think twice

When you start digging into LTE, you find it’s a pretty amazing technology – not just the speeds and feeds, but the way it was thought out from the bottom up. With technology migration always a painful problem for operators, LTE was designed to simplify deployment, maintenance and reduce operating costs with the concept of Self Organizing Networks (SONs) running over a flat IP infrastructure. Base stations are much more sophisticated than in 3G and other wireless models: they are responsible for managing their radios, optimizing service quality, discovering neighboring cells, and connecting themselves to the backhaul network.

But perhaps the most important change in LTE base stations (or “evolved Node Bs”as they are known), is their responsibility for managing the service itself. Where 2G and 3G networks rely on centralized radio network and base station controllers (RNC/BNC), LTE goes without: each tower communicates with its nearest peers to hand-off users as they roam from cell to cell. Both control plane (roaming and call control) and user data traffic pass directly between towers, connected in a mesh-style backhaul network. This distributed networking and intelligence can reduce latency and free core capacity by sending data directly to its destination without passing through a centralized aggregation point.
Sounds wonderful – a clear advance in mobile mechanics that takes full advantage of the advanced routing capabilities of today’s MPLS infrastructure – but like so many things, great ideas quickly run into roadblocks where the rubber hits the road. Here’s the tricky part: with LTE’s rates planned to ramp to 150Mbps per-user (!), the backhaul network has to be future-proofed day one. This means a lot of fiber to towers that are mainly fed today by a bundle of T1s over copper. In 3G this problem opened a whole new market: alternative access vendors (AAVs) such as cable MSOs, fiber-rich CLECs, pure-play backhaul providers and even utilities stepped up to fill the gap. Wholesaling backhaul is the name of the game in 3G, where the fastest deployments are ramping on the networks of others.
But this scheme doesn’t fit so well into LTE’s full-mesh architecture. Backhaul is traditionally provided over point-to-point links: AAVs deliver Ethernet in, Ethernet out, logically connecting each tower to a centralized switching center. The concept of tower-to-tower communication is beyond their domain, and their control. I’ve never heard of wholesale MPLS backhaul. Imagine the complexity getting everything talking? If it sounds like a major headache, it is, and no amount of “self-organization” will help.
So operators rolling out LTE have a difficult choice: go it alone with their own MPLS network (if they have one), or lease backhaul service based on point-to-point Ethernet. Towers can still talk to each another, but all the traffic that would just hop to the next cell now has to loop through the switching center just like in the good old days.
Twice the Trouble
Problem is, this has a serious impact on latency as the data path stretches out over a much longer distance. This added delay, combined with decentralized roaming control managed by the base stations spells out dropped and choppy calls… unless the AAVs deliver super-low latency. It’s hard enough delivering Ethernet backhaul with the tight performance demands of 3G, where tower to switching center latency needs to be in the single-digits of milliseconds. Bad news for AAVs: SLAs for LTE will cut this spec in half. Since control plane traffic has to pass from one cell to another, it effectively doubles the path length of what used to be centralized commands sent directly to the towers. So packets have to get there twice as fast. This isn’t a bandwidth issue – increasing capacity won’t do much for latency. This is more like a speed of light, switching performance and network optimization issue.
So what’s a cellco to do? Early LTE deployments precisely echo the backhaul dilemma; only the largest operators with significant MPLS footprint are in the game, and outsourced backhaul will only come into play where their own network can’t reach… and when it does, it’s a sure bet it’ll come with some of the tightest SLAs telecom has ever seen. The AAVs that can rise to the challenge are sure to win big, because there won’t be too many stepping up to the plate.