47pt design for new reliability challenges in telecomcqr2008.ieee-cqr.org/day 1/session 2/4 gong...
TRANSCRIPT
For 2008 IEEE CQR
47pt
www.huawei.com
Design for
new Reliability challenges
in Telecom
Gong Xuewen ([email protected]) Meng Liming ([email protected])
Huawei N.American MKT CTO office
Page 2For 2008 IEEE CQR
Reliability——No.1 concern for Telecom Carriers
According to the statistics of seven big telecom operators (including BT, AT&T, NTT, etc),
the top 3 factors effecting the network operations are reliability, availability and fault
recovery, and they are the main focusing areas in Network Reliability
Source: Telemark, 2003
Page 3For 2008 IEEE CQR
Network Reliability, More on Service and E2E Reliability
and Availability——How to measure E2E and Service
Reliability & Availability?
� Service reliability and availability
� Network reliability and availability
� Equipment reliability and availability
For 2008 IEEE CQR.Page 4
The Network comes into IP era
——How to achieve high reliability & availability
Reliable IP network ensures the delivery of user services:
� Fault detection and recovery are two main issues to reduce OPEX.
� Fault isolation (localization) is a key issue to improve NW reliability.
A complete reliability and availability analysis should consider at equipment
level, network level, as well as network service level, to correctly describe the
network reliability in different types of services.
PE3
PE4PE2
PE1 P P
PP
MPLS
Network
E2E QoS/QoE Planning and Deployment
CE CE
For 2008 IEEE CQR.Page 5
Another reliability issue for IP era
——How to optimize network reliability & availability?
SONET/SDH
layer is
eliminated
SONET/
SDH
Rings
DWDM
IP router
IP over DWDM
� How to ensure the IP over DWDM as reliable as the legacy SDH/SONET?
� How to coordinate the reliability mechanism among different layer:
Integrated reliability designs in all layers instead of just one layer?1+1<2
Page 6For 2008 IEEE CQR
Equipment and Network is changing……
How to keep meeting 5 nines?
More
commercial
chip sets
replaced by
ASICs
FGPA
NP
ASICDSP
•Hardware
becomes
more complex
•Software
overtakes
hardware
Internet 2x every 12 months
� How to make the ASICs to
be reliable as commercial
chips?
� ASIC-level reliability design
to make system more
reliable.
� New systematic reliability
architecture……
� ……
� Simple means reliable, so
how to keep 5 nines for
such complex systems?
� Fault tolerant design
� Software reliability design
� ……
For 2008 IEEE CQR.Page 7
Flatter network cause new challenges
for access nodes
Local Loop
Sub-Loop
Source: vodafone
Higher speed →→→→ small and More sites
Small and more access nodes, outdoor, pole mounted:� Reliability under rigorous environments——More reliable component , physical
reliability design
� Maintainability is the key for lower OPEX——0 touch maintenance, self-diagnose,
� Low cost design while ensure reliability
small and More nodes
For 2008 IEEE CQR.Page 8
Traditional reliability issues also has
new challenge
Energy
Saving
Lead
Free
The trade-off between energy saving and reliability?
How to ensure the reliability issue in designing for green?
source: TI Intelec 2007
For 2008 IEEE CQR.Page 9
Network Reliability Framework for Telecom
Equipment-level reliability
Network-level reliability
Service-level reliability
NE reliability with considering R&A in
Harden Environment; New Optical device;
New integrated ASICs; Lead-free; ……
User experience focused, quality of service,
rather than in 99.999%. UCD design to avoid
human error, zero-touch OAM
End-to-end network focused, different level
/ Path reliability is considered
Reliability requirement in new networks can NOT be described in a
single word “five nines”, it should include both network reliability and
service quality.
For 2008 IEEE CQR.Page 10
Some Actions for the Challenges
• We will focus on two dimensional analysis framework:
� Network layer and service layer
� Network layer Reliability Analysis
� Service layer Reliability analysis (from operator’s service view)
� E2E Network Reliability analysis
� Access networks
� CORE networks
� Transport Networks
� Application and Content networks
• We will study and develop new Reliability analysis methodology
� New service reliability requirements for new network infrastructures
� New service and network Reliability analysis methodology
� Conduct new service Reliability analysis, guide our high quality network infrastructures and solutions to meet customer service quality requirements.
Page 11For 2008 IEEE CQR
Case study:::: China Mobile IPTN——the world’s largest and advanced NGN bearer network
SH1
GZ1
XA1
CD1
WH1
NJ1
BJ2
SH2
GZ2
XA2
CD2WH2
SY2
NJ2
BJ1 SY1
NE5000E NE5000E
SoftX3000BR BR
UMG8900
MGW GGSN SGSN MGW
ARAR AR
AR
� CMCC IPTN core network is upgraded with NE5000E, and extended to provincial/city level in whole China with CR/BR/AR layered architecture;
� The network continues positioning at the long distance soft-switching service, GPRS and signaling service as in Phase I; meanwhile also acts as the IP Bearer platform for 3G CS/PS domain, and future IMS services;
� Covers 31* provinces in China and offers service to over 200 million* subscribers.
VPN FRR + MPLS TE FRR + IGP/LDP FC
……
Page 12For 2008 IEEE CQR
Case study:::: Field HA ability test and operation results of China Mobile IPTN
The main processor CPU utility is less than 15%, no packet loss of the
NON flapping routes
xxxxxxWhole network nodes
Results::::The core node HA switch over time is less than 50ms, and the whole
network convergence speed is less than 1 second.
Chengdu, Xi’an, Beijing,
Shanghai
Chengdu, Xi’an, Beijing,
Shanghai
Test Place
The switch over time is 20ms, maximum packet loss is only 1 packetxxxxxx
The switch over time is 20ms, maximum packet loss is only 1 packetxxxxxx
Test ResultTest Item
1. In the evening of 8, Aug, 2006, emulate the IPTN link/node failure, to test the online
HA ability and the impact to the networks.
2、Since deployed from 2004, the network works well in commercial operation:
� Has encountered several transmission intermittent bit error burst but NO service affected.
� On the eve of Chinese New Year, 2006, the network handled 471,000 erls of traffic within
one hour(19:00~20:00), while more than 40 million subscribers made long-distance toll
calls via it.
Thank You
www.huawei.com