BGP Community Settings In

BGP Community Settings In

BGP Configuration In

The Digital Island Network

  1. General configuration for BGP in DI routers

The following commands should be applied to all DI router’s general configuration sections.

ip subnet-zero:With the advent of CIDR, subnet-zero subnets are possible. This

command allows packets destined for these subnets to be routed

to them.

ip classless:This command allows the router to forward packets which have

no directly corresponding subnet route to the next best supernet

ip bgp-community new-format:This command forces display and input of community strings to

match the new format specified in RFC1997.

The following commands should be applied to all DI router’s BGP configuration sections.

no synchronization:Digital Island should never synchronize its IGP with BGP.

bgp log-neighbor-changes:Peer state changes should written to the router log.

II.Methods for announcing DI CIDR blocks

There are three methods by which a router may advertise networks to another BGP peer. The first is by using the network command. The second is by using the aggregate-address command. In the third method the router re-advertises routes previously learned from another peer.

Digital Island will use the aggregate-address command to source any announcements of classless address space. This command will advertise the network associated with it as long as there exists a corresponding subnet route in the BGP table of the router. Take for example the following configuration:

router bgp 6553

aggregate-address 167.216.128.0 255.255.128.0

Any route, which is a subnet of 167.216.128.0/17, if it exists in the BGP table will cause this router to announce the aggregate 167.216.128.0/17 plus any more specific routes which exist in the BGP table. If no more specific subnet routes exist, then the aggregate will not be announced. Note that this form of the command will advertise any more specific routes as well as the aggregate. To advertise only the aggregate the tag summary-only should be appended to the aggregate-address command. Even with the summary-only tag appended, there still must be a corresponding subnet route in the BGP tables to announce the aggregate. To announce only a subset of routes you may apply a suppress-map to the aggregate. With the aggregate-address command, the atomic aggregate attribute is set unless you specify the as-set keyword. The as-set keyword announces the aggregate with a set of the AS numbers that contributed to the aggregate. The negative impact of this command is that every time a route is withdrawn from the BGP tables the as-set will have to be recalculated and the corresponding aggregate may need to be re-announced. This could cause route flapping if you are aggregating more than one AS and one of those AS numbers is withdrawing routes. You may also apply an advertise-map or an attribute-map to the aggregate command, but those will not be discussed.

Digital Island will use the network command to inject a route into BGP if that route will never have a greater-mask subnet route in the IGP or BGP. For example, the WCDC production LAN is 167.216.240.0/23. This LAN will never have a subnet route in the routing table since the entire /23 subnet is applied to a single set of interfaces. In order for the network command to inject the route into BGP there must be a corresponding equal route in IGP. It is acceptable to use the following configuration on the WCDC customer aggregation routers since there is a connected route in the IGP routing table for this subnet.

router bgp 6553

network 167.216.240.0 mask 255.255.252.0

The WCDC border routers should have their outbound CIDR route-maps updated to allow the WCDC specifics so they may re-advertise those networks to their peers.

It is also acceptable to use the network command to inject classfull routes in classical address space under some circumstances. Please do not do this for the 167.216.0.0/16 without consulting with the Network Engineering group as it has negative side effects as explained below.

The final way for a router to advertise a network to a BGP peer is by that router passing a route that it has learned from another peer. This method is used by the LCMs to announce the DI aggregate address space. This way if the link to the core from an LCM becomes disconnected The LCM will stop advertising the aggregate that it has learned from the core and not blackhole DI from the LCM country.

The following configuration is an example of all these methods put to use in the WCDC border routers.

router bgp 6553

no synchronization

bgp log-neighbor-changes

network 167.216.157.232 mask 255.255.255.255

aggregate-address 167.216.0.0 255.255.0.0 suppress-map SUPPRESS

aggregate-address 167.216.0.0 255.255.128.0 suppress-map SUPPRESS

aggregate-address 167.216.128.0 255.255.128.0 suppress-map SUPPRESS

ip route 167.216.157.232 255.255.255.255 Null0

The first command applied to the router is a static host route for 167.216.157.232 to null0. This will ensure that there is always a subnet route in the IGP of 167.216/xx. By nailing this route to null0 the router will not flap any announcements associated with this host route. This host route is the default used throughout the entire network. Next the IGP host route is injected into BGP by the network 167.216.157.232 mask 255.255.255.255 command. Since there is now a greater-length mask subnet route in BGP, namely the host route 167.216.157.232, the aggregate commands will advertise 167.216.0.0/16, 167.216.0.0/17, and 167.216.128.0/17 supernets. The aggregates will not flap since the corresponding IGP route will never be removed as it is attached to null0. Since the summary-only tag was not appended to the end of the aggregate commands the router will announce all subnets of the aggregate that are in the BGP tables. This includes the host route. In order to stop this route from being advertised the aggregate command is appended with a suppress map.

route-map SUPPRESS permit 10

match ip address 5

access-list 5 permit 167.216.157.232

access-list 5 deny any

Note that since this is a suppress map the corresponding acl permits the routes that need to be suppressed.

In addition to the aggregates, WCDC and ECDC are presently sourcing the more specific routes for their respective Production LAN and Unsecured LAN by injecting them into the BGP tables using the network command. This will effectively force traffic destined for the east coast to enter through ECDC and traffic for the west coast to enter through WCDC. Special community settings have been applied to these routes to ensure they are only announced domestically. More specifics should never be announced from DI address space without safeguards to limit the scope of their distribution and a full understanding of what impact the route will have on global routing.

III.Internal routing as it relates to BGP

Digital Island is running a default-free core. If the core does not have a route to a destination then it will respond to the source with a hung route. This includes any route in DI CIDR space that does not have a corresponding route in the routing tables. LCM routers should have their default-gateway statements set to the closest major data center’s Production LAN since it is possible for an LCM not to have a complete set of routes. This will force all unresolved packets into the core where they will either be routed to the proper destination or hung. Since the LCM routers do not have the entire DI routing table it is vital that they be defaulted back into the core. If this does not happen the potential exists to blackhole parts of the DI address space from international peers.

The historical method for announcing DI network blocks to BGP peers was by injecting them into the BGP tables using the network command. This required a corresponding route to be in the IGP. This was accomplished by inserting a static route for each supernet. The drawback to this method was that since ip classless is enabled, any classless supernet such as 167.216.128.0/17, that had a corresponding static route would prevent would prevent static routes of greater-length subnet masks from being removed if the next hop became unreachable. For example, consider the following scenario where the purpose of the first route is to allow the BGP network command to inject the supernet into the BGP tables and the remaining routes are statics for BGP loopback load balancing to a customer.

ip route 167.216.128.0 255.255.128.0 null0

ip route 167.216.157.23 255.255.255.255 167.216.157.2

ip route 167.216.157.23 255.255.255.255 167.216.157.6

The third route corresponds to a subnet that is directly connected to serial 1/0/0 on a customer aggregation router. If that serial interface goes down, the static route will not be removed since there is always a lesser-length subnet mask in the routing tables, namely the 167.216.128.0/17 route. ip classless will force the next hop address in the static to resolve the supernet. Therefore the static will never become invalid and will not be removed since there will always be a route to the next hop. This causes the route to flap the BGP session between peers, in the best case, and to blackhole traffic between peers, in the worst case.

Digital Island’s main IP address range 167.216.128.0/17 does not fall on a natural classfull boundary. With the ip classless command implemented, any packets that are destined for a subnet that does not have a corresponding route in the routing table will be routed to the next greatest supernet that exists in the routing table. Therefore, classless supernet routes should not be added to the DI network unless they correspond to an interface or a set of interfaces directly, such as connected routes.

  1. BGP formatting, documentation, and naming schemes

As the DI network grows proper documentation and naming conventions are vital to its success and the Network Engineer’s sanity.

When configuring a BGP peer, the description field should be used. For customers, it should include customer name, then location. For domestic peers, the description field should include ISP name, then city of termination. For international peers, it should include ISP name, then country of origin.

There are several files for documentation related to the functions of BGP. Please update these sources if modifications are made to the routers. These files are all located in the neteng home directory.

asn.shtml:Tracks private AS numbers assigned to customers.

as-path.shtml:Tracks as-path allocation and what route-maps they are associated with.

communities.shtml:Tracks community-list allocation.

/peers:Directory with .txt files with information specific to each of our peers.

countries.shtml:Tracks country codes for community usage.

Route-maps should have a generalized naming scheme. For international peers it should be in the format of ISPNAME-COUNTRY-IN/OUT. For example, our peering session with Pipex in the UK should have the following inbound route-map; PIPEX-UK-IN. Domestic peers should be the same, however the country should be changed to city of termination. Customer route-maps should be customer name-city of termination.

International peers should not have a blanket permit statement on the end of their route-maps by default.

  1. Standard route-maps

Certain route-maps are standard across all routers.

We need to discuss as a group and agree to use them. A good example of where we have more than one route-map doing the exact same thing is the NULL route-map.

VI.Community Attributes

General Notes on the Community Attribute:

The community attribute was defined in RFC1997 as a set of four octet values. It is an optional transitive attribute. There are three well-known attributes.

NO_EXPORT (0xFFFFFF01): Routes carrying this community must not be

advertised outside of the confederation boundary

or AS.

NO_ADVERTISE (0xFFFFFF02): Routes carrying this attribute must not be

advertised to other BGP peers.

NO_EXPORT_SUBCONFED (0xFFFFFF03): Routes carrying this attribute must not be

advertised to external BGP peers including

confederation peers.

The Cisco default format for community strings is a single integer of 4 bytes. Logically, it is represented in the format NNAA where NN is a two-byte integer for customer use and AA is a two-byte integer for the AS number. The new format takes the single four byte integer and turns is into two double byte integers in the format of AA:NN. When you apply the ip bgp-community new-format command to the router it shows you all the communities in the new format. Because the most significant bit is changing between the two formats the string input from the command line using the new method will not equal the same byte stream as the same string input from the command line under the old method. However, on a binary level you are still dealing with four octets in the routing announcement.

If you see permit 999:64167 you are actually seeing:

new-formatAANN

decimal99964167

binary00000011 1110011111111010 10100111

old-formatNNAA

binary00000011 11100111 11111010 10100111

decimal65534631

Digital Island applies the new-format by default. All communities should be typed into the command line in the new format.

Finally, RFC1997 implies that any community attribute to leave your network should be tagged with your AS number in the AA position.

Community Tagging for DI international Peers:

Digital Island heavily utilizes communities within its network. Each international peer should be tagged with at least two communities. The first is a combination of the DI AS followed by the peer’s AS. The second community string is a combination of the private AS 65300 followed by a country code. This country code can be determined by consulting country code documentation on the neteng web page. The set community command for Italy Telecom would look like the following.

set community 6553:3313 65300:52

Using community settings for multi-homed customers:

The BGP decision process states that MEDs are only compared within a single AS when the router is peered to more than one AS. Therefore, metrics that DI sends to its clients will not be compared with metrics sent to its clients from other providers without the client applying bgp always-compare-med. MEDs are compared at the seventh step in the BGP decision process. If MEDs are used in a multi-provider environment there is nothing to guarantee that the metrics will provide useful information since different providers use different metric settings. So that DI can assure its clients use Digital Island routes for international traffic, the company asks them to set a higher local preference on routes it sends them if the client is multi-homed to more than one provider. However, some customers are multi-homed to Digital Island in addition to having links to other providers. In this situation by default, Digital Island would be providing equal routes over two peering sessions. The BGP decision process would ignore the equal local preferences and the decision would be made based on a lower priority attribute, normally lowest router ID. This means one link will never be used for outbound traffic from the client to DI. The ideal situation would be to provide customers with a way to prefer routes to a destination country over the link that would provide the shortest path to the country. The same link would act as a backup to all other routes as well. This is achieved by implementing community settings that can be directly translated into local preference settings in the client’s network. DI defines, using community lists, which countries’ routes are best served by each of DI’s major data centers. This normally means that the data center is the physical termination path into the DI network for that country’s peers or for that country’s LCM. This translates to WCDC, ECDC, HIDC, and UKDC having their own community lists that currently look like the following.

Last updated November 4, 1999

! WCDC country community list

no ip community-list 15

! Mexico

ip community-list 15 permit 65300:26

! Hong Kong

ip community-list 15 permit 65300:80

! Taiwan

ip community-list 15 permit 65300:76

! ECDC country community list

no ip community-list 16

! Canada

ip community-list 16 permit 65300:2

! Brazil

ip community-list 16 permit 65300:27

! HIDC country community list

no ip community-list 17

! China

ip community-list 17 permit 65300:77

! Korea

ip community-list 17 permit 65300:78

! Australia

ip community-list 17 permit 65300:79

! Singapore

ip community-list 17 permit 65300:81

! Japan

ip community-list 17 permit 65300:82

! UKDC country community-list

no ip community-list 18

! UK

ip community-list 18 permit 65300:51

! Italy

ip community-list 18 permit 65300:52

! Spain

ip community-list 18 permit 65300:53

! Russia

ip community-list 18 permit 65300:54

! Israel

ip community-list 18 permit 65300:55

! Switzerland

ip community-list 18 permit 65300:56

! Netherlands

ip community-list 18 permit 65300:57

! Sweden

ip community-list 18 permit 65300:58

! Germany

ip community-list 18 permit 65300:59

! France

ip community-list 18 permit 65300:60

! South Africa

ip community-list 18 permit 65300:61

Anytime a new country is added to DI’s list of peers the corresponding community list will need to be updated on all customer access routers. If the peer is in a country that DI already peers with then no modification will be necessary.

As an example a Digital Island customer is connected to the East and West Coast data centers. DI routers use community lists 15 and 17 to send routes to the West Coast peer with a higher local preference. They use community lists 16 and 18 to send routes to the East Coast peer with the same local preference. The scheme is then reversed with a lower metric for each peer to qualify each link as a backup route for the other. Most customers are dual homed to the East and West Coast. However, by creating a community list for each data center DI has better control over the routing in the event that a customer decides to dual home to two data centers other than the East and West Coast, or decides to triple home. Standard route maps have been defined for the East Coast/West Coast customer. Notice that metrics are still set as an added precaution in case the customer does not properly implement the incoming route map.