A SRv6 Traffic Engineering Application for AI Network
draft-cheng-spring-srv6-for-ai-network-00
This document is an Internet-Draft (I-D).
Anyone may submit an I-D to the IETF.
This I-D is not endorsed by the IETF and has no formal standing in the
IETF standards process.
| Document | Type | Active Internet-Draft (individual) | |
|---|---|---|---|
| Authors | Weiqiang Cheng , Changwang Lin | ||
| Last updated | 2025-07-03 | ||
| RFC stream | (None) | ||
| Intended RFC status | (None) | ||
| Formats | |||
| Stream | Stream state | (No stream defined) | |
| Consensus boilerplate | Unknown | ||
| RFC Editor Note | (None) | ||
| IESG | IESG state | I-D Exists | |
| Telechat date | (None) | ||
| Responsible AD | (None) | ||
| Send notices to | (None) |
draft-cheng-spring-srv6-for-ai-network-00
SPRING Working Group W. Cheng
Internet Draft China Mobile
Intended status: Informational C. Lin
Expires: 05 January 2026 New H3C Technologies
03 July 2025
A SRv6 Traffic Engineering Application for AI Network
draft-cheng-spring-srv6-for-ai-network-00
Abstract
AI applications require fast processing and responses. Traffic using
RoCEv2 has low entropy for ECMP. At the same time, AI elephant flows
are predictable. Traffic engineering technology for AI backend
networks becomes a possible solution. SRv6 TE can start from the
host side, making SRv6 source routing and traffic path control from
the host side an optional solution.
This document presents a AI network Traffic Engineering (TE)
application scenario for handling link faults and traffic congestion
issues in data centers, based on Segment Routing over IPv6 (SRv6)
and Compressed Segment Identifier (CSID). The application scenario
uses SRv6 CSID Network Programming to directly install all
forwarding paths on the head-end device. When a data center
experiences a link fault or traffic congestion, the head-end device
switches the forwarding path to another optimal path for avoiding
the location of link fault or traffic congestion, ensuring optimal
AI data flow forwarding.
Status of this Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as "work in progress."
This Internet-Draft will expire on 05 January 2026.
Cheng, et al. Expires 03 January 2026 [Page 1]
Internet-Draft A Scalable Method to TE with SRv6 July 2025
Copyright Notice
Copyright (c) 2025 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(https://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Revised BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Revised BSD License.
Table of Contents
1. Introduction...................................................2
1.1. Requirements Language.....................................3
2. Application Scenario...........................................3
3. Illustration...................................................5
4. Operational Considerations.....................................6
5. IANA Considerations............................................6
6. Security Considerations........................................6
7. References.....................................................6
7.1. Normative References......................................6
7.2. Informative References....................................7
Authors' Addresses................................................7
1. Introduction
Segment Routing over IPv6 (SRv6) [RFC8402] is the instantiation of
Segment Routing (SR) on the IPv6 data plane. Since Traditional SRv6
Traffic Engineering (TE), which require the use of complete 128-bit
Segment Identifier (SID) [RFC9602] to define an ordered Segment List
for forcing packets to be forwarded along the designated path, has
high flexibility and high scalability, but it will lack low packet
overhead when a path requires a longer segment list.
As AI resources and services become increasingly rich, AI networks
necessitate large-scale, high-bandwidth, and highly reliable
features. AI traffic has lower entropy and is primarily composed of
elephant flows, which leads to rapid saturation of the link when
nodes transmit AI traffic simultaneously. When AI networks employ
traditional load balancing techniques (such as ECMP), even with a
sufficiently uniform hash algorithm, uneven distribution of low-
entropy traffic or link faults can still result in certain links
becoming excessively loaded, leading to traffic congestion. And in
the event of link failures, whether local or remote, it is necessary
to achieve convergence in as short a time as possible to minimize
Cheng, et al. Expires 05 January 2026 [Page 2]
Internet-Draft A Scalable Method to TE with SRv6 July 2025
the impact on network communication. So Data center AI networks
require a reliable, flexible, and efficient solution to mitigate the
impact of traffic congestion and link faults on communication
quality.
From the perspective of source routing, SRv6 TE enables the source
node to directly participate in the path selection and planning
process. The service traffic within AI networks is highly diverse,
with different services having varying requirements for latency,
bandwidth, and quality of service (QoS). Leveraging the source
routing mechanism, the source node can flexibly determine the
forwarding path of packets based on specific service needs,
bypassing potentially congested or underperforming links, thereby
ensuring efficient transmission for critical services.
AI applications require fast processing and responses. Traffic using
RoCEv2 has low entropy for ECMP. At the same time, AI elephant flows
are predictable. Traffic engineering technology for AI backend
networks becomes a possible solution. SRv6 TE can start from the
host side, making SRv6 source routing and traffic path control from
the host side an optional solution.
This document presents a SRv6 TE application scenario based on
Compressed Segment Identifier (CSID) [RFC9800] to address traffic
congestion and link fault issues for AI traffic in data centers. The
key idea is to use CSID to design segment list for forwarding paths
by SRv6 network path programming [RFC8986], enabling dynamic
switchover to the optimal path from the head (source) node to the
end node at the head (source) node in the event of traffic
congestion or link faults, thus routing traffic around congested or
faulty links to alleviate their impact. This application scenario
employs CSID, resulting in lower packet overhead, higher flexibility,
and scalability.
1.1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all
capitals, as shown here.
2. Application Scenario
This section introduces a SRv6 TE application scenario based on
NEXT-CSID flavor [RFC9800] for AI network to mitigate the impact of
traffic congestion and link faults on communication quality.
The comprehensive solution of this application scenario builds upon
traditional SRv6 TE methods by employing CSID for packet
encapsulation and forwarding. CSID compresses the 128-bit SID
Cheng, et al. Expires 05 January 2026 [Page 3]
Internet-Draft A Scalable Method to TE with SRv6 July 2025
[RFC9602] into a shorter SID, such as 16-bit or 32-bit.
Multiple CSIDs can be concatenated into a compact list and embedded
in the remaining space of a single IPv6 address. When using 16-bit
segment CSID for a 32-bit locator block, a single IPv6 address can
easily encode a deterministic path with a depth of up to 6 hops.
The topology of the application scenario is shown in Figure 1, which
includes a controller and multiple network nodes. The controller
collects the status of the entire data center network, such as
topology, bandwidth, and latency, and calculates the optimal SRv6
CSID path through algorithms, such as Dijkstra and Path Computation
Element (PCE) [RFC4655]. The controller issues SRv6 CSID policies to
the head node or modify the CSID sequence list through control
management protocols (such as NETCONF [RFC6241], BGP-LS [RFC9514])
to dynamically adjust the packet forwarding path. The SRv6 Policy,
including multiple feasible paths, is installed in the head node.
Each network node has a CSID and can identify the CSID and perform
shift forwarding operations.
+----------+
|Controller|
+----------+
/ \
/ \
+--------+----------------+--------+
| Data Center|
| |
| (SPINE1)-----+ +-----(SPINE2) |
| | \ / | |
| | \/ | |
| | /\ X |
| | / \ | |
| (LEAF1)------+ +------(LEAF2) |
| | | |
| | | |
| (HOST1) (HOST2) |
+----------------------------------+
Figure 1: Typical Topology
When congestion or a fault occurs in the application scenario, such
as the location between SPINE2 and LEAF2, the procedure is as
follows:
* The congested or faulty node (LEAF2) advertises this to the
controller to recalculate the optimal CSID path and issue it to the
head node (HOST1), or the head node (HOST1) perceives congestion and
faults itself through probe packets or responding ACKs, and reselect
the optimal CSID path.
Cheng, et al. Expires 05 January 2026 [Page 4]
Internet-Draft A Scalable Method to TE with SRv6 July 2025
* Based on SRv6 Policies that include CSID paths, traffic packets
are rerouted to the most optimal forwarding path at the head node
(HOST1), avoiding congested or faulty location.
Of course, the prerequisite of this application solution is that
there MUST be multiple feasible paths from the head node (HOST1) to
the end node (HOST2).
This document assumes that congestion and failures can be probed by
the head node or controller, and how to probe congestion and
failures is beyond the scope of this document.
3. Illustration
This section provides an illustration of the SRv6 TE application
scenario based on NEXT-CSID flavor. The example topology is depicted
in Figure 2.
All network nodes in this topology use a global 32-bit Locator-
Block, which is 2001:db8::/32. All network nodes use a 16-bit CSID,
where the CSID of node LEAF1 is 0xd101, the CSID of node LEAF2 is
0xd102, the CSID of node SPINE1 is 0xd001, and the CSID of node
SPINE2 is 0xd002.
The controller, based on global topology information, calculates the
optimal path from HOST1 to HOST2 as LEAF1->SPINE2->LEAF2 and issues
this to the head node HOST1. Since this path requires passing
through three nodes, when HOST1 sends packets, it only needs to set
the IPv6 destination address of the packets to
2001:db8:d101:d002:d102::/48.
+----------+
|Controller|
+----------+
/ \
/ \
+---------------+----------------+----------------+
| Data Center |
| |
| CSID: 0xd001 CSID: 0xd002 |
| (SPINE1)-----+ +-----(SPINE2) |
| |@@@@@@@@@@\ /##########| |
| |@ @\/# #| |
| |@ /\ #X |
|CSID: 0xd101|@ /#@\ #|CSID:0xd102|
| (LEAF1)------+# @+------(LEAF2) |
| | ####### @@@@@@@ | |
| | | |
| (HOST1) (HOST2) |
+-------------------------------------------------+
'X': the location of the traffic congestion or link fault
Cheng, et al. Expires 05 January 2026 [Page 5]
Internet-Draft A Scalable Method to TE with SRv6 July 2025
'#': the optimal path before traffic congestion or link fault occurs
'@': the optimal path after traffic congestion or link fault occurs
Figure 2: Example Topology
When a traffic congestion or link fault occurs between nodes SPINE2
and LEAF2, the controller recalculates the optimal path from HOST1
to HOST2 as LEAF1->SPINE1->LEAF2 when the LEAF2 advertises the
traffic congestion or link fault to the controller. The controller
issues an update to the head node HOST1, instructing it to set the
IPv6 destination address of transmitted packets to
2001:db8:d101:d001:d102::/48, thereby bypassing the location of the
traffic congestion or link fault.
4. Operational Considerations
The operation of this application scenario is consistent with
[RFC8986] and [RFC9800]. All network nodes related to this
application scenario MUST support CSID and execute shift forwarding
operation of CSID.
5. IANA Considerations
This document has no IANA actions.
6. Security Considerations
This document does not introduce additional security considerations.
7. References
7.1. Normative References
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, DOI
10.17487/RFC2119, March 1997, <https://www.rfc-
editor.org/info/rfc2119>.
[RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed.,
and A. Bierman, Ed., "Network Configuration Protocol
(NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011,
<https://www.rfc-editor.org/info/rfc6241>.
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
May 2017, <https://www.rfc-editor.org/info/rfc8174>.
[RFC8402] Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L.,
Decraene, B., Litkowski, S., and R. Shakir, "Segment
Routing Architecture", RFC 8402, DOI 10.17487/RFC8402,
July 2018, <https://www.rfc-editor.org/info/rfc8402>.
Cheng, et al. Expires 05 January 2026 [Page 6]
Internet-Draft A Scalable Method to TE with SRv6 July 2025
[RFC8986] Filsfils, C., Ed., Camarillo, P., Ed., Leddy, J., Voyer,
D., Matsushima, S., and Z. Li, "Segment Routing over IPv6
(SRv6) Network Programming", RFC 8986,DOI
10.17487/RFC8986, February 2021,<https://www.rfc-
editor.org/info/rfc8986>.
[RFC9800] Cheng, W., Filsfils, C., Li, Z., Decraene, B., and F.
Clad, "Compressed SRv6 Segment List Encoding (CSID)", RFC
9800, DOI 10.17487/RFC9800, July 2025, < https://www.rfc-
editor.org/info/rfc9800>.
7.2. Informative References
[RFC4655] Farrel, A., Vasseur, J.-P., and J. Ash, "A Path
Computation Element (PCE)-Based Architecture", RFC 4655,
DOI 10.17487/RFC4655, August 2006, <https://www.rfc-
editor.org/info/rfc4655>.
[RFC9514] Dawra, G., Filsfils, C., Talaulikar, K., Ed., Chen, M.,
Bernier, D., and B. Decraene, "Border Gateway Protocol -
Link State (BGP-LS) Extensions for Segment Routing over
IPv6 (SRv6)", RFC 9514, DOI 10.17487/RFC9514, December
2023, <https://www.rfc-editor.org/info/rfc9514>.
[RFC9602] Krishnan, S., "Segment Routing over IPv6 (SRv6) Segment
Identifiers in the IPv6 Addressing Architecture", RFC
9602, DOI 10.17487/RFC9602, October 2024,
<https://www.rfc-editor.org/info/rfc9602>.
Authors' Addresses
Weiqiang Cheng
China Mobile
Beijing
China
Email: chengweiqiang@chinamobile.com
Changwang Lin
New H3C Technologies
Beijing
China
Email: linchangwang.04414@h3c.com
Cheng, et al. Expires 05 January 2026 [Page 7]