Skip to content

bgpd: BGP-LS: add Prefix SID (TLV 1158)#21076

Open
hedrok wants to merge 1 commit intoFRRouting:masterfrom
hedrok:bgp_ls_ad_prefix_sid_tlv_1158
Open

bgpd: BGP-LS: add Prefix SID (TLV 1158)#21076
hedrok wants to merge 1 commit intoFRRouting:masterfrom
hedrok:bgp_ls_ad_prefix_sid_tlv_1158

Conversation

@hedrok
Copy link
Contributor

@hedrok hedrok commented Mar 10, 2026

ADD TLV 1158 Prefix SID to BGP-LS: RFC 9805 Section 2.3.1

@cscarpitta Please review.

I've got commits for TLVs:

  • SR Capabilities (TLV 1034)
  • Adjacency SID (TLV 1099)
  • IS-IS Area Identifier (TLV 1027)
  • Node MSD (TLV 266)

And plan to create separate PR for each. If it is more convenient for you to review them all together, I can add them to this PR.

Thanks.

@greptile-apps
Copy link

greptile-apps bot commented Mar 10, 2026

Greptile Summary

This PR implements BGP-LS Prefix SID (TLV 1158) as defined in RFC 9085 Section 2.3.1, adding parsing, encoding, hashing, comparison, and VTY display support across bgp_ls_nlri.c/.h and bgp_ls_ted.c. The changes integrate cleanly into the existing TLV framework and follow established patterns for other BGP-LS attribute TLVs.

Key changes:

  • bgp_ls_nlri.h: Adds struct prefix_sid to bgp_ls_attr, defines TLV enum value 1158 and bit-position constant 36, and introduces BGP_LS_PREFIX_SID_FLAG_VALUE (0x08) for V-flag detection.
  • bgp_ls_nlri.c: Adds parse_prefix_sid() for wire decoding (supporting both 3-byte label and 4-byte index SID formats), extends bgp_ls_encode_attr() for wire encoding, and adds entries in bgp_ls_attr_cmp(), bgp_ls_attr_hash_key(), and bgp_ls_attr_display().
  • bgp_ls_ted.c: Adds conversion from the link-state TED's ls_prefix->sr fields into the new prefix_sid BGP-LS attribute, using the V-flag to determine whether the SID is a 3-byte label or 4-byte index.

Minor issues found: the parse function comment in bgp_ls_nlri.c references RFC 9805 (a likely digit-transposition of RFC 9085 used correctly in the header), and bgp_ls_ted.c uses open-coded bit manipulation instead of the BGP_LS_TLV_SET macro and has a misleading comment about V/L-flag bit positions.

Confidence Score: 4/5

  • This PR is safe to merge with minor documentation fixes recommended before landing.
  • The implementation correctly follows existing patterns for other TLV handlers throughout the file. Logic for parsing, encoding, hashing, and comparison is sound. The only issues are a likely digit-transposition in an RFC citation (9805 vs 9085) and two minor style inconsistencies in bgp_ls_ted.c. No functional bugs were found.
  • bgpd/bgp_ls_nlri.c (RFC citation typo in parse function comment), bgpd/bgp_ls_ted.c (misleading V/L-flag comment and inconsistent present_tlvs setter)

Important Files Changed

Filename Overview
bgpd/bgp_ls_nlri.c Adds parsing, encoding, comparison, hashing, and display support for Prefix SID TLV 1158. Implementation follows existing TLV patterns correctly. Minor issue: parse function comment references RFC 9805 (should be RFC 9085).
bgpd/bgp_ls_nlri.h Adds prefix_sid struct to bgp_ls_attr, defines BGP_LS_ATTR_PREFIX_SID TLV enum value (1158), BGP_LS_ATTR_PREFIX_SID_BIT (36), and BGP_LS_PREFIX_SID_FLAG_VALUE (0x08). RFC reference here correctly cites RFC 9085.
bgpd/bgp_ls_ted.c Adds conversion from ls_prefix->sr fields to BGP-LS prefix_sid attribute. Has a misleading comment about V-Flag/L-Flag positions and uses raw bit-shift instead of the BGP_LS_TLV_SET macro for consistency.

Sequence Diagram

sequenceDiagram
    participant IGP as IGP (IS-IS/OSPF)
    participant TED as bgp_ls_ted.c
    participant ATTR as bgp_ls_attr
    participant ENC as bgp_ls_nlri.c (encode)
    participant PEER as BGP Peer

    IGP->>TED: ls_prefix (sr.sid, sr.sid_flag, sr.algo)
    TED->>ATTR: populate prefix_sid {sid, sid_flag, algo, sid_len}
    TED->>ATTR: set BGP_LS_ATTR_PREFIX_SID_BIT in present_tlvs

    PEER->>ENC: BGP UPDATE with TLV 1158
    ENC->>ATTR: parse_prefix_sid(stream, length=7or8)
    ATTR-->>ENC: prefix_sid populated
    ENC->>PEER: bgp_ls_encode_attr → stream TLV 1158
    Note over ENC,PEER: flags(1) + algo(1) + reserved(2) + SID(3or4)
Loading

Last reviewed commit: 9eac2ad

Comment on lines +268 to +272
/* IS-IS (RFC 8667, 2.1.1.1); OSPF (RFC 8665, 5); OSPFv3 (RFC 8666, 6) */
/* All protocols have same logic: V-Flag and L-Flag are at same positions,
* both can be 1 or 0, when 1 - length is 3, when 0 - length is 4
*/
if (CHECK_FLAG((*attr)->prefix_sid.sid_flag, BGP_LS_PREFIX_SID_FLAG_VALUE))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misleading comment about V-Flag and L-Flag positions

The comment "V-Flag and L-Flag are at same positions, both can be 1 or 0, when 1 – length is 3, when 0 – length is 4" is misleading in two ways:

  1. The V-flag and L-flag are not at the same bit position across all protocols. In IS-IS (RFC 8667), V=0x08 and L=0x04; in OSPFv2 (as implemented in FRR's ospfd/ospf_sr.h), the flag bits happen to shift. The assertion "same positions" can confuse future maintainers.
  2. The code only checks BGP_LS_PREFIX_SID_FLAG_VALUE (the V-flag, 0x08), not both V and L together, which doesn't match the "both can be 1 or 0" phrasing.

Consider clarifying the comment to reflect that only the V-flag (BGP_LS_PREFIX_SID_FLAG_VALUE) is checked, and that a set V-flag means the SID is an MPLS label (3 bytes) rather than an index (4 bytes), consistent with RFC 8667 and RFC 8665.

Also, for consistency with the rest of the file, consider using the BGP_LS_TLV_SET macro on line 276 instead of the open-coded bit manipulation:

Suggested change
/* IS-IS (RFC 8667, 2.1.1.1); OSPF (RFC 8665, 5); OSPFv3 (RFC 8666, 6) */
/* All protocols have same logic: V-Flag and L-Flag are at same positions,
* both can be 1 or 0, when 1 - length is 3, when 0 - length is 4
*/
if (CHECK_FLAG((*attr)->prefix_sid.sid_flag, BGP_LS_PREFIX_SID_FLAG_VALUE))
/* IS-IS (RFC 8667, 2.1.1.1); OSPF (RFC 8665, 5); OSPFv3 (RFC 8666, 6)
* When the V-Flag (Value flag) is set, the SID carries an absolute MPLS
* label value (20-bit, encoded in 3 bytes). When clear, it is an index (4 bytes).
*/
if (CHECK_FLAG((*attr)->prefix_sid.sid_flag, BGP_LS_PREFIX_SID_FLAG_VALUE))
Prompt To Fix With AI
This is a comment left during a code review.
Path: bgpd/bgp_ls_ted.c
Line: 268-272

Comment:
**Misleading comment about V-Flag and L-Flag positions**

The comment "V-Flag and L-Flag are at same positions, both can be 1 or 0, when 1 – length is 3, when 0 – length is 4" is misleading in two ways:

1. The V-flag and L-flag are **not at the same bit position** across all protocols. In IS-IS (RFC 8667), V=`0x08` and L=`0x04`; in OSPFv2 (as implemented in FRR's `ospfd/ospf_sr.h`), the flag bits happen to shift. The assertion "same positions" can confuse future maintainers.
2. The code only checks `BGP_LS_PREFIX_SID_FLAG_VALUE` (the V-flag, `0x08`), not both V and L together, which doesn't match the "both can be 1 or 0" phrasing.

Consider clarifying the comment to reflect that only the V-flag (`BGP_LS_PREFIX_SID_FLAG_VALUE`) is checked, and that a set V-flag means the SID is an MPLS label (3 bytes) rather than an index (4 bytes), consistent with RFC 8667 and RFC 8665.

Also, for consistency with the rest of the file, consider using the `BGP_LS_TLV_SET` macro on line 276 instead of the open-coded bit manipulation:
```suggestion
		/* IS-IS (RFC 8667, 2.1.1.1); OSPF (RFC 8665, 5); OSPFv3 (RFC 8666, 6)
		 * When the V-Flag (Value flag) is set, the SID carries an absolute MPLS
		 * label value (20-bit, encoded in 3 bytes). When clear, it is an index (4 bytes).
		 */
		if (CHECK_FLAG((*attr)->prefix_sid.sid_flag, BGP_LS_PREFIX_SID_FLAG_VALUE))
```

How can I resolve this? If you propose a fix, please make it concise.

(*attr)->prefix_sid.sid_len = 3;
else
(*attr)->prefix_sid.sid_len = 4;
(*attr)->present_tlvs |= (1ULL << BGP_LS_ATTR_PREFIX_SID_BIT);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent use of present_tlvs setter

All other callers in bgp_ls_ted.c set bits in present_tlvs using the BGP_LS_TLV_SET macro (e.g. lines 248, 252, 259), but here raw bit-shift arithmetic is used. Using the macro keeps the code consistent and avoids the risk of accidentally missing the ULL suffix on larger bit numbers.

Suggested change
(*attr)->present_tlvs |= (1ULL << BGP_LS_ATTR_PREFIX_SID_BIT);
BGP_LS_TLV_SET((*attr)->present_tlvs, BGP_LS_ATTR_PREFIX_SID_BIT);
Prompt To Fix With AI
This is a comment left during a code review.
Path: bgpd/bgp_ls_ted.c
Line: 276

Comment:
**Inconsistent use of `present_tlvs` setter**

All other callers in `bgp_ls_ted.c` set bits in `present_tlvs` using the `BGP_LS_TLV_SET` macro (e.g. lines 248, 252, 259), but here raw bit-shift arithmetic is used. Using the macro keeps the code consistent and avoids the risk of accidentally missing the `ULL` suffix on larger bit numbers.

```suggestion
		BGP_LS_TLV_SET((*attr)->present_tlvs, BGP_LS_ATTR_PREFIX_SID_BIT);
```

How can I resolve this? If you propose a fix, please make it concise.

ADD TLV 1158 Prefix SID to BGP-LS: RFC 9085 Section 2.3.1

Signed-off-by: Kyrylo Yatsenko <hedrok@gmail.com>
@hedrok hedrok force-pushed the bgp_ls_ad_prefix_sid_tlv_1158 branch from 9eac2ad to 38d38cd Compare March 10, 2026 19:41
@donaldsharp
Copy link
Member

we need a topotest

@donaldsharp
Copy link
Member

This commit adds parsing, encoding, comparison, hashing, and display of BGP-LS Prefix SID (TLV 1158) across three files. I identified the following issues, ranked from most to least severe:

1. Integer Overflow in bgp_ls_attr_cmp — Incorrect Comparison Result (Medium-High)

bgp_ls_nlri.c
Lines 425-432
	if (BGP_LS_TLV_CHECK(attr1->present_tlvs, BGP_LS_ATTR_PREFIX_SID_BIT)) {
		if (attr1->prefix_sid.sid != attr2->prefix_sid.sid)
			return attr1->prefix_sid.sid - attr2->prefix_sid.sid;
		if (attr1->prefix_sid.sid_flag != attr2->prefix_sid.sid_flag)
			return attr1->prefix_sid.sid_flag - attr2->prefix_sid.sid_flag;
		if (attr1->prefix_sid.algo != attr2->prefix_sid.algo)
			return attr1->prefix_sid.algo - attr2->prefix_sid.algo;
	}
sid is uint32_t. The expression attr1->prefix_sid.sid - attr2->prefix_sid.sid performs unsigned subtraction, then the result is implicitly converted to int (the function's return type). When the unsigned difference exceeds INT_MAX, the sign flips:

sid1 = 0xFFFF_FFFF, sid2 = 0 → unsigned difference is 0xFFFF_FFFF → cast to int yields -1, falsely indicating sid1 < sid2.
This violates the comparator contract (transitivity, antisymmetry). A malicious peer can craft SID values that cause incorrect ordering in hash table lookups or sorted structures, leading to:

Duplicate entries for what should be the same attribute (memory exhaustion / DoS)
Failure to find an existing entry, causing state inconsistency
Incorrect route selection if attributes that should compare unequal are treated as equal (or vice-versa)
Note: This is a pre-existing pattern in this function (lines 353, 370, 378, 398, 407, 413 all have the same uint32_t subtraction bug), but the new code replicates it rather than fixing it.

Fix: Use a proper three-way comparison:

if (attr1->prefix_sid.sid > attr2->prefix_sid.sid) return 1;
if (attr1->prefix_sid.sid < attr2->prefix_sid.sid) return -1;
2. Partial Write on Encode Failure — Stream Corruption (Medium)

bgp_ls_nlri.c
Lines 2289-2309
	if (BGP_LS_TLV_CHECK(attr->present_tlvs, BGP_LS_ATTR_PREFIX_SID_BIT)) {
		if (attr->prefix_sid.sid_len == 3 || attr->prefix_sid.sid_len == 4) {
			if (stream_put_tlv_hdr(s, BGP_LS_ATTR_PREFIX_SID,
					       4 + attr->prefix_sid.sid_len) < 0)
				return -1;
			if (STREAM_WRITEABLE(s) < (size_t)4 + attr->prefix_sid.sid_len)
				return -1;
			stream_putc(s, attr->prefix_sid.sid_flag);
			// ...
stream_put_tlv_hdr unconditionally writes 4 bytes (type + length) without checking STREAM_WRITEABLE first:


bgp_ls_nlri.c
Lines 1471-1476
static inline int stream_put_tlv_hdr(struct stream *s, uint16_t type, uint16_t length)
{
	stream_putw(s, type);
	stream_putw(s, length);
	return BGP_LS_TLV_HDR_SIZE;
}
It always returns BGP_LS_TLV_HDR_SIZE (never -1), so the < 0 check on line 2292 is dead code — it will never trigger. If the stream is nearly full, the header writes succeed (FRR's stream_putw silently truncates or asserts on overflow depending on build), but then the subsequent STREAM_WRITEABLE check for the body fails and returns -1. The stream now contains a partial TLV — a header promising 7-8 bytes of payload with no corresponding body. Any peer parsing this re-encoded message will read past the TLV boundary, interpreting subsequent TLVs or garbage as SID data.

Impact: A carefully sized BGP-LS attribute set could trigger this on re-advertisement, causing the re-encoded UPDATE to contain malformed data that crashes or confuses downstream parsers.

Fix: Check STREAM_WRITEABLE for the total TLV size (header + body) before writing anything, or save/restore the stream position on failure.

3. One-Sided present_tlvs Check in Comparison — Logic Error (Medium)

bgp_ls_nlri.c
Lines 425-425
	if (BGP_LS_TLV_CHECK(attr1->present_tlvs, BGP_LS_ATTR_PREFIX_SID_BIT)) {
Only attr1->present_tlvs is checked. If attr1 has the Prefix SID bit set but attr2 does not, the code compares attr1->prefix_sid fields against attr2's zero-initialized (or stale) memory. This means:

An attribute with prefix_sid = {sid=0, flag=0, algo=0} will compare equal to an attribute without any Prefix SID at all.
An attribute with Prefix SID will be considered "less than" one without if attr1->prefix_sid.sid happens to be smaller than attr2's uninitialized or zero-initialized value.
A malicious peer can exploit this to create hash collisions between attributes that should be distinct, or cause routing decisions to treat different attributes as identical.

Fix: Compare present_tlvs bits for both attributes first:

bool a1_has = BGP_LS_TLV_CHECK(attr1->present_tlvs, BGP_LS_ATTR_PREFIX_SID_BIT);
bool a2_has = BGP_LS_TLV_CHECK(attr2->present_tlvs, BGP_LS_ATTR_PREFIX_SID_BIT);
if (a1_has != a2_has)
    return a1_has ? 1 : -1;
if (a1_has) {
    /* compare fields */
}
(Again, this is a pre-existing pattern throughout the function, but the new code perpetuates it.)

4. Inconsistent sid_len Derivation Between Parse and Populate Paths (Low-Medium)
In the parse path, sid_len is derived from the wire length field (trustworthy — it reflects what was actually sent):


bgp_ls_nlri.c
Lines 3710-3710
	attr->prefix_sid.sid_len = length - 4;
In the TED populate path (bgp_ls_ted.c), sid_len is derived from the flags:


bgp_ls_ted.c
Lines 269-276
		if (CHECK_FLAG((*attr)->prefix_sid.sid_flag, BGP_LS_PREFIX_SID_FLAG_VALUE))
			(*attr)->prefix_sid.sid_len = 3;
		else
			(*attr)->prefix_sid.sid_len = 4;
BGP_LS_PREFIX_SID_FLAG_VALUE is 0x08 (the V-flag). Per RFC 8667/8665/8666, the V-flag and L-flag should be set together, but this code only checks V. A malicious peer could send a Prefix SID with inconsistent flags (e.g., V=0 but only 3 bytes of SID data). The parse path handles this correctly (3 bytes read), but when the data is re-populated from the TED and re-encoded, the flags would cause sid_len = 4, and the encoder would write 4 bytes of SID — reading one extra byte from the sid field (which was only populated with 3 bytes of data, leaving the high byte as zero). While not directly exploitable for code execution, it alters the semantic meaning of the SID on re-advertisement (a 20-bit MPLS label becomes a 32-bit SID index), potentially causing traffic black-holing or misdirection in downstream SR-MPLS networks.

5. No Duplicate TLV Guard (Low)
parse_prefix_sid does not check whether BGP_LS_ATTR_PREFIX_SID_BIT is already set before parsing. If a malicious peer includes multiple TLV 1158 instances in a single BGP-LS attribute, each subsequent one silently overwrites the previous. While there are no dynamic allocations to leak, this violates RFC 9085 (which specifies at most one Prefix SID per prefix) and could be used to bypass policy filters — a filter that inspects the first Prefix SID TLV would see a different value than what the router ultimately stores (the last one).

Summary Table
#	Issue	Severity	Exploitability	Impact
1	uint32_t subtraction overflow in comparator	Medium-High	Remote, unauthenticated	Hash corruption, DoS, incorrect routing
2	Partial TLV write on stream overflow	Medium	Remote, requires specific attr size	Malformed re-advertisement, downstream parser crash
3	One-sided present_tlvs check	Medium	Remote, unauthenticated	Hash collisions, incorrect attr dedup
4	Inconsistent sid_len derivation	Low-Medium	Remote, requires SR-MPLS	Traffic misdirection on re-advertisement
5	No duplicate TLV guard	Low	Remote, unauthenticated	Policy bypass, last-writer-wins
Issues 1 and 3 are pre-existing patterns throughout the bgp_ls_attr_cmp function; this commit replicates them rather than fixing them. Issue 2 is also partially pre-existing (the stream_put_tlv_hdr function never returns an error), but the new code adds a new instance where the TLV header + body write is not atomic.```

@donaldsharp
Copy link
Member

Just to let you know. We've gotten a rash of people using AI to find buffer exploits/underflow/overflow situations recently. I just ran it on your commit and found issues. From your comments above this is the first of many new nlri parsing changes. My expectation is that you are going to go through and fix any problems before the rest are submitted.

@hedrok
Copy link
Contributor Author

hedrok commented Mar 10, 2026

I will.

My plan for each item:

  1. Understand it, see if it is valid (e.g. AI greptile above misleadingly states that I've added "misleading V/L-flag comment" though it is not - I've rechecked it...)

  2. If it is invalid, write comment here, skip.

  3. If valid and concerns only my PR: update PR.

  4. If valid and concerns surrounding code in file (e.g. for item 1 in the report "Note: This is a pre-existing pattern in this function (lines 353, 370, 378, 398, 407, 413 all have the same uint32_t subtraction bug), but the new code replicates it rather than fixing it") create separate PR fixing the issue in whole file.

After all that work I'll add topotest coverage for this TLV.

But I'll do it all tomorrow (or at least will start tomorrow).

🫡

@hedrok
Copy link
Contributor Author

hedrok commented Mar 11, 2026

Summary:

  1. Wrong.
  2. Fixed for other TLVs in bgpd: Fix a couple of issues in BGP-LS NLRI encoding/decoding #21092, will rebase when merged.
  3. Wrong.
  4. Will be thankful for advise.
  5. Fixed for other TLVs in bgpd: Fix a couple of issues in BGP-LS NLRI encoding/decoding #21092, will rebase when merged.

@cscarpitta I would really appreciate advises.

While waiting for advises I'll add checks in test for this TLV...

1.

Integer Overflow in bgp_ls_attr_cmp — Incorrect Comparison Result (Medium-High)

It is just wrong in many aspects.

sid1 = 0xFFFF_FFFF, sid2 = 0 → unsigned difference is 0xFFFF_FFFF → cast to int yields -1, falsely indicating sid1 < sid2. This violates the comparator contract (transitivity, antisymmetry).

Well, it will give -1 for this sid1 and sid2, but it doesn't mean there is violation of transitivity or antisymmetry.

Same can be written about a - b for ints: for a = -2147483648 and b = 3 it will give b < a. This is a bit of whataboutism, if needed, I can try to properly prove that (a - b) == -1 is transitive relation for both int and uint32_t.

Apart from that the function is used only as bgp_ls_attr_cmp(...) == 0 and for DECLARE_HASH, so it CANNOT be used for anything malicious even if the result is not transitive.

2.

stream_put_tlv_hdr unconditionally writes 4 bytes (type + length) without checking STREAM_WRITEABLE first:

Well, stream_put_tlv_hdr unconditionally returns BGP_LS_TLV_HDR_SIZE - that is true.

Should it be fixed?.. I have very shallow understanding of how streams work here, but as I understand after each stream_put_tlv_hdr there is a check whether there is enough stream to write TLV body... Would really appreciate advise:

Options:
a. Do nothing.
b. Remove extra checks of return value from stream_put_tlv_hdr.
c. Check each of stream_putw in stream_put_tlv_hdr, return -1 on error.

Update: in #21092 c is selected.

3.

Only attr1->present_tlvs is checked. If attr1 has the Prefix SID bit set but attr2 does not, the code compares attr1->prefix_sid fields against attr2's zero-initialized (or stale) memory. This means:

Totally wrong, there is:

	if (attr1->present_tlvs != attr2->present_tlvs)
		return attr1->present_tlvs - attr2->present_tlvs;

at the beginning of function, so it is impossible.

4.

At last something specific to this PR. Well... Probably there is some merit in this, but I'm not sure. A lot of other parts of FRR just check V-flag. I expect that state in lib/link_state is valid, so didn't check anything in bgp_ls_populate_prefix_attr. What should I do?

a. Ignore.
b. Strict checking: check everywhere V==L, check length is consistent with flags in NLRI/TED.

5.

I don't see where RFC 9085 specifies at most one Prefix SID per prefix and what to do if it is not. I would expect it is. If there is policy I don't see why it can't be confused by last TLV instead of first one if FRR uses first one instead... I'll appreciate advise:

a. Should I ignore this?..
b. Is it better to use first one and ignore rest with warning?..
c. Should all of them be ignored if there is not single TLV?..

Update: in #21092 even more strict option is chosen: ignore whole NLRI.

@cscarpitta cscarpitta self-requested a review March 11, 2026 08:25
@hedrok
Copy link
Contributor Author

hedrok commented Mar 11, 2026

About topotest for this, @cscarpitta I see as proper way to add json to bgp_ls_attr_display and check TLVs on rr router with show bgp link-state link-state json. Currently I see no way to check attributes in json output... And I think parsing text output is not nice.

Should I add json output as separate PR? Or do you have such plans/other suggestion how to test this?..

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants