tests: harden bgp_conditional_advertisement_track_peer convergence waitsbgp_conditional_advertisement_track_peer.test_bgp_conditional_advertisement_track_peer
fails intermittently on loaded CI hosts (e.g. AddressSanitizer Debian 12 amd64):
AssertionError: R1 SHOULD receive 172.16.255.2/32 from R2
After enabling the R2-R3 session, R1 must learn 172.16.255.2/32 only once R2
receives the exist-map prefix 172.16.255.3/32 from R3 and the conditional
advertisement sc...
Merge pull request #22021 from opensourcerouting/fix/bgp_move_otc_attribute_to_extrabgpd: Move OTC and IPv6 extended community attributes to attr_extra
bgpd: Use stream_new_expandable() for BMP code to avoid overflowAlso, validate and drop packets later exceeding 65k.
Reported-by: Qifan Zhang, Palo Alto Networks
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
tests: fix flaky IGMP source checks in pim_boundary_acltest_pim_asm_igmp_join_acl intermittently failed at its opening check
with "expected has key 'r1-eth0' which is not present in output". The
test intent was correct (verify no IGMP source for the ASM/SSM group
before sending joins), but the assertion did not match how FRR reports
IGMP sources.
"show ip igmp sources json" only emits interface keys when that interface
has at least one source entr...
tests: fix multicast_pim_sm_topo2 TC_15/TC_7 mroute flakinessTC_15 was no-shutting the wrong interface after shut/no-shut of the
upstream links, leaving l1-r2-eth4 and f1-r2-eth3 down and breaking
(S,G) verification in the following test. Reset PIM state at TC_15
teardown and restart traffic in TC_7 after all receivers join.
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
tests: add DM->SM transition coverage to pim_dense topotestAdd helpers and an end-to-end test that verifies an existing dense (S,G)
transitions to sparse mode when an RP is added for its group range.
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
pimd: move dense (S,G) to sparse mode when an RP is addedRe-evaluate existing dense upstreams when RP mappings change, clear
stale DM OIF state before syncing the kernel MFC, and treat groups as
sparse once an RP is configured on sparse-dense interfaces.
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
bgpd: Move ipv6_community attribute from attr to attr_extraIt's not very common usage with IPv6 extended communities yet(?), only something
like extended link-bandwith is used or so, hence move it to extra, by saving
extra 8-bytes and one cacheline (because the last one was 4-bytes).
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
ospfd: prevent stale LSA from corrupting local OSPF DB after rebootEnsure local LSA's have the highest sequence number and neighbors
are refreshed in the event a stale LSA is detected.
Current behavior assuming we have two ospf routers: R1 <–> R2
- R1 and R2 are ospf neighbors
- R1 has a summary route being advertised to R2
This summary route has some LSA sequence number that is higher than 1
At this point everything is working fine. But then:
- R1 reboo...
Revert "bgpd: Move attr->srte_color to attr->extra->srte_color"This reverts commit 036032c3b59da4ca389d9936e2a4034db4c07d1d.
SRTE color is fundamentally broken to be lived in attr at all...
It's just an extended community, and not a separate BGP attribute.
Let's revert this and move it later to bgp_path_info or somewhere else...
tests: verify SSM delivery to h3 with collect_receiver_sourcesAdd test_ssm_r1_to_h3_multicast_traffic: r1 sends (192.168.1.1,
230.0.0.100) on the shared LAN, r3's static join-group on eth0 pulls the
(S,G) to r3, and h3 receives on r3-eth1 after joining the same source on
h3-eth0. Assert per-source RX counts via mcast-tester --report-sources
JSON instead of only checking MFC state.
Extend McastTesterHelper.run_join() with an optional source= argument
for ...
tests: verify SSM mroute split horizon in multicast_ssm_topo1Add test_ssm_mroute_no_iif_oif_loop to ensure (192.168.1.1, 230.0.0.100)
does not install a kernel MFC that lists the incoming interface as an OIF
when the source and r3's join-group are both on the shared LAN (rX-eth0).
The test sends traffic from r1-eth0, waits for an installed mroute on r3,
then checks show ip mroute json on r1–r3 so outboundInterface never equals
iif. This guards against t...
pceplib: Validate lengths during object decodingSanity-check embedded object header lengths before continuing
to decode message objects.
Signed-off-by: Mark Stapp <mjs@cisco.com>
Reported-by: Luke Geier <seabreeze11971220@gmail.com>
lib: warn once when process fd limit is very largeEach event_master_create() logged the same fd limit warning (e.g. zebra
main plus dplane pthreads).
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
tests: Remove key-0 from acceptable on rt2The test is this: rt1 ---- rt2
Both rt1 and rt2 have a key 0 at first,
then the test removes key 0 and adds
key40 on rt1 and checks that the session
is down. Then on rt2 the code is adding
key40 but leaving key0. So rt2 continues
to transmit with key 0 and the session does
not come up. This is because there is no
test of the lifecycle part of key start/end
times. Modify the test to remove ...
*: Fix keychain acceptance of any keyIn bfd if you have this keychain configed on 2 routers, r1 and r2:
keychain a
key 0
cryptographic-algorithm hmac-sha-1
key-string mysecret123
end
And you have bfdd Configured to use keychain's between the two.
Then if you do this on rt1:
keychain a
no key 0
key 40
cryptographic-algoritm hmac-sha-1
key-string mysecret123
end
Notice that the key-string is the same for key 0 ...
tests: Use `show module` to get bgp's pidThe topotest is using `pidof bgpd` which is ok
when you run a test by itself, but when you
are running the topotests in parallel, this
is a bit of a problem. Fix.
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
bgpd: Move attr->srte_color to attr->extra->srte_colorThis saves one more 4 bytes of memory.
And we eliminate one CPU cacheline by moving this (because the last cacheline
was 4 bytes).
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
bgpd: Move attr->otc attribute to attr->extra->otcThis saves at least 4 bytes if not OTC (RFC 9234) is not used.
Signed-off-by: Donatas Abraitis <donatas@opensourcerouting.org>
tests: tune multicast_ssm_topo1 for shared-LAN SSM debuggingr1: Add a static route for 224.0.0.0/4 via r1-eth0 so multicast traffic
from the sender is steered onto the shared transit segment (192.168.1.0/24)
rather than another interface.
r3: Add a second SSM join-group at source 192.168.1.1.
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
pimd: clarify TIB IGMP loop protection vs split-horizon enforcementUpdate comments in tib_sg_oil_setup() to describe the division of
responsibility: non-DR routers still skip creating channel_oil when the
RPF nexthop VIF equals the IGMP interface, while DR routers may create
channel_oil but rely on pim_channel_add_oif() to avoid installing a
looped OIF=IIF MFC entry.
No functional change in this file; documentation only.
Signed-off-by: Jafar Al-Gharaibeh <ja...
pimd: reject adding an OIF that matches the MFC incoming interfaceAdd an early check in pim_channel_add_oif() for SSM (S,G) groups so
traffic is not forwarded back out the same VIF it arrived on. ASM is
excluded because the receiver interface may temporarily equal IIF during
RPT-to-SPT before the true RPF IIF is installed.
This is the primary entry point for IGMP/MLD-driven OIF adds
(tib_sg_gm_join) and complements pim_mroute_copy(), which already omits
IIF ...
pimd: enforce split horizon when installing (S,G) MFC entriesRemove the long-standing exception in pim_mroute_allow_iif_in_oil() that
permitted listing the incoming VIF on the OIL when the OIF was added by
IGMP/MLD (PIM_OIF_FLAG_PROTO_GM) and the router considered itself DR on
that interface.
That exception was meant to let the DR build upstream state when the
source and a local receiver share an interface (TODO T22). In practice
it installed kernel MFC...
tests: add multicast receiver source-reporting helperExtend mcast-tester with a bounded RX reporting mode that collects
per-source packet counts and emits JSON, then expose it through
McastTesterHelper.collect_receiver_sources() for topotests. This gives
tests a deterministic way to assert multicast source visibility without
shell parsing of external capture tools.
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
ci: fail topotest step when parallel run lacks JUnit failuresWhen the parallel pytest run exits non-zero but analyze.py finds no
failures in topotests.xml, fail the step instead of treating it as a pass.
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
tests: verify SSM (S,G) join stateAdd test_ssm_join_state to check that an SSM (S,G) appears in IGMP and
PIM on all routers on the shared LAN. Use a configured join-group on r3
only to inject local membership.
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
tests: expand multicast_ssm_topo1 for SSM debuggingAdd a three-router shared LAN with per-router hosts, OSPFv2 on
inter-router and host-facing interfaces, and passive PIM/IGMP on all
interfaces. Run group-type checks on r1, r2, and r3.
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
topotests: avoid hang opening ExaBGP peer FIFOsBlocking open() on per-peer FIFOs waits for exa_readpipe.py, which only
starts after ExaBGP finishes slow hostname lookups under parallel runs.
Use non-blocking open with retries and add peer names to /etc/hosts in
the Docker entrypoint so Rocky/container runs do not stall indefinitely.
Signed-off-by: Jafar Al-Gharaibeh <jafar@atcorp.com>
topotests: reap mutini children during munet and xdist teardownParallel pytest-xdist runs could hang at session end when workers left
mutini namespace processes as unreaped zombies. cleanup_pid() sent
SIGKILL without waitpid(), and session cleanup only ran on the controller.
Reap PIDs after SIGKILL, sweep zombies after async_cleanup_proc(), run
cleanup_current() on every worker, and waitpid in stop_topology().
Signed-off-by: Jafar Al-Gharaibeh <jafar@atc...
bgpd: fix attr comparison when using attr_intern_reuse cacheWhen the attr_intern_reuse cache is activated during NLRI
processing, a special case in bgp_attr_intern() attempts to
avoid a costly hash key computation by caching an attr
and using just the attrhash_cmp() function. But the logic
that populates the cached entry was comparing the input attr
after using it in a call to hash_get() -> bgp_attr_hash_alloc().
That alloc function has a side-effect - ...