Hi Guys,
Thanks for your earlier responses.
Unfortunately, we do not have access to the switch, so we’re unable to check for errors, drops, or misconfigurations on the switchport side.
On the firewall side:
Only firewall blade is enabled, no QOS.
We do not observe any drops in "fw ctl zdebug + drop" or via SmartConsole logs.
However, during the issue, we captured the following in "fw ctl kdebug" logs:
@;39117251.1778855408; 5Jun2025 16:55:13.085051;[cpu_2];[fw4_5];fwconn_ent_early_expiration: [now=1749113713] conn <dir 0, 172.30.94.88:55510 -> 172.30.79.5:4 IPP 6> reached early expiration;
@;39117251.1778855409; 5Jun2025 16:55:13.085052;[cpu_2];[fw4_5];fwconn_ent_early_expiration: return expire (timeout=25, aggr_timeout=5, new_ttl=20);
@;39117251.1778855410; 5Jun2025 16:55:13.085053;[cpu_2];[fw4_5];fwconn_ent_eligible_for_del : conn <dir 0, 172.30.94.88:55510 -> 172.30.79.5:4 IPP 6> is eligible for deletion;
@;39117251.1778855411; 5Jun2025 16:55:13.085054;[cpu_2];[fw4_5];fwconn_ent_early_expiration: [now=1749113713] conn <dir 0, 172.30.94.88:55504 -> 172.30.79.5:4 IPP 6> reached early expiration;
@;39117251.1778855412; 5Jun2025 16:55:13.085054;[cpu_2];[fw4_5];fwconn_ent_early_expiration: return expire (timeout=25, aggr_timeout=5, new_ttl=20);
@;39117251.1778855413; 5Jun2025 16:55:13.085055;[cpu_2];[fw4_5];fwconn_ent_eligible_for_del : conn <dir 0, 172.30.94.88:55504 -> 172.30.79.5:4 IPP 6> is eligible for deletion;
We tried disabling aggressive aging and increasing the timeout values for this specific port 4 in SmartConsole, but these messages still appear.
Besides, this is not only traffic that reached early expiration, there are plenty other traffic facing this as well in kdebug.
Therefore, these raised a few questions:
Questions
Does the presence of "fwconn_ent_early_expiration" and "eligible_for_del" in kdebug necessarily indicate aggressive aging is triggering? As I check the firewall, there are still plenty of memory and CPU resource remaining
Could this be related to some other timeout mechanism, or maybe a TCP/IP stack or connection tracking issue?
Could it point to some underlying issue at L2/L1, even if no drops show up in SmartConsole or zdebug?
AFAIK, the peer switch is a Cisco device. Is it possible that microbursts are occurring? Not entirely sure how microbursting manifests for Quantum Force, but could it be sending traffic faster than the switch can handle at times?
Any guidance, especially from those who have dealt with early expiration or similar kdebug patterns, would be greatly appreciated.
Thanks in advance.