[Oisf-users] TCP reassembly gaps

Chris Wakelin c.d.wakelin at reading.ac.uk
Sat Apr 21 10:19:05 UTC 2012

On 21/04/2012 10:08, Victor Julien wrote:
> On 04/19/2012 11:33 AM, Chris Wakelin wrote:
>> On 19/04/12 06:33, Victor Julien wrote:
>>>> The tcp.reassembly_gap count (added over all 6 threads) is increasing at
>>>> about 40/sec.
>>>> This is using an Intel 10GB (ixgbe) card, and PF_RING reckons there are
>>>> no lost packets. The network load is about 300-400Mb/s rising to nearly
>>>> 1GB when the students are all here. (One oddity about this port mirror
>>>> is that the packets are VLAN-tagged in only one direction. Extreme
>>>> Networks say this is by design :-$; I've modified the PF_RING packet
>>>> hash to ignore VLAN tags)
>>>> On the main campus network, 1GB port mirror (VLAN-tagged properly) there
>>>> are no gaps, even though it's frequently losing packets (e.g. when the
>>>> traffic goes over 1GB).
>>>> Any idea how do debug this? Could it be an ethernet driver issue?
>>> This is strange indeed.
>>> One way to debug is to enable this rule:
>>> # Sequence gap: missing data in the reassembly engine. Usually due to
>>> packet loss. Will be very noisy on a overloaded link / sensor.
>>> alert tcp any any ->  any any (msg:"SURICATA STREAM reassembly sequence
>>> GAP -- missing packet(s)"; stream-event:reassembly_seq_gap; sid:2210048;
>>> rev:1;)
>> Ah, I forgot we could sig these now!
>>> You may want to threshold it some.
>>> Then look at the streams that fire...
>> I think it must be a PF_RING/ixgbe issue. I've got IRQ-pinning on and
>> RSS enabled so it might be worth trying with RSS turned off.
>> I created a pcap (~200MB/250K packets) with PF-RING-enabled tcpdump for
>> a minute or so, filtering on a Google/Youtube /24 network with "net
>> or (vlan and net" to get varied sources
>> and destinations, and Wireshark agrees it's missing packets. PF_RING
>> stats suggest no dropped packets though.
>> The above sig hits 45 times and gave me some src/dst pairs to check in
>> Wireshark :)
> So you're saying you're loosing packets after all? I guess that could
> make sense. PF_RING can't account for packets that are lost before
> reaching the NIC. The gap counter is an indication of packet loss.

My packet dump is missing packets according to wireshark.

PF_RING thinks it isn't dropping any (except very occasionally, so it's 
not that it always reports 0 drops). Our network switch expert is 
adamant the switch won't be dropping any either, but it could be faulty 
cabling or something. Strangely it doesn't seem to depend much on the 
traffic load, it drops roughly the same at 100Mb/s as at 500Mb/s.

I tried the default Ubuntu ixgbe driver and tcpdump (i.e. no PF_RING), 
but that reckoned the kernel dropped packets (so I probably need PF_RING 
or similar). I've also tried disabling RSS (multiqueues).

I did a packet capture using PF_RING + DNA (the fastest flavour of 
PF_RING) which isn't missing packets in the streams it got but I'm still 
not sure it captured everything; it seemed a bit small as a capture and 
I'm not sure how it integrates with RSS so it could conceivably have 
filtered out some whole streams. "pfcount" was giving me 1/4 of the 
expected rate.

The other odd thing of course is that the switch is VLAN-tagging packets 
in one direction only, which might be confusing things.

I'll know more when we start 10GB port-mirroring on the campus network, 
but that isn't likely until the end of term in July.

Best Wishes,

Christopher Wakelin,                           c.d.wakelin at reading.ac.uk
IT Services Centre, The University of Reading,  Tel: +44 (0)118 378 2908
Whiteknights, Reading, RG6 2AF, UK              Fax: +44 (0)118 975 3094

More information about the Oisf-users mailing list