[Oisf-users] File Extraction Woes

Jason Batchelor jxbatchelor at gmail.com
Fri May 30 19:53:49 UTC 2014


Thanks Victor/Cooper:

> If you have 48G of mem, I think you can use a lot more here. Like 16G or
something.

I used a large number like this initially, however, things eventually got
chaotic and the application ran out of memory. This lead me to believe that
it was allowing the cap on a per PF_RING thread basis, rather than as an
aggregate (though I could be totally wrong :) )

Important to note that with 1.5gb cap I was at 55.8% mem utilization
earlier over a ~12 hr period.

 > As stream depth is 5mb, setting 12mb here doesn't really affect anything
I think. The stream depth cuts stream tracking at 5mb regardless of the
setting here.

Noted. For some reason I thought it treated reassembly for that traffic
differently if that setting was invoked.

> Also, if you have multiple vlans on the network, it may be worth trying
to disable.

We do, and I did do an earlier test run setting this just to false. To no
avail however, as the issues continued.

> In 2.0.1 the stream engine should use less memory and clear memory
quicker. Could you try 2.0.1?

Just installed and am working with this now. Unfortunately,
tcp.reassembly_memuse seems to continue to slowly increase. Below is some
data taken from a recent test over the course of 35 minutes with the same
data rates (only with 2.0.1). Reassem memcap is set to 3gb with this run.
Stats.log output is every 10 seconds so I picked out every 3rd record on a
random thread.

grep RxPFRp4p210 stats.log | grep reassembly_memuse | awk '!(NR%3)'
tcp.reassembly_memuse     | RxPFRp4p210               | 154471336
tcp.reassembly_memuse     | RxPFRp4p210               | 243540732
tcp.reassembly_memuse     | RxPFRp4p210               | 301571602
tcp.reassembly_memuse     | RxPFRp4p210               | 314913802
tcp.reassembly_memuse     | RxPFRp4p210               | 285059970
tcp.reassembly_memuse     | RxPFRp4p210               | 272284065
tcp.reassembly_memuse     | RxPFRp4p210               | 254263605
tcp.reassembly_memuse     | RxPFRp4p210               | 253255657
tcp.reassembly_memuse     | RxPFRp4p210               | 272511595
tcp.reassembly_memuse     | RxPFRp4p210               | 296499090
tcp.reassembly_memuse     | RxPFRp4p210               | 311359535
tcp.reassembly_memuse     | RxPFRp4p210               | 283427074
tcp.reassembly_memuse     | RxPFRp4p210               | 255654055
tcp.reassembly_memuse     | RxPFRp4p210               | 270931311
tcp.reassembly_memuse     | RxPFRp4p210               | 299100487
tcp.reassembly_memuse     | RxPFRp4p210               | 291949643
tcp.reassembly_memuse     | RxPFRp4p210               | 293440732
tcp.reassembly_memuse     | RxPFRp4p210               | 290335664
tcp.reassembly_memuse     | RxPFRp4p210               | 269864084
tcp.reassembly_memuse     | RxPFRp4p210               | 267911989
tcp.reassembly_memuse     | RxPFRp4p210               | 276945873
tcp.reassembly_memuse     | RxPFRp4p210               | 283443241
tcp.reassembly_memuse     | RxPFRp4p210               | 345384994
tcp.reassembly_memuse     | RxPFRp4p210               | 347958227
tcp.reassembly_memuse     | RxPFRp4p210               | 330984680
tcp.reassembly_memuse     | RxPFRp4p210               | 305309240
tcp.reassembly_memuse     | RxPFRp4p210               | 274026280
tcp.reassembly_memuse     | RxPFRp4p210               | 328726738
tcp.reassembly_memuse     | RxPFRp4p210               | 369907025
tcp.reassembly_memuse     | RxPFRp4p210               | 352353094
tcp.reassembly_memuse     | RxPFRp4p210               | 333345579
tcp.reassembly_memuse     | RxPFRp4p210               | 306374595
tcp.reassembly_memuse     | RxPFRp4p210               | 277904734
tcp.reassembly_memuse     | RxPFRp4p210               | 298247168
tcp.reassembly_memuse     | RxPFRp4p210               | 301306632
tcp.reassembly_memuse     | RxPFRp4p210               | 266160428
tcp.reassembly_memuse     | RxPFRp4p210               | 307500851
tcp.reassembly_memuse     | RxPFRp4p210               | 307852499
tcp.reassembly_memuse     | RxPFRp4p210               | 290149127
tcp.reassembly_memuse     | RxPFRp4p210               | 306533111
tcp.reassembly_memuse     | RxPFRp4p210               | 291460952
tcp.reassembly_memuse     | RxPFRp4p210               | 288518673
tcp.reassembly_memuse     | RxPFRp4p210               | 324852777
tcp.reassembly_memuse     | RxPFRp4p210               | 302582777
tcp.reassembly_memuse     | RxPFRp4p210               | 293625604
tcp.reassembly_memuse     | RxPFRp4p210               | 307370770
tcp.reassembly_memuse     | RxPFRp4p210               | 311457210
tcp.reassembly_memuse     | RxPFRp4p210               | 311108835
tcp.reassembly_memuse     | RxPFRp4p210               | 320206099
tcp.reassembly_memuse     | RxPFRp4p210               | 307685170
tcp.reassembly_memuse     | RxPFRp4p210               | 282411341
tcp.reassembly_memuse     | RxPFRp4p210               | 288700701
tcp.reassembly_memuse     | RxPFRp4p210               | 306848861
tcp.reassembly_memuse     | RxPFRp4p210               | 281831829
tcp.reassembly_memuse     | RxPFRp4p210               | 292346555
tcp.reassembly_memuse     | RxPFRp4p210               | 314350370
tcp.reassembly_memuse     | RxPFRp4p210               | 286442727
tcp.reassembly_memuse     | RxPFRp4p210               | 283934879
tcp.reassembly_memuse     | RxPFRp4p210               | 270730695
tcp.reassembly_memuse     | RxPFRp4p210               | 269700704
tcp.reassembly_memuse     | RxPFRp4p210               | 327235776
tcp.reassembly_memuse     | RxPFRp4p210               | 326038648
tcp.reassembly_memuse     | RxPFRp4p210               | 359062072
tcp.reassembly_memuse     | RxPFRp4p210               | 359215420
tcp.reassembly_memuse     | RxPFRp4p210               | 371124232
tcp.reassembly_memuse     | RxPFRp4p210               | 379226360
tcp.reassembly_memuse     | RxPFRp4p210               | 411893212
tcp.reassembly_memuse     | RxPFRp4p210               | 429465136
tcp.reassembly_memuse     | RxPFRp4p210               | 452973020
tcp.reassembly_memuse     | RxPFRp4p210               | 464045120

Since reassembly memcap is not reached (yet) and pf_ring can't keep up, our
number of free slots for pf_ring threads is continuously at 0, which has
amplifying effects concerning packet loss.

grep capture.kernel_ stats.log | tail -n 10
capture.kernel_packets    | RxPFRp4p212               | 57951223
capture.kernel_drops      | RxPFRp4p212               | 5797512
capture.kernel_packets    | RxPFRp4p213               | 70292808
capture.kernel_drops      | RxPFRp4p213               | 7525246
capture.kernel_packets    | RxPFRp4p214               | 55188744
capture.kernel_drops      | RxPFRp4p214               | 3877128
capture.kernel_packets    | RxPFRp4p215               | 57682274
capture.kernel_drops      | RxPFRp4p215               | 5057610
capture.kernel_packets    | RxPFRp4p216               | 55113863
capture.kernel_drops      | RxPFRp4p216               | 4016017

> Are you stuck with PF_RING for some reason?

I am for the reasons Victor cited :)

Thanks all for the input so far. I am really hoping to get this working! If
there is any more information I can give to help please let me know.

Thanks,
Jason


On Fri, May 30, 2014 at 12:17 PM, Victor Julien <lists at inliniac.net> wrote:

> On 05/30/2014 06:31 PM, Jason Batchelor wrote:
> > Hello,
> >
> > I am having some issues with file extraction in Suricata, and after
> > attempting to do many optimizations and review of others experiences I
> > am still finding myself out of luck. Below is some verbose output of my
> > current configuration and some sample data after ~12 hours of running. I
> > have also included a smaller time frame with a few subtle changes for
> > consideration.
> >
> > Under this configuration the rule I have, which is an IP based rule
> below...
> >
> > alert http any any -> $MY_LOCAL_IP any (msg:"FILE PDF"; filemagic:"PDF
> > document"; filestore; sid:1; rev:1;)
> >
> > Does not trigger at all when the reassembly mem cap is reached. Even
> > when it does (when reassembly memcap is below the threshold), I get
> > truncated PDFs. I have tried adjusting things like the reassembly
> > memcap, however, when I do, I very quickly run into a large amount of
> > packet loss because the number of free slots PF_RING can issue is not
> > able to keep up (details below). Additionally, reassembly mem cap seems
> > to slowly increase over time, eventually reaching its peak before the
> > number of free ring slots can finally keep up (presumably due to
> > segments being dropped).
> >
> > I have struggled playing with time out values as well really to no avail
> > (details below).
> >
> > When I turn http logging on, I do see the website that I go to being
> > properly logged fwiw.
> >
> > I feel like I must be doing something wrong, or I am not seeing
> > something obvious. After reviewing many blogs and howtos, it seems folks
> > are able to do what I am trying to accomplish with the same (sometime
> > more) data rates and much less hardware.
> >
> > I have tried the following:
> > - Increased min_num_slots to 65534 for PF_RING
> > - Tinkered with TCP timeout settings
> > - Adjusted reassembly memcap
> >
> > Kindly take a look at the details I have listed below and let me know if
> > there is anything you can suggest. I am curious if I am just plain at
> > the limit of my hardware and need to consider upgrading and/or getting
> > PF_RING with DNA. Or, perhaps there are a few more items I should
> > consider within the application itself.
> >
> > One final thing to consider, would tcp sequence randomization
> > significantly impact things? I would need to get in touch with the folks
> > responsible to see if we have this on but thought I would ask here as
> well!
> >
> > Many thanks in advance for your time looking at this!
> >
> > == Profile ==
> >
> > CentOS 6.5 Linux
> > Kernel 2.6.32-431.11.2.el6.x86_64
> >
> > Installed Suricata 2.0 and PF_RING 6.0.1 from source.
> >
> > Machine sees ~400MB/s at peek load.
> >
> > == Tuning ==
> >
> > I've tuned the ixgbe NIC with the following settings...
> >
> > ethtool -K p4p2 tso off
> > ethtool -K p4p2 gro off
> > ethtool -K p4p2 lro off
> > ethtool -K p4p2 gso off
> > ethtool -K p4p2 rx off
> > ethtool -K p4p2 tx off
> > ethtool -K p4p2 sg off
> > ethtool -K p4p2 rxvlan off
> > ethtool -K p4p2 txvlan off
> > ethtool -N p4p2 rx-flow-hash udp4 sdfn
> > ethtool -N p4p2 rx-flow-hash udp6 sdfn
> > ethtool -n p4p2 rx-flow-hash udp6
> > ethtool -n p4p2 rx-flow-hash udp4
> > ethtool -C p4p2 rx-usecs 1000
> > ethtool -C p4p2 adaptive-rx off
> >
> > It is also using the latest driver available. I have also tried to
> > optimize things in the sysctl.conf
> >
> > # -- 10gbe tuning from Intel ixgb driver README -- #
> >
> > # turn off selective ACK and timestamps
> > net.ipv4.tcp_sack = 0
> > net.ipv4.tcp_timestamps = 0
> >
> > # memory allocation min/pressure/max.
> > # read buffer, write buffer, and buffer space
> > net.ipv4.tcp_rmem = 10000000 10000000 10000000
> > net.ipv4.tcp_wmem = 10000000 10000000 10000000
> > net.ipv4.tcp_mem = 10000000 10000000 10000000
> >
> > net.core.rmem_max = 524287
> > net.core.wmem_max = 524287
> > net.core.rmem_default = 524287
> > net.core.wmem_default = 524287
> > net.core.optmem_max = 524287
> > net.core.netdev_max_backlog = 300000
> >
> > == Hardware Specs ==
> > CPU: Intel Xeon CPU @ 2.40Ghz x 32
> > RAM: 48G
> > NIC:
> >   *-network:1
> >        description: Ethernet interface
> >        product: Ethernet 10G 2P X520 Adapter
> >        vendor: Intel Corporation
> >        physical id: 0.1
> >        bus info: pci at 0000:42:00.1
> >        logical name: p4p2
> >        version: 01
> >        serial: a0:36:9f:07:ec:02
> >        capacity: 1GB/s
> >        width: 64 bits
> >        clock: 33MHz
> >        capabilities: pm msi msix pciexpress vpd bus_master cap_list rom
> > ethernet physical fibre 1000bt-fd autonegotiation
> >        configuration: autonegotiation=on broadcast=yes driver=ixgbe
> > driverversion=3.21.2 duplex=full firmware=0x8000030d latency=0 link=yes
> > multicast=yes port=fibre promiscuous=yes
> >        resources: irq:76 memory:d0f00000-d0f7ffff ioport:7ce0(size=32)
> > memory:d0ffc000-d0ffffff memory:d1100000-d117ffff(prefetchable)
> > memory:d1380000-d147ffff(prefetchable)
> > memory:d1480000-d157ffff(prefetchable)
> >
> > == Suricata Config ==
> > Below are some details that may be relevant...
> >
> > runmode: workers
> >
> > host-mode: sniffer-only
> >
> > default-packet-size: 9000
> >
> > - file-store:
> >     enabled: yes       # set to yes to enable
> >     log-dir: files    # directory to store the files
> >     force-magic: yes   # force logging magic on all stored files
> >     force-md5: yes     # force logging of md5 checksums
> >     waldo: file.waldo # waldo file to store the file_id across runs
> >
> > defrag:
> >   memcap: 512mb
> >   hash-size: 65536
> >   trackers: 65535  # number of defragmented flows to follow
> >   max-frags: 65535 # number of fragments to keep (higher than trackers)
> >   prealloc: yes
> >   timeout: 30
> >
> > flow:
> >   memcap: 1gb
> >   hash-size: 1048576
> >   prealloc: 1048576
> >   emergency-recovery: 30
> >
> > flow-timeouts:
> >   default:
> >     new: 1
> >     established: 5
> >     closed: 0
> >     emergency-new: 1
> >     emergency-established: 1
> >     emergency-closed: 0
> >   tcp:
> >     new: 15
> >     established: 100
> >     closed: 5
> >     emergency-new: 1
> >     emergency-established: 1
> >     emergency-closed: 0
> >   udp:
> >     new: 5
> >     established: 10
> >     emergency-new: 1
> >     emergency-established: 1
> >   icmp:
> >     new: 1
> >     established: 5
> >     emergency-new: 1
> >     emergency-established: 1
> >
> > stream:
> >   memcap: 10gb
>
> This is excessive, although it won't hurt.
>
> >   checksum-validation: no        # reject wrong csums
> >   prealloc-sesions: 500000
> >   midstream: false
> >   asyn-oneside: false
> >   inline: no                     # auto will use inline mode in IPS
> > mode, yes or no set it statically
> >   reassembly:
> >     memcap: 1.5gb
>
> If you have 48G of mem, I think you can use a lot more here. Like 16G or
> something.
>
> >     depth: 5mb
> >     toserver-chunk-size: 2560
> >     toclient-chunk-size: 2560
> >     randomize-chunk-size: yes
> >
> > host:
> >   hash-size: 4096
> >   prealloc: 1000
> >   memcap: 16777216
> >
> >
> > pfring:
> >   - interface: p4p2
> >     threads: 16
> >     cluster-id: 99
> >  cluster-type: cluster_flow
> >  checksum-checks: no
> >  - interface: default
> >
> > http:
> >    enabled: yes
> >    libhtp:
> >       default-config:
> >         personality: IDS
> >
> >         # Can be specified in kb, mb, gb.  Just a number indicates
> >         # it's in bytes.
> >         request-body-limit: 12mb
> >         response-body-limit: 12mb
>
> As stream depth is 5mb, setting 12mb here doesn't really affect anything
> I think. The stream depth cuts stream tracking at 5mb regardless of the
> setting here.
>
> >
> > == ~12 hours (above config) =
> >
> > top - 14:58:59 up 18:23,  3 users,  load average: 6.44, 4.83, 4.32
> > Tasks: 664 total,   1 running, 663 sleeping,   0 stopped,   0 zombie
> > Cpu(s): 17.9%us,  0.1%sy,  0.0%ni, 80.3%id,  0.0%wa,  0.0%hi,  1.7%si,
> > 0.0%st
> > Mem:  49376004k total, 29289768k used, 20086236k free,    68340k buffers
> > Swap:  2621432k total,        0k used,  2621432k free,   820172k cached
> >
> >   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> > 17616 root      20   0 27.0g  26g  16g S 621.4 55.8   3532:51
> Suricata-Main
> >
> > watch 'cat /proc/net/pf_ring/*p4p2* | egrep "Num Free Slots|Tot
> > Packets|Tot Pkt Lost"'
> > ; First three threads...
> > Tot Packets        : 627370957
> > Tot Pkt Lost       : 3582014
> > Num Free Slots     : 118705
> > Tot Packets        : 676753767
> > Tot Pkt Lost       : 5292092
> > Num Free Slots     : 118745
> > Tot Packets        : 665348839
> > Tot Pkt Lost       : 3841911
> > Num Free Slots     : 118677
> > ...
> >
> > watch -n 10 'cat stats.log | egrep
> > "reassembly_memuse|segment_memcap_drop" | tail -n 32'
> > ; First three threads...
> > tcp.segment_memcap_drop   | RxPFRp4p21                | 25782329
> > tcp.reassembly_memuse     | RxPFRp4p21                | 1610612705
> > tcp.segment_memcap_drop   | RxPFRp4p22                | 26161478
> > tcp.reassembly_memuse     | RxPFRp4p22                | 1610612705
> > tcp.segment_memcap_drop   | RxPFRp4p23                | 25813867
> > tcp.reassembly_memuse     | RxPFRp4p23                | 1610612705
> >
> > grep 'reassembly_gap' stats.log | tail -n 10
> > tcp.reassembly_gap        | RxPFRp4p27                | 777366
> > tcp.reassembly_gap        | RxPFRp4p28                | 774896
> > tcp.reassembly_gap        | RxPFRp4p29                | 781761
> > tcp.reassembly_gap        | RxPFRp4p210               | 776427
> > tcp.reassembly_gap        | RxPFRp4p211               | 778734
> > tcp.reassembly_gap        | RxPFRp4p212               | 773203
> > tcp.reassembly_gap        | RxPFRp4p213               | 781125
> > tcp.reassembly_gap        | RxPFRp4p214               | 776043
> > tcp.reassembly_gap        | RxPFRp4p215               | 781790
> > tcp.reassembly_gap        | RxPFRp4p216               | 783368
> >
> > == PF RING ==
> >
> > PF_RING Version          : 6.0.1 ($Revision: exported$)
> > Total rings              : 16
> >
> > Standard (non DNA) Options
> > Ring slots               : 65534
> > Slot version             : 15
> > Capture TX               : No [RX only]
> > IP Defragment            : No
> > Socket Mode              : Standard
> > Transparent mode         : Yes [mode 0]
> > Total plugins            : 0
> > Cluster Fragment Queue   : 9175
> > Cluster Fragment Discard : 597999
> >
> > == ~30 min (with changes) ==
> >
> > FWIW, when I increase reassembly memcap and time outs to the following...
> >
> > flow-timeouts:
> >   default:
> >     new: 5
> >     established: 50
> >     closed: 0
> >     emergency-new: 1
> >     emergency-established: 1
> >     emergency-closed: 0
> >   tcp:
> >     new: 15
> >     established: 100
> >     closed: 10
> >     emergency-new: 1
> >     emergency-established: 1
> >     emergency-closed: 0
> >   udp:
> >     new: 5
> >     established: 50
> >     emergency-new: 1
> >     emergency-established: 1
> >   icmp:
> >     new: 1
> >     established: 5
> >     emergency-new: 1
> >     emergency-established: 1
> >
> > reassembly:
> >   memcap: 3gb
> >   depth: 5mb
> >
> > These are the results, note how there are no more free slots for
> > PF_RING. I believe this results in increased packet loss... which is
> > likely resulting in my truncated files that I receive when I pull a PDF.
> >
> > watch 'cat /proc/net/pf_ring/*p4p2* | egrep "Num Free Slots|Tot
> > Packets|Tot Pkt Lost"'
> > ; First three threads...
> > Tot Packets        : 80281541
> > Tot Pkt Lost       : 44290194
> > Num Free Slots     : 0
> > Tot Packets        : 81926241
> > Tot Pkt Lost       : 17412402
> > Num Free Slots     : 0
> > Tot Packets        : 80108557
> > Tot Pkt Lost       : 14667061
> > Num Free Slots     : 0
> >
> > watch -n 10 'cat stats.log | egrep
> > "reassembly_memuse|segment_memcap_drop" | tail -n 32'
> > ; First three threads...
> > tcp.segment_memcap_drop   | RxPFRp4p21                | 0
> > tcp.reassembly_memuse     | RxPFRp4p21                | 1681598708
> > tcp.segment_memcap_drop   | RxPFRp4p22                | 0
> > tcp.reassembly_memuse     | RxPFRp4p22                | 1681626644
> > tcp.segment_memcap_drop   | RxPFRp4p23                | 0
> > tcp.reassembly_memuse     | RxPFRp4p23                | 1681597556
> > tcp.segment_memcap_drop   | RxPFRp4p24                | 0
> > *** Important to note here, the reassembly memuse seems to steadily
> > increase overtime. After a few minutes of putting this in it has risen
> > to 2022140776 across. Makes me think things are not offloading
> > quickly... (timeout/depth issue?)
> >
> > grep 'reassembly_gap' stats.log | tail -n 10
> > tcp.reassembly_gap        | RxPFRp4p27                | 27603
> > tcp.reassembly_gap        | RxPFRp4p28                | 26677
> > tcp.reassembly_gap        | RxPFRp4p29                | 26869
> > tcp.reassembly_gap        | RxPFRp4p210               | 25031
> > tcp.reassembly_gap        | RxPFRp4p211               | 23988
> > tcp.reassembly_gap        | RxPFRp4p212               | 23809
> > tcp.reassembly_gap        | RxPFRp4p213               | 26420
> > tcp.reassembly_gap        | RxPFRp4p214               | 25271
> > tcp.reassembly_gap        | RxPFRp4p215               | 26285
> > tcp.reassembly_gap        | RxPFRp4p216               | 26848
>
> In 2.0.1 the stream engine should use less memory and clear memory
> quicker. Could you try 2.0.1?
>
> Also, if you have multiple vlans on the network, it may be worth trying
> to disable:
>
> vlan:
>   use-for-tracking: true
>
> I think you've probably checked all or most things on them, but perhaps
> these diagrams here can be of some help here:
>
> https://redmine.openinfosecfoundation.org/projects/suricata/wiki/Self_Help_Diagrams
>
> --
> ---------------------------------------------
> Victor Julien
> http://www.inliniac.net/
> PGP: http://www.inliniac.net/victorjulien.asc
> ---------------------------------------------
>
> _______________________________________________
> Suricata IDS Users mailing list: oisf-users at openinfosecfoundation.org
> Site: http://suricata-ids.org | Support: http://suricata-ids.org/support/
> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
> OISF: http://www.openinfosecfoundation.org/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20140530/634d002a/attachment-0002.html>


More information about the Oisf-users mailing list