[Oisf-users] Suricata 2Gbit/s traffic drops on AWS

Peter Manev petermanev at gmail.com
Sat Aug 24 11:08:43 UTC 2019



> -- 
> Regards,
> Peter Manev 


> On 24 Aug 2019, at 06:54, Tiago Faria <tiago.faria.backups at gmail.com> wrote:
> 
> Rollover can help with packet losses by sending packets to a new socket when current one is full. As per the documentation, this can help with packet loss on single intensive flows, even though there are cons about using this (and it might even result in you missing alerts since Suricata might not be able to analyze all the traffic).

Just for info “rollover” should not be used - 

https://github.com/OISF/suricata/commit/5d76f0897cc862a7096749355d261e2b3d130e0d
It causes tracking issues.

> 
> You don't _have to_ do the placement group, especially if you're not planning on doing 10/15Gbps+.
> 
>> On Sat, Aug 24, 2019 at 11:38 AM Shell_Xu <xuh881026 at gmail.com> wrote:
>> New problem, I tried to add 'rollover:yes' to the configuration file, I found that the packet loss rate has dropped.The 5.0dev version does not have this configuration by default. Is 'rollover:yes' obsolete?
>> In the test, I found that the packet loss rate dropped significantly, but it was not stable.Why is this? What is the role of this configuration?
>> This result only adds configuration parameters, I did not add EC2 to the Placement Group.
>> 
>> The verification results are as follows
>> <image.png>
>> 
>> Tiago Faria <tiago.faria.backups at gmail.com> 于2019年8月24日周六 下午3:49写道:
>>> You can have mirror sessions as you want, including between AWS accounts. To get the best performance, however, placing them in the same placement group will help substantially. 
>>> 
>>> I’d first check if this helps in the problem you’re having though. 
>>> 
>>>> On Sat, 24 Aug 2019 at 01:46, Shell_Xu <xuh881026 at gmail.com> wrote:
>>>> HI:
>>>>     Thank you for your help!
>>>>     'What I recommend is the creation of a Placement Group of type Cluster and deploy the EC2 instances inside that Placement Group. '
>>>>     Does this mean that servers I monitor need to be deployed in the Placement Group? 
>>>>     e.g:
>>>>         Sruicata、Web Server、DB Server、Redis Cluster...
>>>> 
>>>> Tiago Faria <tiago.faria.backups at gmail.com> 于2019年8月24日周六 上午1:38写道:
>>>>> Hi,
>>>>> 
>>>>> It can be fixed, yes, but it requires deployment of the EC2 instances (or re-deployment). What I recommend is the creation of a Placement Group of type Cluster and deploy the EC2 instances inside that Placement Group. 
>>>>> 
>>>>>> On Fri, Aug 23, 2019 at 5:48 PM Shell_Xu <xuh881026 at gmail.com> wrote:
>>>>>> I am not sure if I use Placement Groups. If not used, can this problem still be solved?
>>>>>> 
>>>>>> Tiago Faria <tiago.faria.backups at gmail.com> 于2019年8月23日周五 下午11:06写道:
>>>>>>> Are you using EC2 Placement Groups? Ideally you would use Cluster as much as possible exactly to prevent underlying hardware performance issues. 
>>>>>>> 
>>>>>>> It is also the recommended configuration for HPC applications, and Suricata would greatly benefit from that. 
>>>>>>> 
>>>>>>>> On Fri, 23 Aug 2019 at 15:54, 徐慧 <xuh881026 at gmail.com> wrote:
>>>>>>>> hi, again:
>>>>>>>>     Yes, I am using Elastic Network Adapter (ENA)
>>>>>>>>     Since the EC2 instance is a shared underlying hardware, many network interface hardware settings are not available.
>>>>>>>>     I don't know how to optimize Suricata on EC2, can you help me?
>>>>>>>> 
>>>>>>>>      $ modinfo ena
>>>>>>>> 
>>>>>>>>     filename:       /lib/modules/4.15.0-1044-aws/kernel/drivers/net/ethernet/amazon/ena/ena.ko
>>>>>>>>     version:        2.0.3K
>>>>>>>>     license:        GPL
>>>>>>>>     description:    Elastic Network Adapter (ENA)
>>>>>>>>     author:         Amazon.com, Inc. or its affiliates
>>>>>>>>     srcversion:     1980993534E135DFC7933C4
>>>>>>>>     alias:          pci:v00001D0Fd0000EC21sv*sd*bc*sc*i*
>>>>>>>>     alias:          pci:v00001D0Fd0000EC20sv*sd*bc*sc*i*
>>>>>>>>     alias:          pci:v00001D0Fd00001EC2sv*sd*bc*sc*i*
>>>>>>>>     alias:          pci:v00001D0Fd00000EC2sv*sd*bc*sc*i*
>>>>>>>>     depends:
>>>>>>>>     retpoline:      Y
>>>>>>>>     intree:         Y
>>>>>>>>     name:           ena
>>>>>>>>     vermagic:       4.15.0-1044-aws SMP mod_unload
>>>>>>>>     signat:         PKCS#7
>>>>>>>>     signer:
>>>>>>>>     sig_key:
>>>>>>>>     sig_hashalgo:   md4
>>>>>>>>     parm:           debug:Debug level (0=none,...,16=all) (int)
>>>>>>>> 
>>>>>>>> Tiago Faria <tiago.faria.backups at gmail.com> 于2019年8月23日周五 下午6:51写道:
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> Based on the instance type and interface name, you're most likely using enhanced networking, but, to be on the safe side, can you confirm?
>>>>>>>>> 
>>>>>>>>> $ modinfo ena
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Fri, Aug 23, 2019 at 3:07 AM 徐慧 <xuh881026 at gmail.com> wrote:
>>>>>>>>>> hi, team:
>>>>>>>>>>      Since AWS traffic mirroring uses a VxLAN tunnel, I have to use the 5.0dev version. i deployed Sruicata on AWS, but recently noticed that 'capture. Kernel_drops' appears in stats.log when traffic reaches 2Gbit/s. I tried rsync a large file, 'capture. Kernel_drops' appears in stats.log. default loading ET rules.
>>>>>>>>>>      I hope anyone can help me, any advice is good! Guys, I need your help very much. 
>>>>>>>>>>     
>>>>>>>>>>     # Client rsync files
>>>>>>>>>>     $ rsync -trovpgP xxx.tgz /usr/local/data/xxx.tgz
>>>>>>>>>>     sending incremental file list
>>>>>>>>>>     xxx.tgz
>>>>>>>>>>     3,361,243,136  51%  114.14MB/s    0:00:27
>>>>>>>>>> 
>>>>>>>>>>     # Suricata Server:
>>>>>>>>>>     $ suricata --af-packet -c /etc/suricata/suricata.yaml
>>>>>>>>>>     [24073] 23/8/2019 -- 01:51:19 - (tm-threads.c:2145) <Notice> (TmThreadWaitOnThreadInit) -- all 14 packet processing threads, 4 management threads initialized, engine started.
>>>>>>>>>>     [24073] 23/8/2019 -- 01:53:58 - (suricata.c:2851) <Notice> (SuricataMainLoop) -- Signal Received.  Stopping engine.
>>>>>>>>>>     [24073] 23/8/2019 -- 01:54:01 - (util-device.c:317) <Notice> (LiveDeviceListClean) -- Stats for 'ens5':  pkts: 11270384, drop: 2046365 (18.16%), invalid chksum: 0
>>>>>>>>>> 
>>>>>>>>>>     According to the official documentation, I made some optimizations.
>>>>>>>>>>     https://suricata.readthedocs.io/en/latest/performance/packet-capture.html#rss
>>>>>>>>>>     But I can't set RSS queues to 1
>>>>>>>>>>     ethtool -L ens5 combined 1
>>>>>>>>>>     Cannot set device channel parameters: Operation not supported
>>>>>>>>>> 
>>>>>>>>>>     Amazon EC2 C5
>>>>>>>>>>     EC2 Hardware:
>>>>>>>>>>     RAM: 32G
>>>>>>>>>>     CPU(single): 16 Core (Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz)
>>>>>>>>>>     NIC: 
>>>>>>>>>>         ethtool -l ens5
>>>>>>>>>>         Channel parameters for ens5:
>>>>>>>>>>         Pre-set maximums:
>>>>>>>>>>         RX:	8
>>>>>>>>>>         TX:	8
>>>>>>>>>>         Other:	0
>>>>>>>>>>         Combined:	0
>>>>>>>>>>         Current hardware settings:
>>>>>>>>>>         RX:	8
>>>>>>>>>>         TX:	8
>>>>>>>>>>         Other:	0
>>>>>>>>>>         Combined:	0
>>>>>>>>>> 
>>>>>>>>>>         ethtool -i ens5
>>>>>>>>>>         driver: ena
>>>>>>>>>>         version: 2.0.3K
>>>>>>>>>>         firmware-version:
>>>>>>>>>>         expansion-rom-version:
>>>>>>>>>>         bus-info: 0000:00:05.0
>>>>>>>>>>         supports-statistics: yes
>>>>>>>>>>         supports-test: no
>>>>>>>>>>         supports-eeprom-access: no
>>>>>>>>>>         supports-register-dump: no
>>>>>>>>>>         supports-priv-flags: no
>>>>>>>>>> 
>>>>>>>>>>     Suricata Version: 5.0.0-dev (3a912446a 2019-07-22)
>>>>>>>>>>     Suricata Config:
>>>>>>>>>>         af-packet:
>>>>>>>>>>         - interface: ens5
>>>>>>>>>>             threads: 14
>>>>>>>>>>             cluster-id: 99
>>>>>>>>>>             cluster-type: cluster_flow
>>>>>>>>>>             defrag: yes    # Default AF_PACKET cluster type. AF_PACKET can load balance per flow or per hash.
>>>>>>>>>>             use-mmap: yes
>>>>>>>>>>             mmap-locked: yes
>>>>>>>>>>             tpacket-v3: yes
>>>>>>>>>>             ring-size: 400000
>>>>>>>>>>             block-size: 393216
>>>>>>>>>>             #block-timeout: 10
>>>>>>>>>>             #use-emergency-flush: yes
>>>>>>>>>>             # buffer-size: 32768
>>>>>>>>>>             # disable-promisc: no
>>>>>>>>>>             #checksum-checks: kernel
>>>>>>>>>>             #bpf-filter: port 80 or udp
>>>>>>>>>>             #copy-mode: ips
>>>>>>>>>>             #copy-iface: eth1
>>>>>>>>>> 
>>>>>>>>>>         - interface: default
>>>>>>>>>>             threads: auto
>>>>>>>>>>             use-mmap: yes
>>>>>>>>>>             tpacket-v3: yes
>>>>>>>>>> 
>>>>>>>>>>         max-pending-packets: 1024
>>>>>>>>>>         runmode: workers
>>>>>>>>>>         default-packet-size: 1522
>>>>>>>>>> 
>>>>>>>>>>         defrag:
>>>>>>>>>>             memcap: 4gb
>>>>>>>>>>             hash-size: 65536
>>>>>>>>>>             trackers: 65535 # number of defragmented flows to follow
>>>>>>>>>>             max-frags: 65535 # number of fragments to keep (higher than trackers)
>>>>>>>>>>             prealloc: yes
>>>>>>>>>>             timeout: 60
>>>>>>>>>> 
>>>>>>>>>>         flow:
>>>>>>>>>>             memcap: 4gb
>>>>>>>>>>             hash-size: 1048576
>>>>>>>>>>             prealloc: 1048576
>>>>>>>>>>             emergency-recovery: 30
>>>>>>>>>> 
>>>>>>>>>>         stream:
>>>>>>>>>>         memcap: 4gb
>>>>>>>>>>         checksum-validation: no
>>>>>>>>>>         inline: no
>>>>>>>>>>         bypass: yes
>>>>>>>>>>         reassembly:
>>>>>>>>>>             memcap: 8gb
>>>>>>>>>>             depth: 1mb
>>>>>>>>>>             toserver-chunk-size: 2560
>>>>>>>>>>             toclient-chunk-size: 2560
>>>>>>>>>>             randomize-chunk-size: yes
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>         detect:
>>>>>>>>>>             profile: custom
>>>>>>>>>>             custom-values:
>>>>>>>>>>                 toclient-groups: 200
>>>>>>>>>>                 toserver-groups: 200
>>>>>>>>>>             sgh-mpm-context: auto
>>>>>>>>>>             inspection-recursion-limit: 3000
>>>>>>>>>> 
>>>>>>>>>>         mpm-algo: hs
>>>>>>>>>>         spm-algo: hs
>>>>>>>>>> 
>>>>>>>>>>         threading:
>>>>>>>>>>         set-cpu-affinity: yes
>>>>>>>>>>         cpu-affinity:
>>>>>>>>>>             - management-cpu-set:
>>>>>>>>>>                 cpu: [ "0-1" ]
>>>>>>>>>>                 mode: "balanced"
>>>>>>>>>>                 prio:
>>>>>>>>>>                 default: "medium"
>>>>>>>>>>             - worker-cpu-set:
>>>>>>>>>>                 cpu: [ "2-15" ]
>>>>>>>>>>                 mode: "exclusive"
>>>>>>>>>>                 prio:
>>>>>>>>>>                 default: "high"
>>>>>>>>>> _______________________________________________
>>>>>>>>>> Suricata IDS Users mailing list: oisf-users at openinfosecfoundation.org
>>>>>>>>>> Site: http://suricata-ids.org | Support: http://suricata-ids.org/support/
>>>>>>>>>> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
>>>>>>>>>> 
>>>>>>>>>> Conference: https://suricon.net
>>>>>>>>>> Trainings: https://suricata-ids.org/training/
> _______________________________________________
> Suricata IDS Users mailing list: oisf-users at openinfosecfoundation.org
> Site: http://suricata-ids.org | Support: http://suricata-ids.org/support/
> List: https://lists.openinfosecfoundation.org/mailman/listinfo/oisf-users
> 
> Conference: https://suricon.net
> Trainings: https://suricata-ids.org/training/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20190824/0e8ebfa6/attachment-0001.html>


More information about the Oisf-users mailing list