[Oisf-users] Hunting down performance bugs - ioctls

Mark Ashley mark at ibiblio.org
Tue Dec 17 13:42:55 UTC 2013


OS: Solaris 11 x86
Box: X4600 with 16 AMD8220 CPUs, 128GB RAM
Suricata: master copy as at 2013-12-02

The issue is there is a whole lot of dropped packets on one specific
interface - the busy one.

The four threads on it:
capture.kernel_packets    | RxPcapnet121              | 31752019
capture.kernel_drops      | RxPcapnet121              | 22262851
capture.kernel_ifdrops    | RxPcapnet121              | 0
capture.kernel_packets    | RxPcapnet122              | 31751876
capture.kernel_drops      | RxPcapnet122              | 21957313
capture.kernel_ifdrops    | RxPcapnet122              | 0
capture.kernel_packets    | RxPcapnet123              | 31751474
capture.kernel_drops      | RxPcapnet123              | 22119310
capture.kernel_ifdrops    | RxPcapnet123              | 0
capture.kernel_packets    | RxPcapnet124              | 31751149
capture.kernel_drops      | RxPcapnet124              | 21967330
capture.kernel_ifdrops    | RxPcapnet124              | 0

If I use dtrace to profile the running sniffer, it's doing these syscalls
(about 90 seconds or less of monitoring):

  lwp_mutex_timedlock                                               1
  lwp_mutex_wakeup                                                  1
  yield                                                            40
  brk                                                              56
  write                                                          3730
  read                                                           3860
  nanosleep                                                      9923
  ioctl                                                       1337265
  lwp_park                                                    2462619

with some lively time spend doing the thread parking:

  time in ioctl
           value  ------------- Distribution ------------- count
               0 |                                         0
               1 |@                                        40335
               2 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@   1120796
               4 |                                         7852
               8 |                                         2026
              16 |                                         1212
              32 |                                         30
              64 |                                         13
             128 |                                         1
             256 |                                         1
             512 |                                         0

  time in lwp_park
           value  ------------- Distribution ------------- count
               0 |                                         0
               1 |                                         1145
               2 |                                         2463
               4 |@@@@@@@@@@@@@@@@@@@@                     1102965
               8 |@@@@@@@                                  402356
              16 |@@@@                                     248113
              32 |@@                                       130450
              64 |@@                                       87946
             128 |@@                                       114933
             256 |@@                                       108908
             512 |                                         24145
            1024 |                                         6694
            2048 |                                         4080
            4096 |                                         1058
            8192 |                                         266
           16384 |                                         11
           32768 |                                         0
           65536 |                                         0
          131072 |                                         0
          262144 |                                         0
          524288 |                                         94
         1048576 |                                         0
         2097152 |                                         30
         4194304 |                                         0
         8388608 |                                         8
        16777216 |                                         0
        33554432 |                                         0
        67108864 |                                         0
       134217728 |                                         0
       268435456 |                                         0
       536870912 |                                         0
      1073741824 |                                         0
      2147483648 |                                         0
      4294967296 |                                         0
      8589934592 |                                         0
     17179869184 |                                         0
     34359738368 |                                         0
     68719476736 |                                         0
    137438953472 |                                         0
    274877906944 |                                         3
    549755813888 |                                         0

If I look at the ustack counts and summarise, here's the most busiest
workflows that the binary is doing. You read it bottom to top, the one at
the bottom had that workflow 5060 times during the 30 second scan. The
stack at the top ran 2843 times.

              suricata`PrintList+0x4e4
              suricata`StreamTcpReassembleAppLayer+0x301
              suricata`StreamTcpReassembleHandleSegmentUpdateACK+0x17b
              suricata`StreamTcpReassembleHandleSegment+0x1a0
              suricata`HandleEstablishedPacketToServer+0x1888
              suricata`StreamTcpPacketStateEstablished+0x20c6
              suricata`StreamTcpPacket+0x82a
              suricata`StreamTcp+0x461
              suricata`TmThreadsSlotVarRun+0xada
              suricata`TmThreadsSlotProcessPkt+0x61
              suricata`PcapCallbackLoop+0xd0b
              libpcap.so.1.5.2`pcap_read_bpf+0x13d
             2843

              suricata`PrintList+0x128
              suricata`StreamTcpReassembleAppLayer+0x301
              suricata`StreamTcpReassembleHandleSegmentUpdateACK+0x17b
              suricata`StreamTcpReassembleHandleSegment+0x1a0
              suricata`HandleEstablishedPacketToServer+0x1888
              suricata`StreamTcpPacketStateEstablished+0x20c6
              suricata`StreamTcpPacket+0x82a
              suricata`StreamTcp+0x461
              suricata`TmThreadsSlotVarRun+0xada
              suricata`TmThreadsSlotProcessPkt+0x61
              suricata`PcapCallbackLoop+0xd0b
              libpcap.so.1.5.2`pcap_read_bpf+0x13d
             2857

              suricata`PrintList+0x132
              suricata`StreamTcpReassembleInsertSegment+0xe33
              suricata`StreamTcpReassembleHandleSegmentHandleData+0x845
              suricata`StreamTcpReassembleHandleSegment+0x417
              suricata`HandleEstablishedPacketToClient+0x11bc
              suricata`StreamTcpPacketStateEstablished+0x227f
              suricata`StreamTcpPacket+0x82a
              suricata`StreamTcp+0x461
              suricata`TmThreadsSlotVarRun+0xada
              suricata`TmThreadsSlotProcessPkt+0x61
              suricata`PcapCallbackLoop+0xd0b
              libpcap.so.1.5.2`pcap_read_bpf+0x13d
             2888

              suricata`PrintList+0x4e4
              suricata`StreamTcpReassembleInsertSegment+0xe33
              suricata`StreamTcpReassembleHandleSegmentHandleData+0x845
              suricata`StreamTcpReassembleHandleSegment+0x417
              suricata`HandleEstablishedPacketToClient+0x11bc
              suricata`StreamTcpPacketStateEstablished+0x227f
              suricata`StreamTcpPacket+0x82a
              suricata`StreamTcp+0x461
              suricata`TmThreadsSlotVarRun+0xada
              suricata`TmThreadsSlotProcessPkt+0x61
              suricata`PcapCallbackLoop+0xd0b
              libpcap.so.1.5.2`pcap_read_bpf+0x13d
             2908

              suricata`PrintList+0x4f0
              suricata`StreamTcpReassembleInsertSegment+0xe33
              suricata`StreamTcpReassembleHandleSegmentHandleData+0x845
              suricata`StreamTcpReassembleHandleSegment+0x417
              suricata`HandleEstablishedPacketToClient+0x11bc
              suricata`StreamTcpPacketStateEstablished+0x227f
              suricata`StreamTcpPacket+0x82a
              suricata`StreamTcp+0x461
              suricata`TmThreadsSlotVarRun+0xada
              suricata`TmThreadsSlotProcessPkt+0x61
              suricata`PcapCallbackLoop+0xd0b
              libpcap.so.1.5.2`pcap_read_bpf+0x13d
             2976

              libc.so.1`__lwp_unpark+0xb
              suricata`StreamTcp+0x478
              suricata`TmThreadsSlotVarRun+0xada
              suricata`TmThreadsSlotProcessPkt+0x61
              suricata`PcapCallbackLoop+0xd0b
              libpcap.so.1.5.2`pcap_read_bpf+0x13d
             3092

              libc.so.1`mutex_trylock_adaptive+0x128
              libc.so.1`mutex_lock_impl+0x14a
              libc.so.1`mutex_lock+0x16
              suricata`SCProfilingAddPacketLocks+0x45
              suricata`TmThreadsSlotVarRun+0xdbe
              suricata`TmThreadsSlotProcessPkt+0x61
              suricata`PcapCallbackLoop+0xd0b
              libpcap.so.1.5.2`pcap_read_bpf+0x13d
             3236

              libc.so.1`ioctl+0x7
              suricata`PcapCallbackLoop+0xd4f
              libpcap.so.1.5.2`pcap_read_bpf+0x13d
             5060

If I look at the PcapCallbackLoop process in source-pcap.c, it's not
apparent which ioctl is being done.

Here's the ioctl arguments (there's thousands of these):

ioctl:entry ioctl arg0: 13   arg1: 1082147439   arg2: fd9ff4b0
ioctl:entry ioctl arg0: 13   arg1: 1082147439   arg2: fd9ff4b0
ioctl:entry ioctl arg0: 13   arg1: 1082147439   arg2: fd9ff4b0
ioctl:entry ioctl arg0: 13   arg1: 1082147439   arg2: fd9ff4b0
ioctl:entry ioctl arg0: 13   arg1: 1082147439   arg2: fd9ff4b0
ioctl:entry ioctl arg0: 21   arg1: 1082147439   arg2: fd2074b0
ioctl:entry ioctl arg0: 21   arg1: 1082147439   arg2: fd2074b0
ioctl:entry ioctl arg0: 21   arg1: 1082147439   arg2: fd2074b0
ioctl:entry ioctl arg0: 21   arg1: 1082147439   arg2: fd2074b0
ioctl:entry ioctl arg0: 21   arg1: 1082147439   arg2: fd2074b0
ioctl:entry ioctl arg0: 21   arg1: 1082147439   arg2: fd2074b0
ioctl:entry ioctl arg0: 21   arg1: 1082147439   arg2: fd2074b0

The file descriptors are easy, from /proc/pid/path:

lrwxrwxrwx   1 root     root           0 Dec 17 21:34 13 ->
/devices/pseudo/bpf at 0:bpf
lrwxrwxrwx   1 root     root           0 Dec 17 21:34 21 ->
/devices/pseudo/bpf at 0:bpf


4080426F is the hex value for 1082147439.

The BPF ioctls for Solaris are:

#define BIOCGBLEN        _IOR('B', 102, uint_t)
#define BIOCSBLEN       _IOWR('B', 102, uint_t)
#define BIOCSETF         _IOW('B', 103, struct bpf_program)
#define BIOCFLUSH         _IO('B', 104)
#define BIOCPROMISC       _IO('B', 105)
#define BIOCGDLT         _IOR('B', 106, uint_t)
#define BIOCGETIF        _IOR('B', 107, struct ifreq)
#define BIOCGETLIF       _IOR('B', 107, struct lifreq)
#define BIOCSETIF        _IOW('B', 108, struct ifreq)
#define BIOCSETLIF       _IOW('B', 108, struct lifreq)
#define BIOCGSTATS       _IOR('B', 111, struct bpf_stat)
#define BIOCGSTATSOLD    _IOR('B', 111, struct bpf_stat_old)
#define BIOCIMMEDIATE    _IOW('B', 112, uint_t)
#define BIOCVERSION      _IOR('B', 113, struct bpf_version)
#define BIOCSTCPF        _IOW('B', 114, struct bpf_program)
#define BIOCSUDPF        _IOW('B', 115, struct bpf_program)
#define BIOCGHDRCMPLT    _IOR('B', 116, uint_t)
#define BIOCSHDRCMPLT    _IOW('B', 117, uint_t)
#define BIOCSDLT         _IOW('B', 118, uint_t)
#define BIOCGDLTLIST    _IOWR('B', 119, struct bpf_dltlist)
#define BIOCGSEESENT     _IOR('B', 120, uint_t)
#define BIOCSSEESENT     _IOW('B', 121, uint_t)
#define BIOCSRTIMEOUT    _IOW('B', 122, struct timeval)
#define BIOCGRTIMEOUT    _IOR('B', 123, struct timeval)

#define BIOCSETF32       _IOW('B', 103, struct bpf_program32)
#define BIOCGDLTLIST32  _IOWR('B', 119, struct bpf_dltlist32)
#define BIOCSRTIMEOUT32  _IOW('B', 122, struct timeval32)
#define BIOCGRTIMEOUT32  _IOR('B', 123, struct timeval32)

I'll assume it's a read-only operation so I'll print out the _IOR ones:

#include <net/bpf.h>
#include <net/if.h>
#include <stdio.h>

main() {

    printf("BIOCGBLEN is %lx\n", BIOCGBLEN);
    printf("BIOCGDLT is %lx\n", BIOCGDLT);
    printf("BIOCGETIF is %lx\n", BIOCGETIF);
    printf("BIOCGETLIF is %lx\n", BIOCGETLIF);
    printf("BIOCGSTATS is %lx\n", BIOCGSTATS);
    printf("BIOCGSTATSOLD is %lx\n", BIOCGSTATSOLD);
    printf("BIOCVERSION is %lx\n", BIOCVERSION);
    printf("BIOCGHDRCMPLT is %lx\n", BIOCGHDRCMPLT);
    printf("BIOCGSEESENT is %lx\n", BIOCGSEESENT);
    printf("BIOCGRTIMEOUT is %lx\n", BIOCGRTIMEOUT);
    printf("BIOCGRTIMEOUT32 is %lx\n", BIOCGRTIMEOUT32);
}

This gives:

# ./bpf_decode
BIOCGBLEN is 40044266
BIOCGDLT is 4004426a
BIOCGETIF is 4020426b
BIOCGETLIF is 4078426b
BIOCGSTATS is 4080426f
BIOCGSTATSOLD is 4008426f
BIOCVERSION is 40044271
BIOCGHDRCMPLT is 40044274
BIOCGSEESENT is 40044278
BIOCGRTIMEOUT is 4008427b
BIOCGRTIMEOUT32 is 4008427b

and that tells us BIOCGSTATS is being called repeatedly. That's in
libpcap-1.5.2:pcap-bpf.c:811:pcap_stats_bpf() routine. It's called via the
libpcap-1.5.2:pcap.c:1595:pcap_stats() function and that gets called from
suricata:source-pcap.c:151:PcapDumpCounters().

This code in source-pcap.c:PcapCallbackLoop() is doing it:

    278     /* Trigger one dump of stats every second */
    279     TimeGet(&current_time);
    280     if (current_time.tv_sec != last_dump) {
    281         PcapDumpCounters(ptv);
    282         last_dump = current_time.tv_sec;
    283     }

I'm thinking last_dump should be a global long-lived variable, that
survives _outside_ of the PcapCallbackLoop() function.

--- ../../suricata-master.orig/src/source-pcap.c        Sun Dec  1 21:37:52
2013
+++ source-pcap.c       Wed Dec 18 00:01:05 2013
@@ -224,12 +224,13 @@

 #endif

+time_t last_dump = 0;
+
 void PcapCallbackLoop(char *user, struct pcap_pkthdr *h, u_char *pkt) {
     SCEnter();

     PcapThreadVars *ptv = (PcapThreadVars *)user;
     Packet *p = PacketGetFromQueueOrAlloc();
-    time_t last_dump = 0;
     struct timeval current_time;

     if (unlikely(p == NULL)) {


It hasn't noticably affected my packet dropping rate (I need to code
something to analyse the % per second) but the bpf stats are actually
happening once a second now, and not 10000's of times a second, each time a
packet was scanned. Down from over a million ioctls to just a few hundred.

Bug https://redmine.openinfosecfoundation.org/issues/1071 has been logged.

ta,
Mark.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20131218/b3708d86/attachment-0001.html>


More information about the Oisf-users mailing list