[Oisf-devel] CUDA Issues

Anoop Saldanha poonaatsoc at gmail.com
Sat Oct 16 05:43:42 UTC 2010


On Sat, Oct 16, 2010 at 12:46 AM, Denis T Vollmer <Denis.Vollmer at inl.gov>wrote:

> I built version suricata 1.0.2 on my ubuntu 10.04 box which has two cuda
> enabled processors on it. I am running the latest nvidia drivers (installed
> today) and the latest 3.2 dev kit. All the cuda example programs run fine.
> Suricata runs fine when not configured with cuda. However with cuda enabled
> on suricata it fails. I am guessing that with the latest cuda development
> release (September 2010) that something may have broken suricata's cuda
> implementation. Before I go digging to much into the code base I would
> appreciate anybody who has some experience in this area of the code base
> offering up some thoughts. I have debug enabled and will provide some of the
> relevant output below. Note that I have two GPU's on the machine and tried
> running on both of them. One at a time of course.
>
> Regards,
> Todd
> [547] 15/10/2010 -- 12:58:02 - (suricata.c:423) <Info> (main) -- This is
> Suricata version 1.0.2
> [547] 15/10/2010 -- 12:58:02 - (util-cpu.c:167) <Info>
> (UtilCpuPrintSummary) -- CPUs Summary:
> [547] 15/10/2010 -- 12:58:02 - (util-cpu.c:169) <Info>
> (UtilCpuPrintSummary) -- CPUs online: 4
> [547] 15/10/2010 -- 12:58:02 - (util-cpu.c:171) <Info>
> (UtilCpuPrintSummary) -- CPUs configured 4
> [547] 15/10/2010 -- 12:58:02 - (util-cuda.c:3988) <Info>
> (SCCudaPrintBasicDeviceInfo) -- GPU Device 1: Tesla C1060, 30
> Multiprocessors, 1296MHz, CUDA Compute Capability 1.3
> [547] 15/10/2010 -- 12:58:02 - (util-cuda.c:3988) <Info>
> (SCCudaPrintBasicDeviceInfo) -- GPU Device 2: Quadro FX 1700, 4
> Multiprocessors, 920MHz, CUDA Compute Capability 1.1
> ...
> [547] 15/10/2010 -- 12:58:03 - (suricata.c:1021) <Info> (main) --
> preallocated 4000 packets. Total memory 317280000
> [547] 15/10/2010 -- 12:58:03 - (flow.c:746) <Info> (FlowInitConfig) --
> initializing flow engine...
> [547] 15/10/2010 -- 12:58:03 - (flow.c:833) <Info> (FlowInitConfig) --
> allocated 524288 bytes of memory for the flow hash... 65536 buckets of size
> 8
> [547] 15/10/2010 -- 12:58:03 - (flow.c:852) <Info> (FlowInitConfig) --
> preallocated 10000 flows of size 192
> [547] 15/10/2010 -- 12:58:03 - (flow.c:854) <Info> (FlowInitConfig) -- flow
> memory usage: 2444288 bytes, maximum: 33554432
> ...
> [547] 15/10/2010 -- 12:58:03 - (detect.c:2068) <Info>
> (SigAddressPrepareStage2) -- 149 total signatures:
> [547] 15/10/2010 -- 12:58:03 - (detect.c:2089) <Info>
> (SigAddressPrepareStage2) -- TCP Source address blocks:     any:    0,
> ipv4:    0, ipv6:    0.
> [547] 15/10/2010 -- 12:58:03 - (detect.c:2109) <Info>
> (SigAddressPrepareStage2) -- UDP Source address blocks:     any:    0,
> ipv4:    0, ipv6:    0.
> [547] 15/10/2010 -- 12:58:03 - (detect.c:2129) <Info>
> (SigAddressPrepareStage2) -- ICMP Source address blocks:    any:    2,
> ipv4:   18, ipv6:    2.
> [547] 15/10/2010 -- 12:58:03 - (detect.c:2133) <Info>
> (SigAddressPrepareStage2) -- building signature grouping structure, stage 2:
> building source address list... done
> [547] 15/10/2010 -- 12:58:03 - (util-cuda.c:302) <Error>
> (SCCudaHandleRetValue) -- [ERRCODE: SC_ERR_CUDA_ERROR(132)] -
> cuCtxPushCurrent failed.  Returned errocode - CUDA_ERROR_INVALID_VALUE
> [547] 15/10/2010 -- 12:58:03 - (detect.c:3262) <Info> (SigGroupBuild) --
> Total Memory available in the CUDA context used for mpm with b2g: 511.69 MB
> [547] 15/10/2010 -- 12:58:03 - (detect.c:3265) <Info> (SigGroupBuild) --
> Free Memory available in the CUDA context used for b2g mpm before any
> allocation is made on the GPU for the context: 286.31 MB
> [547] 15/10/2010 -- 12:58:03 - (detect.c:2712) <Info>
> (SigAddressPrepareStage3) -- building signature grouping structure, stage 3:
> building destination address lists...
> [547] 15/10/2010 -- 12:58:03 - (detect.c:2795) <Info>
> (SigAddressPrepareStage3) -- MPM memory 238134 (dynamic 237826, ctxs 308,
> avg per ctx 23782)
> [547] 15/10/2010 -- 12:58:03 - (detect.c:2797) <Info>
> (SigAddressPrepareStage3) -- max sig id 149, array size 19
> [547] 15/10/2010 -- 12:58:03 - (detect.c:2798) <Info>
> (SigAddressPrepareStage3) -- signature group heads: unique 6, copies 254.
> [547] 15/10/2010 -- 12:58:03 - (detect.c:2800) <Info>
> (SigAddressPrepareStage3) -- MPM instances: 10 unique, copies 2 (none 0).
> [547] 15/10/2010 -- 12:58:03 - (detect.c:2802) <Info>
> (SigAddressPrepareStage3) -- MPM (URI) instances: 1 unique, copies 5 (none
> 0).
> [547] 15/10/2010 -- 12:58:03 - (detect.c:2803) <Info>
> (SigAddressPrepareStage3) -- MPM max patcnt 44, avg 14
> [547] 15/10/2010 -- 12:58:03 - (detect.c:2806) <Info>
> (SigAddressPrepareStage3) -- port maxgroups: 0, avg 0, tot 0
> [547] 15/10/2010 -- 12:58:03 - (detect.c:2807) <Info>
> (SigAddressPrepareStage3) -- building signature grouping structure, stage 3:
> building destination address lists... done
> [547] 15/10/2010 -- 12:58:03 - (detect.c:3290) <Info> (SigGroupBuild) --
> Free Memory available in the CUDA context used for b2g mpm after allocation
> is made on the GPU for the context: 285.31 MB
> [547] 15/10/2010 -- 12:58:03 - (detect.c:3293) <Info> (SigGroupBuild) --
> Total memory consumed by the CUDA context for the b2g mpm: 1.00 MB
> [547] 15/10/2010 -- 12:58:03 - (util-profiling.c:311) <Info>
> (SCProfilingInitRuleCounters) -- Registered 149 rule profiling counters.
> ...
> [562] 15/10/2010 -- 12:58:03 - (source-pcap-file.c:228) <Info>
> (ReceivePcapFileThreadInit) -- reading pcap file icmp_rand_good.pcap
> [547] 15/10/2010 -- 12:58:03 - (stream-tcp.c:370) <Info>
> (StreamTcpInitConfig) -- stream "max_sessions": 262144
> [547] 15/10/2010 -- 12:58:03 - (stream-tcp.c:382) <Info>
> (StreamTcpInitConfig) -- stream "prealloc_sessions": 32768
> [547] 15/10/2010 -- 12:58:03 - (stream-tcp.c:392) <Info>
> (StreamTcpInitConfig) -- stream "memcap": 33554432
> [547] 15/10/2010 -- 12:58:03 - (stream-tcp.c:399) <Info>
> (StreamTcpInitConfig) -- stream "midstream" session pickups: disabled
> [547] 15/10/2010 -- 12:58:03 - (stream-tcp.c:407) <Info>
> (StreamTcpInitConfig) -- stream "async_oneside": disabled
> [547] 15/10/2010 -- 12:58:03 - (stream-tcp.c:416) <Info>
> (StreamTcpInitConfig) -- stream.reassembly "memcap": 67108864
> [547] 15/10/2010 -- 12:58:03 - (stream-tcp.c:436) <Info>
> (StreamTcpInitConfig) -- stream.reassembly "depth": 1048576
> [547] 15/10/2010 -- 12:58:03 - (tm-threads.c:1429) <Info>
> (TmThreadWaitOnThreadInit) -- all 12 packet processing threads, 3 management
> threads initialized, engine started.
> [562] 15/10/2010 -- 12:58:03 - (source-pcap-file.c:190) <Info>
> (ReceivePcapFile) -- pcap file end of file reached (pcap err code 0)
> [562] 15/10/2010 -- 12:58:03 - (source-pcap-file.c:294) <Info>
> (ReceivePcapFileThreadExitStats) --  - (ReceivePcapFile) Packets 33794,
> bytes 22850454.
> [547] 15/10/2010 -- 12:58:03 - (suricata.c:1146) <Info> (main) -- signal
> received
> [547] 15/10/2010 -- 12:58:03 - (suricata.c:1149) <Info> (main) --
> EngineStop received
> packet is NULL for TM: CUDA_PB
> [547] 15/10/2010 -- 12:58:04 - (suricata.c:1169) <Info> (main) -- all
> packets processed by threads, stopping engine
> [547] 15/10/2010 -- 12:58:04 - (suricata.c:1176) <Info> (main) -- time
> elapsed 1s
> packet is NULL for TM: CUDA_PB
> [565] 15/10/2010 -- 12:58:04 - (stream-tcp.c:2874) <Info>
> (StreamTcpExitPrintStats) -- (Stream1) Packets 0
> [572] 15/10/2010 -- 12:58:04 - (alert-fastlog.c:303) <Info>
> (AlertFastLogExitPrintStats) -- (Outputs) Alerts 2
> [572] 15/10/2010 -- 12:58:04 - (alert-unified2-alert.c:603) <Info>
> (Unified2AlertThreadDeinit) -- Alert unified2 module wrote 2 alerts
> [572] 15/10/2010 -- 12:58:04 - (log-httplog.c:396) <Info>
> (LogHttpLogExitPrintStats) -- (Outputs) HTTP requests 0
> [573] 15/10/2010 -- 12:58:04 - (util-cuda.c:302) <Error>
> (SCCudaHandleRetValue) -- [ERRCODE: SC_ERR_CUDA_ERROR(132)] -
> cuCtxPushCurrent failed.  Returned errocode - CUDA_ERROR_INVALID_VALUE
> [574] 15/10/2010 -- 12:58:04 - (flow.c:1107) <Info> (FlowManagerThread) --
> 0 new flows, 0 established flows were timed out, 0 flows in closed state
> [547] 15/10/2010 -- 12:58:04 - (stream-tcp-reassemble.c:288) <Info>
> (StreamTcpReassembleFree) -- Max memuse of the stream reassembly engine
> 11220864 (in use 0)
> [547] 15/10/2010 -- 12:58:05 - (stream-tcp.c:484) <Info>
> (StreamTcpFreeConfig) -- Max memuse of stream engine 4063232 (in use 0)
> [547] 15/10/2010 -- 12:58:05 - (detect.c:2819) <Info>
> (SigAddressCleanupStage1) -- cleaning up signature grouping structure...
> [547] 15/10/2010 -- 12:58:05 - (util-cuda.c:302) <Error>
> (SCCudaHandleRetValue) -- [ERRCODE: SC_ERR_CUDA_ERROR(132)] -
> cuCtxPushCurrent failed.  Returned errocode - CUDA_ERROR_INVALID_VALUE
> [547] 15/10/2010 -- 12:58:05 - (detect.c:2834) <Info>
> (SigAddressCleanupStage1) -- cleaning up signature grouping structure...
> done
> [547] 15/10/2010 -- 12:58:05 - (util-cuda.c:302) <Error>
> (SCCudaHandleRetValue) -- [ERRCODE: SC_ERR_CUDA_ERROR(132)] -
> cuCtxPopCurrent failed.  Returned errocode - CUDA_ERROR_INVALID_CONTEXT
> [547] 15/10/2010 -- 12:58:05 - (suricata.c:1230) <Error> (main) --
> [ERRCODE: SC_ERR_CUDA_HANDLER_ERROR(133)] - Call to SCCudaCtxPopCurrent()
> during the shutdown phase just before the call to SigGroupCleanup()
>
>
> _______________________________________________
> Oisf-devel mailing list
> Oisf-devel at openinfosecfoundation.org
> http://lists.openinfosecfoundation.org/mailman/listinfo/oisf-devel
>
>
Looks like it's running fine, unless you have terminated the engine?

Those cuda logs are a bit misleading.  Those are not actually errors under
this context.  Since we are multithreaded, we end up trying to push cuda
contexts on different threads.  Hence all those pushctx and popctx erros.
Need to clean it up somehow.

-- 
Regards,
Anoop Saldanha
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-devel/attachments/20101016/0b27a71c/attachment-0002.html>


More information about the Oisf-devel mailing list