[Oisf-users] Detecting Unicode/UTF html

Mon Feb 20 13:45:21 UTC 2017

I am having an issue with detecting Unicode/UTF characters in html
formatted email. So for example let say I want to detect “This is awesome”
in Traditional Chinese (“這太棒了”). The signature would be written
basically with content:”| E98099E5A4AAE6A392E4BA86|”. As far as I know I
can’t supply a content match in Unicode/UTF. Instead I have to convert
those characters into hex so that Suricata can understand what I am looking
for.

If the email is html format, the hex bytes will have = between the bytes
(ie. “E9=80=99=E5=A4=AA=E6=A3=92=E4=BA=86=”). This causes the signature to
not alert in Suricata. However, in Snort if you supply the file_data
modifier in the signature. It will drop the = and trigger the alert
correctly because it matches the signature. This also might be the case
for html format web pages, but I haven’t confirmed. I assume that it is
probably the same case too.

Any thoughts if there is a solution in Suricata?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openinfosecfoundation.org/pipermail/oisf-users/attachments/20170220/ec55ee75/attachment.html>