[Oisf-users] Detecting Unicode/UTF html

Mon Feb 20 21:07:07 UTC 2017

On 20/02/17 at 08:45, Clark Kent wrote:
> I am having an issue with detecting Unicode/UTF characters in html
> formatted email. So for example let say I want to detect “This is awesome”
> in Traditional Chinese (“這太棒了”). The signature would be written
> basically with content:”| E98099E5A4AAE6A392E4BA86|”. As far as I know I
> can’t supply a content match in Unicode/UTF. Instead I have to convert
> those characters into hex so that Suricata can understand what I am looking
> for.
> 
> If the email is html format, the hex bytes will have = between the bytes
> (ie. “E9=80=99=E5=A4=AA=E6=A3=92=E4=BA=86=”). This causes the signature to
> not alert in Suricata. However, in Snort if you supply the file_data
> modifier in the signature. It will drop the = and trigger the alert
> correctly because it matches the signature. This also might be the case
> for html format web pages, but I haven’t confirmed. I assume that it is
> probably the same case too.
> 
> Any thoughts if there is a solution in Suricata?

You could include the hex value for "=" as well?

-- 
Andreas Herz