<div dir="ltr"><font color="#000000" face="Times New Roman" size="3">
</font><p style="margin:0in 0in 8pt"><font size="3"><font color="#000000"><font face="Calibri">I am having an issue with detecting Unicode/UTF characters in
html formatted email. So for example let say I want to detect “This is awesome”
in Traditional Chinese (</font><span style="font-family:"pmingliu",serif">“這太棒了”</span></font></font><font color="#000000" face="Calibri" size="3">). The signature would be written
basically with content:”| E98099E5A4AAE6A392E4BA86|”. As far as I know I can’t
supply a content match in Unicode/UTF. Instead I have to convert those
characters into hex so that Suricata can understand what I am looking for. </font></p><font color="#000000" face="Times New Roman" size="3">
</font><p style="margin:0in 0in 8pt"><font size="3"><font color="#000000"><font face="Calibri">If the email is html format, the hex bytes will have = between
the bytes (ie. “E9=80=99=E5=A4=AA=E6=A3=92=E4=BA=86=”). This causes the
signature to not alert in Suricata. However, in Snort if you supply the
file_data modifier in the signature. It will drop the = and trigger the alert
correctly because it matches the signature. This also might be the case for html
format web pages, but I haven’t confirmed. I assume that it is probably the same
case too. <span> </span></font></font></font></p><font color="#000000" face="Times New Roman" size="3">
</font><p style="margin:0in 0in 8pt"><font size="3"><font color="#000000"><font face="Calibri">Any thoughts if there is a solution in Suricata? <span> </span></font></font></font></p><font color="#000000" face="Times New Roman" size="3">
</font></div>