[Users] How to filter utf8 messages
Pierre Fortin
pf at pfortin.com
Thu Jul 27 11:22:16 UTC 2023
On Thu, 27 Jul 2023 08:38:18 +0000 Colin Leroy-Mira via Users wrote:
>July 27, 2023 at 9:50 AM, "Slavko" <linux at slavino.sk> wrote:
>
>
>> And another question, i use this regex to score Chinesse & etc chars
>> (scripts) in subject in rspamd (perhaps can be useful for OP):
>> [\p{Han}\p{Hiragana}\p{Katakana}\p{Hangul}\p{Arabic}]+
>>
>> Will that work in CM filter regexes?
>
>I'm unsure, you can test regexps in the QuickSearch in extended mode:
>subject regexpcase "..."
>
Wow! I've never seen that regex syntax and my ancient O'Reilly (7/1997)
"Mastering Regular Expressions" book does not cover it. Therein,
{min:max} is the "interval" quantifier, and no mention of "\p".
I tried the suggested regex in QuickSearch; but no hits. The messages
I'm trying to filter are mostly in Chinese & Japanese; are these covered
by the above suggestion?
Thanks,
Pierre
More information about the Users
mailing list