[Users] How to filter utf8 messages
Slavko
linux at slavino.sk
Thu Jul 27 11:36:37 UTC 2023
Ahoj,
Dňa Thu, 27 Jul 2023 07:22:16 -0400 Pierre Fortin <pf at pfortin.com>
napísal:
> Wow! I've never seen that regex syntax and my ancient O'Reilly
> (7/1997) "Mastering Regular Expressions" book does not cover it.
> Therein, {min:max} is the "interval" quantifier, and no mention of
> "\p". I tried the suggested regex in QuickSearch; but no hits. The
> messages I'm trying to filter are mostly in Chinese & Japanese; are
> these covered by the above suggestion?
I am not regex nor CJK guru, but i afraid that in 1997 nobody care about
Unicode ;-) From my notes:
Japan:
[\u3040-\u30ff]
[\p{Han}\p{Hiragana}\p{Katakana}]
Chinese:
[\u4e00-\u9FFF]
\p{Han}
Korean:
[\uac00-\ud7a3]
\p{Hangul}
Arabic:
[\u0621-\u064A]
\p{Arabic}
Try this https://www.regular-expressions.info/unicode.html
regards
--
Slavko
https://www.slavino.sk
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 488 bytes
Desc: Digit��lny podpis OpenPGP
URL: <http://lists.claws-mail.org/pipermail/users/attachments/20230727/1b013b5b/attachment.sig>
More information about the Users
mailing list