[Users] [Bug 4428] New: punctuation stripping from URLs in text emails strips $ signs from end of URL
noreply at thewildbeast.co.uk
noreply at thewildbeast.co.uk
Thu Dec 31 15:59:21 CET 2020
https://www.thewildbeast.co.uk/claws-mail/bugzilla/show_bug.cgi?id=4428
Bug ID: 4428
Summary: punctuation stripping from URLs in text emails strips
$ signs from end of URL
Product: Claws Mail
Version: GIT
Hardware: PC
OS: All
Status: NEW
Severity: enhancement
Priority: P3
Component: UI/Message View
Assignee: users at lists.claws-mail.org
Reporter: rhaas at illinois.edu
When encountering an URL like:
https://urldefense.com/v3/__https://eff.org/r.o9g6__;!!DZ3fjg!srHW_CI4QzWk_Et7SAcZwTL_6C2bVOKE-ZLz9eesJB6afpP_kdt4-QeDMY9WyOMX$
in an email then the punctuation stripping code in get_uri_part will remove the
trailing "$" signs since they are considered a real punctuation by the
IS_REAL_PUNCT macro defined in that function (src/common/utils.c):
#define IS_REAL_PUNCT(ch) (g_ascii_ispunct(ch) && !strchr("/?=-_~)", ch))
Unfortunately urldefense, used by my institution, has recently started to
construct their redirection emails to all end in "$" so I have to manually copy
each URL to a browser address bar and add the $ to it, rendering the URL
detection useless.
I simple fix would be to extend the list of characters in the strchr call in
the macro to include "$". A better one might be to use a list of punctuation
characters instead and make that list user configurable for cases of
non-English languages where the claws-authors might not know what is likely to
be a punctuation character (e.g. « and » in French).
Note that this is apparently something that is a know(ish) issue based on the
comment just above the macro:
/* FIXME: this stripping of trailing punctuations may bite with other URIs.
* should pass some URI type to this function and decide on that whether
* to perform punctuation stripping */
Given that punctuation stripping seems to be based on a heuristic of what
likely is expected to end a URL in an email, I do not, of course, know how
likely it is to find emails where "$" really should not be considered part of
the link, eg:
$$$Earn money now https://example.com/earn$$$
would likely indicate that one wold want to consider "$" to designate the end
of the URL here (in particular if bad HTML-text conversion was at work on the
sender's side).
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the Users
mailing list