[Users] CM segfaults after file moves and recursive search

Philippe Gramoullé philippe at gramoulle.com
Sat Sep 14 22:37:32 UTC 2024


Re,

On Sat, 14 Sep 2024 22:09:32 +0200
Philippe Gramoullé <philippe at gramoulle.com> wrote:

> Hi,
> 
> I'm posting it here since Bugzilla is in maintenance mode.
> 
> HW & OS : Raspberry Pi 4, Debian Buster, armv7l 5.10 Kernel , single account, IMAP mailbox (Dovecot)
> 
> $ /usr/local/sylpheed-claws-cvs-gtk3/bin/claws-mail -V
> 
> Claws Mail version 4.3.0-32-g4f9e29
> runtime GTK 3.24.5 / GLib 2.58.3
> buildtime GTK 3.24.5 / GLib 2.58.3
> Compiled-in features:
>  compface
>  Enchant
>  GnuTLS
>  iconv
>  libetpan 1.9
>  libSM
>  librSVG 2.44.10
> 
> Information about how the segfault happened : i was doing some mail cleanup, copying , moving large number of files around (mainly spams).
> Then i did a recursive search for a header value in the search bar (extended mode with recursive & sticky options checked) : "h X-My-Header"
> As  it took too much time, i hit the "Clear" button and changed folder. 
> 
> A little later i got the following segfault message:
> ...
> (claws-mail:22390): Gtk-WARNING **: 20:35:31.605: Negative content width -1 (allocation 17, extents 9x9) while allocating gadget (node button, owner GtkButton)
> (claws-mail:22390): Gtk-WARNING **: 20:35:31.605: Negative content width -1 (allocation 17, extents 9x9) while allocating gadget (node button, owner GtkButton)
> ** (claws-mail:22390): WARNING **: 20:41:11.808: [2024-09-14 20:41:11] IMAP error on internal.example.com: UID SEARCH error
> 
> gtkcmctree.c:3865 Condition node != NULL failed
> traceback:
> 
> ** (claws-mail:22390): WARNING **: 20:41:12.496: [2024-09-14 20:41:12] IMAP error on internal.example.com: FETCH error
> (claws-mail:22390): Claws-Mail-WARNING **: 20:41:12.497: can't fetch message 1
> [1]+  Segmentation fault      /usr/local/sylpheed-claws-cvs-gtk3/bin/claws-mail
> 
> I have obvious errors in Dovecot logs (like "Broken physical size in mailbox" or "Cached message size larger than expected" )
> but i would expect CM not to crash. Still investigating the Dovecot errors but it looks related to the files that were moved.
> 
> I was able to reproduce the bug through GDB. Backtrace below :
> 
> ...
> (claws-mail:18450): Gtk-WARNING **: 21:02:07.197: Negative content width -1 (allocation 17, extents 9x9) while allocating gadget (node button, owner GtkButton)
> (claws-mail:18450): Gtk-WARNING **: 21:02:07.197: Negative content width -1 (allocation 17, extents 9x9) while allocating gadget (node button, owner GtkButton)
> /home/user/.claws-mail/imapcache/internal.example.com/user/INBOX/829983: fread: Resource temporarily unavailable
> /home/user/.claws-mail/imapcache/internal.example.com/user/INBOX/829983: fread: Resource temporarily unavailable
> ** (claws-mail:18450): WARNING **: 21:06:55.852: [2024-09-14 21:06:55] IMAP error on internal.example.com: UID SEARCH error
> 
> 
> Thread 1 "claws-mail" received signal SIGSEGV, Segmentation fault.
> 0x000c7414 in matcherlist_match ()
> (gdb) 
> (gdb) 
> (gdb) thread apply all bt full
> 
> Thread 6 (Thread 0xb0273f00 (LWP 18562)):
> #0  0xb67501a0 in futex_wait_cancelable (private=0, expected=0, futex_word=0x8d86c4) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
>         _a1 = 9275076
>         _nr = 240
>         _a3tmp = 0
>         _a1tmp = 9275076
>         _a3 = 0
>         _a4tmp = 0
>         _a2tmp = 128
>         _a2 = 128
>         _a4 = 0
>         __ret = <optimized out>
>         oldtype = 0
>         err = <optimized out>
>         spin = 0
>         buffer = {__routine = 0xb674fe28 <__condvar_cleanup_waiting>, __arg = 0xb0273890, __canceltype = 0, __prev = 0x0}
>         cbuffer = {wseq = 31991, cond = 0x8d8698, mutex = 0x8d8680, private = 0}
>         err = <optimized out>
>         g = 1
>         flags = <optimized out>
>         g1_start = <optimized out>
>         signals = <optimized out>
>         result = 0
>         wseq = 137403503193700
>         seq = 13146046173158572032
>         private = 0
> #1  0xb67501a0 in __pthread_cond_wait_common (abstime=0x0, mutex=0x0, cond=0x8d8698) at pthread_cond_wait.c:502
>         spin = 0
>         buffer = {__routine = 0xb674fe28 <__condvar_cleanup_waiting>, __arg = 0xb0273890, __canceltype = 0, __prev = 0x0}
>         cbuffer = {wseq = 31991, cond = 0x8d8698, mutex = 0x8d8680, private = 0}
>         err = <optimized out>
>         g = 1
>         flags = <optimized out>
>         g1_start = <optimized out>
>         signals = <optimized out>
>         result = 0
>         wseq = 137403503193700
>         seq = 13146046173158572032
>         private = 0
> #2  0xb67501a0 in __pthread_cond_wait (cond=0x8d8698, mutex=0x0) at pthread_cond_wait.c:655
> #3  0xb668274c in mailsem_internal_wait () at /usr/lib/arm-linux-gnueabihf/libetpan.so.20
> #4  0x001e1bb4 in  ()
> 
> Thread 3 (Thread 0xb17fef00 (LWP 18483)):
> #0  0xb58ac9e0 in __GI___poll (timeout=-1, nfds=2, fds=0x44ef00) at ../sysdeps/unix/sysv/linux/poll.c:29
>         _a1 = 4517632
>         _nr = 168
>         _a3tmp = -1
>         _a1tmp = 4517632
>         _a3 = -1
>         _a2tmp = 2
>         _a2 = 2
>         _sys_result = <optimized out>
>         sc_cancel_oldtype = 0
>         sc_ret = <optimized out>
> #1  0xb58ac9e0 in __GI___poll (fds=0x44ef00, nfds=2, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:26
> #2  0xb5af4f04 in  () at /usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0
> 
> Thread 2 (Thread 0xb2192f00 (LWP 18482)):
> #0  0xb58ac9e0 in __GI___poll (timeout=-1, nfds=1, fds=0x464e48) at ../sysdeps/unix/sysv/linux/poll.c:29
>         _a1 = 4607560
>         _nr = 168
>         _a3tmp = -1
>         _a1tmp = 4607560
>         _a3 = -1
>         _a2tmp = 1
>         _a2 = 1
>         _sys_result = <optimized out>
>         sc_cancel_oldtype = 0
>         sc_ret = <optimized out>
> #1  0xb58ac9e0 in __GI___poll (fds=0x464e48, nfds=1, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:26
> #2  0xb5af4f04 in  () at /usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0
> 
> Thread 1 (Thread 0xb2458040 (LWP 18450)):
> #0  0x000c7414 in matcherlist_match ()
> #1  0x000931dc in folder_item_search_msgs_local ()
> #2  0x000932e0 in folder_item_search_msgs ()
> #3  0x00060b80 in  ()
> (gdb) 
> 
> 
> Thanks,
> 
> Philippe
> _______________________________________________
> Users mailing list
> Users at lists.claws-mail.org
> https://lists.claws-mail.org/cgi-bin/mailman/listinfo/users

After fixing the wrong reported file size in the filenames (using the maildir-size-fix.pl [1] script) and rebuilding the whole
folder tree in CM, i couldn't reproduce the segfault anymore.

For reference, size differences between size on disk and size reported in the filenames were related with files i had moved a little before,
so it looks like some kind of corruption was caused by moving files around in the first place.

Also, related to the Dovecot corruption cache issue, all file sizes on disk and in the filenames always differed by 7 bytes , like in the entry below:

Sep 14 21:05:07 server dovecot[32238]: imap(user)<18573><J9yG1RgiONrAqAAr>: Error: Mailbox Spam.Archives: UID=69919: read(/home/user/Mail/.Spam.Archives/cur/1726336778.M652633P22502.server,S=8107,W=8277:2,S) failed: Cached message size larger than expected (8107 > 8100, box=Spam.Archives, UID=69919) (read reason=search)

Googling suggests the following Dovecot settings : "maildir_broken_filename_sizes=yes" , to prevent these kind of problems.
So far, i'll stick with the default (maildir_broken_filename_sizes=no) and see if the problem happens again.

Thanks,

Philippe

[1] : https://github.com/dovecot/tools/blob/main/maildir-size-fix.pl



More information about the Users mailing list