[Users] Questions regarding character encoding of text/plain attachments
Michael Gmelin
freebsd at grem.de
Tue Oct 2 14:47:44 CEST 2012
On Tue, 2 Oct 2012 14:28:25 +0200
Ricardo Mones <ricardo at mones.org> wrote:
> On Mon, Oct 01, 2012 at 12:25:23AM +0200, Michael Gmelin wrote:
> > Hi,
> >
> > I noticed the following issue:
> >
> > When sending a text file attachment, claws uses content-type text
> > plain, even if it is encoded in UTF-8 and ends up base64-encoded.
> >
> > So the mime header of the attachment looks like this:
> >
> > Content-Type: text/plain
> > Content-Transfer-Encoding: base64
> > Content-Disposition: attachment; filename=china.txt
> >
> > In some cases it would be preferable to have a header like this:
> >
> > Content-Type: text/plain; charset=UTF-8
> > Content-Transfer-Encoding: base64
> > Content-Disposition: attachment; filename=china.txt
>
> Maybe it should be set to UTF-8 always for text/plain in the cases
> where the current code does not add a charset. ASCII only attachments
> would work anyway as that's a subset of UTF-8.
>
I just realized that this is more like a follow/duplicate of my earlier
request, sorry for that.
> > Questions:
> > 1. Is there a reasonable way to auto-detect and set the encoding?
>
> http://code.google.com/p/uchardet/
>
> > 2. If not, is there a way to make this happen on user request (like,
> > selecting the encoding)?
>
> Not currently, but a patch is welcome :)
As I learned by checking the Properties dialog (which I never did
before on an attachment, thanks for that), you can actually just add
the encoding after the mime-type, like "text/plain; charset=UTF-8". Not
nice, but it actually works.
I would be willing to contribute a patch though, that allows:
- Selecting the encoding of an attachment
- Specifying a default based on various conditions
Not sure about a time line thought :)
>
> > 3. If not, what is the rationale for not doing this. I could imagine
> > something like "the receiving system should assume UTF-8 in the
> > absence of a character encoding specification" or "the receiving
> > system should handle encodings transparently". In this case it
> > would be good to get some reference supporting one or both of these
> > arguments (RFC anyone?)
>
> My response to your other mail has several RFC references, and UTF-8
> is not the default assumption for text/plain without charset. Anyway,
> since automatic charset detection is equally flawed on any side, I
> don't think such transparent handling is possible.
My argument would be: If it's ascii, don't try to change it - this way
UTF-8 can pass through transparently. But that can be hard on legacy
systems. That's definitely not Claws responsibility though.
> At most all the
> MUAs can do would be a) suggest some encoding for sending, and b)
> allow changing it, and c) suggest some encoding for reading, and d)
> allow changing it.
>
> Claws Mail currently lacks b), and a) probably can be improved.
> AFAIK c) and d) are fully covered.
AFAIK b) is also there as well, but could be improved.
>
> regards,
Cheers,
--
Michael Gmelin
More information about the Users
mailing list