Opera 49 creates invalid mhtml file

  • Opera 49 creates invalid mhtml file. Header may have the following structure:

    From: <Saved by Blink>
    X-Snapshot-Version: 1.0
    X-Snapshot-Title: =?utf-8?Q?=D0=AF=D0=BD=D0=B4=D0=B5=D0=BA=D1=81.=D0=9D=D0=BE=D0=B2=D0=BE=D1=81=D1=82=
    =D0=B8: =D0=93=D0=BB=D0=B0=D0=B2=D0=BD=D1=8B=D0=B5 =D0=BD=D0=BE=D0=B2=D0=BE=
    =D1=81=D1=82=D0=B8 =D1=81=D0=B5=D0=B3=D0=BE=D0=B4=D0=BD=D1=8F, =D1=81=D0=B0=
    =D0=BC=D1=8B=D0=B5 =D1=81=D0=B2=D0=B5=D0=B6=D0=B8=D0=B5 =D0=B8 =D0=BF=D0=BE=
    =D1=81=D0=BB=D0=B5=D0=B4=D0=BD=D0=B8=D0=B5 =D0=BD=D0=BE=D0=B2=D0=BE=D1=81=
    =D1=82=D0=B8 =D0=A0=D0=BE=D1=81=D1=81=D0=B8=D0=B8 =D0=BE=D0=BD=D0=BB=D0=B0=
    =D0=B9=D0=BD?=
    X-Snapshot-Content-Location: https://news.yandex.ru/

    As you see content of X-Snapshot-Title is encoded with RFC 2047 but there are no leading spaces at the beginning of every new line of encoded string.

    RFC 2047 says: An 'encoded-word' may not be more than 75 characters long, including 'charset', 'encoding', 'encoded-text', and delimiters. If it is desirable to encode more text than will fit in an 'encoded-word' of 75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may be used.

    So existing mime parsers cannot correctly parse such files. Is it possible to fix algorithm of mhtml file creation?

  • MHTML Generation and Loading. As implemented in Chrome.
    Authors: jianli@chromium.org
    Last Updated: Nov 21, 2017

    Qouted
    Note: the long title line can be split into multiple lines using soft line break “CRLF+SPACE/TAB” per RFC 2047. The soft line break “=CRLF” used to break long line in message body as defined in RFC 2045 should NOT be used.
    Qouted

  • Confirmed. It's missing the spaces. Happens in Chrome too, so this needs to be filed at https://bugs.chromium.org/p/chromium/issues/list.

    And, just for comparison, here's what a few mail clients do with the title on news.yandex.ru:

    Opera Mail:
    
    Subject: =?utf-8?B?U3ViamVjdDog0K/QvdC00LXQutGBLtCd0L7QstC+0YHRgtC4OiDQk9C70LA=?=
     =?utf-8?B?0LLQvdGL0LUg0L3QvtCy0L7RgdGC0Lgg0YHQtdCz0L7QtNC90Y8sINGB0LA=?=
     =?utf-8?B?0LzRi9C1INGB0LLQtdC20LjQtSDQuCDQv9C+0YHQu9C10LTQvdC40LUg0L0=?=
     =?utf-8?B?0L7QstC+0YHRgtC4INCg0L7RgdGB0LjQuCDQvtC90LvQsNC50L0=?=
    
    Thunderbird:
    
    Subject: =?UTF-8?B?0K/QvdC00LXQutGBLtCd0L7QstC+0YHRgtC4OiDQk9C70LDQstC90Ys=?=
     =?UTF-8?B?0LUg0L3QvtCy0L7RgdGC0Lgg0YHQtdCz0L7QtNC90Y8sINGB0LDQvNGL0LUg0YE=?=
     =?UTF-8?B?0LLQtdC20LjQtSDQuCDQv9C+0YHQu9C10LTQvdC40LUg0L3QvtCy0L7RgdGC0Lgg?=
     =?UTF-8?B?0KDQvtGB0YHQuNC4INC+0L3Qu9Cw0LnQvQ==?=
    
    mail.google.com:
    
    Subject: =?UTF-8?B?0K/QvdC00LXQutGBLtCd0L7QstC+0YHRgtC4OiDQk9C70LDQstC90YvQtSDQvdC+0LLQvg==?=
        =?UTF-8?B?0YHRgtC4INGB0LXQs9C+0LTQvdGPLCDRgdCw0LzRi9C1INGB0LLQtdC20LjQtSDQuCDQv9C+0YHQu9C1?=
        =?UTF-8?B?0LTQvdC40LUg0L3QvtCy0L7RgdGC0Lgg0KDQvtGB0YHQuNC4INC+0L3Qu9Cw0LnQvQ==?=
    

    They're not using quoted-printable, but no matter. They're using spaces (or tabs in the mail.google.com) for line continuation.

Log in to reply
 

Looks like your connection to Opera forums was lost, please wait while we try to reconnect.