• Login
    • Search
    • Categories
    • Recent
    • Tags
    • Users
    • Groups
    • Rules
    • Help

    Do more on the web, with a fast and secure browser!

    Download Opera browser with:

    • built-in ad blocker
    • battery saver
    • free VPN
    Download Opera

    Save as MHTML adding suspect content to the file.

    Opera for Windows
    6
    24
    16347
    Loading More Posts
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as topic
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • A Former User
      A Former User @sgunhouse last edited by A Former User

      Dear @sgunhouse,

      thanks for answering.

      My web host is adding nothing. It is my own web host and opening the web page with Chrome I get a file as it should be, without that all rubish "<Saved by Blink>"

      Did you open a web page and saved it to .mhtml before pointing my host as the culprit?

      Let's try with a different web server not related with me.....Wikipedia for instance?
      This is what I get saving Wikipedia Web page as .mhtml file:

      From: <Saved by Blink>
      Snapshot-Content-Location: https://www.wikipedia.org/
      Subject: Wikipedia
      Date: Thu, 21 Feb 2018 18:06:55 -0000
      MIME-Version: 1.0
      Content-Type: multipart/related;
      type="text/html";
      boundary="----MultipartBoundary--7YBlNuygxaM3zPYmATZBRJCKLk1xk8MuTF7UMNzUn6----"

      ------MultipartBoundary--7YBlNuygxaM3zPYmATZBRJCKLk1xk8MuTF7UMNzUn6----
      Content-Type: text/html
      Content-ID: frame-160-95d37f33-a3d4-47fe-853c-dcd4a6b454df@mhtml.blink
      Content-Transfer-Encoding: quoted-printable
      Content-Location: https://www.wikipedia.org/

      <!DOCTYPE html><html lang=3D"es" class=3D"js-enabled" dir=3D"ltr"><head><me=
      ta http-equiv=3D"Content-Type" content=3D"text/html; charset=3DUTF-8">

      <title>Wikipedia</title>
      <meta name=3D"description" content=3D"Wikipedia is a free online encycloped=
      ia, created and edited by volunteers around the world and hosted by the Wik=
      imedia Foundation.">
      <!--[if gt IE 7]-->

      <!--[endif]-->
      <!--[if lt IE 7]><meta http-equiv=3D"imagetoolbar" content=3D"no"><![endif]=
      -->
      <meta name=3D"viewport" content=3D"initial-scale=3D1,user-scalable=3Dyes">
      <link rel=3D"apple-touch-icon" href=3D"https://www.wikipedia.org/static/app=
      le-touch/wikipedia.png">
      <link rel=3D"shortcut icon" href=3D"https://www.wikipedia.org/static/favico=
      n/wikipedia.ico">
      <link rel=3D"license" href=3D"https://creativecommons.org/licenses/by-sa/3.=
      0/">
      <style>
      REMOVED TO AVOID OPERA SERVER SIZE LIMIT
      </style>
      <link rel=3D"preconnect" href=3D"https://upload.wikimedia.org/">

      RUBBISH ADED BY OPERA
      <style type=3D"text/css">:root a[href^=3D"http://ad-apac.doubleclick.net/"]=
      , :root .GKJYXHBF2 > .GKJYXHBE2 > .GKJYXHBH5, :root a[href^=3D"http://www.a=
      mazon.co.uk/exec/obidos/external-search?"], :root #\5f _mom_ad_2, :root #rh=
      s_block .mod > .luhb-div > div[data-async-type=3D"updateHotelBookingModule"=
      ], :root #\5f _admvnlb_modal_container, :root #MAIN.ShowTopic > .ad, :root =
      a[href^=3D"http://pubads.g.doubleclick.net/"], :root .GB3L-QEDGY .GB3L-QEDF=
      margin:0 4px;padding:1px 5px;background:#fff7ed"], :root #center_col > #mai=
      n > .dfrd > .mnr-c > .c._oc._zs, :root a[href^=3D"http://ul.to/ref/"], :roo=
      t #\5f _nq__hh[style=3D"display:block!important"], :root a[href^=3D"http://=
      cdn.adstract.com/"], :root a[href^=3D"//tracking.content-recommendation.net=
      /"][href*=3D"/sponsored/click.html?"], :root #\5f _mom_ad_12, :root a[href^=
      =3D"http://lp.ncdownloader.com/"], :root .inlineNewsletterSubscription + .i=
      nlineNewsletterSubscription div[class$=3D"item"], :root div[id^=3D"google=

      REMOVED TO AVOID OPERA SERVER SIZE LIMIT

      oleft, :root adsenseintermedioright, :root div[class^=3D"publicHoriz"], :ro=
      ot div[id^=3D"M"][id*=3D"Composite"] { display: none !important; }</style><=
      link rel=3D"preconnect" href=3D"https://es.wikipedia.org/"></head>
      <body id=3D"www-wikipedia-org" class=3D" jsl10n-visible">
      <h1 class=3D"central-textlogo" style=3D"font-size: 1em;" title=3D"Wikipedia=
      ">
      <img class=3D"central-featured-logo" src=3D"https://www.wikipedia.org/porta=
      l/wikipedia.org/assets/img/Wikipedia-logo-v2.png" width=3D"200" height=3D"1=
      83" alt=3D"Wikipedia">
      <div class=3D"central-textlogo-wrapper">
      <div class=3D"central-textlogo__image sprite svg-Wikipedia_wordmark">
      Wikipedia
      </div>
      <strong class=3D"jsl10n localized-slogan" data-jsl10n=3D"slogan">La enciclo=
      pedia libre</strong>
      </div>
      </h1>
      <!-- container div for the central logo and the links to the most viewed la=
      nguage editions -->
      <div class=3D"central-featured" data-el-section=3D"primary links">
      <
      REMOVED TO AVOID OPERA SERVER SIZE LIMIT
      padding: 0 0.5em;
      }
      </style>
      <![endif]-->

      </body></html>
      ------MultipartBoundary--7YBlNuygxaM3zPYmATZBRJCKLk1xk8MuTF7UMNzUn6----
      Content-Type: image/svg+xml
      Content-Transfer-Encoding: quoted-printable
      Content-Location: https://www.wikipedia.org/portal/wikipedia.org/assets/img/sprite-6e35f464.svg

      <?xml version=3D"1.0" encoding=3D"UTF-8"?><svg width=3D"176" height=3D"675"=
      REMOVED TO AVOID OPERA SERVER SIZE LIMIT
      1 2.75-.2q.35 0 .35.62a.94.94 0 0 1-.19.57.53.53 0 0 1-.43.25z"/></svg></sv=
      g>
      ------MultipartBoundary--7YBlNuygxaM3zPYmATZBRJCKLk1xk8MuTF7UMNzUn6----
      Content-Type: image/png
      Content-Transfer-Encoding: base64
      Content-Location: https://www.wikipedia.org/portal/wikipedia.org/assets/img/Wikinews-logo_sister.png

      iVBORw0KGgoAAAANSUhEUgAAAC8AAAAZCAMAAACW2ocuAAAABGdBTUEAALGPC/xhBQAAAAFzUkdC
      REMOVED TO AVOID OPERA SERVER SIZE LIMIT
      Vw0AAAAASUVORK5CYII=
      ------MultipartBoundary--7YBlNuygxaM3zPYmATZBRJCKLk1xk8MuTF7UMNzUn6----
      Content-Type: image/png
      Content-Transfer-Encoding: base64
      Content-Location: https://www.wikipedia.org/portal/wikipedia.org/assets/img/Wikipedia-logo-v2.png

      iVBORw0KGgoAAAANSUhEUgAAAMgAAAC3CAMAAABg8uG4AAACNFBMVEVMaXGHh4jc3N6bm5yNjo6V
      REMOVED TO AVOID OPERA SERVER SIZE LIMIT
      vvjqJ598AOzVV1/8v/f8/xb7/wCY4U51gcleEgAAAABJRU5ErkJggg==
      ------MultipartBoundary--7YBlNuygxaM3zPYmATZBRJCKLk1xk8MuTF7UMNzUn6------

      I don't believe Wikipedia is suspected of delivering that kind of contents. Don't you?
      Are you getting similar results?
      Please, let me know...
      Thanks.

      Reply Quote 0
        leocg 2 Replies Last reply
      • leocg
        leocg Moderator Volunteer @Guest last edited by leocg

        @disappointeduser Next time please store those long text dumps somewhere else and just post the link to it here.
        A post with probably more then 100 lines may be kinda annoying, specially for those on mobile.

        Reply Quote 0
          A Former User 1 Reply Last reply
        • leocg
          leocg Moderator Volunteer @Guest last edited by

          @disappointeduser Do you get the same results with the ad-blocker turned off?

          Reply Quote 0
            A Former User 1 Reply Last reply
          • A Former User
            A Former User @leocg last edited by

            @leocg said in Save as MHTML adding suspect content to the file.:

            @disappointeduser Next time please store those long text dumps somewhere else and just post the link to it here.
            A post with probably more then 100 lines may be kinda annoying, specially for those on mobile.

            I am sorry for the inconvenience, but I found no way to attach a file.

            I'll do the next time......

            Reply Quote 0
              leocg 1 Reply Last reply
            • A Former User
              A Former User @leocg last edited by A Former User

              @leocg said in Save as MHTML adding suspect content to the file.:

              @disappointeduser Do you get the same results with the ad-blocker turned off?

              Dear @leocg,

              I have just checked it.....and your suspects were right......
              With the ad-blocker turned off, all that links created by Opera (I insist in that) and inserted in the <head> as a <style> of the web page saved as .mhtml, are not stored in the .mhtml file.
              GOOD POINT......thanks....

              I suppose that now a discussion will arise: Can this be considered as a bug?
              From my point of view, YES as the saved file is not an image of the web page.

              Reply Quote 0
                leocg 1 Reply Last reply
              • leocg
                leocg Moderator Volunteer @Guest last edited by

                @disappointeduser You don't attach it, you upload it to a host and then post the link to it here.
                For long text, you may try https://pastebin.com/

                Reply Quote 0
                  1 Reply Last reply
                • leocg
                  leocg Moderator Volunteer @Guest last edited by

                  @disappointeduser It was discussed here already, I just remembered that and asked about the ad-blocker.

                  Reply Quote 0
                    A Former User 1 Reply Last reply
                  • gmiazga
                    gmiazga Opera last edited by

                    @disappointeduser we think that it's something we shouldn't remove when generating .mhtml to not confuse users with different content than they've expected in case some ads were blocked. But to avoid any confusion on concerns we will try to add some comments to this code to indicate that it comes from Opera Adblocker.

                    Reply Quote 0
                      donq A Former User 2 Replies Last reply
                    • donq
                      donq @gmiazga last edited by donq

                      @mgeffro said in Save as MHTML adding suspect content to the file.:

                      @disappointeduser we think that it's something we shouldn't remove when generating .mhtml to not confuse users with different content than they've expected in case some ads were blocked. But to avoid any confusion on concerns we will try to add some comments to this code to indicate that it comes from Opera Adblocker.

                      Please, add option to not save adblock rules. This could be advanced option, no problem with that - but I greatly prefer saving original page, not adjusted one. Maybe I even need to look at saved page in another browser with all ads flashing? 🙂

                      Reply Quote 0
                        1 Reply Last reply
                      • A Former User
                        A Former User @leocg last edited by A Former User

                        @leocg said in Save as MHTML adding suspect content to the file.:

                        @disappointeduser It was discussed here already, I just remembered that and asked about the ad-blocker.

                        Dear @leocg,

                        did you get any answer to your questions from the development team?

                        I have been analyzing the stored .mhtml file and it looks that as you said, all those links included in the file are related with the Opera integrated ad-blocker.
                        Maybe it is the list of URLs that the add-blocker uses to recognize and ban a link in a web page.
                        If this is true, I am not sure if this list of URLs should be disclosed to the public as the advertisers could use it to circumvent the ad-blocker.

                        Besides of this, the links are included between HEAD tags as a STYLE, just after TITLE tags but while the html standard says the style tags should have this format:

                        <style type="text/css"></style>

                        the Opera browser is saving the added STYLE tags in this way:

                        <style type=3D"text/css">
                        </style><=

                        and no, that ending "<=" characters are not my typing error

                        As I said, in between those style tags is where the list of URLs is included in this kind of format:
                        :root a[href^=3D"http://ad-apac.doubleclick.net/"]=, :root .GKJYXHBF2 > .GKJYXHBE2 > .GKJYXHBH5,
                        :root a[href^=3D"http://www.amazon.co.uk/exec/obidos/external-search?"],
                        :root #\5f _mom_ad_2, :root #rhs_block .mod > .luhb-div > div[data-async-type=3D"updateHotelBookingModule"=
                        ],

                        etc...etc.....

                        From my point of view all of this is something it shouldn't happen. It is no sense to store that information related to ad-blocker in the .mhtml file.

                        Should I open a bug report or some moderator will generate it from this post?

                        Regards.....

                        Reply Quote 0
                          burnout426 leocg 2 Replies Last reply
                        • A Former User
                          A Former User @gmiazga last edited by A Former User

                          Dear @mgeffro,

                          the question is what are all those links?
                          I could understand that Opera saved the specific links that have blocked if any, but saving all the list even if the visited and saved page has no ads is a bit no sense....

                          Firefox doesn't save any extra information when the ad-blocker plugin is activated.

                          INMHO, It will be as simple as when Opera saves the page with the ad-block activated to simply remove the suspected advertisement links from the pages according to the rules in ad-block but there is no need to save all the list.

                          What efect will have those .mhtml files with antivirus software? Could it trigger an alert because some suspicious link?

                          What do you think?

                          Regards..

                          Reply Quote 0
                            1 Reply Last reply
                          • sgunhouse
                            sgunhouse Moderator Volunteer last edited by

                            MHTML is similar to email, as you are viewing the actual saved file it is encoded like email would be (known as "quoted printable") Characters that are not safe in email are converted to an equals sign followed by a hex code - and since the = is reserved for this use, any original = must be converted to =3D (3D being the hex code for =].

                            The developers replied to us that removing the code would change the appearance of the page (if the page was supposed to contain any of the content blocked by Opera), and their idea was that it should look the same as it was displayed. If you don't want Opera to add that code you can (temporarily) disable the ad-blocker. They haven't clearly said they wouldn't change it - I mean, my argument to them would be that Opera will do all that when I view the saved file anyway - but of course if you (or someone you send it to) view the file in another browser you'd have all the ads back.

                            Reply Quote 0
                              A Former User 1 Reply Last reply
                            • burnout426
                              burnout426 Volunteer @Guest last edited by

                              @disappointeduser said in Save as MHTML adding suspect content to the file.:

                              the Opera browser is saving the added STYLE tags in this way..:

                              The markup is encoded as Quoted-Printable. That's all you're seeing there. mhtml files are basically just eml files where the main page is an attachment in the source.

                              Reply Quote 0
                                1 Reply Last reply
                              • A Former User
                                A Former User @sgunhouse last edited by A Former User

                                Dear @sgunhouse,

                                I understand the URL encoding and that =3D stuff.....what attract my attention is that "<=" after the "</style>" tag.

                                I understand what they said about the appearance of the page but what I expect is to save what I see in the browser.
                                If the adblocker is activated, in some way, what it is showing to me is not the original page as the ads has been removed, so why is so important to respect the original spaces coming from the ads if the page is saved to file while the browser is removing those same spaces when the page is originally displayed in the browser?
                                Did you get my point of view? If the argument is valid for saving it shoud be also valid for browsing.

                                Reply Quote 0
                                  leocg burnout426 2 Replies Last reply
                                • leocg
                                  leocg Moderator Volunteer @Guest last edited by

                                  @disappointeduser What questions?

                                  As said, it's already been addressed so there's no need to report it.

                                  Reply Quote 0
                                    A Former User 1 Reply Last reply
                                  • A Former User
                                    A Former User @leocg last edited by A Former User

                                    @leocg .....

                                    Ahhhh, got it.....

                                    I didn't understand you correctly the first time...

                                    But it depends if this is considered a bug or a feature as @sgunhouse, suggested.

                                    Thanks.

                                    Reply Quote 0
                                      1 Reply Last reply
                                    • leocg
                                      leocg Moderator Volunteer @Guest last edited by

                                      @disappointeduser But what you see on the browser is generated by that code from the ad-blocker.

                                      Reply Quote 0
                                        A Former User 1 Reply Last reply
                                      • burnout426
                                        burnout426 Volunteer @Guest last edited by

                                        @disappointeduser said in Save as MHTML adding suspect content to the file.:

                                        my attention is that "<=" after the "</style>" tag.

                                        The '<' is the starting bracket for the LINK element on the next line. The '=' and the newline after it is just breaking up things into another line to meet line-length limits.

                                        Reply Quote 0
                                          A Former User 1 Reply Last reply
                                        • A Former User
                                          A Former User @leocg last edited by

                                          @leocg Exactly! And that appearance is altered respect to the original page with ads, so why not save the page altered as is being displayed?

                                          When a section is removed from a page by the adblocker, its space is not reserved in the displayed page, so the page appeareance is altered. This new page layout is what it should be saved...exactly what is displayed... with the ad links directly eliminated and tha ads space eliminated or reduced as displayed.

                                          Maybe there is something I am missing because this sound so obvious for me...sorry.

                                          Reply Quote 0
                                            1 Reply Last reply
                                          • A Former User
                                            A Former User @burnout426 last edited by

                                            @burnout426 said in Save as MHTML adding suspect content to the file.:

                                            @disappointeduser said in Save as MHTML adding suspect content to the file.:

                                            my attention is that "<=" after the "</style>" tag.

                                            The '<' is the starting bracket for the LINK element on the next line. The '=' and the newline after it is just breaking up things into another line to meet line-length limits.

                                            It is not....it looks it belongs to the </head> tag.
                                            So the same question is still around. Is it compliant with the standard to break the head tag inserting a "=" sign plus CRLF just before "/head>". The "=" sign is supposed to replace non printable caracter followed by its hex code, so if any it should be something like:

                                            <=OD=OA
                                            /head>

                                            Am I wrong?

                                            Reply Quote 0
                                              burnout426 1 Reply Last reply
                                            • First post
                                              Last post

                                            Computer browsers

                                            • Opera for Windows
                                            • Opera for Mac
                                            • Opera for Linux
                                            • Opera beta version
                                            • Opera USB

                                            Mobile browsers

                                            • Opera for Android
                                            • Opera Mini
                                            • Opera Touch
                                            • Opera for basic phones

                                            • Add-ons
                                            • Opera account
                                            • Wallpapers
                                            • Opera Ads

                                            • Help & support
                                            • Opera blogs
                                            • Opera forums
                                            • Dev.Opera

                                            • Security
                                            • Privacy
                                            • Cookies Policy
                                            • EULA
                                            • Terms of Service

                                            • About Opera
                                            • Press info
                                            • Jobs
                                            • Investors
                                            • Become a partner
                                            • Contact us

                                            Follow Opera

                                            • Opera - Facebook
                                            • Opera - Twitter
                                            • Opera - YouTube
                                            • Opera - LinkedIn
                                            • Opera - Instagram

                                            © Opera Software 1995-