If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

API - implement persistence, printing functionality for embedding

VERIFIED FIXED in mozilla0.8

Status

()

Core
Layout
P1
normal
VERIFIED FIXED
17 years ago
17 years ago

People

(Reporter: dougt, Assigned: Adam Lock)

Tracking

({embed})

Trunk
mozilla0.8
x86
Windows NT
embed
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [nsbeta3-])

Attachments

(9 attachments)

(Reporter)

Description

17 years ago
DocumentViewerImpl::Save() is not implemented.  However print is.  We should be 
able to use the same nsIContentViewerFile interface for both printing and 
saving.
(Reporter)

Updated

17 years ago
Keywords: embed, nsbeta3

Comment 1

17 years ago
I don't know why this is assigned to Clayton. Reassigning back to Doug.
Assignee: clayton → dougt

Comment 2

17 years ago
adam, don't you have saving working?
Whiteboard: [nsbeta3+]

Updated

17 years ago
Target Milestone: --- → M18
(Reporter)

Comment 3

17 years ago
.
Assignee: dougt → adamlock
(Assignee)

Comment 4

17 years ago
Perhaps this method should be deleted? It's not the content viewer's job to save 
the document.

The DOM document has persistence methods which can be used like the code snippet 
below:

// Get an nsIDiskDocument interface to the DOM document
nsCOMPtr<nsIDiskDocument> diskDoc = do_QueryInterface(domDocument);
if (!diskDoc)
{
  return E_NOINTERFACE;
}

// Create an nsFilelSpec from the selected file path.
nsFileSpec fileSpec("/my/path/is/here.html", PR_FALSE);

// Save the file.
nsAutoString mimeType; 
mimeType.AssignWithConversion("text/html");
nsAutoString useDocCharset;
hr = diskDoc->SaveFile(&fileSpec, PR_TRUE, PR_TRUE, mimeType, useDocCharset, 0);
(Reporter)

Comment 5

17 years ago
so, this should be done when the contentviewerfile interface is wacked?

Comment 6

17 years ago
if by whacked you mean "migrated from," yes. API notes are suggesting a new
interface (nsIPersistance or something).
(Assignee)

Comment 7

17 years ago
Changing summary.
Summary: DocumentViewerImpl::Save() not implemented → API - implement persistence, printing functionality for embedding

Updated

17 years ago
Priority: P3 → P1

Comment 8

17 years ago
per email with Jud, changing nsbeta3+ to nsbeta3- on all "embed" keyword bugs
since embedding changes will not be made in the mn6 branch. If you feel this bug
fix needs to go into mn6 branch, please list the reasons/user impact/risk and
nominate for rtm. Thanks.
Whiteboard: [nsbeta3+] → [nsbeta3-]
(Assignee)

Updated

17 years ago
Depends on: 57996
(Assignee)

Comment 9

17 years ago
Created attachment 18324 [details]
First working cut - first attachment of many
(Assignee)

Comment 10

17 years ago
Created attachment 18325 [details]
First working cut
(Assignee)

Comment 11

17 years ago
Created attachment 18326 [details]
First working cut
(Assignee)

Comment 12

17 years ago
Created attachment 18327 [details] [diff] [review]
First working cut
(Assignee)

Comment 13

17 years ago
Created attachment 18328 [details] [diff] [review]
First working cut
(Assignee)

Comment 14

17 years ago
Created attachment 18329 [details] [diff] [review]
First working cut - diffs to existing files
(Assignee)

Comment 15

17 years ago
I've attached the first working version of the new persistence stuff.

Attachments described:

nsIWebBrowserPersist.idl
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=18324

nsIWebBrowserPersist is the new interface for persisting documents. It has 3 
methods, saveURI, saveCurrentURI & saveDocument. The first two methods save any 
data referenced by a URI to local store. The third method saves an existing DOM 
and all linked files to local store.

nsDOMWalker.h & cpp
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=18325
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=18326

This is a helper class for walking through a DOM, calling a callback method for 
each node in the DOM.

nsWebBrowserPersist.h & cpp
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=18327
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=18328

This is the implementation of nsIWebBrowserPersist and does the job of saving 
files & observing their progress. It also has all the code for fixing up a DOM 
to use locally referenced files. Currently it fixes up images, css, js but could 
be extended to frames, iframes. It's reasonably generic and couldn't care if it 
were in the embedding module or somewhere else.  The internal implementation of 
SaveDocument calls SaveURI so all the methods get exercised.

Diffs to other files
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=18329

Changes to other files to support the new stuff. I've used the ActiveX control 
has the test harness so there's some sample code in there for how SaveDocument 
is called. nsWebBrowser also implements nsIWebBrowserPersist but creates an 
nsWebBrowserPersist object to do the work.

Comments please!
Status: NEW → ASSIGNED
(Assignee)

Comment 16

17 years ago
Note that I've had to hack around the non-implementation of cloneNode in HTML 
documents for the moment (see bug 57996). I may try and fix that myself. There's 
also plenty of TODOs and error checking that could be added.
(Assignee)

Comment 17

17 years ago
Created attachment 18878 [details] [diff] [review]
Second cut of persistence. Has progress listener callbacks, works on Win32, Linux
(Assignee)

Comment 18

17 years ago
Created attachment 19588 [details] [diff] [review]
Persistence code ready for checkin pending review
I can't actually apply this patch because patch says that it is mangled so I
just looked at it by hand.

Should this be released?

+     nsresult rv = persist->SaveDocument(doc, aFileName, aDataPath);
+ //    persist->Release(); // TODO memory leak

Other than that, looks fine. sr=blizzard
(Assignee)

Comment 20

17 years ago
Thanks Chris, the fixes are checked in. I'm leaving the bug open until I can 
check the status of that memory leak TODO. I've opened a new bug 61672 to cover 
the embedding printing api changes.

Comment 21

17 years ago
Adam,

Looks like you are not saving images that are inside an input tag. If the tag is
input and type is "image", the src attribute contains the url to the image. See
www.msn.com, the circle (go) button is an image.

Is there a way to save a page+css+images+etc w/o having to load the page? I
realize you are using the nsIDOMDocument to do the fixups? Can a page be loaded
and parsed into an nsIDOMDocument without viewing and then save and destroy the
DOMDocument?

Finally, images that are embedded in a document sometimes get redirected by the
server, however the original URL is kept in the DOM. This of course breaks any
attempt to save the image through the use of the original URL. Can an interface
be added to get the final redirected URL for that image? A typical example is:
aolsearch.aol.com.. the ads work in this way.

Comment 22

17 years ago
one more:

all href links have to be fixed-up in order to work from the saved page as
intended. so rather than relative, they should be fully-qualified from the base
url of the page at the time of saving....
(Assignee)

Comment 23

17 years ago
Gus, in response to your issues:

1. Yes, the INPUT tag is ignored at the moment. I'm working on a simple patch.
2. You must have a DOM document to save the linked files. I don't know how clean 
the code is for parsing HTML into a DOM document, but it should be possible to 
lash the parser, streams & DOM together without rendering the document. There 
could well be a test harness that already demonstrates how to do this.
3. The redirect problem is tricky. To solve this would require each DOM element 
to remember the redirected URL and for there to be a way to ask for that from 
the persister. I don't think DOM elements do know so the persister doesn't 
either. Since it only affects advertising banners, perhaps it's not so bad?
4. Yes, there's an issue with the BASE element in the page. Basically, if it 
exists it needs to be fixed up, if it doesn't it needs to be inserted. There's a 
TODO against that in the code and I should do a patch for that too.
(Assignee)

Comment 24

17 years ago
Created attachment 20883 [details] [diff] [review]
Patch fixes up INPUT tag, fixes memory leak in webbrowser, adds/fixes up BASE element
(Assignee)

Comment 25

17 years ago
Info on the latest patch:

1. The persistence object is now released by the webBrowser.
2. INPUT tags are fixed up
3. The BASE tag is inserted or fixed up so relative links work correctly.
4. A duff argument test has been removed so passing a nsnull datapath into 
SaveDocument works like it's meant to.

Gus, let me know if this patch fixes some of your problems.

Updated

17 years ago
Target Milestone: M18 → mozilla0.8
Keywords: donttest

Comment 26

17 years ago
Patch works great

In looking at my list...
1. INPUT tags works nicely
2. I'm satisfied with they way it works now. If a user wants to save an HTML 
document for later use, he/she will need to at least view it... IE works like 
this
3. Like 2, I'm satisfied. no big deal...
4. This works great!

New problem:

The form tag is written in the file like this: <form ....></form> and therefore 
all input elements are left floating outside of the form and do nothing. Must 
be a bug in there since the actual HTML was not like this... Isn't this a known 
bug? If, so know it is painfully obvious...

IE issues the following tags in the file it saves:
at the top of the file
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0022)http://www.yahoo.com/ -->
and in the header:
<META http-equiv=Content-Type content="text/html; charset=windows-1252">

Are they needed, for completeness?

Great job!
(Assignee)

Comment 27

17 years ago
sr=blizzard. Fixes are checked in.

Gus, can you open a bug on the form problem? I just tried a few different 
forms and they seem to work fine. It could be that the original HTML is 
malformed in some way (e.g. bad nesting) so that the parser is building the DOM 
incorrectly and consequently saving it out wrong.

Regarding DOCTYPEs, I don't think we know whether the original document is HTML 
compliant or not unless they say so. If they say <!DOCTYPE blah> at the top 
then the persister should certainly save that information out. Is that not 
happening? If they *don't* say <!DOCTYPE blah> then we should assume they're not 
compliant to any HTML spec and not attempt to append our own DOCTYPE saying that 
they are.
Status: ASSIGNED → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → FIXED

Updated

17 years ago
Blocks: 64833

Comment 28

17 years ago
Marking verified per last comments.

Status: RESOLVED → VERIFIED

Updated

16 years ago
No longer blocks: 64833
You need to log in before you can comment on or make changes to this bug.