177886 - (rdf-bookmarks) Store bookmarks in a RDF format

Reporter

Description

•

23 years ago

As bug xbel was mis-used for discussion of changing our default bookmarks format, here's a seperate bug for that. Copying some comments of that bug: ------- Additional Comment #8 From Robert Kaiser 2000-10-10 15:12 ------- And why not simply save it to an RDF file? It is already an RDF datasource internally... Mozilla could provide an export feature to this in "Manage Bookmarks..." which writes Netscape style or XBEL or even a standards compliant (X)HTML file... ------- Additional Comment #9 From Ben Bucksch 2000-10-10 16:01 ------- KaiRo has a point. We don't need to use an exchange format internally, just for ex-/imports. Changing SUMMARY from "Mozilla should use XBEL" to "Mozilla should support XBEL". ------- Additional Comment #12 From Ben Bucksch 2000-10-11 13:08 ------- > XML is the logical, standards compliant language to use for storing bookmarks. Wrong. It is not the logical format. The RDF format we use internally, serialized as XML, is the logical format. If we use RDF internally during runtime, why not store it interally as RDF? For "standards-compliant" (XBEL is no standard) data exchange, you can offer options to export and import to and from XBEL and Netscape's HTML bookmark format. In the UI, this means *no* difference to your proposal. ------- Additional Comment #16 From Ben Bucksch 2000-10-11 15:45 ------- I think, we're all on the same track now. I don't think, this bug is hard to fix. 1. Remove the RDF->Netscape-Bookmark-HTML converter and just serialize the RDF into XML and save that and the other way around (load) (you might find examples for that in the localstore.rdf writer/loader). 2. Write XSLT "stylesheets" to transform Mozilla-Bookmark-RDF-XML<->XBEL and Mozilla-Bookmark-RDF-XML<->Netscape-Bookmark-HTML (and possibly Mozilla-Bookmark-RDF-XML<->some-sane-HTML :) ). 3. Hook up the XSLT-based converters in the UI. We will propably need XSLT enabled in Mozilla for that. ------- Additional Comment #22 From Fabian Guisset 2001-02-01 05:11 ------- Since we already use RDF for the bookmarks management (i.e. in the chrome, we use RDF trees, etc), I see no reason why the bookmarks.html file shouldn't become a bookmarks.rdf file. We would not need to change the chrome that way. It would also speed up a little the bookmarks since we wouldn't have to convert from html to rdf. Then we can support XBEL as import/export language, as Timeless suggested. This is just my opinion. ------- Additional Comment #25 From Ben Goodger 2001-02-14 17:51 ------- My plan is to convert the bookmarks service to read and produce serialized RDF, and limit use of the HTML->RDF parser to import and export activities. This should allow for quicker development of new features, and provide a richer API for third parties manufacturing bookmark management utilities. I propose something like a XUL template in an XHTML document to produce a live view of the users's bookmarks that can be loaded into a browser content area.

Robert Kaiser

Reporter

Comment 1

•

23 years ago

It seems Bugzilla doesn't like my use of the bug's alias in the "bug xxxx" test. OK, it's bug 55057 - hope it links it now :) BTW, all comments in that bug were in favor of that change, nobody said anything against this...

Richard Jones

Comment 2

•

23 years ago

Please, please, please store as XBEL, and allow setting of where the XBEL file is. If mozilla supported using the KDE bookmarks file, I'd use it much more often. As it is though, I have to put up with an XSLT over my bookmarks. I don't actually get to use the bookmarks from the bookmarks menu though. KDE has used XBEL for quite a while, and it's well-integrated into the system (meaning the bookmarks are not just used by Konqueror). Please let me use my KDE bookmarks from Mozilla! Please let me bookmark locations in Mozilla and have them available from KDE! (sorry, this will come across really strongly, but I believe that since you're still in the planning stage, an impassioned plea might actually sway you to whay I believe is the right course of action :)

Richard Jones

Comment 3

•

23 years ago

Gah, sorry about the wrapping there... dunno what happened...

Robert Kaiser

Reporter

Updated

•

23 years ago

Alias: rdf-bookmarks

Robert Kaiser

Reporter

Comment 4

•

23 years ago

[about dumping HTML as storage format:] There one big reason that shold concern the standards-aware Mozilla community: Bookmarks HTML format doesn't comply to the any HTML standard, IIRC. And that's a main reason to drop it. That argument would leads us to string bookmarks as XBEL, of course, as this is a standard. Why I (and others) propose RDF is that it may save performance just to dump/serialize the RDF datasource into an RDF file, and esp. to read it in from there. Of course, we need an import/export function (and that doesn't really depend on how we actually store bookmarks), for supporting multiple formats as XBEL, Nestcape-HTML, and even IE Favorites (yes, it would be a solution for all those IE Favorites import problems).

Mike Lee

Comment 5

•

23 years ago

Discussion on newsgroup... proposing wontfix http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&safe=off&threadm=aptb6g%241156%241%40news.net.uni-c.dk&rnum=2&prev=/groups%3Fq%3Dhow%2Bdoes%2Bmozilla%2Bhandle%2Bbookmarks%2Bhtml%26ie%3DUTF-8%26oe%3DUTF-8%26hl%3Den%26btnG%3DGoogle%2BSearch

Robert Kaiser

Reporter

Comment 6

•

23 years ago

Mike: Just to state it correctly here, you're the _only_ one so far who proposed this to be WONTFIX, also in the thread you mention. And for storing data (and not web content), we shouldn't have ever been turning to using HTML. That's NO data storing language! So, whatever it turns out, it should be a data format, and most people agree that it should be XML. If it ends up being RDF or XBEL or whatever, it can't be HTML, if you really think about it (and, no, using the wrong format for years is no argument here - why should I ever change away from any format that's being used for ages? why should I turn from IE to any other browser when I've been using it for years?)

Mike Lee

Comment 7

•

23 years ago

No I'm the only one who happen to know enough about RDF AND XML formats techincally that are _bothered_ to give a reason defending the use of HTML in that thread. No one has given a good enough reason to store the thing in RDF, if you going to propose the use of RDF or some other attempt of 'XML'fy the bookmark why don't you defend it in the thread? Why is html the "wrong" format? Quite frankly there will be a lot more pissed users when their discovered that their bookmark is turned into some machine human unreable language that they can't open than there will ever be defending the use of RDF. It's a storage format, no one is suppose to parse the thing except mozilla and when it does it's presented in RDF. If you propose the few years or month back I would agree with you, but learning enough about RDF and 'XML format' over the years told me otherwise. Get over it...

Sören 'Chucker' Kuklau (gone)

Comment 8

•

23 years ago

Re: Comment #7 From Mike Lee 2002-11-04 15:59 > No I'm the only one who happen to know enough about RDF AND XML formats > techincally that are _bothered_ to give a reason defending the use of HTML in > that thread. Well it appears that you're wrong ;-) > No one has given a good enough reason to store the thing in RDF, AFAIK, there's one very good reason, performance. As we handle it as RDF internally anyway (AFAIK), there's no point in converting it back and forth all the time. > if you going to propose the use of RDF or some other attempt of 'XML'fy the > bookmark why don't you defend it in the thread? Why is html the "wrong" > format? - HTML was never intended for this kind of data storage. It was always intended for creating links ("hyperlinks") between semantically similar pages. Netscape happened to have thought that one could see a bookmarks files as an unordered list of such links, but that doesn't justify usage of HTML today any more, really. - There is no version of HTML that has the elements used in Mozilla's bookmark files. Have you actually seen the source of those files? It doesn't quite feel like HTML! Ex.: <H3 ADD_DATE="961102203" ID="NC:BookmarksRoot#$b742f58">Developer Information</H3> - The advantage of using HTML for the bookmarks file is that I can open it in any HTML reader and it will display just fine. Neat thing, but I can as well export my bookmarks file to real HTML and it'll work as well. > Quite frankly there will be a lot more pissed users when their discovered that > their bookmark is turned into some machine human unreable language that they > can't open than there will ever be defending the use of RDF. I've never seen "I can human-read my bookmarks file in Mozilla" as an argument for Mozilla and against Opera, Internet Explorer, etc. Do you really think that many users will actually go to their profile directory (most people don't even know where it is, just lurk in #mozilla for a week and watch people ask) for to be able to read their bookmarks? > It's a storage format, no one is suppose to parse the thing except mozilla Exactly!

Mike Lee

Comment 9

•

23 years ago

I must be sleeping when I was taugh data structures. Can you tell me how the hell the performance will benefit through parsing a larger, more complex file into memory? Not being sarcastic or anything, I think I really miss that point. It's been bugging me since someone first mentioned it. About your html critism, RDF is never designed as a bookmark data storage either. Have you ever seen innerHTML in a html spec? You can export RDF if you want too. If no one is ever going to touch the file why convert it to RDF? Waste of engineering effort with little to no return. Seriously unless there is a real 'performance' benefit to using RDF there is no reason to save it as an RDF file that give no benefit at all. At the very least people can open the html version using any browser as well as the many bookmark tools available.

Robert Kaiser

Reporter

Comment 10

•

23 years ago

Mike: Your only argument seems to be that the file is larger and that can't help performance, others think that it's wrong (and some of those others are really Mozilla and data storage experts, look into the pasted paragraphs of comment #0). But your argument doesn't count much as long as you can't prove it. So I propose you make up a patch to store RDF, profile it, and give us some real numbers about loading/saving times and file sizes, else I don't think we can much believe you. And I simply don't believe people that are declaring themselves experts and can't prove it other than repeating their arguments. I may not be an expert as well, but there are lots of other people who tell the same arguments as me, soe of them being real experts who can prove that fact.

Mike Lee

Comment 11

•

23 years ago

I quite respect Sören and I would like to see him provide a explaination on why the RDF file will perform better. I'm was looking at the source code of the bookmark after reading his comment and I can't see how the performance would increase. I be interested to hear Ben Goodger's opinion (on performance, which wasn't talked about in the comment you quoted) and Chris Waterson (sorry if I shouldn't of cc'ed you). Heh you want me to make a patch to store it in RDF and prove it? I think it should be the other way around, I'm AGAINEST changing code. If you want to change the code YOU should be the one to come up with the numbers. Not to mention I never declared myself as expert, I just happen to know some programming principles. By the way I provded a lot of argument. I even provide a solution of using another RDF file if you want to be more flexible. See we on a thin line here, the only argument that RDF has going for it is this "performance" which you never backed up in the newsgroup. I have no intentions to make enemies, see outside of the mozilla project itself mozblog is probabily the biggest consumer of RDF out of all the addons. There is no reason why I would be againest further development of it unless I see something wrong with it. In this case, I think it's wrong. If RDF do in fact give better performance, then I rest my case, you can hack away with this new RDF bookmark format.

Robert Kaiser

Reporter

Comment 12

•

23 years ago

Mike: My real argument for this isn't performance - though I think it would be nice if it was a performance improvement as well. My real arguments are as follows: 1) Despite what you said above, HTML _is_ the wrong format, because: a) a "standards-compliant browser" should _never_ write out a document that looks to be some official standard but doesn't validate against any of the versions of this very standard. And I didn't find any official HTML standards paper that makes the Mozilla bookmarks file a valid HTML document. b) HTML is a web document format, "the publishing language of the World Wide Web" (see http://www.w3.org/TR/1998/REC-html40-19980424/), used to "represent a hypertext document for transmision over the network" (from the first ever HTML draft, http://www.w3.org/History/19921103-hypertext/hypertext/WWW/MarkUp/MarkUp.html), not a format to store a bookmarks tree along with bookmarks data, such as last visited date, add date, icons etc. c) Our current HTML format is very hard to extend, and therefore it's unnecessarily complicated to develop new features. See Ben Goodger's words from 2001-02-14 in comment #0 d) It's not necessary to convert data to a completely different model just to store it for internal use. It's good if we can convert that, but that's import/export fuctionality, not storing functionality. 2) Many people would love to see some XML format because that's very very easy to use/convert by other applications. You can even just use XSLT and create a different XML format, even [X]HTML from it. (see some comments in bug 55057) 3) XBEL might have some restrictions when it comes to the argument in 1)c) - I don't know it good enough though to know how exensible the standard is there, we still might need a conversion of the logic of our internally stored data for it. 4) We already store the data in an rdf model in memory, so we need no converting of the logic to just serialize that RDF. Additionally, we can extend the RDF as we want and we'd have a well-defined API, what would make it much easier to work with for 3rd party developers. If file size is a concern, we might eventually think about compressing the file using an internal gzip or something (as we already have at least the un-gzip part in Mozilla). The argument that 3rd party tool might rely on current HTML format is no argument for me, as we'd still provide exporting to that format, and new 3rd party tools that use our XML-based, eventually RDF, format would evolve quite fast. Additionally, it would be really easy to convert to the Netscape-HTML format as well as compliant HTML, XBEL or anything else with our built-in XSLT engine...

Mike Lee

Comment 13

•

23 years ago

I already answer most of these question, I'll do it here again for clarity 1a. Define standard, there is simply no standard for bookmark storage. 1b. Look at the bookmark file doctype, it certainly doesn't declare itself as an html format 1c. Extension could be made through the use of RDF resource subject as I described in newsgroup 1d. Do you mean we need to convert the history database into RDF as well since we use RDF model? 2. I thought we not suppose to act on the document directly and should go through mozilla. So why does it matter what format it is 'stored' in? 3. Not sure why you talking about XBEL in a bug about RDF 4. What logic? It's writing the entry one by one out just as you walk the tree. We already have a well-defined api that is used by 3rd party developer like me. 5. So size is definately a negative point right now. We talking about the storage of a bookmark here. Anything to do with using and format conversion doesn't apply here. Because they happen when the bookmark is read into memory and exposed through a RDF api. We can do all that conversion we want there. That left us with a few things. Flexability, "Standard", and performance. Flexability, I already mentioned in the newsgroup how to extend the bookmark _without_ bloating it. Can you imagine the size of the bookmark if people keep adding additional data field to it? My solution allow the extension to independant of the bookmark yet still allow you to access it as if it's one datasource. Standard, how I love this topic. This is something people including myself keep getting caught. The fact is netscape bookmark file format is the closest thing to a standard bookmark format as you can possibly get. It is certainly the most widely understood bookmark format. It doesn't declare itself as a valid html or even just html format. Quote "<!DOCTYPE NETSCAPE-Bookmark-file-1>", it's a netscape bookmark file. I must admit I only realise that myself, but it sure is a good one. It uses html tags, but I don't recall using the brackets mean it must be a valid xml or html (is angle bracket trademarked?). Just as Mozilla store mail in mbox format, history in mork format, pref in javascript format, calendar in vcalendar format, cookie in tab delimited format, mozilla store bookmark in netscape bookmark format. Note half of the stuff is not a 'standard' way of storing their respective data because there is NO standard just like bookmark. Quite frankly Netscape bookmark format is the defacto standard. This leave the only advantage of using RDF is for performance. Performance is pretty much still in the air until someone much more knowledgable about the bookmark backend come and make a comment.

Robert Kaiser

Reporter

Comment 14

•

23 years ago

Mike, I'll ingore all your further comments (and I think you can inogre mine as well in the meantime), until I hear some words of people that have real knowledge of Mozilla's bookmarks system and/or can provide real data about what's going on. we're just repeating arguments, no new findings, other than that you're the only one supporting your arguments, and I can state that some others think the same our similarily. Just some last comments: > 1a. Define standard, there is simply no standard for bookmark storage. I just said we write "NETSCAPE-Bookmark-file-1" into a .html file (that's belived to be HTML - as in "a W3C standrds compliant file" by almost anyone I speak to), and "NETSCAPE-Bookmark-file-1" is a proprietary format, no open standard. An Open Source browser should not write such a format. We're NOT Netscape. > 1d. Do you mean we need to convert the history database into RDF as well since > we use RDF model? It doesn't have to be RDF, it should be an open, preferably XML format. The current history format is bad as well, it just doesn't matter that much because I don't think people would want to synchronize history with other client or such things. Many more people want to have access at bookamrks than at history. > 2. I thought we not suppose to act on the document directly and should go > through mozilla. So why does it matter what format it is 'stored' in? I never supposed 3rd party applications should always contact a running Mozilla (it might quite often be not running) when wanting to access bookmarks. > 4. What logic? At least for me, there's a big logic difference between RDF triples and those "stuffed" Netscape-"HTML"-format-lines in some <DL> "tree"(!?!) stucture and strange <H3> tags. But anyway, too much said. We're repeating arguments, as I said. And no bug report or newsgroup is thought for endlessly repeating arguments.

Summary: Store bookmarks in RDF format → Store bookmarks in a RDF format

Robert John Churchill

Comment 15

•

23 years ago

I can speak knowledgeably about both bookmarks as well as RDF in Mozilla. Using the (historical) HTML format has various advantages for the browser over using RDF. Instead of focusing on "why HTML?" I will go over "why not RDF?" Basically, RDF files are significantly larger on-disk and take longer to parse. As an example, a *small* "bookmarks.html" file (containing twenty-nine bookmarks and one folder) is 6,829 bytes. Serialized to a "bookmarks.rdf" uses 22,221 bytes. That's a 69.3% increase in on-disk size. In terms of parsing, Mozilla can parse the HTML file in 18,802 microseconds. To parse the equivalent RDF file takes 37,352 microseconds. That's a 49.6% increase in parsing time. The numbers speak for themselves. Hardware used: [ Machine: Power Macintosh, dual-CPU'ed 800 Mhz PowerPC G4 RAM: 512 MB of RAM OS: Mac OS X 10.2.1 Hard drive: 80 GB Seagate Barracuda ATA IV, 7200 RPM, 9.5 ms seek] ] Could the RDF parser be improved? Sure, why not. Could the HTML parser be similarly improved? Sure, why not. IMHO, Mozilla *should* support at least exporting bookmarks as RDF for those individuals inclined to want to externally process their bookmarks via an RDF format. (Indeed, I've written eactly that code to help generate the numbers above. Perhaps, with good fortune, it will make it's way into the tree.)

Mike Lee

Comment 16

•

23 years ago

How about it Robert? I think Sören is a bit busy to reply my question and Waterson turned off bugzilla mails, doh. Proposing wontfix and open a new bug on exporting RDF instead. I am very passionate about RDF, but storing everything in RDF just don't make sense. It's a format to exchange data, not storing them.

Sören 'Chucker' Kuklau (gone)

Comment 17

•

23 years ago

Mike, *exporting* to RDF doesn't make sense. Have you met a browser that imports RDF? Most likely not.

Robert John Churchill

Comment 18

•

23 years ago

Sören: while it is true that no browser current supports importing RDF, various people have asked over time to be able to export their bookmarks to RDF for external processing (server apps, etc).

Mike Lee

Comment 19

•

23 years ago

Sören I said export because one of the arguement people have for storing bookmark in RDF is so other apps could use it. I guess I added that comment because I don't want this bug to 'morph' into a RDF exporting bug. By the way, thanks for the measurements Robert. (just realised theres two Robert, my previous comment is refering to Robert Kaiser sorry)

Dan Mosedale (:dmosedale, :dmose)

Updated

•

23 years ago

Hardware: PC → All

Robert John Churchill

Comment 20

•

23 years ago

I've opened bug 177886 for tracking exporting of bookmarks to RDF.

Mike Lee

Comment 21

•

23 years ago

Robert do you really mean bug 177886 (thats this bug...)? I couldn't find the export to rdf bug.

Robert John Churchill

Comment 22

•

23 years ago

Oops, I meant bug 180423

lord_ion

Comment 23

•

23 years ago

I think everyone has missed the point of the original bug (55057). The user wanted XBEL because people hope to make that a cross-platform, cross-browser standard for bookmark storage, not for performance gains. Compare to ACAP (http://asg.web.cmu.edu/acap/index.html). What users want is ONE place to store bookmarks that ALL browsers use. I use konqueror, mozilla, IE (sometimes), and phoenix. I therefore have four different sets of bookmarks. Import/Export is fine, but I hate that I can't just bookmark something and then have it be there when I open a different browser. Thus, cross-platform/browser standards (which I realise XBEL is not, but standards are made by being supported). Maybe you will tell me to just pick one browser, but sometimes I need to use windows (to play a game) and I like konqi for when I'm upgrading/tweaking phoenix and I've nuked my install -- still need a browser, but phoenix isn't there anymore. I don't think I'm alone in using several browsers. Yes, sharing across machines would require leaving the file on a server and having the client connect to it, but on my machine where I do 99% of my work, having a single bookmark file would be a godsend.

lord_ion

Comment 24

•

23 years ago

Ok, really sorry about the lack of wrapping. Should I use hard returns?

Mark Stier

Comment 25

•

21 years ago

(In reply to comment #23) I fully agree. Maybe that's a task for freedesktop.org? :-)

Mark Stier

Comment 26

•

21 years ago

Maybe, we should make this bug dependent on bug #11050? http://bugzilla.mozilla.org/show_bug.cgi?id=11050#c15 Storing Mozilla's user data in a standardized way into a local SQL database should provide a MUCH better way to share bookmarks data and other data. Even the disk cache could be shared between KDE, Mozilla, ...

Myk Melez [:myk] [@mykmelez]

Updated

•

21 years ago

Product: Browser → Seamonkey

Reed Loden [:reed]

Updated

•

18 years ago

Assignee: bugs → nobody

QA Contact: claudius → bookmarks

Benjamin Smedberg

Comment 27

•

18 years ago

places (therefore sqlite) is the new storage medium

Status: NEW → RESOLVED

Closed: 18 years ago

Resolution: --- → WONTFIX