Closed
Bug 73992
(dublinCore)
Opened 24 years ago
Closed 16 years ago
Page Info dialog should support Dublin Core metadata
Categories
(SeaMonkey :: Page Info, enhancement)
SeaMonkey
Page Info
Tracking
(Not tracked)
RESOLVED
WORKSFORME
Future
People
(Reporter: karl, Assigned: db48x)
References
()
Details
(Keywords: helpwanted)
The View Page Info dialog should support Dublin Core metadata <URL:
http://dublincore.org/documents/1999/07/02/dces/ >. Dublin Core defines 15
elements (these are not elements in the SGML/XML sense):
Title: A name given to the resource.
Creator: An entity primarily responsible for making the content of the resource.
Subject: The topic of the content of the resource.
Description: An account of the content of the resource.
Publisher: An entity responsible for making the resource available.
Contributor: An entity responsible for making contributions to the content of
the resource.
Date: A date associated with an event in the life cycle of the resource.
Type: The nature or genre of the content of the resource.
Format: The physical or digital manifestation of the resource.
Identifier: An unambiguous reference to the resource within a given context.
Source: A Reference to a resource from which the present resource is derived.
Language: A language of the intellectual content of the resource.
Relation: A reference to a related resource.
Coverage: The extent or scope of the content of the resource.
Rights: Information about rights held in and over the resource.
This metadata is included in the HTML like this as defined in <URL:
http://www.ietf.org/rfc/rfc2731.txt >:
<meta name = "DC.Creator"
content = "Engels, F.">
<meta name = "DC.Title"
content = "Capital">
<link rel = "schema.DC"
href = "http://purl.org/DC/elements/1.1/">
Dublin Core also has a set of qualifisers <URL:
http://dublincore.org/documents/dcmes-qualifiers/ >, which "narrow" the meaning
of different elements. Example:
<meta name = "DC.Date.Created"
content = "1998-05-14">
<meta name = "DC.Date.Available"
content = "1998-05-21">
<meta name = "DC.Date.Valid"
content = "1998-05-28">
More examples can also be found in <URL:
http://dublincore.org/documents/2000/07/16/usageguide/qualified-html.shtml >
(not normative). Note that an element can be repeated several times (e.g., when
there's several authors). All elements should be displayed in the UI, possible
using several tabs/categories.
All of these should be support in the page info dialog. Here's a
*complete* "walk-through" of what we can expect to find in HTML documents,
including all qualifiers (we should support these, and noone else) and schemes:
TITLE:
<meta name = "DC.Title"
content = "Hamlet in Iceland; being the Icelandic romantic Ambales
saga">
<meta name = "DC.Title.Alternative"
content = "Ambales saga">
<meta name = "DC.Title"
lang = "nn"
content = "Hamlet på Island – Ambales saga">
Note the 'Alternative' qualifier. This should be marked as such in the UI.
The language of the to first titles are defined by the document, i.e.:
<html xml:lang="en"> (or <head ...> or another parent)
<html lang="en">
HTTP header 'Content-Language'
<meta http-equiv="Content-Language" content="en">
xml:lang overrides lang which overrides HTTP header which overrides meta http-
equiv. (This is the normal way of getting the language of an element/attribute -
- inheritance. I don't know if this information is available in Mozilla, but it
*should* be, as CSS 2 requires it.)
Language can be explicitly defined on each 'meta' element or implicitly, by
inheritance from the parent or HTTP header.
CREATOR:
<meta name = "Creator"
content = "Hufthammer, Karl Ove">
The creator name is usally written in the form 'Last name, First Name', but not
always, e.g.:
<meta name = "DC.Creator"
content = "Mao Tse Tung">
They should *always* be displayed as 'First Name Last Name', e.g. 'Hufthammer,
Karl Ove' should be displayed as 'Karl Ove Hufthammer'.
SUBJECT:
<meta name = "DC.Subject"
content = "heart attack">
<meta name = "DC.Subject"
scheme = "MeSH"
content = "Myocardial Infarction; Pericardial Effusion">
<meta name = "DC.Subject"
content = "Vietnam War">
<meta name = "DC.Subject"
scheme = "LCSH"
content = "Vietnamese Conflict, 1961-1975">
<meta name = "DC.Subject"
content = "Friendship">
Note the 'scheme' attribute. This can take one of the values:
LCSH
MeSH
DDC
LCC
UDC
When presented in the UI, the name of the scheme should also be shown, but
expanded to the following (not including the text in []):
Library of Congress Subject Headings
Medical Subject Headings [See <URL: http://www.nlm.nih.gov/mesh/meshhome.html >]
Dewey Decimal Classification [See <URL: http://www.oclc.org/dewey/index.htm >]
Library of Congress Classification [See <URL:
http://lcweb.loc.gov/catdir/cpso/lcco/lcco.html >]
UDC [See <URL: Universal Decimal Classification >]
DESCRIPTION:
<meta name = "DC.Description"
content = "A tutorial and reference manual for Java.">
<meta name = "DC.Description.TableofContents"
lang = "en"
content = "The Author gives some Account of Himself and Family
-- His First Inducements to Travel -- He is
Shipwrecked, and Swims for his Life -- Gets safe on
Shore in the Country of Lilliput -- Is made a
Prisoner, and carried up the Country">
<meta name = "DC.Description.Abstract"
content = "The kinematics of the jaws and hyolingual apparatus in
Caiman crocodilus were examined by cineradiography and
electromyography. After catching, caimans position their
prey between the teeth by a series of inertial bites and
then kill and crush it by a forceful bite.">
Note the 'TableofContents' and 'Abstract'. These should be marked as such in
the UI.
PUBLISHER:
<meta name = "DC.Publisher"
content = "O'Reilly">
<meta name = "DC.Publisher"
content = "Digital Equipment Corporation">
This is pretty straigt-forward. There could be more than one publisher of a
document.
CONTRIBUTOR:
<meta name = "DC.Contributor"
content = "Curie, Marie">
Again, pretty straigt-forward.
DATE:
<meta name = "DC.Date"
scheme = "W3CDTF"
content = "1998-05-14">
<meta name = "DC.Date.Created"
scheme = "W3CDTF"
content = "1998-05-14">
<meta name = "DC.Date.Available"
content = "1998-05-21">
<meta name = "DC.Date.Valid"
scheme = "W3CDTF"
content = "1998">
<meta name = "DC.Date.Valid"
scheme = "W3CDTF"
content = "1999-09-25T14:20+10:00/">
<meta name = "DC.Date.Issued"
scheme = "W3CDTF"
content = "1998-05-29">
<meta name = "DC.Date.Modified"
scheme = "W3CDTF"
content = "1998-05-29">
Note the qualifiers 'Created', 'Available', 'Valid', 'Issued' and 'Modified'.
If the value of 'Created' and 'Modified' isn't available, they can be taken from
the HTTP headers.
The W3CDTF scheme is basically ISO 8601, and is specified in <URL:
http://www.w3.org/TR/NOTE-datetime >. This is the default is no scheme is
specified.
There is also a scheme="Period" defined in <URL:
http://dublincore.org/documents/dcmi-period/ >, though this can't be used as an
attribute value (as far as I can see).
TYPE:
<meta name = "DC.Type"
scheme = "DCMIType"
content = "Software">
<meta name = "DC.Type"
scheme = "DCMIType"
content = "Dataset">
<meta name = "DC.Type"
scheme = "DCMIType"
content = "Event">
<meta name = "DC.Type"
scheme = "DCMIType"
content = "Service">
The DCIMType scheme is defined in <URL: http://dublincore.org/documents/dcmi-
type-vocabulary/ >. There are nine different DCMI types. There can be a button
to get a description of a type, or this can be presented as a tooltip
(localizable of course). For 'Service':
A service is a system that provides one or more functions of value to the end-
user. Examples include: a photocopying service, a banking service, an
authentication service, interlibrary loans, a Z39.50 or Web server.
FORMAT:
<meta name = "DC.Format.Medium"
scheme = "IMT"
content = "text/xml">
<meta name = "DC.Format.Extent"
content = "14 minutes">
<meta name = "DC.Format"
content = "A text file with mono-spaced tables and diagrams.">
<meta name = "DC.Format"
content = "video/mpeg; 14 minutes">
The IMT scheme is defined in <URL: http://www.isi.edu/in-
notes/iana/assignments/media-types/media-types >.
IDENTIFIER:
<meta name = "DC.Identifier"
scheme = "URI"
content = "http://catalog.loc.gov/67-26020">
The URI scheme is defined in <URL: http://www.ietf.org/rfc/rfc2396.txt >. All
URIs should be clickable.
(An identifier is *not* and shoulnd not be treated as an URI unless the 'URI'
scheme is used, even though it has the form of a valid URI. We should always
honor the scheme and never assume a particular scheme is used if it isn't
explicitly defined in the 'meta' element (an exception is 'DC.Date' which is
W3C Date/Time if no scheme is chosen).)
SOURCE:
<meta name = "DC.Source"
content = "Shakespeare's Romeo and Juliet">
<meta name = "DC.Source"
scheme = "URI"
content = "http://a.b.org/manon/">
The scheme 'URI' is a URI. The default is plain text.
LANGUAGE:
<meta name = "DC.Language"
scheme = "rfc1766"
content = "en">
<meta name = "DC.Language"
scheme = "ISO639-2"
content = "eng">
<meta name = "DC.Language"
scheme = "rfc1766"
content = "en-US">
ISO639-2: <URL: http://lcweb.loc.gov/standards/iso639-2/langhome.html >.
RFC 1766: <URL: http://www.ietf.org/rfc/rfc1766.txt >.
The name, not the language code of the language should be displayed in the UI.
Mozilla already has a list of language code/language name pairs
(see 'Preferences' | 'Language') built-in.
Also, for backwards compatibility, 'ISO639-1' should be treated as synonym
for 'rfc1766'. (The Nordic Metadata Template uses this.)
When no language is specified, the language should be taken from the HTTP
header, a http-equiv meta element or lang="xx" or xml:lang="xx" on the 'html'
element (it should not be taken from any other elements -- only language
specified on the top-level element defines the document language).
RELATION:
<meta name = "DC.Relation.IsVersionOf"
scheme = "URI"
content = "http://foo.bar.org/draft9.4.4.2">
<meta name = "DC.Relation.HasVersion"
scheme = "URI"
content = "http://foo.bar.org/draft9.4.4.2">
<meta name = "DC.Relation.IsReplacedBy"
scheme = "URI"
content = "http://foo.bar.org/draft9.4.4.2">
<meta name = "DC.Relation.Replaces"
scheme = "URI"
content = "http://foo.bar.org/draft9.4.4.2">
<meta name = "DC.Relation.IsRequiredBy"
scheme = "URI"
content = "http://foo.bar.org/draft9.4.4.2">
<meta name = "DC.Relation.Requires"
content = "LWP::UserAgent; HTML::Parse; URI::URL;
Net::DNS; Tk::Pixmap; Tk::Bitmap; Tk::Photo">
<meta name = "DC.Relation.IsPartOf"
scheme = "URI"
content = "http://foo.bar.org/abc/proceedings/1998/">
<meta name = "DC.Relation.HasPart"
scheme = "URI"
content = "http://foo.bar.org/abc/proceedings/1998/">
<meta name = "DC.Relation.IsFormatOf"
scheme = "URI"
content = "http://foo.bar.org/cd145.sgml">
<meta name = "DC.Relation.IsReferencedBy"
scheme = "URI"
content = "http://foo.bar.org/cd145.sgml">
<meta name = "DC.Relation.References"
content = "urn:isbn:1-56592-149-6">
<meta name = "DC.Relation.IsFormatOf"
content = "Shakespeare's Romeo and Juliet">
<meta name = "DC.Relation.HasFormat"
scheme = "URI"
content = "Shakespeare's Romeo and Juliet">
The scheme 'URI' is a URI. The default is plain text.
I *think* I remembered all qualifers. Description of them can be found at <URL:
http://dublincore.org/documents/dcmes-qualifiers/#relation >.
COVERAGE:
<meta name = "DC.Coverage.Temporal"
content = "US civil war era; 1861-1865">
<meta name = "DC.Coverage.Temporal"
scheme = "W3CDTF"
content = "1998">
<meta name = "DC.Coverage.Spatial"
content = "Columbus, Ohio, USA; Lat: 39 57 N Long: 082 59 W">
<meta name = "DC.Coverage.Spatial"
scheme = "TGN"
content = "Columbus (C,V)">
Note to author: This is the spatial or temporal features of the intellectual
content. A document about the Eiffel Tower, written in English, by a Norwegian,
living in Turkey, stored on a server in Brazil should have a coverage
of 'Paris' or 'France' or the equivalent geographical coordinates.
This has the qualifiers 'Temporal' and 'Spatial'. There are tons of schemes for
these. See <URL: http://dublincore.org/documents/dcmes-qualifiers/#coverage >.
RIGHTS:
<meta name = "DC.Rights"
lang = "en"
content = "Copyright Acme 1999 - All rights reserved.">
<meta name = "DC.Rights"
scheme = "URI"
content = "http://foo.bar.org/cgi-bin/terms">
*** IMPORTANT ***
The 'DC' "elements" should *only* be recognized if one of the following lines
are included in the HTML document:
<link rel = "schema.DC"
href = "http://purl.org/DC/elements/1.1/">
<link rel = "schema.DC"
href = "http://purl.org/DC/elements/1.0/">
<link rel = "schema.DC"
href = "http://purl.org/metadata/dublin_core_elements">
<link rel = "schema.DC"
href = "http://purl.org/DC/elements/1.0/#fragement-identifier">
(You can compare these to 'namespaces' in XML.)
The 'http://purl.org/metadata/dublin_core_elements' should only be supported
for backwards compatibility and its use is discouraged. *All* URLs can contain
fragment identifiers, e.g.:
<link rel = "schema.DC"
href = "http://purl.org/DC/elements/1.1/#date">
Here, only the 'date' element should be supported.
I'm not completely sure of this, but I *think* using
<link rel = "schema.TEST"
href = "http://purl.org/DC/elements/1.1/">
should enable:
<meta name = "TEST.Description"
content = "A tutorial and reference manual for Java.">
to work (i.e., the prefix has no meaning in it self, only when connected to
a "namespace").
Comment 1•24 years ago
|
||
Karl: db48x is rewriting Page Info, as far as I know. You may want to contact
him. The word infobot in #mozillazine has his email address.
Gerv
What if someone has:
<link rel = "schema.DC"
href = "http://purl.org/DC/elements/1.2/">
[and that url exists]
Reporter | ||
Comment 3•24 years ago
|
||
> What if someone has:
> <link rel = "schema.DC"
> href = "http://purl.org/DC/elements/1.2/">
Hmm, you have a point. OK, I think we safely can assume that all schemas
beginning with "http://purl.org/DC/elements/" is part of the the DC. (The DC is
pretty stable, and I doubt it will change much.)
> [and that url exists]
404?
well i picked 1.2 because you didn't list it, but i'm assuming it doesn't exist
yet (or you would have). if someone references 1.2 and it returns 404 do we
still honor it? [I was more concerned w/ honoring a present file that matched
the naming convention but not your list -- thanks for your revised answer,
it's much more acceptable]
Keywords: helpwanted
Assignee | ||
Comment 5•24 years ago
|
||
This information will be already be picked up by the new page info stuff when it
lists the contents of all meta tags. I'm just displaying the contents of the
name and content/http-equiv attributes, so it won't pretty print anything. Will
this be enough for the shipping mozilla? Anything further is easily included as
an extension with an overlay. That overlay could modify the contents of the tree
I'm showing meta tags in, or add a completely seperate tab for displaying the DC
metadata. I'd recomend the latter.
db48x
Now if I could just get mozilla to stop crashing on form submission...
Reporter | ||
Comment 6•24 years ago
|
||
> I'm just displaying the contents of the
> name and content/http-equiv attributes, so it won't
> pretty print anything. Will
> this be enough for the shipping mozilla?
Well, it will be better than nothing, but it won't be enough for marking this
bug as 'fixed' (anymore than displaying a DOM tree of an HTML document can be
seen as *supporting* HTML).
Comment 7•24 years ago
|
||
Dublin Core metadata should also be accessible through the W3C's RDF
recommendation, like so:
<html><head>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/metadata/dublin_core#">
<rdf:Description about="http://www.dlib.org">
<dc:Title>D-Lib Program - Research in Digital Libraries</dc:Title>
<dc:Description>The D-Lib program supports the community of people
with research interests in digital libraries and electronic
publishing.</dc:Description>
<dc:Publisher>Corporation For National Research Initiatives</dc:Publisher>
<dc:Date>1995-01-07</dc:Date>
<dc:Subject>
<rdf:Bag>
<rdf:li>Research; statistical methods</rdf:li>
<rdf:li>Education, research, related topics</rdf:li>
<rdf:li>Library use Studies</rdf:li>
</rdf:Bag>
</dc:Subject>
<dc:Type>World Wide Web Home Page</dc:Type>
<dc:Format>text/html</dc:Format>
<dc:Language>en</dc:Language>
</rdf:Description>
</rdf:RDF>
</head></html>
or the RDF abbreviated syntax:
<html><head>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/metadata/dublin_core#">
<rdf:Description about="http://www.dlib.org"
dc:Title="D-Lib Program - Research in Digital Libraries"
dc:Description="The D-Lib program supports the community of people
with research interests in digital libraries and electronic
publishing."
dc:Publisher="Corporation For National Research Initiatives"
dc:Date="1995-01-07"/>
</rdf:RDF>
</head></html>
or through links to external RDF files:
<link rel="meta" href="mydocMetadata.DC.RDF">
...These examples were taken from the W3C recommendation at
http://www.w3.org/TR/REC-rdf-syntax/ -- see that document for more details.
I know that Mozilla has support for RDF datasources, I don't know how different
this usage of RDF is from the current implementation. As support for RDF as a
vehicle for metadata and the "Semantic Web" grows, Mozilla needs to be able to
put it to use.
Updated•23 years ago
|
Status: NEW → ASSIGNED
Target Milestone: --- → Future
Comment 8•23 years ago
|
||
mass moving open bugs pertaining to page info to pmac@netscape.com as qa contact.
to find all bugspam pertaining to this, set your search string to
"BigBlueDestinyIsHere".
QA Contact: sairuh → pmac
Assignee | ||
Updated•22 years ago
|
Alias: dublinCore
Updated•20 years ago
|
Product: Browser → Seamonkey
Bug 268343 is about Live Bookmarks better supporting Dublin Core metadata, related?
Comment 10•16 years ago
|
||
Test cases at http://www.codestyle.org/test/DCTestCases.shtml
WFM with the current pageinfo implementation in Build identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.1b4pre) Gecko/20090422 SeaMonkey/2.0b1pre
All the attributes show up in the General->Meta list.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → WORKSFORME
Comment 11•10 years ago
|
||
This is a TEST!!!
You need to log in
before you can comment on or make changes to this bug.
Description
•