How to use title, subject and keywords

Many document formats allows you to add metadata to a document like title, subject and keywords. Widespread formats which does are html, ms-word, jpeg. Despite the fact that it is an effective way to organize your computer content it’s rarely used.
One reason is that even this three attributes have different meaning for html, word and jpeg. Try to find with google examples how to fill in. Let’s try to find one.

The known standard Dublin Core is already a good starting point because many metadata schemas are using it. It defines among others these attributes:

  • title: The name given to the resource. Typically, a Title will be a name by which the resource is formally known, e.g. “My dog has fun”
  • subject: The topic of the content of the resource. Typically, a Subject will be expressed as keywords or key phrases or classification codes that describe the topic of the resource. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme. Separate the keywords with semicolon, e.g “animal; dog; water”
  • description: Description may include but is not limited to: an abstract, table of contents, reference to a graphical representation of content or a free-text account of the content
  • source: A Reference to a resource from which the present resource is derived, e.g. “http://dublincore.org/documents/usageguide/elements.shtml”

How to use it in

  • HTML can store metadata in the header section and initially only title, keywords, description but also allows to embed DC attributes like e.g. <meta name=”DC.title” content=”SELFHTML: Meta-Angaben”>. Always use the DC attributes.
  • JPEG allows to store metadata in EXIF or XMP container inside the JPEG file. The future proof standard is XMP and default for most professional application (e.g. Adobe Photoshop, Apple Aperture). XMP supports natively the Dublin Core attributes
  • MS-Word is as always different. They provide as default metadata title, subject, keywords and description and don’t support DC in their old formats. The new document format OpenXML is storing DC metadata but keeps also the attribute keywords. Best practice is to ignore the attribute keywords and use subject instead to add keywords
  • OpenOffice also mismatches subject and keywords like it’s precursor and even in new OpenDocument Format.