The difference between a structured and unstructured wiki

Before you choose a wiki application you must decide whether you are going to use an unstructured wiki like wikipedia or an structured wiki like dekiwiki or confluence.

The difference is simple but with a big consequence:
  • unstructured wikis organize wiki pages by their page name. They have a flat space with wiki pages whoose page names are unique in the whole wiki.
  • structured wikis organize their wiki pages similar to a file system with directories. In that case every wiki page represents a directory which may contain sub pages. The page name has to be unique only within one directory.

Disambiguation

Unstructured wikis need a process to resolve the conflicts that occur when articles about two or more different topics could have the same "natural" page title. The page name"ARM" could mean the financial term "Adjustable rate mortgage" or "Advanced RISC Machine". See Disambiguation in Wikipedia for more details. This process is ideal for open communities where many people organize and categorize pages.

Structured Wikis avoid this problem due to their page structure, e.g. /Finance/ARM or /Computer/ARM. As consequence you need a common understanding about the root directory (page) structure and should not change it frequently. In an Enterprise you have naturally this kind of structure, due to the nature how an Enterprise is organized (Finance, Marketing, Department, Development, etc.).

Authorization

There is no simple way for unstructured wikis to enable advanced authorization for wiki pages.

Most structured wikis provide an advanced authorization mechanism, e.g. where you can apply autorization rules to a whole subtree at once.

WebDAV

In case you want to access to wiki pages attached files not only via webbrowser but also via Windows Explorer WebDAV offers here an elegant solution.
Unstructured Wikis don’t provide enough navigation information for the WebDAV user.
Structured Wikis can map their wiki page structure as WebDAV directory structure, which is an ideal approach for collaborating files.
 

Structured Wikis and WebDAV

Wiki WebDAV Comments
Confluence Yes  
JSPWiki Yes  
TWiki Yes*
Apache 2 servers are not supported.
Hierarchical webs are not supported
Xwiki Yes only supports basic authentication
Advertisements

My digital archive

Recently I recognized that my oldest documents which I still have in my archive are from 1994. In between they have seen several migrations of computer hardware, operating systems and applications. I think that is a very good base to review what worked out to be a good practice after that long time to archive documents and what didn’t.

First my archive is always stored on my computer hard disk and followed all my migrations steps during the time. I refuse to use backup tools which move the file on external media or compressed them in zip or proprietary file formats. The files in the archive are organized by year that means the root of the archive has only one folder for every year. This eases the navigation in history and I don’t need a special application to manage the content. The backup of my archive is just a mirror to an external hard drive or recently a NAS. Since the past 14 years I needed that backup only 3 times to recover fragmented or deleted files.

The main lessons I learned is to save content in files with mainstream document standards like MS Word&Excel, HTML, PDF, AVI, MP3, MPEG, JPEG, TIFF, SVG, Plain ASCII. Today I would suggest open document standards but they were not available 10 years ago. I know there were standards like latex but they were mostly for techies and today still not in widespread use.

Further I didn’t loose files because of hard disk problems where your files dies a slow death (just one old AVI video) but suffered most because of the usage of applications which use proprietary document formats to store their data.

One negative example is CorelDraw, once my favorite vector graphic program. The version I licensed was not capable to run on Windows XP which rendered my Corel files to meaningless binary files. I spent some time to convert the most important ones to mainstream formats but in the end I lost a lot of my drawings. You can continue the negative hit list of all day applications for processing and managing images, word documents, emails, notes, calender, passwords and contact list which will give you headache with the next migration or patch of your operating system.

In the end it was always the same: I lost the data only because the viewer was not available anymore on my operating system which I’m actually using. I used many OS with time (MS DOS 6.X, W3.11, Windows 2000, OS/2, Linux, Windows XP and Mac OS/X) and many applications like MS Office beginning with MS Word 2.0 for MS DOS 🙂 or Word Perfect and OpenOffice.

What is the consequence for me now?

  1. Use open standards to save your files or at least mainstream standards which will be readable in the next decade
  2. The application centric approach of the last 20 years is a dead end for long term archiving and a impressive confirmation of the old unix rule that everything is a file
  3. Embedd all information in the file for organizing purposes (e.g. Keywords for a JPEG file)
  4. Take care what application you use and check how they store the information
  5. Don’t rely on the features of your Operating System and it’s applications. I promise you will use in 5 years something different 🙂

How to use title, subject and keywords

Many document formats allows you to add metadata to a document like title, subject and keywords. Widespread formats which does are html, ms-word, jpeg. Despite the fact that it is an effective way to organize your computer content it’s rarely used.
One reason is that even this three attributes have different meaning for html, word and jpeg. Try to find with google examples how to fill in. Let’s try to find one.

The known standard Dublin Core is already a good starting point because many metadata schemas are using it. It defines among others these attributes:

  • title: The name given to the resource. Typically, a Title will be a name by which the resource is formally known, e.g. “My dog has fun”
  • subject: The topic of the content of the resource. Typically, a Subject will be expressed as keywords or key phrases or classification codes that describe the topic of the resource. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme. Separate the keywords with semicolon, e.g “animal; dog; water”
  • description: Description may include but is not limited to: an abstract, table of contents, reference to a graphical representation of content or a free-text account of the content
  • source: A Reference to a resource from which the present resource is derived, e.g. “http://dublincore.org/documents/usageguide/elements.shtml”

How to use it in

  • HTML can store metadata in the header section and initially only title, keywords, description but also allows to embed DC attributes like e.g. <meta name=”DC.title” content=”SELFHTML: Meta-Angaben”>. Always use the DC attributes.
  • JPEG allows to store metadata in EXIF or XMP container inside the JPEG file. The future proof standard is XMP and default for most professional application (e.g. Adobe Photoshop, Apple Aperture). XMP supports natively the Dublin Core attributes
  • MS-Word is as always different. They provide as default metadata title, subject, keywords and description and don’t support DC in their old formats. The new document format OpenXML is storing DC metadata but keeps also the attribute keywords. Best practice is to ignore the attribute keywords and use subject instead to add keywords
  • OpenOffice also mismatches subject and keywords like it’s precursor and even in new OpenDocument Format.

History of documents

Following my recent post I think future software applications should embed the context from where they received documents before saving them to the local hard disk.

In example downloading images always means to keep track from where you got it in case you need more images from the source or because of legal aspects. For example I’m a big fan of icons and collecting them for private use when they are something special. The simplest way is collect them in folders for every website. Better would be to store the information in the metadata of the file itself.

When saving attachments from an email to the local hard disk the email client should also embed the sender email address in the local file. A lot of document types allow storing this kind of information. I think to rely on an application to keep track which document originates from which email is the wrong approach.

The reason why:

  • when downloading images from e.g. stockphotos they often don’t contain the original website. In case you want to find more photos of the same at a later time it’s easier with embed data
  • when you send the document to a friend the document on your hard disk would contain from whom you got the file (email adress) and where it was downloaded
  • integrates in modern concepts of organizing the content on your hard disk with desktop search engines like google desktop, ms desktop search, spotlight or beagle. All of them extracting metadata of files to bring them in some context so you can find them easily.

Hierarchy and Context of objects


There are some arguments why humans are structuring their daily life in objects which are organized in hierarchies and related to a certain context.

One example is the hierarchy: countries, (federal) state, town, districts, street, house number and person. It’s actually hip to measure your geographical position in longitude and lattitude which is fine for geotagging of images but is of limited use to measure the distance of two people.

The two geotags “48° 13′;16° 22’East” and “48° 09′ North; 17° 07′ East” just tells you that the distance between this two persons ist about 65km.

Expressed in human words you get more information. One person lives in Vienna the other in Bratislava. You can derive a lot more information:
– Hierarchy: Europe>Austria>Vienna and Europe>Slovak Republic>Bratislava
– Context: it’s very likely that they don’t speak the same language (German and Slovakian) or that in the actual finanical crisis the austrian will have less troubles because his currency is Euro and not Slovak koruna.

I think we can learn a lot through this kind of examples how to organize and access the content in our computer. The question is how will we work on a future desktop

Cubicle living rooms and buildings

Yesterday a friend told me an interesting observation about rooms and buildings where people are living. They have always flat surfaces and are mostly cubicle. This observation is independent on culture, type of usage, material they are build of or time in history (counting since we “build” houses not earth holes :-).

The interesting part is, that nothing in nature is build like that (maybe except some crystals). A surface outside in nature is always crippled, wound, interrupted by spaces and with irregular geometry like a wall made of Trees or bushes. The geometry of a cave is also very irregular and also it’s surface and represents the only natural equivalent to a room.

Why this happens he suggested that it eases the orientation for people inside the room. I think one other reason would be that our rooms and buildings are a projection of how our mind is organized and therefore we design them always in the same way even if there is no comparison in nature. Very geometrical and organized.

Sure, there are arguments for efficency or stability like: needs less material, more stable, easier to place furniture etc. But this arguments are relatively new.

I think this is an interesting observation because it has also an impact how human interface for computers should be build. An operating system and it’s programs still follow the same old concept that the tool comes first. But in real life the paper document is more important than your pencil, especially when you try to find it again after a year. When I add something to the document I use the next pencil or ball point which is available.

Do know studies about this topic which deepens our understanding?