CMIS Domain Model

Categories:
Posted on: Sun, 2009-11-22 19:00

Repository

CMIS is defined around the interactions between a client application and single repository.  The repository is a container of objects (documents, folders etc..).  In order to be CMIS compliant there is some mandatory functionality that the repository must support and there are some optional capabilities that the repository may support.

In order to get started, the client will need to know the starting URI needed to access the Repository via the desired binding (SOAP or REST).  Armed with that URI, the client is then in a position to:

  • Discover the capabilities of the repository
  • Discover and navigate through the objects in the repository.
  • Discover the types of object that are in the repository.
  • Discover the vendor and CMIS version of the repository.

Repository Capabilities

There are 14 optional capabilities that a CMIS compliant repository may support.  The level of support indicated only specifies what level of support there is for the capability via a CMIS client.  Some capabilities that are listed as not being available may be available via proprietary interfaces to the repository.  Below is a list of the optional capabilities:

  • Navigation Capabilities
    • Get Descendants
      Supports the get descendants method which allows the client to retrieve objects of a folder and its subfolders in one call. The valid values for this capability are True or False.
    • Get Folder Tree
      Supports the get folder tree method which allows the client to retrieve subfolders of a folder and its subfolders in one call. The valid values for this capability are True or False.
  • Object Capabilities
    • Content Stream Updatability
      Support for updating of the content stream (the actual content or file) of a document object.  The valid values are
      • None: The content stream cannot be updated. This does not prevent updating content by checking in new versions of a document object.
      • Pwc Only: The content stream can only be updated on a private working copy when a document is checked out.
      • Anytime: The content stream can be update at any time.
    • Change Log
      Support for a change log which keeps track of all changes to objects in the repository.  The valid values are shown below:
      • None: No change log.
      • Object IDs only: The change log contains the Object ID of the changed object and the type of change.
      • Properties: The change log contains the object ID and the properties for the changed object
      • All: The change log contains more than just the object IDs and properties.
    • Renditions
      If the repository supports renditions, it may allow clients to read renditions.  The valid values are None if the repository does not allow clients to access its renditions and Read if it does allow the clients to read the renditions.
  • Filing Capabilities
    • Multi Filing
      Supports the ability to have the same document object to be filed in multiple folders. The valid values for this capability are True or False.
    • Un-Filing
      Supports the ability to have a document object that is not filed in any folder. The valid values for this capability are True or False.
    • Version Specific Filing
      Supports the ability to have different versions of the same document filed in different folders. The valid values for this capability are True or False.
  • Versioning Capabilities
    • PWC Updateable
      Indicates whether or not the Private Working Copy (checked out version) of a document is updateable. The valid values for thie capability are True or False. If the private working copy is not updatable, then the object can only be updated on check in.
    • PWC Searchable
      Indicates whether or not the private working copy of a document can be included in a query's search results.  The valid values for this capability are True or False.
    • All versions searchable
      Indicates whether different versions of the document can be included in a queries search results.  The valid values for this capability are True or False.  If this is false than only the current version of any document can be inculded in the search results.
  • Query Capabilities
    • Basic Query
      What level of support the repository has for CMIS Queries. The valid values are:
      • None: No support for CMIS queries
      • Metadata only: Only metadata queries are supported.
      • Fulltext only: Only full text queries supported.
      • Both Separate: Both are supported but you cannot query for both in the same query.
      • Both Together: You can issue a query that searches both metadata and text at the same time.
    • Join Capabilities
      What level of join support exists.  The valid values (None, Inner Only, and Inner and Outer) are self explanatory.
  • ACL Capabilities
    • ACL
      What level of support the repository provides for a CMIS client regarding ACLs.  The valid values are:
      • None: No support for ACLs via CMIS
      • Discover: CMIS clients may discover and read ACLs but not modify them.
      • Manage: CMIS clients can manage ACLs

Objects

Object represent the entities that are in the repository.  Each object has a type.  There are 4 base types defined by CMIS, Documents, Folders, Relationships and Policies.  Every object in the repository will be derived from one of these types.  An object will be identified by an Object ID and will have a set of properties associated with it.  The properties that an object has is defined by its Object Type. The CMIS specification does specify how new object types are created.

In addition to the metadata properties that define object types, there are some additional attributes that govern some of the behavior of objects within the repository they are listed below

  • Versionable:  Can be versioned (only document objects are fileable)
  • Fileable: Can be filed in a folder (folder object must be fileable, relationship objects are not fileable).
  • Queryable: Can be queried (is mapped to a virtual table). Only folder and document objects can be queryable.
  • Controllable-Policy: Can have a policy applied to it. Policies are not controllable.
  • Controllable-ACL: Can have an ACL applied to it. Policies are not controllable.

Properties

Properties are named values that are associated with each object type.  Properties are of a specific type (date, integer, text etc...).  Properties can be single valued or multi valued, required or optional. Some properties may be read only or only updatable at certain times. One point to note is that properties can have different names associated with them their display name, ID and query name may all be different.

Document Objects

Document objects represent the enties that we really come to the repository for, the content. Document objects (and only document objects) may have Content Streams (the actual file associated with the document).  In some cases it makes sense to have document objects without content streams. Content streams exist only as part of a containing document object.  The content stream will have a mimetype associated with it. In addition to a content stream, a document object may contain one or more renditions (alternate views of the content).

Documents objects are also the only objects that are versionable, or for which versions can be exposed via CMIS. Each version of a document object will have their own object ID.  All versions of a document make up a Version Series and will share a Version Series ID.

Folder Objects

Folder objects are containers used to organize the document objects within the repository.  With the obvious exception of the root folder, folder objects must have one and only one parent folder.  A folder has a folder path that is automatically generated representing its place in the repository's hierarchy. A folder object may be defined in a way the limits what object types can it can contain (for example, an accounting related folder could be defined to only contain document objects of type invoice). A folder object may have renditions (for example a folder may have a thumbnail as a rendition representing what is in the folder).

Relationship Objects

Relationship objects define a non-invasive two way relationships between two objects (source and target) in the repository. Manipulating the relationships should not effect any changes to either the source or target objects.  Relationship objects are optional for CMIS compliant repositories.

Policy Objects

Policy objects are optional repository specific objects that can be applied to controllable objects. The behavior of policies are not modeled by the CMIS specification.  A single policy object may be applied to multiple controllable objects and a single controllable object may have multiple policies applied to it.  In order to preserve referential integrity, a policy object can not be deleted if it is applied to one or more controllable objects.

Renditions

Renditions are alternate views of the content stream such as previews, PDF renditions and thumbnails. It is also possible to have a thumbnail rendition object without content streams (i.e. folders). Renditions attributes must include a Stream ID and a mimetype. Additional common attributes for rendition are length, title and kind.  The only kind of rendition that the CMIS specification defines is a thumbnail.  Thumbnail renditions should only include height and width as attributes.  The repository may define its own rendition types in addition to thumbnails.

Renditions cannot be queried unless they have a Rendition Document ID, that allows them to be exposed as documents.

Access Control

Access control is used to specify who can do what with an object in the repository.  If the repository supports access control then access control lists are applied to each object within the repository.  Access control lists specify what types of access or permissions (read, write etc..) to an object are given to groups or users (known collectively as principles).  CMIS defines three permissions cmis:read, cmis:write and cmis:all.  When setting an ACL cmis:user can be used to represent the current authemticated user.

Change Log

The repository may have an optional change log that contains an entry for each change made to content in the repository.  Each entry has a Change Log Token.  The repository must expose the latest change log token if it support change logs. Change log entries include the object ID and the change type (created, updated, deleted or security). Armed with a change log token, a client could retrieve the list of objects that have been changed since the change was made.

A change log need not contain every change for the life of the repository, but it must contain every change made since the earliest change in the log.