Building a CMIS REST Client Application using PHP

Now that the CMIS 1.0 Draft Specification is available for public review, many developers are starting to ask "How can I build applications that leverage CMIS?" If you are inclined to use SOAP, then you can probably use your favorite SOAP tools to build a client application. If you are looking to leverage the REST binding, then you may need to do a bit more work.

There are a number of efforts under way to build REST based client libraries for CMIS, but it will take time. This cuts to the core differences between REST and SOAP. The standards around SOAP and WSDL ensure that a lot of the scaffolding can be put in place pretty quickly for development. The CMIS SOAP binding document is about 15 pages long and all but 1 or 2 pages is boilerplate administrative information. REST, on the other hand, provides a lot more flexibility and hence the CMIS REST binding document is over 100 pages long. Any API built to the CMIS spec will have to express that 100+ pages of content as source code. In other words it will take time.

Understanding that, I thought that it would be helpful to give an example of how one would go about developing a REST based CMIS API. I have put together a very rudimentary PHP toolkit that can be used to access a CMIS client. It is far from being an proper API, but understanding the approach that I have taken should help other developers develop applications that leverage CMIS compliant repositories even if there is no available API.

The code that I am using can be retrieved from my Google Code the projet is called CMIS PHP Client Programming I am making it available under GPL Artistic license. Feel free to use as you wish. There are three files in the repository:

The PHP class for accessing the repository provided the following functionality

It provides a connection to the repository handling, basic authentication and http requests
It allows for the population of URI or XML templates by combining a template with placeholders with a hash map containing the values to be inserted in the placeholders.
It provides utilities for converting some of the AtomPub responses into data structures that are more convenient to work with from a programming perspective.

I think that the 3 items above are key to building any CMIS/REST client app, whether you are building an API or building an application without an API. As we walk through the sample code, this will become apparent.

Building the Client App

The CMIS REST binding relies on ATOM Pub documents including:

Atom Feeds
Atom Services
Atom Entries

Additionally it relies on some CMIS Specific Documents such as:

CMIS Queries
CMIS ACLs
CMIS Allowable Actions
Atom Feeds with CMIS Hierarchy Extensions

These XML documents that come back generally contain the following buckets of information:

Identifying information (object IDs, Paths, Names etc...)
Property information (ACLs, Size, Ownership, Last Update time and custom properties)
Instructions on how to interact with the object that is represented (links to check in, download, update, get more information, find parents or children).

The key ingredient to being able to successfully navigate through the repository is understanding now to leverage the links that are presented. The REST binding maps the various methods to the links that are presented for each object. The folder listing application that I wrote leverages that aspect of the REST binding. Below we will walk through the source code

Connecting to the repository

require_once('cmis_repository_wrapper.php');
$repo_url = $_SERVER["argv"][1];
$repo_username = $_SERVER["argv"][2];
$repo_password = $_SERVER["argv"][3];
$repo_folder = $_SERVER["argv"][4];
$repo_debug = $_SERVER["argv"][5];
  
$client=new CMISRepositoryWrapper($repo_url,$repo_username,$repo_password);

The $client object that is returned, does the following

Connects to the repository using the URL, username and password given.
Executes getRepositoryInfo to retrieve all of the information about the repository including:
- The links needed to interact with the repository (collections, links and uritemplates). These URLs are stored in hash tables.
- The capabilities of the repository
- Identifying information about the repository (vendor, version, repository Id).

One of the uritemplates provided allows the client to retrieve an object by path. An example XML fragment is shown below, note that the type is objectbypath.

    <cmisra:uritemplate>
        <cmisra:template>http://localhost:8080/alfresco/service/cmis/s/workspace:SpacesStore/arg/p?path={path}&amp;filter={filter}&amp;includeAllowableActions={includeAllowableActions}&amp;includePolicyIds={includePolicyIds}&amp;includeRelationships={includeRelationships}&amp;includeACL={includeACL}&amp;renditionFilter={renditionFilter}</cmisra:template>
        <cmisra:type>objectbypath</cmisra:type>
        <cmisra:mediatype>application/atom+xml;type=entry</cmisra:mediatype>
    </cmisra:uritemplate>

We map all of the repository information into a object called workspace which remains as part of the $client data structure.

Getting an object by its path

Since we are looking to list a specific folder and we pass the name of the folder into the program ($repo_folder) we need to be able to find that folder so that we can eventually get a list of its children. The code to retrieve the object is shown below:

// Use the URL template, but fill in the "path" -- this gives you the URL needed to retrieve the folder object
$folder_url =  $client->processTemplate($client->workspace->uritemplates['objectbypath'],array("path" => $repo_folder));
// Retrieves the atompub feed from the repository  -- it is a feed with just one entry but still it is a feed.
$ret = $client->doGet($folder_url);
// Convert the feed into an array of data structures
$objs=$client->extractObjectFeed($ret->body);
// Get the first (and only object) in the array
$my_folder=$objs->objectList[0];

Getting the Child Objects

The last step is to get the child objects and display some information about them to the screen.

Before we show the source code -- the XML fragment below shows the list of links that was returned for the folder object

<link rel="self" href="http://localhost:8080/alfresco/service/cmis/s/workspace:SpacesStore/i/68363b49-68fa-46c0-a269-3695fa54b730"/>
<link rel="edit" href="http://localhost:8080/alfresco/service/cmis/s/workspace:SpacesStore/i/68363b49-68fa-46c0-a269-3695fa54b730"/>
<link rel="http://docs.oasis-open.org/ns/cmis/link/200908/allowableactions" href="http://localhost:8080/alfresco/service/cmis/s/workspace:SpacesStore/i/68363b49-68fa-46c0-a269-3695fa54b730/allowableactions"/>
<link rel="http://docs.oasis-open.org/ns/cmis/link/200908/relationships" href="http://localhost:8080/alfresco/service/cmis/s/workspace:SpacesStore/i/68363b49-68fa-46c0-a269-3695fa54b730/rels"/>
<link rel="down" href="http://localhost:8080/alfresco/service/cmis/s/workspace:SpacesStore/i/68363b49-68fa-46c0-a269-3695fa54b730/children" type="application/atom+xml;type=feed"/>
<link rel="down" href="http://localhost:8080/alfresco/service/cmis/s/workspace:SpacesStore/i/68363b49-68fa-46c0-a269-3695fa54b730/descendants" type="application/cmistree+xml"/>
<link rel="http://docs.oasis-open.org/ns/cmis/link/200908/foldertree" href="http://localhost:8080/alfresco/service/cmis/s/workspace:SpacesStore/i/68363b49-68fa-46c0-a269-3695fa54b730/tree" type="application/cmistree+xml"/>
<link rel="describedby" href="http://localhost:8080/alfresco/service/cmis/type/cmis:folder"/>
<link rel="service" href="http://localhost:8080/alfresco/service/cmis"/>

You will note that there are two "down" link. In the one is for the children and the other is for the descendants. As far as I know that is the only instance where there are two links with the same "rel" attribute but different "type" attributes. For our implementation, rather than coming up with some sort of two dimensional hash map, when we create the hashmap, the descendants link is indexed under "down-tree". The code to do this is shown below.

// get the URL that we will use to get all of the children
$children_url = $my_folder->links['down'];
// Get AtomPub feed representing the child objects
$ret=$client->doGet($children_url);
//Convert the feed into list of datastructures
$objs=$client->extractObjectFeed($ret->body);
// Iterate through the list -- displaying the type (folder/document) and name of each object
foreach ($objs->objectList as $obj) {
        if ($obj->properties['cmis:baseTypeId'] == "cmis:document") {
                print "Document: " . $obj->properties['cmis:name'] . "\n";
        } elseif ($obj->properties['cmis:baseTypeId'] == "cmis:folder") {
                print "Folder: " . $obj->properties['cmis:name'] . "\n";
        } else {
                print "Unknown Object Type: " . $obj->properties['cmis:name'] . "\n";
        }
}

Running the example

Go to the project site and download cmis_ls.php and cmis_repository_wrapper.php.

Execute one of the following commands

php -f cmis_ls.php   http://localhost:8080/alfresco/service/api/cmis  admin admin / 2 #run against the company home folder of your local instance with full debugging
php -f cmis_ls.php   http://localhost:8080/alfresco/service/api/cmis  admin admin /Sites 1 #run against the Share Sites folder of your local instance with partial debugging
php -f cmis_ls.php   http://localhost:8080/alfresco/service/api/cmis  admin admin /Data+Dictionary #run against the Data Dictionary folder of your local instance with no debugging
php -f cmis_ls.php   http://cmis.alfresco.com/service/api/cmis  admin admin / 2 #run against the company home folder of alfresco's test instance with full debugging
php -f cmis_ls.php   http://cmis.alfresco.com/service/api/cmis  admin admin /Sites 1 #run against the Share Sites folder of alfresco's test instance with partial debugging
php -f cmis_ls.php   http://cmis.alfresco.com/service/api/cmis  admin admin /Data+Dictionary #run against the Data Dictionary folder of alfresco's test instance with no debugging

Limitations of the Wrapper Class

The if the folder path has spaces in it you must URL encode the folder path i.e. /Data Dictionary --> /Data+Dictionary
This will not work on Pre CMIS-1.0 repositories
There is virtually no error checking.
Your version of php must support DOMDocument and curl
I have only tested this with GET requests.

Other Things To Do

Feel free to write other test utilities with the CMIS Repository Wrapper. I will be following up later with more information about the wrapper itself, but in the meantime feel free to look through the source code. Let me know what you think about it.

The client should use domain language

Submitted by Anonymous on Tue, 2009-12-01 18:36.

Richard,

Nice PHP example. I know it is just a starting point, but just a big picture question: Don't you think the client should be a little more high level than what you are showing here? I'm thinking it should use domain language like "getRepositoryInfo", "getFolder", and "getChildren". I don't think developers using the client library should have to know about "workspaces", "templates", "gets", or, really, even "feeds".

If all client libraries that use the AtomPub Bindings try to stick to the services listed in the services section of part 1 of the spec (like the Web Services binding does) then it will make it a lot easier for people to jump between different client library implementations.

Thoughts?

Jeff Potts

http://ecmarchitect.com

Absolutely!

Submitted by richmck on Tue, 2009-12-01 23:00.

Hey Jeff,

Very good observation. This is not a full blown, proper API. I wanted this to show some of what goes on under the covers regarding the REST binding. I have seen some developers leveraging the CMIS REST Services of their favorite ECM Product by "figuring out" which URLs are being used and hard coding them into their applications. This will help those early adopters that inevitably will jump the gun and dive in before the APIs catch up. By taking this approach, their application will be able to work properly across all CMIS compliant repositories.

I am also not totally convinced that all development efforts will want to use a full blown API. In some cases, an existing framework (like Drupal or Moodle) may want to add the ability to have a CMIS repository as a back end store. In those cases, the project might opt to build their interface to the CMIS repo using the lower level components. In these cases, they will probably expose the repository using their own patterns (the ones that their development community are accustomed to) rather than exposing the repository using the domain language interfaces.