RFC ::: Open University Respoitory System Proposal

RFC ::: Open University Respoitory System Proposal

by Thanh Le -
Number of replies: 18


I'm currently working on various Open University (UK) Virtual Learning Environment (VLE) projects; principally developing an e-Portfolio system and a Content/Document Management system (CMS/DMS) (that will be the base technology from which our e-Portfolio system will depend upon).  Both components will be built as pluggable components into Moodle.

The Open University development will provide a generic "Repository System" that will offer both CMS and DMS functionality.

We would like to offer this solution as the potential answer to the "Repository API" itemised in the Moodle development roadmap for Moodle Version 1.7.

I've followed the discussion in this forum (especially the disucssion on the DMS development) and also in the "CMS extensions and integrations" forum (http://moodle.org/mod/forum/view.php?f=811). 

I think what I'm propsing is not too far from people's thinking to date... but if I'm wrong, I'm certain you will all tell me.  Also there will be probably be debate about harmonising this proposal with past and current development.

Anyhow, the remaining post will present a rough sketch of what the Open University is proposing and I welcome your feedback and analysis of the proposal.


Repository System Proposal
--------------------------

1.0 Our Objectives:
---------------
a) to provide a "Repository Component" that can be plugged into Moodle for usage by Moodle core code or Moodle modules;

b) the "Repository Component" will offer support for CMS and DMS requirements;

c) the "Repository Component" will be provided as optional functionality; i.e. developers can choose to use it or not; i.e. we're agnostic as to whether the component is a Moodle core library;

d) the "Repository Component" will leverage international trends in the specification and standardisation of ECM/CMS/DMS solutions (ECM=Enterprise Content Management);

e) the "Repository Component" will offer an abstract interface that allows Moodle developers who uses it to connect with any number of ECM/CMS/DMS implementation (open source or commercial solutions) without knowing anything about those implementations; if that was desired;

Hopefully it's self-evident that our objectives is to provide a "strategic repository" component that will allow the Open University to connect to any number of CMS/DMS whether Open Source or commercial.


2.0 Some definitions:
-----------------
DMS: the Open University considers a DMS as a system that offers features and functions to manage documents; features include metadata support, version control, auditing, content hierachy organisation, content manipulation (edit, copy, move, etc.), etc...

CMS: the Open University considers a CMS as a system that offers features and functions to manage content; features include metadata support, version control, auditing, content hierachy organisation, content manipulation (edit, copy, move, etc.), etc...

When one manipulates documents (Word, text, spreadsheets, etc) as the lowest "content object", they are working with a DMS system.

When one is interested in manipulating the structure within documents (e.g. read templates styles in MS Word, read the XML structure within an XML document, etc.), they are working with a CMS system that offers features and fucntions to navigate from a document into the structure of a document.  Note: you can imagine XML plays a big part in the CMS world.

A Web CMS is a specialised application of a CMS for web site management.

An Enterprise Content Management (ECM) system is a glorfied umbrella name that includes CMS, DMS, Workflow systems, Web CMS, Portal technology, etc.  So for this proposal, I restrict the discussion to CMS and DMS.


3.0 The Proposed Solution
---------------------
The Open University is developing a "Repository Component" that is principally a CMS system that can be used as if it was a DMS system.

The "Repository Component" has three parts:

1) A Repository Object Data Model;

2) A Repository API for processing the Repository Data Model;

3) A Development Framework that describes how one can develop plug-in CMS/DMS implementations that can be accessed seamlessly through the "Repository COmponent";


4.0 Solution Implementation
-----------------------
The Open University is proposing the adoption of the JSR-170 specification (http://www.jcp.org/en/jsr/detail?id=170) for both the Repository Object Data Model and API.

Note: I won't delve into the details of the specification in this posting but will point readers to download the PDF specification and read sectin 4 and 5 only.  I'm happy to explain the spec if people are interestd in another posting.


JSR-170 offers a powerful data model that provides a seamless bridging between CMS and DMS and has a comprehensive set of API interfaces for manipulating the data model.

The specification was created by a consortium of people of who have developed the major commercial CMS/DMS systems and has great support from the open source community (i.e. Apache with their Jackrabbit implementation).

In taking this approach, the Open University satisfies its objectives B, and D above.


One criticism is that JSR-170 is commonly seen as a Java specification.  But with web services technology one can easily abstract the Java interfaces into web services calls to be activated in PHP.

In Considering this criticism, the Open University is proposing to implement a PHP interface equivalent to JSR-170 such that PHP/Moodle developers can make use of JSR-170 without knowing Java.  How the PHP JSR-170 will talk to CMS/DMS is upto the implementation of the PHP JSR-170.  So one implementation may be purely PHP.  Another implementation may communicate with a Java CMS via web services call in PHP with everything mediated by the PHP JSR-170 interface.

The PHP JSR-170 interface approach satisfies the Open University's objective E.


Our choice of actual coding and implementation of this solution will hopefully satisfy the Open University's objective A and C.

In making the "Repository Component" light-weight and completely optional plug-in into Moodle, we hope that this will ease the discussion as to whether this component can be the basis for the Repository API proposed in Moodle version 1.7.

5.0 Architectural View
------------------

The following diagram depicts the architectural view of what is being proposed....

Repository Architecture

Box 2 is the PHP JSR-170 interface that will offer the Repository Data Mode, API and Development framework that all Moodle core code or Moodule modules will communicate with.  This is pure PHP.

Box 2 is only an interface and requires an implementation to connect the interface to an actual CMS/DMS implementation.

Box 3 is the proposal for a "pure" PHP implementation that interprets PHP JSR-170 calls and map them to an implementation of a CMS/DMS that talks to a server file system and database.

Box 4 is the proposal for a JSR-170 implementation that connects an implementation of a JSR-170 Repository to the PHP JSR-170 component (e.g. Apache Jackrabbit, Alfresco CMS, etc.).

Box 5 is the proposal for a OKI Repository (OSID) implementation that connects an implementation of an OKI Repository to the PHP JSR-170 component.

Box 6 is the proposal for a IMS Digitital Respotiory implementation that connects an implementation of an IMS Digitial Repository to the PHP JSR-170 component.

Box 8 suggests that if the CMS/DMS implementation is not PHP, then the means for connecting PHP to anotehr platform technology (e.g. Java) is PHP web services/soap.

The diagram below demonstrates the JSR-170 data model:

JSR 170 Data Model

Note:

A workspace is equivalent to a drive letter in MS Windoes file management or can be partition into personal workspace, group workspace, enterprise workspace.  Anything is possible, the concept is simply that a workspace equals the "Root Node" in a repository node hierarchy.

A Node can be a document or it can be (for example) a root of an XML document.  Child elements of an XML documents are child nodes.

Property is the metadata of a "content object" applicable to any content granularity.


6.0 Proof-of-concept
----------------

We have a working proof-of-concept codebase that demonstrates this proposal and the architectural thinking.  This prototype connects Moodle/PHP to the Alfresco CMS.  I'm happy to share this if people are interested.

The proof-of-concept makes use of a PHP interface translation of JSR-170 called phpCR which can be found at http://www.phpcr.org/.

Taking this PHP JSR-170 interface, the proof-of-concept code has mapped the PHP JSR-170 API calls to those provided by Alfresco web services implementation (http://dev.alfresco.com/downloads/).

The bridging of PHP JSR-170 to Alfresco (which is Java) is through the PECL SOAP module as the base web services communication layer.

I've explored alternative PHP/Java bridging, but I suggest web services is probably the best communication layer.


7.0 Closing notes
-------------

This has been a long posting, but I hope I seeded enough info for people to explore the proposal and to ask me directed questions to focus in more detail about this proposal.

I've been mindful of not overloading the posting with too much detail, so depending on interest, I will be happy to expand on issues.


Thanks,

Thanh Le.

Average of ratings: -
In reply to Thanh Le

Re: RFC ::: Open University Repository System Proposal

by Helen Foster -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers Picture of Translators
Hi Thanh Le,

Thanks for your proposal. approve Just thinking that you may be interested in the discussion in the General developer forum RFC - Remote object repositories -- consolidating implementations.
In reply to Helen Foster

Re: RFC ::: Open University Repository System Proposal

by Thanh Le -

Hi Helen,

Thanks for the pointer... I've read the post you referred to on a previous occassion and concur that what is debated there has many connections to my proposal here.  I think I had earlier discussions with Martin L. which triggered his input in that discussion.

The key aspects of my proposal to consider in light of other disucssion is that:

a) I do not wish to reinvent the wheel (i.e. use standards and follow international trends, e.g. JSR-170, web services, etc.);

b) I do not wish to invent "my own" API and add to the proliferation that generally arises from lack of general consensus;

c) I do wish to propose a solution that offers a data model, an API and development framework.  This tri-part element is key to any substntial proposal for "repository";

d) I do want to offer a solution that can easily connect to any number of repositories easily; commercial, open source, java, PHP, etc;

e) I do want a light-weight repository component that can be slotted into PHP/Moodle and be used optionally; not a requisite element of Moodle core unless developers choose to;

Hope people agree with these emphasis.

Thanh.

In reply to Thanh Le

Re: RFC ::: Open University Respoitory System Proposal

by Matt Oquist -
Mmmmmm.  Standards-based.

For the record, and not based on a careful reading of the above or much though, I'm totally on-board with the idea of heading toward standards-compliance in the way you're proposing.

And for the record, if it would be helpful (and a good idea) when the time comes that an architecture and API are defined, I'm willing to help salvage anything worth keeping out of the repository I implemented.  It seems quite possible that my "Moodle Native plugin" could be adapted to work with a similar architecture such as the one you're proposing.

Again -- my work might not be worth bringing forward.  But if it is, I'll do my best to help.

Back to philosophy for now...
In reply to Matt Oquist

Re: RFC ::: Open University Respoitory System Proposal

by Martin Dougiamas -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
Thanks very much Matt. I would like see as much of your work as possible in this, and yes, it could be the "native adapter".
In reply to Thanh Le

Re: RFC ::: Open University Respoitory System Proposal

by Martin Dougiamas -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
Thanks, Thanh!

In general this is very much in keeping with the architecture I was imagining, with the addition of JSR-170 which I agree would be an absolutely important thing. Thanks for the good explanation and diagrams, which help a lot.

I have four requirements I would like to see satisfied within this model:
  1. I very much want to see the API built into Moodle and ALL file access in Moodle (with possible exception of a few system files like cache etc) converted to use the API. Box 3 should also be included within Moodle to provide at least the same filesystem functionality as we have at present (but hopefully much better smile). All of this has to be entirely GPL-compatible, obviously.
  2. The API for Moodle developers in "module-space" should be as simple as possible. While the whole JSR-170 API could still be available when needed, we definitely don't want developers worrying about workspaces and nodes, etc. There should be intermediate functions to make the most common operations (save a file, get that file back) really really easy. This is much like the model of cvs:/moodle/lib/datalib.php which simplifies the AdoDB library.
  3. It should be at least possible to write adapters for arbitrary repositories that DO NOT use web services or standards, or even support writing (eg MERLOT). ie much like Box 3 but talking via other means to something. Looking at the architecture I don't think this is a problem but please tell us if it is!
  4. Tight integration of user accounts and privileges/roles should be possible with the external repositories. Have you given any thought to this? How does JSR-170 handle that?


Lastly, in your diagram you have Moodle code bypassing the API and talking directly to repositories via SOAP (upper part of box 8). Shouldn't everything be via the API and adapters?
In reply to Martin Dougiamas

Re: RFC ::: Open University Respoitory System Proposal

by Thanh Le -
Hi Martin,

With ref: to your points:

1) The Repository Proposal will be built in PHP and will be installed within Moodle and usabale by all Moodle code.  I left it open whether developers would want it to become Moodle core.  This is a choice.  Martin, you may choose to make it Moodle core contribution; but it still remains a choice.  This approach also means anybody using PHP can use the Repository Proposal - even if they dont use Moodle.

2) There is no reason why we cannot overlay a simpler API above box 2 in my diagram which hides all the workspace and nodes and simply offer folders and files (and maybe even no versioning).  This API overlay does nothing but map code and hide the data model.  This is why I see this solution bridging the world of CMS to DMS; a Node is a folder or a file.  This is my thinking so your requirement should be met.

3) I agree that web services is not a mandatory requirement.  I suggested that currently this is the best way to bridge PHP to Java or .NET.  If you are connecting to PHP CMS, then the web services is not even required.  As part of the framework we might want to suggest and encourage certain communication/piping layer; but the actual choice is down to the developer.

4) Currently JSR170 offers only data model for storing a user credential.  Actual mapping of permissions and access to content is delegated to the actual CMS implementation.  So you would call JSR170 API and ask for this content using this session/credential.  Whether you get the content depends on the CMS roles and permissions and this would be adminstered using the CMS admin interface.  This is an OK approach if you take the federated approach.

If you take a Moodle centric view, then I would say the proposed Moodle roles and permissions development (made by the Open University) will take reponsibility for this and the JSR170 just provides access asssuming access is given by Moodle permission model.

There's more work to be fleshed out on this but there are dependencies on other project sas you can see.

5) Last point about bypassing the API.  Its may being pragmataic and acknowledging some developers may choose to bypass all of this and get straight to the guts of the CMS implementation and use all the richness of the commerical or open soucre CMS.  We may not sanction it and we may not encourage it; so the framework should acknowledge its possibility and then say dont do it.  Hence its in the diagram.

Finally, thanks for the support.  I'm hoping to get more feedback from the community and also maybe interested developers wanting to contribute to the effort.

Thanh.







In reply to Thanh Le

Re: RFC ::: Open University Respoitory System Proposal

by Thanh Le -
p.s.

I forgot to mention that I see box 3 in my diagram as the means of:

1) initially.... merging all the existing CMS/DMS Moodle modules and mapping those to the Repository API.

2) longer term...  harmonising all the existin CMS/DMS/File Moodle modules into one coherent implementation that map to the proposed Repository API.

So if you take a Moodle-centric view, then in Moodle core (eventually, maybe) will be box 1, 2, 3 and the file system/database; all written in PHP.

All other boxes in my architecture diagram are optional "adapters" to connect with other CMS implementation.

Thanh.

In reply to Thanh Le

Re: RFC ::: Open University Repository System Proposal

by Martin Dougiamas -
Picture of Core developers Picture of Documentation writers Picture of Moodle HQ Picture of Particularly helpful Moodlers Picture of Plugin developers Picture of Testers
I think you should avoid using the acronym CMS, because it's been pretty much hijacked worldwide by the Mambo/PHP Nuke/Xoops crowd etc (and this is what this forum is all about too). Also, technically Moodle is a Course Management System. wink

Let's just stick to "Repository" or "DMS" to avoid confusion.

On that note, how many Repositories are out there that actually support JSR-170 today? And are there any in PHP yet?
In reply to Martin Dougiamas

Re: RFC ::: Open University Repository System Proposal

by Thanh Le -
If we follow JSR-170 and thinking system architecture, then the more general term is "Content repository" of which Document Management/Web Content Management etc. then sits above it.

Thanh.
In reply to Thanh Le

Re: RFC ::: Open University Repository System Proposal

by Just H -
Hi Thanh

From a totally non-developer point of view sounds really exciting and something I've been hoping would happen sooner rather than later smile

2) ... (and maybe even no versioning).

Please, please, please, if this goes ahead include versioning - not being from a traditional learning environment I'm not sure how important this is in acedemic circles, but in my situation versioning is very important; whether it's for audit purposes to maintain our RTO status or whether as evidence in a legal enquiry, versioning is very important.

WIll be watching with great interest how this pans out.

Regards
H



In reply to Just H

Re: RFC ::: Open University Repository System Proposal

by Dan Stowell -
Harry - Don't worry! Thanh was merely suggesting that in simpler cases, a simpler API could hide away some of the gory details, but they'll still exist in the underlying system.
In reply to Dan Stowell

Re: RFC ::: Open University Repository System Proposal

by Just H -
Thanks for the clarification Dan, couldn't see it not having versioning but was a tadge worried there! approve
In reply to Thanh Le

Re: RFC ::: Open University Respoitory System Proposal

by Dirk Herr-Hoyman -
Hi Thanh- You've got my attention now. Back in 2004-2005, I was the lead
for the Sakai Content Working Group. Looking at JSR-170 was one of the
things this group did and it was thought to cover all the requirements
that everyone saw. If you look at http://atsosxdev.doit.wisc.edu/~dherrhoyman/cm-req/ will will see the Use Cases and Requirements.

Since that time, I've been watching the CM (content management, sorry
Martin D but this term has stuck in the larger industry) space, wondering
where things might go. I see you've got Alfresco as one of the packages you
are looking to enable. This is good, Alfresco is my #1 hot product in this
whole space. Being both open source and having real venture capital behind it (similar to AB/MySQL or JBoss or RedHat Linux).

Whether one believes that JSR-170 is a Good Thing (tm), being tied to
Java as it's heritage, I can say that it's had a lot of really good thinking behind it.
The lead is Roy Fielding, whom some of you might recall is the person behind
the HTTP specification (he did HTTP 1.1, more specifically).

I also note that you've found a way to weave in OKI APIs. This whole area is
one where I'm seeing OKI being used in several projects, not merely Sakai. You might want to drop a line to Jeff Merriman @ MIT, I'm sure he'd be interested to
hear and see more about this direction.

So, I support this general direction. Of course it would be nice to weave in the work that Matt Oquist has done, which is also good work.
In reply to Dirk Herr-Hoyman

Re: RFC ::: Open University Respoitory System Proposal

by Thanh Le -
Hi Dirk,

Thanks for the encouraging feedback.

Given the group of people involved in the development of JSR-170 and also the numerous open source CMS that support JSR-170; I have high hopes and expectation that JSR-170 will becomde the de facto interface to many open source and commerical CMS/DMS.

Its applicability will grow as soon as we remove the Java platform constraint which the proposal above is aiming to do anyway.

I agree that Alfresco is a hot product being both open source and having real venture capital behind it. Although in my proof of concept work, I found its current release support for web services and PHP web services is not fully JSR-170 compliant; although its Java implementation is much better (as one should expect given its a Java system).

So I'm going to explore the Apache Jackrabbit implementation to see how it compares to Alfresco.

Finally, thanks for the lead and info.... I will follow them up.

Thanh.

In reply to Thanh Le

Re: RFC ::: Open University Respoitory System Proposal

by Thanh Le -

Thanks to all who have commented on this proposal.

My project team are at the stage where we are beginning to build the repository starting next week.

My current thinking is we will deploy a PHP implementation and also an implementation that may demonstrate an adapter to Alfresco as primary targets.

This is a note to:

1) Call for anyone interested in working with us during this development cycle or maybe particpate in future/on-going development;  please send me a note if you can contibute your time and effort or would like imput into the design and coding of the proposal aove;

2) Final check whether people have any concerns or issues that has not been riased; otherwise we will proceed as I have planned out in the above proposal.

I hope to publicise the build as it develops and people can query and look at our work in progress.

Thanks,

Thanh.

In reply to Thanh Le

Re: RFC ::: Open University Respoitory System Proposal

by Pablo López -
Hi Thanh,

I'd like to contribute in what you're doing. I sent you an email yesterday.

For the moment I'm studying your RFC.

Cheers,

Pablo.
In reply to Thanh Le

Re: RFC ::: Open University Respoitory System Proposal

by Thanh Le -

Hi,

Its a been a while since I updated this thread, but work has progressed actively on the quiet front.

I've been working on various implementations to prove the proposal in this thread for PHP JSR170 as a viable option.  This work consists of the following implementations:
1) PHP JSR170 implementation; written fully in PHP using the filesystem for content storage;
2) PHP JSR170 implementation connecting to Day Software's JSR170 implementation;
3) PHP JSR170 implementation connecting to Alfresco CMS JSR170 implementation;
4) PHP JSR170 implementation connecting to Apache Jackrabbit's JSR170 implementation;

All of these efforts are in various progress of developemt and are not completed development.  But the range of options shows the flexibilty and options that the PHP JSR170 solution potentially offers.

I'm taking the view that it is good to let the "software code" speak for itself.  Therefore, find below some instructions for getting a PHP JSR170 instance up and running that uses the Day Software's JSR170 implementation and which can be used in PHP (and therefore within Moodle).

Given the nature of JSR170 being a back-end system/infrastructure, only play with this if you are seriously interested in JSR170 or wish to have and wish to deploy a generic repository service for Moodle.  Otherwise, given the learning curve is a little steep, you might waste your time trying to get this demo up and running.

For a cursory glance, simply look at the code files as its not too difficult to understand.  The instructions for the demo below is just to let you play for real.

 

In non-install mode, download this zip:

http://oufcnt1.open.ac.uk/~datthanh_le/app.zip

The important folder in the ZIP is the "repository" folder which contains:
"doc-jsr170" : this contains the JSR170 specification document
"interfaces" : this contains the PHP interface class files that map to JSR170 API
"crx" : this contains the PHP class files that implement the PHP JSR170 interfaces

The code uses standard PHP5 object syntax and as such you will be up and runniing in your PHP code with these few lines:

1) $repository = new RepositoryImpl();
2) $credential = new SimpleCredentials("admin", "admin");
3) $session = $repository->login($credential, NULL);
4) $node = $session->getRootNode();
5) $node = $node ->getItem("/test");


Line 1: creates the Repository object which starts everything
Line 2: creates a credential object to hold username/passowrd
Line 3: logs you into the repositrory using the credential and gives you back a session object to talk to the repository
Line 4: get the root object in the repository tree using the session object
Line 5: gets the item /test relatvie to the node

That's it.  With these few API calls, you now are navigating the repository.  Setting and getting data is all done against the objects (such as the node object in the above example).

The folder "unittest" under the "crx" folder contains more unit test files that demonstrate the API.

To attempt to install a working demo, here's the instructions:

As a background, Day Software is the technical lead for the JSR170 specification work and has developed a JSR170 reference implementation to go along with the specification which can be downloaded for free to test and play with.  Day Software also has a commercial software called CRX which can be downloaded for a 30 day trial.  For the demo described here, you need to download the CRX 30 day trial (standard edition is ok) and follow the provided instruction to install.  See:

http://www.day.com/content/site/en/index/products/content-centric_infrastructure/content_repository.html


1) Ideally, you should get the CRX software installed in a Java TOMCAT server.  The CRX software can be found as a Java WAR file which you can drop into a TOMCAT "webapps" folder.  Once install on TOMCAT, test you can access the CRX software via TOMCAT using this url:

http://localhost:8080/crx/    [ changed host name as appropriate ]

 

2) Download this zip file:

http://oufcnt1.open.ac.uk/~datthanh_le/app.zip

Unzip it to your Moodle root directory.  It should NOT overwrite any files.  You should then have an "app" folder in the root of your moodle directory.


3) Inside the "app" folder there is a TOMCAT folder.  Copy the content of that folder into your TOMCAT directory.

NOTE: this assumes the CRX software is installed under TOMCAT's WEBAPPS folder as a folder called "crx".

 

4) If you have TOMCAT+CRX Software running and Moodle up and running, then you are all set.

There is a unit test script which can be found in the install and accessed using this URL:

http://localhost/moodle-16-20070704/app/repository/crx/unittest/TestSuite.php

I suggest not attempting to run the test script immediately but study it and modify the script as appropriate.  The test script is a series of test which you can pick and choose with each one demonstrating and testing some of the JSR170 API calls.

 

Closing remarks:

I welcome comments and feedback on this work.

Also happy to offer support if someone is trying to install the demo and finding difficulties.

The demo and provided JSR170 implementation is not 100% complete.  It is work in progress and provided to demonstrate how one can use JSR170.

Thanks,

Thanh.

 

 

 

 

 

 

 

 

 

In reply to Thanh Le

Re: RFC ::: Open University Respoitory System Proposal

by Pablo López -
Any advances or achievements?

I've been trying to contact Thanh for a long time but I've received no response.

Any ideas/clues about this development?