Documentum: A Huge Toolset for Content Management and More
An Evaluation of the Documentum ECM System:
A Huge Toolset for Content Management and More
By
Ms.G
(Ms.G[at]NoitacudE.com)
For
Don Turnbull’s
Knowledge Management Systems
(INF 385Q)
Spring, 2005
Introduction
Documentum is a brand name, more than a specific piece of software. The company Documentum, Inc.[1] was recently acquired by EMC Corporation[2]. The larger company now hopes to become “the ultimate information lifecycle management company” and bills itself as “the world leader in products, services, and solutions for information storage and content management.[3]” Let us recognize at this point that these sound like huge and nearly impossible goals, but we want to believe that they can be achieved. Certainly, the company intends to serve those who would call themselves information managers.
It is impossible within this paper, and perhaps within my lifetime, to explain the range of products that EMC and Documentum offer, much less how they are supposed to work together. I doubt that there is any organization in existence that is using the full suite of EMC/Documentum products. This paper will analyze Documentum’s most fundamental and “flagship” product offering: a content management system.
Note: Due to the fact that the Documentum product line is geared to large corporations, I was unable to actually try out any of the software. The information in this paper is gathered from the company’s website, marketing and training materials, and independent reviews or publications found on the Internet (listed in References section). Any information without a cited source is based on my personal experience with IT management in a variety of contexts for the past 10 years.
The Basics: A Content Management System
The basic toolset that Documentum, Inc. offers—and the one that most people recognize and understand—is a content management system. Essentially, a content management system (CMS) is a sophisticated way to manage electronic files of many sorts, for many users, in many places, with complex and detailed controls on who has what kind of access to which files. The four fundamental components of a CMS are content, a content repository, a user interface, and a database management system. Content is any kind of file that someone can create on a computer, typically things like MS Office documents, PDF files, and image files, but extending to sophisticated interrelated sets of files such as parts of a website or XML filesets (more on this later). A content repository is basically a file server, or servers, where all of these files are stored. A user interface is a window-like environment through which a user can get to and do things with the files. A database management system (DBMS) is a sophisticated “hidden” system that keeps track of large sets of related information—in this case, it keeps track of the location of all the content (files) and allows the users (through the user interface) to “find” and “see” specific files.
Terminology Note
Content management (CM) is a more recent term that encompasses activities and systems that are—or used to be—more narrowly focused, including: document management (DM), web content management (WCM), digital asset management (DAM), and records management (RM). The differences and merging among these systems will be touched upon later in this paper. At this point, note that the core Documentum CM package grew out of—and might still be considered—a document management (DM) system[4], but Documentum, Inc. refers to their product suite as a content management system, so I will do so as well.
The Documentum Content Management System
The Documentum CMS stores files in a content repository called a “Docbase.” The user interface to the Docbase is either a desktop client application or a special password-protected website called a “webtop.” Either interface mimics a typical computer’s file-manager interface, with files appearing in cascading folders and “cabinets”, etc. An important difference is that the particular file/folder organization (structure) seen by the user does not necessarily reflect the actual arrangement of files on the server, but is rather a highly-managed view of specific files, organized in a specific structure that can be dynamic (customized) for each user. Thus, each user can only see and do things within a structure that is ostensibly appropriate for that user, and other users may see a different structure and a different subset of the same files, and may have different permissions (rights) to do things to the files or folders (such as read, edit, create, delete, etc). Meanwhile, the actual files themselves are stored in one place, so everyone who needs access to a given file is definitely working with the latest and “live” version of that file. This is a critical and basic function of a CMS, but it’s only the beginning…
Actually, it’s the Documentum ECMS
In fact, Documentum prefers to call their CMS an enterprise content management system (ECM system or ECMS). The enterprise part signifies that the system does much more than the basic functionality described above, and is sophisticated enough to be used by a large company for “mission-critical” work (see Figure 1 for example). An “enterprise” label implies that the system should be:
- Reliable (should not have a lot of bugs; the system should never “go down” because of a little glitch in one file or a simple mistake by an everyday user)
- Secure (access to everything is highly manageable and not easily “hacked”)
- Granular (many details can be controlled for specific situations, but “rules” can also be applied to many users and many situations)
- Scalable (new users, large amounts of files, and new servers can be added as needed and without too much trouble)
- Interoperable (can “talk to” and “work with” other popular software systems, such as Microsoft Office, Photoshop, Lotus Notes, SAP, or Dreamweaver)
- Extensible (upgrades will not be a nightmare, new features can be added in the future, and other software systems can be developed to work with this system)
- Usable (intuitive interfaces, well-documented, training and support is available)
I cannot fully evaluate whether Documentum’s system lives up to all these expectations, but this paper will touch on some of these considerations.
Figure 1 Examples of “enterprise” content, to be managed by the Documentum ECMS.[5]
A Typical User Case: SampleCo, Inc.
In order to provide less abstract explanations, let’s imagine a simple company (named SampleCo) with a CEO (named Ms. C), a manager (named Mr. M), and two workers (named Joe W. and Jane W.). We will assume these folks have some work to do and some content to work with. SampleCo’s basic reasons for needing a CMS include:
- Workers and managers have to pass files back and forth several times before the files are “final.” Then the CEO has to approve them. Then the files may go somewhere else outside the company, or may need to be stored as records.
- Workers sometimes work on different computers or work from home.
- In recent history, they have had trouble with losing files, overwriting each others’ work, failing to have back-up copies, and not having a standard organization of files on their server.
- SampleCo wishes to keep its content secure from competitors and prevent illegal copying of their content. It also needs to track and manage the use and re-use of various pieces of content, such as images, for which copyright or ownership may belong to other entities.
This paper will show how the Documentum ECMS helps address these needs.
The Documentum ECMS: Content Management Functions
Controlling and Organizing File Access
An early problem that any organization encounters when it starts storing electronic files on a centralized file server is that of access (as in: who has access to which files). If an organization does nothing to control access, it implicitly allows all users of the system to do whatever the users wish with any of the files, including editing, moving, and deleting files. Clearly, this becomes a problem if the users do not have a common understanding and level of skill regarding the manipulation of the files. A simple and common level of management of this issue is to control file access through directories (folders) on the server, where each user is limited in access to specific directories, such as a “personal” folder and a “workgroup” folder.
When controlled directory access is implemented on a traditional file server, the issue then arises as to how to organize these directories such that everyone has access to appropriate files. Often, the sensible directory structure seems to be one that mimics the organization’s management structure. But even if a very sensible directory structure is implemented, there is inevitably a need for files to be used or transferred in some way that “crosses” an organizational grouping. A common response in these cases is for individuals to send files to each other using email or “drop boxes” (folders on the network with unrestricted access).
When individuals “work around” the typical server-based file-access system in this way, network traffic is increased, multiple copies of the same file are stored, and email systems are burdened with large file attachments. Even people who share access to the same folder on the server may prefer to email attachments, because they are accustomed to the email interface and also tend to keep “active” files on their personal computer’s “desktop.” There are a variety of other ways that users may inadvertently or deliberately circumvent the intentions of a basic file-server system.
Separating File Storage Organization from File Access Organization
Documentum offers a robust solution to most of these issues. In the first place, the Documentum ECMS separates the users’ ability to access files from actual organization of files on the server. The Documentum content repository (Docbase) uses a relational database management system (RDBMS) to keep track of files and manage access to files. The RDBMS works alongside of the actual file-storage server (technically, there are a few more layers of software functioning between the user and the actual files). Immediately, this separation allows “rules” for file access to be made to match the work that needs to be done, regardless of organizational structure. Also, it allows for different “views” of the files to be made by or for each user.
Users view the files through the Documentum desktop client software, or through a browser via the Documentum “webtop” interface (Figure 2). When a user connects to a Docbase through the client interface, they see a familiar cabinet/folder/file organizational structure, but each file may have different rules for each user, regardless of the file’s apparent place in the folder organization. In other words, files may be organized logically for the user (for example, according to task or project) without arbitrarily limiting (or failing to limit) what the user can do with any particular file.
Example: SampleCo’s File Access Management
In our example of SampleCo, suppose several users in the company need to use a document for a project. Mr. M creates the document in Microsoft Word and adds it to the Docbase. For this project, he wants Jane W. to be able to edit the file, and he wants Joe W. to be able to read the file, but not edit it. Even though Jane and Joe are “equal” in the organization hierarchy, they are not equal participants in this project. On another, concurrent project, their roles may be reversed.
The Documentum system allows Mr. M to control each user’s file access for each project without having to create a new folder organization on the server. Instead. Mr. M creates a “cabinet” for each project, to which he can “add” all files for the project (although he is really just placing a link, or reference, to the file); then, for each file and for each person involved in the project, he can allow or disallow actions such as “read” or “edit,” without moving or copying the files. This is a very basic example.
Figure 2 The Documentum desktop and “Webtop” client software interfaces to the Documentum “Docbase” (content repository).[6]
A Deep Hierarchy of Rules for Access
The Documentum ECMS provides a more-detailed hierarchy of rules for file access than most file-server systems do. The Documentum system has six file-access permission levels (rules for what a given user may do with a given file), with the following hierarchy:
- None (the user may do nothing to the file, not even see that it exists)
- Browse (the user may only see that the file exists and see the properties of the file—more on this later)
- Read (the user may open and read the file, but not change it)
- Relate (the user may read the file and “annotate” it with comments—more on this later)
- Version (the user may only edit or save a new version of the file)
- Write (the user may edit and save the file, overwriting the previous version)
- Delete (the user may completely delete the file from the system)
I will elaborate upon the ramifications of these permissions shortly, but I should first explain another important function of the Documentum system—check-in and check-out.
File Check-in and Check-out
Within the Documentum ECMS, any time a user wants to edit a file, they must “check out” the file through the client interface. The system then “locks” the file so that other users cannot try to edit it at the same time. When the user is done editing the file, the user is prompted to “check in” the file. Clearly, this is a critical function of a CMS in that it helps prevent users from trying to work on a file at the same time or proliferating multiple copies and versions of a file. The system also keeps a record of all check-in/check-out actions, so that a future user can find out who last edited a file, among other things. In fact, the system is capable of keeping track of most of the actions that can be performed with files, from reading to deleting. This is where the system begins to be powerful.
More Details: More-Powerful Content Management
It is too much to explain the full range of powerful and detailed controls that are possible within the Documentum ECMS, but a few are highlighted below.
Metadata
An increasing number of software packages (such as Microsoft Office) include some kind of “properties” embedded in the file-saving functions. Examples of such file properties include: author, date created, date last saved, and keywords. The more general term for such file properties is metadata, commonly defined as “information about the information contained within.” A file’s metadata are stored with or associated with the file.
In the Documentum system, file metadata can be browsed—just as files are browsed—by any user who has at least a “browse” permission level on the file. The metadata can be very detailed, including elements such as long titles, categories, keywords (for searching or indexing), multiple author names, or the dimensions of an image file. Custom metadata elements can be added. But more importantly, metadata can be searched.
Searching
As anyone who uses the Internet knows, the ability to search effectively within a content repository adds significant value to the content. The Documentum system offers two levels of search utility: searching properties (metadata) and searching the full text content of files. A user may only search among the properties of files to which the user has at least the “browse” access level, and may search among the full text of files to which the user has at least the “read” access level. The search utility is powered by the Verity® search technology[7] and allows complex search queries.
Granular Permissions
In the Documentum ECMS, much of the control relies in the permission level granted to users. Without sensible planning and management of user permissions, however, the system would become as dysfunctional as a traditional file server. Documentum helps manage permissions through "permissions sets” and user “groups.” User groups can be created that would logically need access to similar files, and a given user can be added to and removed from a particular user group as needed (for example, when an employee is assigned to a new project). Permission sets can be created for each file, folder, or cabinet in a Docbase, defining which user groups should have which levels of access to each item. A permission set typically includes a permission level for a “world” group (anyone who has any kind of access to the Docbase.” Permissions are set and changed by the “owner” of a file, who is typically the file’s creator, but could be a person to whom the ownership has been assigned. Like permission levels, permissions sets have a hierarchical structure. If a given user belongs to several groups that have differing permissions for a given file, the user is granted the highest permission level among those groups.