Implementing An Organizational Directory Service

5. Design the Database

The content and structure of the database will affect the way the directory is used. Aim to make the database as accessible (easily queried) and informative (rich in content) as possible. Weighed against this will be the need to facilitate the equally important directory management task. Working towards this can be difficult as putting more information in the database inevitably leads to a greater administrative overhead.

It is important to note that once the data model has been designed and the directory populated, it will require some effort to redesign and rebuild. More so if the directory names contained have already been recorded elsewhere as any restructuring will result in names becoming invalid. Thus, it is essential that a satisfactory design (there is probably no such thing as a perfect design) is found before the database is put forward as service. Don`t expect initial designs to fulfil the task - refinements will inevitably be made as experience is gained on the relevant issues and user`s have had a chance to give real feedback. Any data model you come up with should be assessed on the following factors:

Information content. Does it tell users what they want to know?
Usability. How easy will it be to search the database? Remember if search criteria are minimal then a combination of browsing and searching by the user may be required.
Manageability. How difficult will it be to maintain the database? If the directory hierarchy is complex, e.g. reflecting management structure, then keeping things up to date may become problematic.
Security policy. An involved access control policy might be difficult to implement and will result in extra management overhead.

User considerations should be paramount. Specifically, this means that the database should contain the data that people need, and that the hierarchy should be structured in a natural and logical way. This enables the user processes of searching and browsing. Service contact points of interest to external users could be held in a dedicated subtree in order to make browsing easier. The internal management hierarchy, which would solely be of internal use, could be held in another.

The database design should also be strongly influenced by management and security issues. It may be advantageous to keep public and private information distinct in the database. This makes sense from the security and management perspectives as it eliminates inter-dependency. Implementation of the security policy in particular would be simplified by such an approach as access restrictions can then be applied prescriptively to relevant subsets of information.

The database is organized by a set of rules called the `directory schema'. The schema defines how entries in the tree are organized, the types of entry held and the types of information that they contain. The schema definition consists of a number of rules:

Entry Content. Entry types in the directory are defined by a set of associated object classes. Each object class tells the directory what attributes can be held in that entry. The person object class, for instance, must contain values for the surname and commonName attributes. The organizationalPerson object class allows an entry to contain, amongst other things, e-mail addresses and telephone numbers. A set of standard object classes exist, though this is extensible.
Entry Naming. Entries are named using one or more 'distinguished` attributes from the entry. A personal entry will always be named using the commonName attribute as a minimum.
Structure Rules. Each node in the hierarchy is permitted to parent entries with any of a given set of object classes. This rule could be used to define, say, that person entries are only allowed to exist beneath departmental (organizationalUnit) entries. This enforces a consistent directory hierarchy.

5.1 Information Content

Start by listing the objects that the directory needs to represent. A White Pages service will contain personal entries. If service contact points are to be included then `role' entries should be added. Representing the organizational hierarchy will mean inclusion of departmental entries (organizational units). Locality entries will be needed if the organization has more than one site and each site is to be listed.

Entry Type	Key Object Class	Associated Attributes
Person	person	commonName, surname
Department, Office	organizationalUnit	organizationalUnitName
Place	locality	localityName
Organization	organization	organizationName
Role, Job	organizationalRole	commonName, roleOccupant

Table 5.1 Key Object Classes (naming attributes in bold)

Once this list has been formed you should go on to specify the data types that entries will support. Begin by defining the set of core object classes that will be used. Remember that the attributes that an entry can contain are specified by the object class of that entry (see section 2.2.2 for an explanation). Each entry will have a key object class that signifies the actual entry type, i.e. personal entries will be based around the `person' object class. The key object class is normally used to define an entries naming attributes. Table 5.1 lists a set of common entry types, corresponding base class and the attributes they define.

Further object classes will define the data properties of an entry. An example is the `labeledURI' object classes, which is used to add support for World Wide Web URLs to an entry. Some supporting object classes can be applied generically to all entry types (`labeledURIObject' is an example). Some, however, only apply to given classes of entry. The `organizationalPerson' is an example. Table 5.2 lists a few of the standard object classes, with associated attributes, that you should consider.

Table 5.2 Supporting Object Classes

Object Class Additional Attributes

organizationalPerson facsimileTelephoneNumber, telephoneNumber, postalAddress, description, businessCategory, seeAlso, userPassword

newPilotPerson rfc822mailbox, roomNumber, userClass, homePhone, homePostalAddress, secretary, personalTitle, preferredDeliveryMethod, janetMailbox, otherMailbox, mobileTelephoneNumber, pagerTelephoneNumber, organizationalStatus, mailPreferenceOption, personalSignature

labeledURIObject labeledURI

Table 5.3 Common Attribute Syntax

Syntax Name Description Typical Usage

CaseIgnoreString A textual string supporting T.61 encodings for international characters. Case ignored when matching. CommonName, surname, organizationName

CaseIgnoreList A list of strings each obeying the syntax and match rules of CaseIgnoreString.

PrintableString A printable string. serialNumber

PostalAddress Similar to CaseIgnoreList, except that only six lines are permitted and each line must be no longer than 32 characters. PostalAddress

NumericString A string of numeric digits. x121Address

CaseIgnoreIA5String A string of ASCII characters. rfc822Mailbox, domainComponent

DN A directory name. SeeAlso, secretary, roleOccupant

You will now have a list of core data elements. Consideration should now be given to any further information not supported as standard by the directory. Some of this data, such as employee or payroll IDs, will have purely local scope. Other attributes are of general use but are nonetheless unavailable in the core schema. No attribute type in the recommended schema explicitly defines video conferencing numbers for example. In order to handle unsupported attributes types you may have to define custom object classes and attribute types.

The contents of an attribute are structured according to an attribute syntax. The syntax defines the format of attribute values together with the rules used to match against that attribute when a search operation is performed. The `commonName' attribute has a syntax of `caseIgnoreString'. This means that the attribute can contain a standard set of character values, and that any matches against the attribute value will be performed without comparing case. Table 5.3 lists a few of the more common syntaxes, together with examples of the attributes that use them.

To illustrate all of this let`s consider a locally required attribute named `employeeID' which will contain every staff member`s identifier (you might want this in the directory in order to correlate it with the personnel database). In order to add this attribute as an extension of the local schema you`ll have to come up with the following information.

The syntax for the `employeeID' attribute. It is often adequate to use a generic syntax, such as `caseIgnoreString'. It may be better to utilise one that exactly describes the target data item. If the `employeeID' is just a number, then the `numericString' syntax should be used. It is possible to define new syntaxes, though these will require modifications to the directory server and user interface software.
The object class that attaches `employeeID' to an entry. We`ll call this `localEmployee'. The object class has to define `employeeID' as a mandatory attribute or an optional one. If it`s specified as mandatory then any staff entry created with the `localEmployee' object class has to contain a value for `employeeID'.

Be wary that any attributes you define locally may not be supported by external directory services and thus will not necessarily be visible to external users. For this reason try not to extend local schema if the attributes defined contain generally useful information, especially if that information can be shoe horned into standard attributes. Although not always appropriate, the employee identifier outlined above could be stored in the `uniqueIdentifier' attribute.

5.2 Organizing the Database Hierarchy

The design of the hierarchy is likely to be a trade off between the needs of the user and ease of data management. On the one hand the directory should contain all data required, be easily searchable and logical, and on the other information needs to be organized in such a way that the database is easily updated and, if necessary, merged with other sources. The security policy will also impact management aspects of the design.

The X.500 directory standard contains a recommended hierarchy. This structure assumes that organizations are divided into organizational units (departments) and/or localities. Entries for people and roles are then held beneath organizational unit entries. This model is common in directory services implemented to date. However, the suggested model was devised by virtue of the fact that it reflected real world structure and to a certain extent, came about without a great degree of emphasis for management. In fact, several models can be used:

Organigram. Similar to the X.500 recommended model. Here the tree follows an organization`s management structure.
Locality oriented. This also has roots in X.500.
Data oriented. Here the DIT is structured into data units, where each unit contains a information on the basis of its function or management role within the directory.

Figure 5.1 Directory Hierarchy Based on Management Structure

The organigram model is structured around corporate management hierarchy. Upper levels consist of divisions, with subordinate departments below them (see Figure 5.1 for an example). Taking this to the furthest extent the tree could then consist of smaller management units.

This style of DIT is useful as directory names will contain information about the relevant position within the organization. This is especially beneficial for role entries, as the directory name of the entry will encode wider characteristics of the role (consider an entry with the RDN "CN=Manager"). This approach has associated drawbacks from the management perspective:

If many levels of organizational structure are reflected in the DIT then management overhead will increase as the database will be subject to greater change (due to movement of staff).

If the directory is to slave data from a personnel database then mapping entries back into the appropriate position in the management hierarchy will be difficult.

Figure 5.2 Locality Orientated Directory Hierarchy

A locality centred approach may be appropriate for geographically distributed organizations, especially multinationals. This approach has similar management properties to the organigram approach already outlined. It is, however, less helpful for users as directory names contain less useful information. For this reason a purely locational approach will probably have little real value, and some management structure is likely to be included, as shown in Figure 5.2.

This model is appropriate when the task of managing the directory is distributed in similarly locality oriented fashion. In Figure 5.2, data for the `London' office may be better defined and maintained by IT staff at that site, etc.. Organizing the hierarchy geographically would then simplify the task of assigning management responsibility.

Figure 5.3 Data Organised by Access Area

In the data oriented model information is organized according to use and access. Doing so can have benefits for manageability and also for usability. Separating private and public data in this way makes secutiry mechanisms simpler to implement because access controls are easily applied to all entries in a subtree. In Figure 5.3 access control could be applied to the `ou=People' subtree (hide a particular attribute for unauthenticated access, for instance), with different restrictions applied to the `ou=Public Contacts' subtree.

The data oriented model has several handy features:

Listings of specific data are easily browsed and searched, as with the `ou=Public Contacts' area in Figure 5.3.
Data management responsibility can be assigned in a meaningful way. White Pages data (the `cn=People' subtree in Figure 5.3) and functional data (`cn=Public Contacts') could be maintained by different staff. Write restrictions enforcing this distribution of management are easily implemented using subtree-wide access controls.
Looking at Figure 5.3 again, the `ou=People' subtree can be updated in relatively straightforward fashion (i.e. by deleting the subtree and reloading from a master source).

5.3 User Accessibility

Aim to make the data as accessible as practically possible. This means that the database should be readable when browsing, and that the data should be amenable to ordinary search criteria. There are a number of steps that can be taken to accomplish these goals:

Friendly naming. Ensure that entries are named in a meaningful way. This ensures that users browsing the database know what they're looking at. In particular avoid the use of abbreviations and acronyms in names - these will be impenetrable to users who haven't encountered them before.
Use common variants. Ensure that entries contain variants of names that are likely to be used as search criteria. An entry named `cn=Joe Sidney Soap' should also contain values for `cn=J Soap', `cn=Joe Soap' and `cn=J S Soap'. This will increase the probability of the entry being located by a search.
Language variants. If the directory product you select supports different character sets then you may wish to store multi-language versions of the data you store. This is also useful where personal names contain extended characters, e.g. `Müller' versus `Muller'.

Object Class	Additional Attributes
organizationalPerson	facsimileTelephoneNumber, telephoneNumber, postalAddress, description, businessCategory, seeAlso, userPassword
newPilotPerson	rfc822mailbox, roomNumber, userClass, homePhone, homePostalAddress, secretary, personalTitle, preferredDeliveryMethod, janetMailbox, otherMailbox, mobileTelephoneNumber, pagerTelephoneNumber, organizationalStatus, mailPreferenceOption, personalSignature
labeledURIObject	labeledURI

Syntax Name	Description	Typical Usage
CaseIgnoreString	A textual string supporting T.61 encodings for international characters. Case ignored when matching.	CommonName, surname, organizationName
CaseIgnoreList	A list of strings each obeying the syntax and match rules of CaseIgnoreString.
PrintableString	A printable string.	serialNumber
PostalAddress	Similar to CaseIgnoreList, except that only six lines are permitted and each line must be no longer than 32 characters.	PostalAddress
NumericString	A string of numeric digits.	x121Address
CaseIgnoreIA5String	A string of ASCII characters.	rfc822Mailbox, domainComponent
DN	A directory name.	SeeAlso, secretary, roleOccupant