October 2005
Introduction > Directory Architecture Design...

Research Directory Services Architectures

   
Technology / Architecture

Understand the components and how they interact
There are three common components of an enterprise directory architecture: 1) the registry which is a database of information about each entity of significance - current and historical; 2) the interface to consumer applications -usually an LDAP directory or authentication service, such as Kerberos; and 3) the metadirectory infrastructure which controls the flow of information between systems of record, the enterprise directory components, and the consumer applications.

An enterprise directory is generally not a stand-alone service. Rather, it is a means of publishing institutional data in an easily accessible manner. As such, one or more systems of record will provide data for import into the directory. There may also be data that only exists in the directory. There will certainly be a number of users of the directory.

Below is a diagram of the core middleware in an integrated architecture. As noted in the picture, an enterprise directory comprises a number of services and processes and is typically more than one physical system.

architecture

Data enters from the left, passes through a "join" process to merge the information under the correct identifiers, and is written to the person registry, which is a database whose primary functions are identity management, reconciliation ("Is this person the same as that person?"), and cross-indexing ("Given this person's ID on system X, find their ID on system Y.") The person registry can also serve as a reference identifier for other systems. Other types of registries, such as organization registries or group registries, may also exist; registries in general are also referred to as metadirectories. Both directory and metadirectory products often come with person registries.

Finally, not all institutions have a physical person registry. Some smaller schools or those with simpler data feeds, may not need to address identifier reconciliation, or can they do it within the metadirectory intelligence before loading into the directory. Disadvantages to this approach include:

• Where are the unique identifiers assigned? In simpler cases, campuses use the identifier from the system of record as the UID.

• How will the campus assign identifiers and offer extended services to broader audiences, such as summer youth camp attendees or theater ticket purchasers? Having a separate person registry can be a place to keep these additional audiences separate from the systems of record and apply different aging/retaining rules to their data, if necessary.

For more information about person registries, see the Early Harvest Best Practices for Higher Education.

The data are then loaded into the physical directories used for authentication and attribute and group services (represented in green) and served out to the applications. The other consumers could be application or NOS-specific directories.

There are a number of questions to be considered, and they include:

• What are the data sources?

• How will the data be received (batch, real-time)?

• What are the data definitions?

• Are there attributes in the directory that will be updated from more than one source? If so, what are the rules to do the "join"?

• Will a metadirectory product be required to support the various data definitions and joins

• How will directory services be made available to campus? Will there be a centralized service? Or will there be a series of distributed services?

• How much traffic is expected? Will the directory need to be replicated to support the traffic?

• How will the consuming applications or systems access the data? Do they need to be directly provisioned or can they access the directory directly?

For more information on this and schema design, see the A Recipe for Configuring and Operating LDAP Directories.

Review campus technical infrastructure and requirements

Once the various directory service architectures are reviewed, the next step is to look at the current campus infrastructure. Which hardware platforms are already supported by the campus infrastructure? Obviously, supporting a new one will require significant human capital investment, through additional staff or training, or both. Will the network infrastructure be able to handle the required traffic? How big will the Directory (database) be, and how many copies will be in production simultaneously? Is the necessary OS and disk technology to support the high availability need of the Directory, and the expertise in configuring/using the technology, already on campus? If an open-source system is considered, can the campus developers provide the level of support required? All of these will be factors that will impose limitations on the choice of Directory Server.

The nearly universal acceptance of LDAP v.3 means that many of the major email and address book clients will communicate with any compliant directory product. However, there may be older clients that want to use the ph protocol, or finger, to read information from the directory. There are products available to translate these older protocols to LDAP, but they must be included in the overall project specification

Research current higher-ed practices
Higher education institutions share a common set of problems and have a need for some amount of collaboration and data sharing to provide services to one another, as well as to our vendors and contractors. This implies the need for a common schema definition specific to the needs of higher education. A directory schema called eduPerson begins to address this. Make sure that eduPerson is defined in your schema for each person entry in the directory to enhance interoperability. Further examples include eduMember for including group membership and eduCourse for describing courses and course components. Every institution has local needs, so create a local schema with attributes defined by local requirements. For more information about local attributes other campuses have implemented, refer to the LocalDomainPerson Object Class Study.

Look for other campuses with a similar size population and funding model that have already implemented a directory or are in the process of doing so. Networking with others on some of the tough problems often helps. The vendors of the chosen product(s) should be able to provide references. There are campus affiliations by location (such as state organization) and by commonality of purpose. Consult the national organizations - the user groups for various vendor products, efforts such as Internet2 and EDUCAUSE, both of which have middleware resources.

Research security issues and models
Campuses need to protect the directory service systems as well as the data contained in those systems from security breaches. Data must also be secured during transit. The LDAP specification allows for authenticated access to the directory, but does not make any statement about the encryption of data or passwords into or out of the directory. Will there be graduated levels of access (that is, will anonymous/public access entities see a certain set of attributes; certain authenticated users see an additional set of attributes; other authenticated users see a different additional set of attributes, etc)? How will users of the directory be authenticated? Is there an existing authentication system (certificates, Kerberos, etc.) that can be leveraged? For more information about appropriate configurations of directories, see A Recipe for Configuring and Operating LDAP Directories.

Review and decide on products
There are a number of commercial and open-source directory-service products available. In many cases, the actual directory server software is just one piece or portion of the overall solution. Other products may be implemented in addition to the directory server software, such as delegated administration tools. Decide on required additional functionality, such as multi-master replication. In some cases, products may require additional products in order to install (for example, a particular compiler version might be required for an open-source product) or to interoperate (will a tool be needed to do password sync between the existing security system and the directory?). Finally, hardware and OS platforms for the products will need to be selected. Some products will run on more than one platform. There may be constraints on performance/functionality if a second tier platform is considered.

After all these issues are taken into the consideration, the choice of products may be self-evident, based on issues of cost (purchase and support), functionality, and performance. It should be noted, however, that the selection committee should include representatives from the policy, data administration, and functional offices, as well as technical personnel. Consider that there may be product-loyal staff that will not be happy with some of the choices made by the project team.

Tools and Resources

Documents
For general information on middleware components, see Identifiers, Authentication, and Directories: Best Practices for Higher Education.

A Recipe for Configuring and Operating LDAP Directories outlines specific practices for directory design in the higher education sector.

Practices in Directory Groups offers ideas and methodologies for managing groups in directories, which is for many campuses entry-level authorization.

Directory Schemas
eduCourse offers guidance for institutions interested in expressing course and course components in an LDAP directory.

eduMember offers a way to express groups in an LDAP directory.

eduPerson Object Class and accompanying LDIF files offers a directory person schema that once installed, can be leveraged to serve inter-campus applications.

eduOrg Object Class and accompanying LDIF files offers a directory organization schema that once installed, can be leveraged to serve inter-campus applications.

LocalDomainPerson Object Class Study highlights common attributes added to local directories across higher education.