DiGIR Provider Manual

Author: Dave Vieglais
Revision: 1.10
Date: 2003-05-20

Contents

Introduction

Distributed Generic Information Retrieval (DiGIR) is a client/server protocol for retrieving information from distributed resources. It uses HTTP as the transport mechanism and XML for encoding messages sent between client and server. It is an open source project hosted on Source Forge and is currently in late Beta stage of development. DiGIR was originally conceived to be the replacement for the Z39.50 protocol used in the Species Analyst project, but is intended to work with any type of information, not just Natural History collections. A major contributor to DiGIR is the MaNIS project.

This document is formatted as "reStructured Text", which is plain text with some simple formatting rules. The docUtils tools are able to render this text to html and a variety of other formats.

Installation

This section provides detailed information on installing a DiGIR Provider service. The steps in the process are:

  1. Identify the machine on which the service will run
  2. Ensure PHP is functioning correctly on your web server
  3. Download and install the DiGIR provider code
  4. Copy DIGIR_ROOT/www/localconfig_dist.php to DIGIR_ROOT/www/localconfig.php and edit to reflect your local installation choices
  5. Use the DiGIR configurator to complete the process and add at least one resource: http://localhost/digir/admin/setup.php

Note

It is also possible to configure the service manually (without using the configurator). In this case you should follow these steps:

  1. Copy providerMeta.xml from DIGIR_ROOT/config/example to DIGIR_ROOT/config
  2. Edit providerMeta.xml to present appropriate metadata about your installation
  3. Copy resources.xml from DIGIR_ROOT/config/example to DIGIR_ROOT/config
  4. Create a database configuration file and save it in your configuration folder (there's a template you can use to create your file: DIGIR_ROOT/config/example/digir_test.xml)
  5. Update resources.xml to point to the new configuration file

After completing the process, the following URL's will be available:

Default page:http://localhost/digir or http://localhost/digir/index.php
DiGIR Provider service:
 http://localhost/digir/DiGIR.php
DiGIRm Service:http://localhost/digir/DiGIRm.php
DiGIR WMS Service:
 http://localhost/digir/DiGIR_wms.php
DiGIR Administration Interface:
 http://localhost/digir/admin

Note

The DiGIR Administration Interface, and the DiGIRm and DiGIR WMS services are experimental and are not yet completed.

Requirements

The following are required to operate a DiGIR Data Provider service:

Web Server
This can be any web server that supports PHP. Apache or Microsoft's IIS are both suitable candidates (my personal preference is Apache 2.x).
PHP
PHP is a cross platform web scripting language. Since the DiGIR data provider is written in PHP, you need a PHP interpreter for your system. DiGIR generally works best with the latest version of PHP. At the time of writing, the latest stable release version of PHP is "4.3.1" and DiGIR is known to work correctly with that version. Version 4.2.3 or later is required for all functionality of the DiGIR Data Provider.
Domain Name for Server

It is recommended that the machine running the DiGIR Provider service has a fully qualified domain name (FQDN). If you do not have a FQDN for your server then you can use one of the many Dynamic DNS services to register a name for your server regardless of whether you have a static or dynamic IP address. One free service that has proven to be particularly stable is DynDNS.

Note that if you only have a dynamically allocated IP address, you must use Dynamic DNS services to name your machine otherwise the location entered into the directory of DiGIR providers will become unusable once your IP address changes.

This document assumes installation on a functional web server with an operational PHP environment. Detailed instructions for installing PHP are available at the PHP website.

Setting Up the Provider Environment

Is PHP Working?

Warning

Once this test has been completed, it is a good idea to delete the test.php file that you create since the information it provides can be quite helpful to someone that may not have the best interests of the web server in mind.

If you are certain that PHP is working correctly then you can skip this step. In a location that is accessible by your web server (e.g. c:\inetpub\wwwroot or /var/www/html), create a short script called "test.php" using a text editor. The contents of the script should look like this:

<?php
phpinfo();
?>

Run the script by loading it with your web browser (e.g. http://localhost/test.php). You should see a page with lots of information about your PHP installation and the web server environment. Make sure that the version is at least 4.2.3. If the resulting page looks like the script you typed above, then there is a problem. This means that the web server is not correctly determining that PHP files should be processed with the PHP processing engine. Carefully review the steps you followed when setting up the PHP engine for your web server, and make sure that the script is operating correctly before proceeding with the installation of the DiGIR provider service.

Note

Improper configuration of php can open significant security holes in your system. You should carefully examine the information provided in http://www.php.net/manual/en/security.php (especially http://www.php.net/manual/en/security.cgi-bin.php if you are considering a cgi-bin installation). If you are using a php enabled hosting service, or are running a server shared by multiple users, you should probably consider everything placed in your DiGIR configuration files to be visible to all users on the machine, as it takes a good deal of care to make configuration files accessable to the web server but not to other users on the system.

PHP Configuration Directives

The PHP interpreter engine uses a configuration file to set default operating parameters. This file, php.ini is a plain text file that is formatted like a windows "ini" file. There are a couple of adjustments, or rather settings to check in the php.ini file. On windows, php.ini is located in the $SystemRoot folder (c:\windows). On linux, it is located in /etc.

register_globals
This setting should be off. This setting does not alter the functionality of the DiGIR provider, but it does have a big impact on the security of your web server. See the discussion on php.net.
extension_dir

This entry indicates where php can find it's extensions. These are compiled libraries which are dynamically loaded by the PHP interpreter. For the purposes of DiGIR, it should be a full path to the folder that contains the extension php_mbstring.*. If you installed PHP to c:\php then the extension folder setting is most likely:

extension_dir = c:/php/extensions
cgi.force_redirect
If you are running the IIS web server, then this must be 1 or On.
extension=php_mbstring
The default configuration setting has this library commented out. If you are running PHP as a CGI application, then it is safe to leave it this way. If you are running PHP as a module loaded by the web server, then you must uncomment this entry so that it is loaded by the PHP interpretor, otherwise it will not be available to the DiGIR provider (and will cause error messages in the responses).
extension=php_sockets
This library is required by the DiGIRm service. If DiGIRm is disabled then there is no need to enable this library which implements low level internet communication functions. The default configuration setting has this library commented out. If you are running PHP as a CGI application, then it is safe to leave it this way. If you are running PHP as a module loaded by the web server, then you must uncomment this entry so that it is loaded by the PHP interpretor, otherwise it will not be available to the DiGIR provider (and will cause error messages in the responses).

There are numerous other settings in php.ini. Please review the php manual if you are curious about how these affect the operation of your PHP installation.

DiGIR File and Folders

The following examples are for a Windows installation. Please adjust as necessary for a Linux installation.

The distribution of the DiGIR Provider service will unzip to the following hierarchy. For example, if you unzip the distribution in C:\, a folder C:\DiGIRprov wil be created along with all the subfolders described below. The term DiGIR_ROOT used later in this document will refer to the file system path to the DiGIR folder (on linux, this may be /var/www/DiGIRprov for example).

DiGIR
    admin             Location of administrative tools
    cache             Temporary files will be written here
    config            Configuration files
    doc               Documentation
    lib               Various PHP libraries
      adodb +         PHP ADODB database abstraction library
      pear +          PHP PEAR libraries
      xpath +         XPath interpreter library
    log               Log files written here
    www               The DiGIR Provider Services

The contents of these folders is described in more detail below. Also indicated is whether the folder should be browseable or writable. "Browseable" in this context means whether scripts or pages contained within that folder may be retrieved via the web server when the URL of a document or script is requested by a web browser. Even though it will probably not hurt, it is generally not advisable to permit generation of "directory indexes" (file lists) when a directory rather than file is requested.

Folder:admin
Browseable:Yes (controlled access)
Writable:No
Description:This folder is a place holder for administrative tools currently under development. It will need to be accessible from the web to enable remote management of the DiGIR Provider service. It's important to use a security mechanism for restricting access to this folder (such as IP address restrictions or HTTP authentication).
Folder:cache
Browseable:No
Writable:Yes
Description:In order to improve performance, the DiGIR Provider service will write some information to disk. This folder should be writable by the process running the DiGIR Provider script (typically the user id of the web server). It should not be accessible by a web browser.
Folder:config
Browseable:No
Writable:Yes (if you want to use the DiGIR configurator)
Description:The config folder contains information about the configuration of the DiGIR Provider service. Some of the configuration files may contain sensitive information such as passwords for connecting to databases, and so this folder should be protected as necessary. The contents must be readable by the web server process, writable if you want to use the DiGIR configurator, but should not be browseable.
Folder:doc
Browseable:No (optional)
Writable:No
Description:Contains this document plus additional release notes and other documentation.
Folder:lib
Browseable:No
Writable:No
Description:Contains PHP libraries necessary for the operation on the DiGIR Provider service. The DiGIR Provider distribution should contain all the libraries required to operate, and will modify the include_path of your PHP installation during operation to override the settings in php.ini. Three sub-folders contain the PHP ADODB, a subset of PEAR, and XPath libraries. Each of these folders may have many sub-folders.
Folder:log
Browseable:No
Writable:Yes
Description:The DiGIR Provider service will record every transaction by appending information to a log file contained in this folder.
Folder:www
Browseable:Yes
Writable:No
Description:This folder contains the scripts that actually run the DiGIR Provider service. It should be accessible by web browsers, but should not be writable. Files contained in this folder are described in Appendix A.

Web Server Configuration

This section provides an overview of how to configure the web server (Apache or IIS) to work with the DiGIR Provider service. In both cases, the installation is described for Windows systems. The Apache instructions will be quite similar for installation on an Linux box.

After following the example web server configurations below, you should be able open a browser to http://localhost/digir/index.php. There will be several error messages displayed- this is normal at this stage of installation and is easily corrected by following the steps described below.

Apache

The primary configuration file for Apache is httpd.conf located in APACHE_INSTALL_DIR\conf which on windows might be c:\program files\Apache Group\Apache2\conf or /etc/httpd/conf on Linux.

The file is plain text with pseudo xml sections for configuring properties and functionality of the web server. You must read the manual to be certain of any changes you make to this file, otherwise you may inadvertantly create a security hole that will be quickly breached.

An example configuration section for Apache 2 running mod_php with the DiGIR Provider service unpacked in the folder C:\var\ is shown below. Please check that the configuration settings are appropriate for your system if you are just copying the confiruation information.

#PHP Configuration section - this portion enables the PHP interpreter


#The DiGIR administrative interface.  The settings allow access
#only from the localhost, meaning that you can only access the admin
# folder from the machine that is running the web server.  The admin
# folder is physically located at c:/var/DiGIRprov/admin, and is 
# accessible through the web browser at the address: http://localhost/digir/admin
Alias /digir/admin C:/var/DiGIRprov/admin
<Directory C:/var/DiGIRprov/admin>
      AllowOverride None
      Order deny,allow
      Deny from all
      Allow from 127.0.0.1
      DirectoryIndex index.html, index.php
</Directory>

#Alias for the DiGIR Provider service.  The settings are for a
#provider that was installed to c:/var/DiGIRprov, with the default
#folder name (www) for the location of the DiGIR php scripts.
Alias /digir C:/var/DiGIRprov/www
<Directory C:/var/DiGIRprov/www>
      AllowOverride None
      Order allow,deny
      Allow from all
      DirectoryIndex index.html, index.php
</Directory>

IIS

Assume IIS 5.x

TO BE COMPLETED

Configuration

Overview

Configuration of DiGIR involves editing of operating parameters for the service (by modifying localconfig.php), setting the description of the DiGIR Provider service (providerMeta.xml), and adding resources (connections to databases) by creating a database configuration file and adding an entry in the resource list file resources.xml. Descriptions of how these files should be modified are available in the sections that follow.

Note

You can also edit all xml configuration files using the DiGIR configurator, which is available inside the administrative tools directory: http://localhost/digir/admin/setup.php

For a new installation of a DiGIR Provider service, it will be necessary to create several required files. Examples of the necessary files are included with the distribution and can be installed by following these directions. Once the provider software is installed and the web server is properly configured to serve the DiGIR_ROOT/www folder, it is necessary to copy some default configuration files and update some settings to finalize the installation of the DiGIR Provider service.

  1. Copy localconfig_dist.php to localconfig.php
  2. Copy DiGIR_ROOT/config/example/providerMeta.xml to DIGIR_ROOT/config
  3. Copy DiGIR_ROOT/config/example/resources.xml to DIGIR_ROOT/config

Once these steps have been completed, the DiGIR Provider service will be operational, but will not yet be serving content from a database.

Operating Parameters (localconfig.php)

DiGIR operational defaults are controlled by defining constants in the file localconfig.php which overrides the default values defined in DiGIR_globals. It should not be necessary to edit any other file in the DIGIR_ROOT/www folder in order to have a functional default DiGIR provider installation.

The default settings are helpful for debugging and configuring an installation, however they are not optimal for a fully configured provider. The additional information these settings provide can also reduce the security of your installation. The following default values should be changed once your installation is functioning satisfactorily.

New Installation

Using a web browser, open the url to the location where you installed the DiGIR provider service. For the example installation in this document, this would be:

http://localhost/digir/index.php

You should see an error message something like the following:

Fatal Error due to a configuration problem. The file:
  localconfig.php
must exist in the same folder as DiGIR.php
If this is a new installation, copy
  www/localconfig_dist.php
 to
  www/localconfig.php

To resolve this problem, copy the file localconfig_dist.php to localconfig.php.

Other Commonly Modified Settings

DIGIR_CONFIG_DIR

Name:DIGIR_CONFIG_DIR
Default:../config/
Example:define('DIGIR_CONFIG_DIR','c:/my_secure_folder/DiGIR/config/');

This entry identifies the path to the folder which contains all the configuration files. Note the use of forward slashes for path seperators.

DIGIR_CACHE_DIRECTORY

Name:DIGIR_CACHE_DIRECTORY
Default:../cache/
Example:define('DIGIR_CACHE_DIRECTORY','c:/temp/DIGIRcache');

This is the location of the cache files. It is ok to delete content from this folder any time since DiGIR will simply rebuild the files as necessary, and deleting the files is the best way to force DiGIR to reload cached information (such as after editing provider metadata). Note that the size of the cache can be fairly large depending on the popularity of your DiGIR Provider service. On some systems, it is probably better to locate the cache in your regular temp folder as shown in the example.

Important: The path defined in this entry must exist and be writable by the web server service.

DIGIR_LOG_NAME

Name:DiGIR_LOG_NAME
Default:log.txt
Example:define('DIGIR_LOG_NAME`,strftime('digir_%d%m%Y.txt'));

Specifies the name of the log file for the DiGIR Provider service. Note that this specifies the file name only. To set name of the folder that contains the log files, use DIGIR_LOG_PATH. The example shows how to create a log that will use a new file for every day.

DIGIR_LOG_PATH

Name:DIGIR
Default:../log/
Example:define('DIGIR_LOG_PATH','c:/logs/digir');

Sets the name of the folder that will contain the log files for the DiGIR Provider service. This folder must exist, and be writable by the process running the PHP script.

DIGIR_STATUSPICKLE

Name:DIGIR_STATUSPICKLE
Default:DIGIR_LOG_PATH/DiGIRStatus.txt
Example:define('DIGIR_STATUS_PICKLE','c:/temp/digir/status.txt');

The name of the file that is used to cache DiGIR Provider service status information between requests. This file must be in a location that is writable by the process running the PHP scripts.

Provider Metadata (providerMeta.xml)

Server metadata describes the DiGIR Provider service and provides information on who to contact for more information about the service. The server metadata is defined in the file DIGIR_ROOT/config/providerMeta.xml. This is an xml file, and so the normal rules for xml documents apply- element and attribute names are case sensitive and special characters such as ">", "<", and quotes must be escaped.

New Installation

After creating a localconfig.php file reload the default page again in a web browser:

http://localhost/digir/index.php

An error message is generated that looks something like the following:

Provider Configuration Error
The following problems have been identified with this DiGIR
Provider installation:

No Provider Metadata
The DiGIR Provider service metadata file (providerMeta.xml) could not 
be found at the expected location
(C:\var\phpdev\DiGIRprov\config\providerMeta.xml).

If this is a new installation, you can copy the file 
C:\var\phpdev\DiGIRprov\config\example\providerMeta.xml to 
C:\var\phpdev\DiGIRprov\config\providerMeta.xml to get started.

...

To resolve this problem, copy the files providerMeta.xml and resources.xml from the config/example folder to the config folder.

If you run the default page again by reloading it in the browser, you should see a status page that lists the services that have been enabled and their load indicators (which should all be zero).

Editing Provider Metadata

name

Element:<name>
Attributes:None
Repeatable:No
Example:<name>``The Natural History Museum and Biodiversity Research Center, Univeristy of Kansas``</name>

The name element should contain the full name of the institution or organization responsible for operating this instance of the DiGIR Provider service.

code

Element:<code>
Attributes:None
Repeatable:No
Example:<code>``KUNHM``</code>

This value should provide a unique identifier (e.g. abbreviation) for the entity identified in the <name> element.

relatedInformation

Element:<relatedInformation>
Attributes:None
Repeatable:Yes
Example:<relatedInformation>``http://www.nhm.ku.edu``</relatedInformation>

This element contains a reference to where further information about the provider service and /or the institution hosting the provider may be found. This element may be repeated so multiple references may be cited. The reference should be in the form of a URL.

contact

Element:<contact>
Attributes:type, can be "administrative" or "technical"
Repeatable:Yes
Example:below
<contact type="technical">
  <name>Name of a technical contact</name>
  <title>Title of contact</title>
  <emailAddress>admin@nowhere.com</emailAddress>
  <phone>full phone number - including country code</phone>
</contact>

This element contains information who should be contacted for details about the provider service (administrative contact) or for technical aspects of the provider operation (technical contact). This element contains several sub-elements.

contact/name

Element:<contact><name>
Attributes:None
Repeatable:No
Example:<name>``Dave Vieglais``</name>

The name of a person or group that expects to be contacted about the role identified by the contact type attribute, and also the location of the contact element (contact elements also occur in resource metadata).

contact/title

Element:<contact><title>
Attributes:None
Repeatable:No
Example:<title>``Senior Scientist`</title>

Title of the contact person or group.

contact/emailAddress

Element:<contact><emailAddress>
Attributes:None
Repeatable:No
Example:<emailAddress>vieglais(at)ku(dot)edu</emailAddress>

Obfusicated form of email address for the contact. It is recommended that the email address is not written in a form that is the syntactically correct form as this information may be harvested by robots and hence subject to inappropriate or unexpected use.

contact/phone

Element:<contact><phone>
Attributes:None
Repeatable:Yes
Example:<phone>``+1 785 864 4504``</phone>

A telephone number that may be used to contact the person or group. The number should include relevant country and area codes.

abstract

Element:<abstract>
Attributes:None
Repeatable:No
Example:<abstract>``This provider provides access to publicly available information. ...``</abstract>

The abstract should provide a description of the service suitable for presenting to portal users.

Resource Configuration

Resource configuration affects at least two configuration files. The file resources.xml contains a list of resource names and the name of the file that contains the associated configuration information . This resource configuration file contains metadata describing the resource, database connection information and concept to database column matching.

The resource configuration files are XML files that define:

  1. Metadata about the resource
  2. Database connection information
  3. Database schema representation
  4. Mapping between database columns and conceptual schema elements

Provider Resource List (resources.xml)

The resource list file resources.xml is a simple xml file that associated a resource name with a configuration file. The resources.xml file has a simple structure that is best described by example:

<resources>
  <resource name="test" configFile="test_resource.xml"/>
</resources>

This resource list file has one entry that indicates that the resource identified with the name "test" has configuration information located in the file "test_resource.xml".

Note that all element and attribute names are case sensitive. The name of the resource is case sensitive. Case sensitivity of the file name is dependent on your operating system.

New Installation

After creating a localconfig.php file, load the default page in a web browser:

http://localhost/digir/index.php

If the file "resources.xml" can not be found, then an error message that looks something like the following is generated:

...

No Provider Resource List
The DiGIR Provider service resources file (resources.xml) could not 
be found at the expected location 
(C:\var\phpdev\DiGIRprov\config\resources.xml).

If this is a new installation, you can copy the file 
C:\var\phpdev\DiGIRprov\config\example\resources.xml to 
C:\var\phpdev\DiGIRprov\config\resources.xml to get started.

To resolve this problem, copy the files providerMeta.xml and resources.xml from the config/example folder to the config folder.

Resource Metadata

How to edit metadata about a resource

TO BE COMPLETED

Resource Database Connection

How to configure a resource database connection.

<configuration>
  ...
  <datasource 
     type="SQL"
     constr="Provider=Microsoft.JET.OLEDB.4.0;Data Source=&quot;c:\data\testdb.mdb&quot;"
     uid=""
     pwd=""
     database=""
     dbtype="ado_access"
     encoding="utf-8">
  </datasource>
  ...
</configuration>

The <datasource> element has the following attributes, most of which can be mapped directly to database connection parameters of the PHP ADODB Library that provides the database connection library used by the DiGIR provider:

type

This value can be "SQL" or "Z3950".

Note that the Z3950 support is partially complete and is not recommended for general use.

SQL database types can be any relational database that is supported by the PHP ADODB Library. Since this library also supports ODBC and Microsoft's ADO, almost every type of relational database that supports the SQL language can be used as a data source for the DiGIR Provider.

constr

The value specified here is used in the server parameter of the PHP ADODB database driver connection. For Microsoft ADO (OLE/DB) type connections, this will be the "Connection String" parameter normally used to connect with the database from Win32 application that use the Microsoft ADO libraries.

uid

The user ID used to connect with the database server. Note that the account information used to connect with the database is stored in clear text in the configuration file. As such, you should take appropriate measures to ensure that the file is not generally accessible (but must be readable by the web server process), or preferably, should provide sufficient database privileges to read the data required by the DiGIR Provider service but no more. Most importantly, this account should not have write access to the database.

pwd

The password used to connect with the datgabase server. See the security notes in the description of``uid`` above.

dbtype

Corresponds to the name of the PHP ADODB database driver. Possible values of dbtype are enumerated below.

TO BE COMPLETED

encoding

Specifies the character encoding of content that the database driver will provide to the DiGIR Provider code (this is not necessarily the same as the actual character encoding used by the database).

{{cross reference to php manual entry mb_string }}

TO BE COMPLETED

Example Database Connections

MySQL

The MySQL database testdb hosted on the server dataserver.mydomain.edu, accessible through the account readonlyAccount with password s#kl_01WZ:

<datasource 
   type="SQL"
   constr="dataserver.mydomain.edu"
   uid="readonlyAccount"
   pwd="s#k1_01WZ"
   database="testdb"
   dbtype="mysql"
   encoding="utf-8">
</datasource>
Microsoft Access

Microsoft Access database located at c:\data\testdb.mdb, connecting using the Microsoft OLEDB database libraries:

<datasource 
   type="SQL"
   constr="Provider=Microsoft.JET.OLEDB.4.0;Data Source=&quot;c:\data\testdb.mdb&quot;"
   uid=""
   pwd=""
   database=""
   dbtype="ado_access"
   encoding="utf-8">
</datasource>
ODBC

Database connection defined by an ODBC System DSN called "testdb":

<datasource 
   type="SQL"
   constr="testdb"
   uid=""
   pwd=""
   database=""
   dbtype="odbc"
   encoding="utf-8">
</datasource>

Microsoft SQL Server

TO BE COMPLETED

Resource Table Mapping

The table mapping section of the resource configuration file describes the relationships between two or more tables in the database from the point of view of an individual conceptual schema record. In the following example, the root of the record is contained in the table rootTable and has a unique row identifier available in the column rowid.

<configuration>
  ...
  <table name="rootTable" key="rowid">
    <table name="child1" key="child1field" join="parentfield1" />
    <table name="child2" key="child2field" join="parentfield2">
      <table name="subchild1" key="subchildField" join="child2joinField" />
    </table>
  </table>
  ...
</configuration>

The <table> element has the following attributes:

name
The name of the table contained in the database. This value is not case sensitive.
key
The name of a column in the table that provides a unique identifier for each row. This field is used internally by the DiGIR provider code when necessary to retrieve unique rows from the table, and so this field may or may not participate in a concpetual schema.
join
The name of the field in the parent table that this table will join with. In the example above, the table child1 is joined with the parent table rootTable by the fields child1field in the table child1 and parentfield1 in rootTable.

Conceptual Schema Mapping

TO BE COMPLETED

Testing

Testing a provider installation can be broken into a couple of different areas:

  1. Checking the the Provider is interacting correctly with the web server environment
  2. Checking that necessary libraries and configuration files can be located
  3. Ensuring that cache and log directories are writable by the DiGIR scripts
  4. Checking that configuration files are formatted correctly and contain the necessary information
  5. Checking that resource configuration files correctly map the resource to a conceptual schema

Web Server Environment

Tests to checking the web server environment.

TO BE COMPLETED

Library and Configuration File Locations

Basic Configuration - Browse to the index.php file. This will report problems with localconfig.php, providerMeta.xml and resources.xml. For example:

http://localhost/digir/index.php

TO BE COMPLETED

Cache and Log Folder Accessibility

Execute the script DiGIR_globals.php. This script will check the values of the service operating parameters (defined in localconfig.php and DiGIR_globals.php). It will also check that the DIGIR_CACHE_DIRECTORY and DIGIR_LOG_PATH are writable.

TO BE COMPLETED

Operation

This section provides an overview of the operation of a DiGIR Provider service.

The DiGIR Provider service is a service application that has no implicit user interface, but rather is intended for machine to machine communications. As such, human interaction with the operation of a DiGIR Provider service is generally limited to monitoring the status of the provider through review of the dynamic status information, the log files, and any diagnostic infomration that may be reported by users of the service.

Retrieving Metadata

How to retrieve metadata from a DiGIR Provider service.

Metadata is the default response of a DiGIR Provider service, and hence can be invoked without any additonal parameters. An example of a formal request for metadata is provided in the static html page eg_metadata.htm which is located in the DIGIR_ROOT/www folder. The text of a formal metadata request is repeated below:

<request 
    xmlns='http://digir.net/schema/protocol/2003/1.0' 
    xmlns:xsd='http://www.w3.org/2001/XMLSchema' 
    xmlns:digir='http://digir.net/schema/protocol/2003/1.0'>
  <header>
    <version>1.0.0</version>
    <sendTime>20030421T170441.431Z</sendTime>
    <source>127.0.0.1</source>
    <destination resource='test'>http://localhost/digir/DiGIR.php</destination>
    <type>metadata</type>
  </header>
</request>

Elements of Metadata Request Document

version
The version tag in a request is meant to indicate the version of the DiGIR protocol that is being used in the communications.
sendTime
A time stamp that records the time at which the request document is transmitted to the intended target. This information is intended to be used as a measure of network latency and performance, and so from a practical point of view, the time stamp should be created as close as is practical to the time the message is actually sent. The time stamp must include time zone information to be valid.
source
The source element contains the IP address of the machine from which the request originated. For example, if a client is accessing a DiGIR Provider via a DiGIR portal with a web browser, then the source element should contain the IP address of the client's web browser. The portal will normally retrieve this information from the internet connection parameters (e.g. the CGI environment variable REMOTE_ADDR).
destination
The destination element indicates the intended target of the message. This information could be used by DiGIR routers or proxies to forward requests to the intended target.
type
Indicates the type of operation being invoked. This can be one of metadata, search, or inventory.

Example of a formal Metadata Request document:

<request 
  xmlns='http://digir.net/schema/protocol/2003/1.0' 
  xmlns:xsd='http://www.w3.org/2001/XMLSchema' 
  xmlns:digir='http://digir.net/schema/protocol/2003/1.0'>
<header>
  <version>1.0.0</version>
  <sendTime>20030421T170441.431Z</sendTime>
  <source>127.0.0.1</source>
  <destination resource='test'>http://localhost/digir/DiGIR.php</destination>
  <type>metadata</type>
</header>
</request>

Inventory of Resource

How to retrieve an inventory for a concept within a resource.

The static html file eg_inventory.htm located in the DIGIR_ROOT/www folder provides an example of an inventory request on the concept dwc:Species for the resource named test.

Resource name
test
Schema name space
http://digir.net/schema/conceptual/darwin/2003/1.0
Inventory concept
dwc:Species
Return record count
Yes
<request 
  xmlns='http://digir.net/schema/protocol/2003/1.0' 
  xmlns:xsd='http://www.w3.org/2001/XMLSchema' 
  xmlns:digir='http://digir.net/schema/protocol/2003/1.0'>
<header>
  <version>1.0.0</version>
  <sendTime>20030421T170441.431Z</sendTime>
  <source>127.0.0.1</source>
  <destination resource='test'>http://localhost/digir/DiGIR.php</destination>
  <type>inventory</type>
</header>
<inventory xmlns:dwc='http://digir.net/schema/conceptual/darwin/2003/1.0'>
  <dwc:Species />
  <count>true</count>
</inventory>
</request>

Retrieving Records from Resource

An example of a search request document is shown below. This search returns records based on the criteria:

Resource Name
test
Schema namespace
http://digir.net/schema/conceptual/darwin/2003/1.0
Filter
ScientificName LIKE 'f%'
Record structure
Schema located at: http://digir.sourceforge.net/schema/conceptual/darwin/brief/2003/1.0/darwin2brief.xsd
Start record index
0
Maximum number of records
10
Return count of records?
Yes
<request 
  xmlns="http://digir.net/schema/protocol/2003/1.0" 
  xmlns:xsd="http://www.w3.org/2001/XMLSchema" 
  xmlns:darwin="http://digir.net/schema/conceptual/darwin/2003/1.0" 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
  xsi:schemaLocation="http://digir.net/schema/protocol/2003/1.0 
    http://digir.sourceforge.net/schema/protocol/2003/1.0/digir.xsd 
    http://digir.net/schema/conceptual/darwin/2003/1.0 
    http://digir.sourceforge.net/schema/conceptual/darwin/2003/1.0/darwin2.xsd">
<header>
  <version>1.0</version>
  <sendTime>2003-03-09T19:14:58-05:00</sendTime>
  <source>216.91.87.102</source>
  <destination resource="test">http://localhost/DiGIR/DiGIR.php</destination>
  <type>search</type>
</header>
<search>
  <filter>
  <like>
    <darwin:ScientificName>f%</darwin:ScientificName>
  </like>
  </filter>
    <records limit="10" start="0">
    <structure schemaLocation="http://digir.sourceforge.net/schema/conceptual/darwin/brief/2003/1.0/darwin2brief.xsd"/>
    </records>
  <count>true</count>
</search>
</request>

Registering a Provider in UDDI

DiGIR portals generally search the global public UDDI registry to locate DiGIR Provider service endpoints. All that is necessary to advertise your data to the rest of the world is to register the location of your DiGIR service in the global public UDDI registry. Immediately after this is done, your service should be visible to all portals (some portals cache their provider lists, so it may take a few hours for this to happen).

Follow these steps to register your provider endpoint (i.e. its URL) in the Microsoft UDDI Registry of the global, public UDDI registry. A similar procedure can be followed to register with the IBM UDDI Registry.

  1. Open the Microsoft UDDI Registry page
  2. Register yourself to use the registry administration tools. Keep a record of your logon information - there is no way to retrieve this information (and hence edit your registry information) if you are unable to log on to the registry publishing interface. Note: I discourage anyone from checking the "Log me on automatically" box. It has many unexpected consequences because of your implied acceptance of the .net Passport.
  3. Open the "Publish" page
  4. Click on the "Providers" node in the administration tree.
  5. Add a Provider and give it a name
  6. Add a service to the provider by clicking on the "Services" tab
  7. Give the service a name and press "Update"
  8. Click on the "Bindings" tab and edit the Access Point to reflect the endpoint of your DiGIR provider installation (e.g. http://speciesanalyst.net/DiGIR/DiGIR.php). Update when you are done.
Note that it is always best to register your DiGIR Provider service using a fully qualified domain name rather than an IP address. This provides a small measure of indirection that allows you to move a server without having to change the registered endpoint (simply point the name to the ip address of the new machine).
  1. Click on the "Instance Info" tab, then click "Add Instance Info"
  2. In the query box, type "Digir" and press search. Select "DiGIR Provider" from the tModel list (currently this is the only item).
  3. Ensure that the information is updated by navigating to the "Instance Info" after the changes mentioned above have been completed.
  4. Your provider end point is now registered with UDDI in a minimal fashion.
  5. Note that it may take some time (upto a few hours) for your newly entered registration information to be propogated through th public UDDI registry.

DiGIR providers are currently being registered in the global public UDDI registry. The DiGIR interface has been assigned a tModel (technical model = UUID:4DFAB7E8-6387-431D-BC20-6291E99A51A8), and your service registration should include a reference to the DiGIR protocol tModel to ensure that it can be found by portal (client) applications.

Instances of DiGIR providers are discovered by querying UDDI for services that expose the DiGIR interface, that is, the service has the DiGIR tModel associated with it.

Each DiGIR provider is shipped with a simple tool for searching the public UDDI registry for DiGIR installations. The tool is called "DiGIR_uddi.php" and resides in the same folder as "DiGIR.php". The operation of this tool is described in the tools section of this document.

Maintenance

Keeping the provider operational

Updating An Installation

How to update a provider

DiGIR Log Files

Interpreting DiGIR log files

Reporting Bugs

Bugs, feature requests, and other issues with the DiGIR Provider service should be reported to the Source Forge project bug list. It is very important to record the bugs there otherwise they will most likely be forgotten. When reporting bugs, please provide as much information about the problem as you can so that it is possible to reproduce the problem.

Every bug that is submitted, fixed, or otherwise addressed triggers an email that is sent to the mailing list digir-bugs. This is a private list, so only DiGIR developers can subscribe to the list.

Contributing to DiGIR Development

DiGIR is an open source project hosted by Source Forge. Active participation is encouraged and will be greatly appreciated by all involved. For details on how to contribute, please send a message to the digir-developers list.

Appendix A. DiGIR Provider Service Files

Description of files contained in the DiGIR Provider service folder (DiGIR/www).

DiGIR.php

Primary script of the DiGIR Provider service.

View current source in CVS

DiGIRm.php

A version of DiGIR that supports multiple targets. This provides similar functionality to a DiGIR portal, except that it provides no user interface.

DiGIR_checkConfig.php

Checks configuration. Note: this script needs to be updated.

DiGIR_clientJS.php

Generates javascript that can be of assistance for web pages that implement simple DiGIR client user interfaces.

DiGIR_config.php

Library file that reads DIGIR config files.

DiGIR_errors.php

Library that implements the error tracking and defines error constants.

DiGIR_flt.php

Library that implements the root filter object.

DiGIR_fltBuilder.php

Library implementing a filter builder. Includes parsers for DiGIR XML format and PQN (a common Z39.50 syntax).

DiGIR_fltCOP.php

Library implementing comparison operator functionality for filter objects.

DiGIR_fltLOP.php

Library implementing logical operators for filter objects.

DiGIR_getCaps.php

Library that generates the body of a metadata response.

DiGIR_getContent.php

Library that performs a search operation.

DiGIR_globals.php

Library that implements the default operating parameters of the DiGIR scripts.

DiGIR_help.php

Pumps out some simple help information when the DiGIR request operation is "help". This script needs to be updated.

DiGIR_Log.php

Library that implements the logging functionality of the provider services.

DiGIR_recStr.php

Library that implements the response record structure generator. Takes a resultset and maps it to an XML structure defined by an XML-Schema document.

DiGIR_scan.php

Library implementing the inventory operation.

DiGIR_searchResponse.php

Library that reads a DiGIR response document and returns an array representation of the response records. This library is used to support remote join functionality of the IN clause.

DiGIR_status.php

Implements a simple load monitoring system to track status of DiGIR Provider services.

DiGIR_uddiSearch.php

Script that searches UDDI for registered DiGIR providers and renders the result as html, xml, or javascript. This script is a hack and needs extensive updating.

DiGIR_utils.php

Library that implements various routines commonly used within DiGIR.

DiGIR_wms.php

Service endpoint that provides a simple implementation of an Open GIS Consortium Web Mapping Service.

eg_inventory.htm

Static html page giving an example inventory request.

eg_metadata.htm

Static html page giving an example metadata request.

eg_search.htm

Static html page giving an example search request.

index.php

Default page for the DiGIR folder. Lists the availability of the DiGIR Provider services, their revision number, and their status.

localconfig.php

User created file that sets global operating parameters for the DiGIR Provider services.

localconfig_dist.php

Example configuration file included in the distribution.

sock_http.php

Low level library that implements simultaneous, asynchronous http processing in a single thread. Provides the basis for a PHP based DiGIR Portal, and is used by DiGIRm and DiGIR_searchResponse.

testClient.php

Simple test client that generates metadata, search, and inventory requests against a single target. Targets are retrieved from UDDI. The query form is generated dynamically by reading a conceptual schema document. This script needs to be updated.

Appendix B. Retrieving the DiGIR Provider from CVS

The latest, most up to date source for the DiGIR PHP Provider can be retrieved from the CVS repository on source forge. CVS is used in the Source Forge DiGIR project to maintain code for data providers (such as the PHP Provider), portals, schemas and some documents. The DiGIR CVS repository is hosted by Source Forge. Code may be viewed via the CVS web interface, or retrieved using a CVS client.

CVS command for anonymous code (use this if you don't plan to make changes to the code) checkout:

cvs -z3 -d:pserver:anonymous@cvs.digir.sourceforge.net:/cvsroot/digir co DiGIRprov

Typical authenticated checkout (replace "UserName" with your source forge account name):

set CVS_RSH=ssh
set CVSROOT=c:\programfiles\ssh\ssh\ssh.exe
cvs -z3 -d:ext:UserName@cvs.digir.sourceforge.net:/cvsroot/digir co DiGIRprov

Updating your code with the latest CVS source:

cvs -z3 -d:ext:UserName@cvs.digir.sourceforge.net:/cvsroot/digir update

Committing changes to the CVS repository (do an Update first!):

cvs -z3 -d:ext:UserName@cvs.digir.sourceforge.net:/cvsroot/digir commit

See Also

Appendix C. Unicode

All DiGIR request and response documents are encoded with the UTF-8 charset.

TO BE COMPLETED