Version
Last Updated: Friday, December 26, 2003 5:15 PM
dbXML is a Native XML Database (NXD). NXDs are databases that store XML using an internalized format for faster overall processing. dbXML was developed using the Java 2 Standard Edition version 1.4, and should operate properly on all platforms to which J2SE 1.4 has been ported.
Throughout this documentation, there will be formatting conventions that are utilized to convey specific things, such as example source code, terminal sessions, and hints. These formatting conventions are defined as follows:
A terminal session is demonstrated using monospaced font in front of a black background. For example:
Collection: /myCollection
/myCollection>
Terminal instructions are usually included in a terminal session to instruct the developer to type what is designated in the bold yellow typeface. For example:
Collection: /myCollection
/myCollection> ls test*.xml
test1.xml
test2.xml
test39.xml
Document notes are added in locations where the note itself may not be vital to the document, but will be informative for other reasons. For example:
Reminder:
Don't forget to wash your hands after
you delete your XML documents
Example source code is displayed using a monospaced font in front of a light gray background. For example:
public static void main(String[] args) {
System.out.println("Hello world!");
}
The installation file for UNIX variants (Solaris, Linux, Mac OS X) is called
1) Untar the file to your installation location of choice. Expanding the file will result in a subdirectory called . For beta purposes, it is a good idea to install into a location under your home directory, rather than a privileged location such as /usr/local.
2) Set the DBXML_HOME environment variable to the location where you untarred the installation file. If you installed it in "", then your DBXML_HOME variable should be "". Under the bash shell, this may be done with the following command:
$
export
DBXML_HOME=
3) Set the PATH environment variable to point to the DBXML_HOME/bin directory. Under the bash shell, this may be accomplished with the following command:
$
export PATH="$DBXML_HOME/bin:$PATH"
4) You are now ready to start up the server. This can be done by invoking the dbxml.server script. Under the bash shell, you might type the following:
$
cd $DBXML_HOME
$
./dbxml.server start
5) Connect to the server to test it out using the dbXML command line tools:
$
dbxml
> connect
user=
pass=
Connected
/> col /system/SysConfig
Collection: /system/SysConfig
/system/SysConfig> show database.xml
... a document should be displayed ...
/system/SysConfig> shutdown
/system/SysConfig> exit
The installation file for Windows is called
1) Unzip the file to your installation location of choice. Expanding the file will result in a subdirectory called .
2) Set the DBXML_HOME environment variable to the location where you unzipped the installation file. If you installed it in "", then your DBXML_HOME variable should be "".
>
set
DBXML_HOME=
3) Set the PATH environment variable to point to the DBXML_HOME\bin directory.
>
set PATH="%DBXML_HOME%\bin;%PATH%"
4) You are now ready to start up the server. This can be done by invoking the startup.bat script.
>
cd %DBXML_HOME%
>
startup
5) Connect to the server to test it out using the dbXML command line tools:
>
dbxml
> connect
user=
pass=
Connected
/> col /system/SysConfig
Collection: /system/SysConfig
/system/SysConfig> show database.xml
... a document should be displayed ...
/system/SysConfig> shutdown
/system/SysConfig> exit
Warning!
By default, dbXML installs with two users and two roles. One User is named
''
and is granted the Role of 'admin'. The other User is named
''
and is granted the Role of 'guest'. This
''
user, by default, has read access to the dbXML
stylesheet Collection. The system administrator may wish to remove this User
and its assigned Role.
This section discusses some of the core concepts in dbXML. These include how dbXML manages collections of documents, how collections are indexed, how queries are executed, and how the server can be augmented using extensions and triggers.
dbXML manages documents in collections. Many collections can be created and managed at one time. Collections can also be laid out in a hierarchical fashion, much in the same way that an operating system's directory structure works. A single collection may be associated with multiple indexes, extensions, triggers, and child collections. Also, a collection may store XML documents or binary streams.
The underlying storage engine of a collection is called a filer. By default, dbXML uses a native filer called a BTreeFiler, but other filers are available for specific purposes. These include MemFiler, FSFiler, and DBFiler:
Regardless of the filer that is used, dbXML exposes the content using the same programming interfaces, so the underlying filer should be transparent to the developer.
Though the documents in a collection do not need to be bound by a common schema, it is good practice to make a single collection contain similar documents to ease indexing, and querying against that collection.
Note:
Currently, dbXML collections
are schema independent, and it is up to an application developer to perform
schema-based validation against documents being stored in a collection.
A future version of dbXML will include internal XML schema support.
dbXML collections can store either XML documents or binary streams (records), but not both at the same time. XML documents can be stored as binary streams, but won't benefit from tokenization, compression, and indexing. It is important to understand that dbXML is not a multimedia database, and so storing massive binary streams is not recommended. It is probably a good idea to limit binary streams to no more than 500 kilobytes.
Collections may have multiple indexes associated with them. An index is a file structure that is used to allow optimized retrieval of documents in a collection based on the structure or values in those documents. dbXML currently provides three types of indexers. These are ValueIndexer, NameIndexer, and FullTextIndexer. Indexes are created based on element and/or attribute patterns.
Collections of documents and indexes for values in those documents aren't much use if you don't have a way to query those documents or portions of them. dbXML provides several query resolving systems for you to do this. Query resolvers are registered with the entire database. Queries are executed against specific collections or documents within a collection, and are referenced by a style name. dbXML supports the following styles:
Extensions are a way of adding extra capabilities to the dbXML server. Extensions are Java classes that implement the Extension interface, and whose public methods are exposed as web service endpoints. Triggers and other extensions can also reference extensions. It's important to remember that only public methods that take a specific subset of generic parameters can be exposed as web service endpoints.
A trigger is a Java class that implements the Trigger interface. This interface specifies several methods that are to be implemented to handle triggered callbacks from a collection. A collection fires a trigger before and after the following events:
The triggered events that are fired before insert, update, and deletion may be vetoed by the trigger implementation. Triggers also give the programmer the opportunity to perform tasks such as validation, and document modification before the document itself reaches the database.
Presently, dbXML provides transactional journaling capabilities at the direct API level. At this level, all database actions must be executed using a Transaction reference, and only commits are allowed. These capabilities are not yet exposed via dbXML's client/server APIs, which is both a good and bad thing.
For client/server tasks, transactions are a detail that an application developer can conveniently ignore. The dbXML client/server APIs automatically instantiate and commit a transaction for every database call.
Note:
Future versions of dbXML
will include fully coordinated transactions at the client/server layer,
but
for now, journaling transactions can only be controlled using the direct API.
The dbXML command line is a shell-like interface that provides database, collection, and document management capabilities for interactive and scripting purposes. To start the dbXML command line tools, be sure you have your DBXML_HOME environment variable properly set (see Quick Installation Guide) and also be sure that your PATH environment variable includes the 'bin' directory that is located under DBXML_HOME. The dbXML command line includes a help system that can guide you through using these commands.
Connecting to a dbXML database using the command line tools is pretty easy, you just use the connect command. The connect command controls both which dbXML driver type to use and the host and authentication parameters required to connect to a server using that driver type. By default, the connect command will use the XML-RPC drivers to connect to the local machine on dbXML's standard port using no authentication parameters. This would be perfectly fine except for the fact that dbXML's standard security model requires a user name and password to authenticate against the database, so you're going to need to provide a username and password. For example:
> connect
user=
pass=
Connected
This will connect you to the local database by attempting to log you in as username '' with a password of ''. In dbXML, '' is a bootstrap user whose role is to administer the newly created database.
Did You Know?
For database history buffs (all
three of you), the username 'scott' and
password 'tiger' were once used to login to Oracle's demo schema, and
refer to one of Oracle's founders (Bruce Scott) and his cat (Tiger).
You can connect to an alternative computer by adding host and (optionally) port parameters:
> connect
user=
pass=
host=myhost.mydomain.com
port=
Connected
Command | Description | Usage |
CONNECT | Connect to a dbXML server | CONNECT [connection properties] |
DISCONNECT | Disconnect from the dbXML server | DISCONNECT |
Managing and navigating the collections of a dbXML database is a relatively straight-forward process, and is very similar in many respects to managing and navigating UNIX directories.
Creating a collection is as simple as the following example:
/> mkcol myCollection
Collection 'myCollection' created
This will create a collection under the database's root collection called 'myCollection'. The default collection type stores compressed XML documents. If you'd like to create a collection that can store binary streams, you could add a type parameter to the command.
/> mkcol myCollection type=binary
Collection 'myCollection' created
Command | Description | Usage |
LSCOL | Lists the child Collections of a Collection | LSCOL [wildcard] |
COL | Set the default Collection | COL 'collection' |
MKCOL | Create a new Document Collection | MKCOL 'collection name' [properties] |
RMCOL | Remove the specified child Collection from the Collection | RMCOL 'collection name' |
dbXML currently supports three possible security managers.
There are three sets of commands related to security that one must become familiar with.
Note:
The command line tools are built around
the capabilities of the default security manager. If you are not using the
default security manager, using
these commands
to
modify
user, role, and access control data will succeed, but will have no effect, because
the
other
two
security
managers
do not utilize this data in any way.
When you create a collection in dbXML, that collection will initially be inaccessible to the outside world. The database will not automatically add a default set of permissions to the creating user, so it is the administrator's role to do so. The following example demonstrates the process:
/> mkcol myCollection
Collection 'myCollection' created
/> col myCollection
Collection: /myCollection
/myCollection> grant admin READ WRITE EXECUTE CREATE
Grant Successful (READ, WRITE, EXECUTE, CREATE)
Command | Description | Usage |
LSUSER | Lists the available Database Users | LSUSER [wildcard] |
ADDUSER | Adds a new User to the Database | ADDUSER 'user id' |
USER | Modifies User attributes | USER 'user id' 'action' [parameter] |
RMUSER | Removes a User from the Database | RMUSER 'user id' |
LSROLE | Lists the available Database Roles | LSROLE [wildcard] |
ADDROLE | Adds a new Role to the Database | ADDROLE 'role id' |
ROLE | Modifies Role attributes | ROLE 'role id' 'action' [parameter] |
RMROLE | Removes a Role from the Database | RMROLE 'role id' |
ACCESS | Displays the Collection's Access Control List | ACCESS [role id] |
GRANT | Grants permissions to a Role | GRANT 'role id' 'permissions' |
REVOKE | Revokes permissions from a Role | REVOKE 'role id' 'permissions' |
Effective index management is the key to performance when developing applications with dbXML. A well placed index can mean the difference between a query resolver having to check three documents instead of three hundred thousand. The type of index that is used is also important.
There are currently three types of Indexers in dbXML.
Index patterns are simple expressions used to identify the elements or attributes that will be evaluated by a particular index. There are six possible pattern combinations.
Index Patterns also support the ability to be namespace qualified. This is done by prepending the element or attribute name with its corresponding namespace URI surrounded in square brackets (ex: [http://www.w3.org/1999/xhtml]title). This type of pattern should not include a namespace prefix, as they are not important to namespace resolution. Also, it is important to not include spaces in these patterns.
Note:
It's very easy to look at an Index pattern and mistake it for an XPath expression.
While the syntax is intentionally meant to resemble XPath expressions,
Index patterns are not governed by the XPath syntax. What this means is
that an Index pattern of '/foo[status='active']/bar@id' will not
work. In this case, try 'bar@id' instead.
The following example creates five indexes on a collection:
/myCollection> mkidx myElem_ID_Value
pattern=myElem@id type=int
Index 'myElem_ID_Value' created
/myCollection> mkidx myElem_Name pattern=myElem
type=name
Index 'myElem_Name' created
/myCollection> mkidx myElem_Text pattern=myElem
type=fulltext
Index 'myElem_Text' created
/myCollection> mkidx anyElem_Status_Value pattern=*@status
type=trimmed
Index 'anyElem_Status_Value' created
/myCollection> mkidx nsElem_Value pattern=[http://www.dbxml.com/uri]elem
Index 'nsElem_Value' created
The first index, 'myElem_ID_Value', creates an integer coerced value index for all 'id' attributes of elements named 'myElem' in the collection.
The second index, 'myElem_Name', creates a name index for all documents that contain an element called 'myElem'.
The third index, 'myElem_Text', creates a full text index on the collection for all elements named 'myElem'.
The fourth index 'anyElem_Status_Value' creates a trimmed string index for all 'status' attributes of ANY element that is encountered in the collection.
The fifth index 'nsElem_Value' creates a defaulted (string) value index on all 'elem' elements that belong to the 'http://www.dbxml.com/uri' namespace.
Command | Description | Usage |
LSIDX | Lists the Indexers of a Collection | LSIDX [wildcard] |
MKIDX | Create a new Collection Index | MKIDX 'index name' [properties] |
RMIDX | Remove the specified Indexer from the Collection | RMIDX 'indexer name' |
Managing and indexing collections are a wonderful thing, but they do you no good if there's nothing in the collection to be indexed. Fortunately, the dbXML command line tools also allow you to import and export content to and from a collection.
The import and export commands have two modes. The can either import/export a single file, or they can import/export multiple files using a wildcard. If a single file is being worked with, the commands allow you to rename the file between the file system and the database using the AS clause. If multiple files are being worked with, this option is not available. The import command also allows you to recurse a directory tree to store all matching files in the current collection.
The following example demonstrates importing a single document and renaming it:
/myCollection> import myFile.xml
AS myDocument.xml
The next example demonstrates importing an entire directory tree into the current collection:
/myCollection> import *.xml
RECURSIVE
Command | Description | Usage |
LS | Lists the Keys in a Collection | LS [wildcard] |
SHOW | Retrieves a Document from a Collection | SHOW 'document name' |
IMPORT | Imports Content in a Collection | IMPORT 'filespec' [AS 'name'] [RECURSIVE] |
EXPORT | Exports Content from a Collection | EXPORT 'filespec' [AS 'name'] |
RM | Remove the specified Content from the Collection | RM 'filespec' |
The way that triggers and extensions are managed in dbXML is nearly identical, so it's easier to discuss them in a combined section. The most important thing to understand about triggers and extensions is that their classes must be available to the server via the Java CLASSPATH or by packaging them into jars and storing them in the DBXML_HOME/lib directory before the dbXML server starts up.
This is an example of adding a trigger implementation to a collection:
/myCollection> mktrg name=myTrigger
class=examples.ExampleTrigger
Trigger 'myTrigger' created
Command | Description | Usage |
LSTRG | Lists the Triggers of a Collection | LSTRG [wildcard] |
MKTRG | Register a Collection Trigger | MKTRG 'trigger name' [properties] |
RMTRG | Remove the specified Trigger from the Collection | RMTRG 'trigger name' |
LSEXT | Lists the Extensions of a Collection | LSEXT [wildcard] |
MKEXT | Register a Collection Extension | MKEXT 'extension name' [properties] |
RMEXT | Remove the specified Extension from the Collection | RMEXT 'extension name' |
There are three other commands that are hard to categorize, so they've been conveniently placed in the category of 'other' options. These commands include the ability to set command line options, to shut down the server, and to exit the command line.
The set command is interesting because it allows you to toggle and change various command line options. Some of these options include:
The follow example demonstrates setting command line options:
/myCollection> set durations
true
durations true
(Execution: 5ms)
Command | Description | Usage |
SET | Set or display interactive mode properties | SET ['prop name' [value]] |
VERSION | Displays the server's version | VERSION |
SHUTDOWN | Shuts down the server | SHUTDOWN [exit code] |
EXIT | Exit interactive mode | EXIT [exit code] |
dbXML provides several application programmer interfaces (APIs), each with their own strengths and weaknesses. The API that an application developer selects will usually depend on the requirements of their particular project. This section will briefly discuss each API as well as that API's benefits.
The complete set of dbXML API documentation can be found in the DBXML_HOME/docs/api directory.
Note:
Throughout these sections, class and interface definitions
will be presented. These definitions will almost always be incomplete, usually
omitting JavaDoc, non public methods and fields, non essential methods
and fields, and throws clauses. They are meant to be a quick reference
rather than a complete definition.
The direct API is the interface defined by dbXML's internal class hierarchy. It is the closest that a developer can get to the inner workings of the dbXML engine. The classes and methods exposed by this API are very low level, and can be a little daunting at first.
One way in which the direct API is more difficult is that API exposes XML documents using the internalized dbXML DocumentTable representation. DocumentTables are ideal for storage and indexing, but can be difficult to work with for an application developer. To alleviate potential frustration, the direct API also offers an Adapter system that allows a developer to circumvent the DocumentTable APIs. dbXML provides adapters for DOM Documents, SAX Handlers, and JAXB bound classes.
The dbXML client API is a common interface for access dbXML in both embedded and client/server scenarios. The interfaces and methods that are exposed are very similar in many ways to the direct API, but serve to buffer the application developer from some of the inner workings of the database. A good example of this buffering is that the dbXML client API exposes specialized methods for working text or DOM representations of documents, where the direct API only exposes the internalized dbXML DocumentTable representation.
The XML:DB API is an abstract interface for accessing XML databases in a generic fashion. Several XML databases implement this API, and so it provides the most potential for developing portable applications.
Under the hood, dbXML is a web server, and provides complete HTTP access to its underlying facilities. Most of the features of the server can be access using the XML-RPC protocol. Some features can also be accessed directly using the REST protocol (URL encoding).
The direct API is the interface defined by dbXML's internal class hierarchy. It is the closest that a developer can get to the inner workings of the dbXML engine. The classes and methods exposed by this API are very low level, and can be a little daunting at first.
The direct API becomes important when developing triggers and extensions because the object references that are passed to these classes will be references to direct API objects. So if you're planning on extending dbXML, it's a good idea to become familiar with these classes and interfaces.
Collection represents a collection of Documents. It maintains links to the filer storage implementation, the indexes, and any extensions or triggers that may be associated with the collection. The Collection class is the focal point of the direct API. Nearly all methods for accessing and managing documents, indexes, triggers, and extensions can be found in this class.
public class Collection {
static final int TYPE_DOCUMENTS;
static final int TYPE_RECORDS;
String getName();
Collection getParentCollection();
boolean dropCollection(Collection collection);
Collection createCollection(String path,
Configuration config);
Database getDatabase();
SystemCollection getSystemCollection();
QueryEngine getQueryEngine();
int getCollectionType();
IndexManager getIndexManager();
ExtensionManager getExtensionManager();
TriggerManager getTriggerManager();
String getCanonicalName();
String getCanonicalDocumentName(Key key);
boolean drop();
Key createNewOID();
Key insertDocument(Transaction tx, DocumentTable document);
void setDocument(Transaction tx, Object docKey,
DocumentTable
document);
void remove(Transaction tx, Object key);
DocumentTable getDocument(Transaction tx, Object docKey);
Container getContainer(Transaction tx, Object docKey);
Record getRecord(Transaction tx, Object recKey);
void setRecord(Transaction tx,
Record rec);
void setRecord(Transaction tx, Object
recKey,
Value value);
Key insertRecord(Transaction tx, Value value);
ResultSet queryCollection(Transaction tx, String style,
String
query, NamespaceMap
nsMap);
ResultSet queryDocument(Transaction tx, String style,
String
query, NamespaceMap
nsMap,
Object key);
ContainerSet getContainerSet(Transaction tx);
Key[] listKeys(Transaction
tx);
long getKeyCount(Transaction tx);
}
Some of the important methods of the Collection class include:
The Database class, which is also a Collection class, serves as the top level container for a dbXML database. It provides additional capabilities such as providing access to the system collections, symbol tables, and security manager. It is also the class that is used to bootstrap the database and the collections therein.
public class Database extends Collection {
static Database getInstance();
SecurityManager getSecurityManager();
}
Container is a generic container for objects that are stored in a collection. A container can either contain a DocumentTable or a Value, depending on the collection type. Container associates the internal document or binary representation with its original collection and key, so that the contents of the container can be stored without having to maintain three separate references.
public interface Container {
static final int TYPE_DOCUMENT;
static final int TYPE_VALUE;
int getContainerType();
Collection getCollection();
Key getKey();
String getCanonicalName();
DocumentTable getDocument();
Value getValue();
void reset();
void remove();
void setDocument(DocumentTable document);
void setValue(Value value);
}
Some of the important methods of the Container interface include:
The ContainerSet interface is a set interface that allows the developer to iterate over the set of documents being stored by a collection.
public interface ContainerSet {
boolean hasMoreContainers();
Container getNextContainer();
}
The ResultSet interface is a set interface that allows the developer to iterate over a set of DocumentTables. These DocumentTables are the result of a query, and can either be the entire original document, portions of an original document, or synthesized nodes produced by the query resolver.
public interface ResultSet {
static final int RESULT_DOCUMENT;
static final int RESULT_ELEMENT;
static final int RESULT_ATTRIBUTE;
static final int RESULT_PROCINST;
static final int RESULT_COMMENT;
static final int RESULT_TEXT;
static final int RESULT_CDATA;
Collection getCollection();
Query getQuery();
boolean next();
void close();
int getResultType();
int getCount();
DocumentTable getResult();
Collection getResultCollection();
Key getResultKey();
}
Some of the important methods of the ResultSet interface include:
The dbXML client API is a common interface for access dbXML in both embedded and client/server scenarios. The interfaces and methods that are exposed are very similar in many ways to the direct API, but serve to buffer the application developer from some of the inner workings of the database. A good example of this buffering is that the dbXML client API exposes specialized methods for working text or DOM representations of documents, where the direct API only exposes the internalized dbXML DocumentTable representation.
There are currently two implementations of the dbXML Client API. These are:
dbXMLClient is the standard interface for working with dbXML in a client/server fashion.
public interface dbXMLClient {
static final String HOST;
static final String PORT;
static final String USER;
static final String PASS;
void setProperty(String name, String value);
String getProperty(String name);
String[] listProperties();
Map getProperties();
void connect();
void disconnect();
String getServerVersion();
void shutdown(int exitCode);
CollectionClient getDatabase();
CollectionClient getCollection(String path);
ContentClient getContent(String path);
}
Some of the important methods of the dbXMLClient interface include:
CollectionClient is the standard interface for working with a dbXML Collection in a client/server fashion.
public interface CollectionClient {
static final int TYPE_DOCUMENTS;
static final int TYPE_VALUES;
dbXMLClient getClient();
String getName();
int getCollectionType();
String getCanonicalName();
CollectionClient getParentCollection();
CollectionClient getDatabase();
CollectionClient getSystemCollection();
// Collection Management
CollectionClient getCollection(String name);
CollectionClient createCollection(String path,
Document configuration);
String[] listCollections();
boolean dropCollection(String name);
// Trigger Management
String createTrigger(Document configuration);
boolean dropTrigger(String name);
String[] listTriggers();
// Index Management
String createIndexer(Document configuration);
boolean dropIndexer(String name);
String[] listIndexers();
// Extension Management
String getExtension(String name);
String createExtension(Document configuration);
String[] listExtensions();
boolean dropExtension(String name);
// Document Management
String createKey();
Document getDocument(String docKey);
String getDocumentAsText(String docKey);
ContentClient getContent(String key);
String insertDocument(Document document);
String insertDocumentAsText(String document);
void setDocument(String docKey, Document document);
void setDocumentAsText(String docKey, String document);
void remove(String docKey);
String[] listKeys();
long getKeyCount();
// Record Content Management
String insertValue(byte[] value);
void setValue(String key, byte[] value);
byte[] getValue(String key);
// Query Processing
ResultSetClient queryCollection(String style,
String query,
Map nsMap);
ResultSetClient queryDocument(String style,
String query,
Map
nsMap, String key);
}
Some of the important methods of the CollectionClient interface include:
ContentClient is the standard interface for objects that are stored in a Collection. A ContentClient can either represent a Document or a Value, depending on the Collection type.
public interface ContentClient {
static final int TYPE_DOCUMENT;
static final int TYPE_VALUE;
String getKey();
int getContentType();
String getCanonicalName();
CollectionClient getCollection();
Document getDocument();
String getDocumentAsText();
byte[] getValue();
void setDocument(Document document);
void setDocumentAsText(String document);
void setValue(byte[] value);
}
Some of the important methods of the ContentClient interface include:
ResultSetClient is the standard interface for iterating over the results of a query against a CollectionClient.
public interface ResultSetClient {
CollectionClient getCollection();
String getQueryStyle();
String getQueryString();
boolean next();
void close();
int getCount();
Node getResult();
String getResultAsText();
CollectionClient getResultCollection();
String getResultKey();
}
Some of the important methods of the ResultSetClient interface include:
The XML:DB API is an abstract interface for accessing XML databases in a generic fashion. Several XML databases implement this API, and so it provides the most potential for developing portable applications.
Only some of the more useful classes and interfaces of the XML:DB API that dbXML implements are covered here. For a complete reference, please visit the XML:DB API working group's web site at http://www.xmldb.org/xapi/index.html.
DatabaseManager is the entry point for the API and enables you to get the initial Collection references necessary to do anything useful with the API. DatabaseManager is intended to be provided as a concrete implementation in a particular programming language. Individual language mappings should define the exact syntax and semantics of its use.
public class DatabaseManager {
static Database[] getDatabases();
static void registerDatabase(Database database);
static void deregisterDatabase(Database database);
static Collection getCollection(String uri);
static Collection getCollection(String uri,
String username,
String
password);
static String getConformanceLevel(String uri);
static String getProperty(String name);
static void setProperty(String name, String
value);
}
Database is an encapsulation of the database driver functionality that is necessary to access an XML database. Each vendor must provide their own implementation of the Database interface. The implementation is registered with the DatabaseManager to provide access to the resources of the XML database.
In general usage client applications should only access Database implementations directly during initialization.
public interface Database {
String getName();
Collection getCollection(String uri,
String username, String password);
boolean acceptsURI(String uri);
String getConformanceLevel();
}
A Collection represents a collection of Resources stored within an XML database. An XML database MAY expose collections as a hierarchical set of parent and child collections.
A Collection provides access to the Resources stored by the Collection and to Service instances that can operate against the Collection and the Resources stored within it. The Service mechanism provides the ability to extend the functionality of a Collection in ways that allows optional functionality to be enabled for the Collection.
public interface Collection {
String getName();
Service[] getServices();
Service getService(String name, String version);
Collection getParentCollection();
int getChildCollectionCount();
String[] listChildCollections();
Collection getChildCollection(String name);
int getResourceCount();
String[] listResources();
Resource createResource(String id, String
type);
void removeResource(Resource res);
void storeResource(Resource res);
Resource getResource(String id);
String createId();
boolean isOpen();
void close();
}
Provides access to XML resources stored in the database. An XMLResource can be accessed either as text XML or via the DOM or SAX APIs. The default behavior for getContent and setContent is to work with XML data as text so these methods work on String content.
public interface XMLResource extends Resource {
static final String RESOURCE_TYPE;
String getDocumentId();
Node getContentAsDOM();
void setContentAsDOM(Node content);
void getContentAsSAX(ContentHandler handler);
ContentHandler setContentAsSAX();
void setSAXFeature(String feature, boolean value);
boolean getSAXFeature(String feature);
}
ResourceSet is a container for a set of resources. Generally a ResourceSet is obtained as the result of a query.
public interface ResourceSet {
Resource getResource(long index);
void addResource(Resource res);
void removeResource(long index);
ResourceIterator getIterator();
Resource getMembersAsResource();
long getSize();
void clear();
}
ResourceIterator is used to iterate over a set of resources.
public interface ResourceIterator {
boolean hasMoreResources();
Resource nextResource();
}
Under the hood, dbXML is a web server, and provides complete HTTP access to its underlying facilities. Most of the features of the server can be access using the XML-RPC protocol. Some features can also be accessed directly using the REST protocol (URL encoding).
The easiest way to perform a call against the database using REST is to retrieve a document from a collection. Open a browser while dbXML is running on your local machine, and try retrieving the following URL:
http://localhost:7280/rest/system/SysConfig/database.xml
You will probably be required to enter a username and password. If you recall an earlier section in this guide (see Connecting to a Database), then you'll know that you can type '' as a username, and '' as a password. The result might look something like this:
<?xml version="1.0" encoding="UTF-8"?>
<database name="db"><extensions/></database>
This is the simplest way to utilize dbXML's web services capabilities. REST also supports invoking methods using a syntax that combines a URL with an accompanying query string. The following example performs the same task as the previous one, but uses the method call syntax to accomplish it:
http://localhost:7280/rest/system/SysConfig?method=getDocument&docKey=database.xml
Beyond the REST protocol, dbXML also supports the XML-RPC protocol, so developing client applications in languages other than Java should not be difficult. For a starting point to the XML-RPC interfaces that dbXML supports, refer to the com.dbxml.db.server.labrador package.
Note:
dbXML does not currently support the SOAP protocol. The
next major release of dbXML will include SOAP web services support, and a
dbXML
client API implementation utilizing SOAP.
Collections of documents and indexes for values in those documents aren't much use if you don't have a way to query those documents or portions of them. dbXML provides several query resolving systems for you to do this. Query resolvers are registered with the entire database. Queries are executed against specific collections or documents within a collection, and are referenced by a style name.
The example queries in this section can be used by any of the APIs to perform a query against a collection. To perform a query, simply assign the query's XML to a string, and feed it to either the queryCollection or queryDocument method of a collection API.
XPath is a terse pathing syntax that is similar in some ways to UNIX or DOS directory paths. It allows the results returned to be filtered based on location and predicated evaluation. It is the simplest query that can be performed against a dbXML collection, and is also leveraged by the other three default query resolvers.
The standard XPath language is meant to be utilized for single document querying. dbXML extends XPath to support querying over an entire collection of documents, but is otherwise identical in behavior to the actual specification. For more information about the XPath specification see http://www.w3.org/TR/xpath.
<dbxml:xpath xmlns:dbxml="http://www.dbxml.com/db/query">
<!-- The above dbxml:xpath
element can
be used for namespace definitions
-->
/myElem[@id='some value']/childElem/text()
</dbxml:xpath>
This query checks all documents in the collection with a root element of 'myElem' for a containing attribute of 'id' that is equal to "some value", it then selects the first child element named 'childElem' and retrieves the textual value of that element.
If you don't need all of the flexibility that the XML query syntax provides, you can also form your queries as a single XPath string. The limitation to this format is that you can't define namespace prefixes.
/myElem[@id='some value']/childElem/text()
XSLT is a transformation language that converts XML into other forms. These formats can include XML, text, HTML or even PDF when XSL formatting objects are used. dbXML XSLT queries can be executed against a single document, an entire collection, or the results of an XPath query.
<dbxml:xslt xmlns:dbxml="http://www.dbxml.com/db/query">
<dbxml:source xpath="/myElem[@id='some value']" document="">
<!-- optionally
source XML -->
</dbxml:source>
<dbxml:params>
<!-- <param name="" value=""> -->
</dbxml:params>
<dbxml:stylesheet document="/xslt/somestylesheet.xsl">
<!-- optionally
a stylesheet -->
</dbxml:stylesheet>
</dbxml:xslt>
This example is fairly complex, so some additional explanation is in order.
The first section of the query is the transformation source. The transformation source can come from one of three places. If the 'xpath' attribute is defined, the source content will be produced by performing the specified XPath query against the collection. If the 'document' attribute is defined, the source content will be the document in the collection whose key specified by the attribute value. If neither attributes are defined, the source will be any inline XML content that is specified within the 'dbxml:source' element.
The second section of the query are parameter definitions. These parameters will be bound to the stylesheet transformation context, and can utilized by the stylesheet for dynamic rendering purposes.
The third section of the query is the stylesheet source. The stylesheet source can come from one of two places. If the 'document' attribute is defined, the stylesheet content will come from the specified path. The value of this attribute can be an absolute path to the document, if the document doesn't exist in the current collection. If the 'document' attribute is not defined, the stylesheet will be defined by any inline XML content that is specified within the 'dbxml:stylesheet' element.
Note:
If the 'document' attribute is used for the stylesheet definition, dbXML will
pre-compile the stylesheet and cache the results for faster querying in
subsequent calls. If the stylesheet source is inlined, this optimization
cannot be performed.
For more information about the XSLT specification see http://www.w3.org/TR/xslt.
XUpdate is also a transformation with some of the same goals as XSLT, but its syntax is simpler, and its purpose is to modify the content of documents in place.
<xu:modifications version="1.0" xmlns:xu="http://www.xmldb.org/xupdate">
<xu:insert-after select="/myElem[@id='some value']/childElem">
<!-- Add insert-after
nodes here -->
<newElem>
I'm a new element
</newElem>
</xu:insert-after>
</xu:modifications>
The XUpdate specification provides many more capabilities. For a complete reference, please visit the XUpdate working group's web site at http://www.xmldb.org/xupdate/index.html.
FullText is a search engine style query with the the ability to search on many words with ANDed and ORed set evaluation. The results of a full text query can also be filtered using an XPath expression.
<dbxml:fulltext xmlns:dbxml="http://www.dbxml.com/db/query"
xpath="">
<!-- The above dbxml:fulltext
element can
be used for namespace definitions.
Use
the name attribute in the below
select
element to specify a document
element -->
<or>
<select name="oneElem">
anded set of words to find
</select>
<select name="anotherElem">
another anded set of words
to find
</select>
</or>
</dbxml:fulltext>
Within a full text query, 'and' and 'or' nested elements can be used to refine the result set. The 'name' attribute of the 'select' element is used to specify which element or attribute is to be queried. If more than one word is specified within the 'select' element, the set of words are queried, and the results are implicitly ANDed. If an 'xpath' attribute is specified in the 'dbxml:fulltext' element, then the results of the full text query will be further filtered by the specified XPath expression.
Be Aware:
A full text query requires a supporting full text index on
any of the elements or attributes that are queried. If any of the parts of
the full text query lack a supporting index, the entire query will fail
If you don't need all of the flexibility that the XML query syntax provides, you can also form your queries as a set of words whose results will be ANDed. The limitation of this format is that you can't define namespace prefixes, and can't filter using XPaths. The possible formats for these types of queries follow.
interests="coffee cigarettes"
person@keywords=caffeine
*="jamaican blue"
*@*=cup
The first line checks for the words "coffee" and "cigarettes"in all 'interests' elements. The second line checks for the word "caffeine" in all 'keywords' attributes of 'person' elements. The third line checks for the words "jamaican" and "blue" in ALL elements. The fourth line checks for the word "cup" in ALL attributes. These queries assume that there are appropriate indexes to support the query.
This section includes some example programs that demonstrate the process of connecting to and working with a dbXML database.
The following simple example program demonstrates the dbXML client API. It performs the tasks of connecting to a database, retrieving a collection reference, and fetching a document from that collection.
import com.dbxml.db.client.CollectionClient;
import com.dbxml.db.client.dbXMLClient;
import com.dbxml.db.client.xmlrpc.dbXMLClientImpl;
import com.dbxml.util.dbXMLException;
public class Example1 {
public static void main(String[] args) {
try {
// Connect to the database
on localhost port
dbXMLClient client = new dbXMLClientImpl();
client.setProperty(dbXMLClient.USER,
"");
client.setProperty(dbXMLClient.PASS,
"");
client.connect();
// Retrieve a CollectionClient
reference for /myCollection
CollectionClient col = client.getCollection("/myCollection");
// Retrieve the Document
referred to by the command line
String doc = col.getDocumentAsText(args[0]);
// Display the Document
on the system console
System.out.println(doc);
// Disconnect from
the database
client.disconnect();
}
catch ( dbXMLException e ) {
e.printStackTrace(System.err);
}
}
}
The following example program demonstrates using the dbXML client API to perform a query against a collection and iterate over the results of that query. It performs the tasks of connecting to a database, retrieving a collection reference, querying the collection, and iterating over the query results.
import com.dbxml.db.client.CollectionClient;
import com.dbxml.db.client.ResultSetClient;
import com.dbxml.db.client.dbXMLClient;
import com.dbxml.db.client.xmlrpc.dbXMLClientImpl;
import com.dbxml.util.dbXMLException;
import java.util.HashMap;
public class Example2 {
public static void main(String[] args) {
try {
// Connect to the database
on localhost port
dbXMLClient client = new dbXMLClientImpl();
client.setProperty(dbXMLClient.USER,
"");
client.setProperty(dbXMLClient.PASS,
"");
client.connect();
// Retrieve a CollectionClient
reference for /myCollection
CollectionClient col = client.getCollection("/myCollection");
// Build up the query
string and create an empty namespace map
StringBuffer sb = new StringBuffer();
sb.append("/myElem[@id='");
sb.append(args[0]);
sb.append("']");
String query = sb.toString();
HashMap nsMap = new HashMap();
//
Perform the query
and iterate over its ResultSetClient.
// The id attribute
of root level myElem elements must match
// the value specified
on command
line.
ResultSetClient rs = col.queryCollection("XPath",
query, nsMap);
while ( rs.next() ) {
String result = rs.getResultAsText();
System.out.println(result);
}
// Close the ResultSetClient
and Disconnect
from
the database
rs.close();
client.disconnect();
}
catch ( dbXMLException e ) {
e.printStackTrace(System.err);
}
}
}