Berkeley DB XML 2.4.13 Change Log
Upgrade Requirements
Upgrade is only required for containers created using release 2.2.13
or earlier. Containers creating using 2.3.X do not require upgrade.
However, most queries will benefit from reindexing 2.3.X-based containers
to add new structural statistics information used in query cost analysis
so it is highly recommended. Reindexing is required in order
to enable the substring index to be used on 1- and 2-character strings (a
new feature in 2.4).
Reindexing should generally be performed offline as it is an expensive
operation that reads all content and regenerates index databases.
If an upgrade is performed (e.g. from 2.2.13) it is recommended
that the resulting container be run through dbxml_dump/dbxml_load to
reduce its file size.
New Features:
- Conformance to Last-Call Working Draft of XQuery Update 1.0
- Added the ability to use "document projection" when querying whole-document
containers. This performance and memory optimization results in only
materializing that portion of the document relevant to the query.
- Partial document modifications will now only reindex those portions
of the document(s) affected by the modification itself. This is a
significant performance enhancement for partial update of large documents.
API Changes:
Unless otherwise noted, the API additions apply to all language bindings,
and all bindings use the same method name.
- Added the DBXML_DOCUMENT_PROJECTION flag to the various query interfaces
to enable use of this feature. In Java, this behavior is controlled by
XmlDocumentConfig.setDocumentProjection()
- Added a new XQuery extension function, dbxml:contains(), that
allows case- and diacritic-insensitive string searches and can
be optimized by a substring index
- Removed all C++ interfaces that used Xerces-C DOM, including:
- XmlDocument::setContentAsDOM()
- XmlDocument::getContentAsDOM()
- XmlValue::asNode()
- XmlModify is now a deprecated (but still-supported) class. XQuery Update
should be used instead. One method
has been removed as it is no longer supported by the internal
infrastructure: XmlModify::setNewEncoding()
- Added a new constructor to XmlEventReaderToWriter that allows an
XmlEventWriter instance to be multiple-use. This makes it possible to
write to it from multiple reader sources to concatenate content, for example.
- Removed XmlUpdateContext::{get,set}ApplyChangesToContainers() methods. This
behavior is no longer controllable. Changes to documents that are in containers
will always be written. If transient changes are required, content must be
copied (XQuery Update has syntax to do this directly)
- Removed unused variant of XmlValue::asString(std::string &encoding). This variant never actually changed the encoding [#15822]
- Added DBXML_STATISTICS and DBXML_NO_STATISTICS flags to enable/disable
creation of an additional statistics database that is used for query
optimization. The default is to create the database. Upgraded
containers will NOT have a statistics database added unless they are
explicitly reindexed. The cost of this optimization is a bit of extra
work during document insertion. In Java this behavior is controlled by
XmlContainerConfig.setStatisticsEnabled()
-
Added the DBXML_NO_AUTO_COMMIT flag, which can be specified to the XmlQueryExpression::execute()
methods to turn off auto-commit of update queries when it is not appropriate.
-
Some XmlException error codes have changed - DOM_PARSER_ERROR and NO_VARIABLE_BINDING have
been removed, XPATH_PARSER_ERROR is now called QUERY_PARSER_ERROR, and XPATH_EVALUATION_ERROR
is now called QUERY_EVALUATION_ERROR. [#15792]
- The enumeration, XmlQueryContext::DeadValues, has been removed.
The related method, XmlQueryContext::setReturnType() remains but is
a no-op. All results are LiveValues. This will not affect the vast
majority of applications.
- Added XmlQueryExpression::isUpdateExpression() to allow users to
know whether an expression is updating or not
Changes That May Require Application Modification:
- Some of the API changes above may result in the need to
make minor code changes
- Not all modification patterns that use XmlModify will
continue to work given the new infrastructure. Specifically,
operations that would both copy and delete the same content (emulating
a "move" operation) may not work. In all cases, such code can
be rewritten to use XQuery Update directly, resulting in simpler
code. Special attention should be paid to multi-step operations
that include such side effects.
- Changed default indexing type on containers to be node indexes
for node storage containers and document indexes for document storage
containers [#15863].
- Java only -- the function XmlContainer.getNode()
function has changed its signature and will require code change if used.
See details below under "Java-specific Functionality Changes."
General Functionality Changes:
- Partial document modification will result in only reindexing those
portions of document(s) affected by the modification
- The system now keeps better cost information and statistics
and the query optimizer uses this information to perform more effective
cost-based optimization
- The content processing internals have been reworked to make
heavy use of iterators and temporary Berkeley DB databases to
significantly reduce the memory footprint of query handling as well
as reduce the number of objects created and destroyed by a query.
This leads to a more scalable, high-performance system
- Substring indexes will now work on any length search string (e.g. 2-char)
rather than be restricted to a 3-character minimum. Reindexing the container
is required to get this functionality.
- Various fixes and memory leak elimination in XmlEventWriter[#15405]
- Fixed a problem where removing a default index could remove index
entries for an overlapping non-default index[#15412]
- Changed semantics of XmlQueryContext::setNamespace() to treat an empty
namespace prefix as the default element namespace[#15630]
- Fixed problem in XmlModify that could result in malformed XML if
a prefixed element name were used without a mapping for that prefix[#15586]
- Fixed URI resolution code to not add the base URI when the
URI being resolved is absolute[#15583]
- Fixed code to force explicit transactions (vs auto-commit) when
using XmlContainer::putDocumentAsEventWriter. This is necessary because
of the 2-part nature of this interface [#15578]
- Fixed a crash that could occur if XmlResults.next() were called
at the end of a result set [#15621]
- Fixed a bug where XQuery expressions involving unused global
variables were not being optimized correctly [#15661]
- Fixed problem in XmlEventReader::nextTag() where it would
mistakenly throw an exception on character data. Also changed
semantics of XmlEventReader to always return start and end document
events so that callers can know when content starts and ends [#15686]
- Fixed case where the '>' character was not being escaped
properly (according to the XML specificiation). This case is when it
occurs in the sequence, "]]>" [#15739]
- Fixed a problem in XmlModify where removing a node that was the last
child and had leading text could cause a SEGV [#15615]
- Enhanced XmlEventReaderToWriter API to not unconditionally close
the XmlEventWriter object, allowing a single XmlEventWriter to be used
more than once via that API. This allows XmlEventReaderToWriter to be used
for example, to coalesce a number of results into one document [#15446]
- Fixed a problem with XmlIndexLookup where a GT lookup that happens
to start with the last entry in the index might return results when
it should return none[#15408]
- Fixed a bug which incorrectly reported an error for fractional seconds
when the seconds filed was "59" [#15389]
- Fixed a problem in statistics calculation for substring indexes that
could cause a crash in fn:contains() [#15823]
- Fixed a bad exception that might be thrown when inserting a
schema-invalid document, due to the length of the error message [#15824]
- Fixed a problem where a query that uses an XmlDocument that has just been
"put" into a continer as context for the query might hang if done in the
same transaction as the putDocument call[#15905]
- Fixed a problem where querying empty CDATA sections could cause an
assertion failure or bad memory refernce[#15906]
- Fixed a latent bug that could result in missing index entries after
updateDocument or modifyDocument call. This is very obscure and has never
been seen by a user. It requires an odd combination of indexes and
updates[#15943]
- Fix open/close race condition on XmlContainer. An application that
concurrently opened/closed XmlContainer objects (not recommended...)
could possibly reference bad memory [#15890]
- Fixed several memory leaks that could occur if deadlock exceptions
are thrown during document processing (most likely put and delete)
- All update operations now work inside internal child transactions
to ensure that they are properly aborted if necessary. This is not
user visible
- Internal buffer size for DB get operations on nodes is
tuned to the calling operation (bulk vs single get)[#15607]
- Fixed a bug in XmlEventWriter where the behavior was dependent on an uninitialized variable [#15968]
- Changed dbxml_load_container to take a '-e' flag that causes the program to stop document loading in the event of a parse error. The default is to continue with the next document [#15777].
- Fixed problem in XmlModify where newly-inserted element content could
cause a bad memory reference and/or crash while calculating a new node id [#15974].
Utility Changes:
- The dbxml shell added commands:
- setProjection allows control of the document projection feature
- The dbxml shell can be invoked using the #! syntax in a *nix
shell command, e.g. with the first ilne:
#!/dbxml -s
- Handling of '#' comment lines in the dbxml shell has been improved
so that they can occur anywhere in a line [#15689]
Java-specific Functionality Changes:
- Fixed a problem where committed or aborting a transaction that was
already committed or aborted could crash, especially after a failed
XmlManager.openContainer() call [#15729]
- Is it no longer be necessary to explicitly delete objects of type XmlValue,
XmlDocument, XmlQueryContext, XmlMetaData, XmlMetaDataIterator and XmlUpdateContext.
They are implemented entirely as pure Java
objects with no native memory to release. It will still be necessary
to explicitly delete other Java objects to release native memory.
In general the validity of XmlValue and XmlDocument objects returned
via XmlResults (queries, index lookups, etc) is under control of the
XmlResults object. When the XmlResults object is deleted node values
that may have been associated with that object may no longer be
accessible and an exception will be thrown if accessed [#15194]
- Added -source 1.5 -target 1.5 to Java builds to be explicit, especially
for Windows binary build. The current code *will* work with 1.4 or 1.6 if the arguments are
changed (manually) [#15986]
- The function XmlContainer.getNode() function has changed its signature.
Instead of XmlValue it now returns XmlResults. The XmlValue that was
previously returned can be retrieved using XmlResults.next(). There will
never be more than one value in this result. It is necessary
to explicitly delete the returned XmlResults object (XmlResults.delete())
when the application no longer needs access to the returned value. Once
deleted the information in the XmlValue may no longer be accessible.
Python-specific Functionality Changes:
- Fixed XmlEventReader in Python so that methods returning unsigned char *
would be mapped properly into Python strings [#15608]
- Changed implementation of XmlException and related classes to make them
part of the dbxml (vs _dbxml) module [#15617]
- Changed names of XmlException attributes to start with lower-case letters.
See src/python/README.exceptions.
- Moved examples to dbxml/examples/python directory and added some additional
basic examples
PHP-specific Functionality Changes:
- Fixed code that resulted in build and runtime errors on 64-bit
platforms. One symptom was "std::bad_alloc" exceptions. The issue
was a mix of 64- and 32-bit types resulting in attempts to allocate
huge amounts of memory [#15587]
- Fixed compilation problems in a threaded (ZTS) environment related
to the use of incorrect macros in a few places [#15746]
- Fixed problem (SEGV) constructing XmlIndexLookup objects as well as several
other problems with this class implementation [#16168]
- Moved examples to dbxml/examples/php
- Fixed XmlValue constructor to accept explicitly typed strings [#15996]
Perl-specific Functionality Changes:
- Moved examples to dbxml/examples/perl
Example Code Changes
- Added examples/cxx/xerces directory with example code that
provides the same functionality that the Xerces-C DOM interfaces
previously provided. They are written as example code to
illustrate an integration with Xerces-C DOM and to also illustrate
use of the XmlEvent* classes for such an adapter
- Moved examples for all languages to dbxml/examples/* to
consolidate them and make packaging simpler
Configuration, Documentation, Portability and Build Changes:
- XQilla 2.0 is bundled. XQilla 2.0 is released under a permissive
(Apache) license
- Windows static build projects are included
- Project and solution files for Visual Studio version 8.00 have been added
for use by Visual Studio 2005 and later releases. The new solution file is BDBXML_all_vs8.sln.
- Added Berkeley DB project files to the BDB XML build_windows directory for
Visual Studio 7.1 and 8 builds. This means that the included DB projects will be built
directly in the BDB XML tree and not in the Berkeley DB tree. This does not apply to
the VC6 projects and workspace and does not affect where the default build installs
executables and libraries. VS7.1 project files for Berkeley DB examples are no longer included.