Prepring version. Published in Workshop on Web Information Management (WIDM’04) at ACM CIKM’04 Proceedings: Washington DC., November 12, 2004, pages 23-30.
Copyright © ACM 2004. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Workshop on Web Information Management (WIDM’04) at ACM CIKM’04 Proceedings.
NOTE: At the time of publication, the author Alex Dekhtyar was not yet affiliated with Cal Poly.
Concurrent markup hierarchies appear often in document-centric XML documents, as a result of different XML elements having overlapping scopes. They require significantly different approach to management and maintenance. Management of XML documents composed of concurrent markup has been mostly studied by the document processing community and has attracted attention of computer scientists only recently. In this paper we discuss the architecture of an XML parser for concurrent XML. This parser uses a GODDAG data structure in place of traditional DOM Tree to store concurrent markup on top of the document content and provides a DOM-like API that allows software developers of tools working with concurrent XML documents to use it instead of parsing each individual component with a traditional DOM XML parser. The paper describes the architecture of the parser, data structures and algorithms used and the DOM-like API.