XML and Web Services In The News - 03 May 2006

Provided by OASIS | Edited by Robin Cover

This issue of XML.org Daily Newslink is sponsored by Innodata Isogen

HEADLINES:

	All Eyes on Office as ODF Gets the Nod
	Web Services Gets SPML 2.0 Boost
	Java NVDL Implementation Alpha Release
	GeoRSS: Geographically Encoded Objects for RSS Feeds
	IBM DB2 "Viper" Revs XML Engine
	Not in Schema Wildcard
	Accessing the Web of Databases
	XQuery 1.0 and XPath 2.0 Full-Text Use Cases
	Nobody Reads a Column About Open Standards

All Eyes on Office as ODF Gets the Nod
Elizabeth Montalbano, InfoWorld
The International Organization for Standards (ISO) this week gave formal approval to the Open Document Format for Office Applications (ODF), paving the way for office suites based on ODF to be more broadly adopted, proponents said Wednesday. The move comes as Microsoft's rival standard for its own Office productivity suite, OpenXML, awaits the same approval by the ISO. The ISO is an international consortium that works with the United Nations to maintain and approve international technology standards. ODF is a standard for office documents overseen by the Organization for the Advancement of Structured Information Standards (OASIS) and supported by Microsoft rivals IBM Corp. and Sun Microsystems Inc., among other companies. They want to see ODF adopted internationally as the standard for office documents and software that creates and manages these documents, such as Microsoft's popular Office suite and rivals such as Sun's Star Office. The government of Massachusetts in the U.S. already has put in motion a plan to migrate its documents to ODF from proprietary formats, a process it hopes to implement beginning in January 2007. According to Andrew Updegrove: "With adoption of ODF by ISO/IEC now assured, software that implements the standard will now become more attractive to those European and other government purchasers for whom global adoption by ISO/IEC [International Engineering Consortium] is either desirable, or required. Offerings such as OpenOffice and KOffice therefore should receive a boost in appeal and usage, as well as for-sale versions, such as Sun's StarOffice and IBM's Internet-based offering."
See also: ODF references

Web Services Gets SPML 2.0 Boost
Mathew Schwartz, Enterprise Systems
How do businesses securely tie together systems with business partners using Web Services technology or service-oriented architectures? Today, such business-to-business (B2B) efforts typically require business partners to standardize on identical identity-management software or code laborious workarounds. A new standard should help. The international standards consortium OASIS announced it has ratified Service Provisioning Markup Language (SPML) version 2.0, which should facilitate easier out-of-the-box, B2B identity-management integration. The new OASIS Standard specifies an XML framework for identity management and provisioning. An XML-based framework, SPML defines how resources should be allocated between systems and organizations. It also handles provisioning -- managing user accounts and access rights -- in a variety of environments, including access to systems, networks, and applications, as well as to such physical resources as mobile phones and credit cards. According to Gavenraj Sodhi, the director of product management for security information management solutions at CA Inc. (formerly Computer Associates), and a co-chair of the SPML technical committee: "SPML can become a major component of the identity management stack... this will allow vendors to build hooks into their applications,' to create easier out-of-the-box interoperability between applications, which should better facilitate B2B Web Services integration. That's because a growing requirement in Web services rollouts, as well as in the implementation of service-oriented architectures, is sharing user information across businesses -- and not just identities, but also permissions, groups, and access rights.
See also: the OASIS Standard

Java NVDL Implementation Alpha Release
Jirka Kosek, DSDL-Discuss Posting
Jirka Kosek announced a first alpha release of JNVDL -- an open-source NVDL implementation written in Java, including a binary distribution available for download from SourceForge. "NVDL is upcoming ISO standard which can be used to define 'meta-schemas' that define how to validate XML documents that are composed from elements from multiple namespaces. For each namespace, NVDL schema can define a schema against which validation should be performed. This schema can be written in an arbitrary schema language like RELAX NG, DTD or W3C XML Schema. NVDL was heavily inspired by NRL language. Although syntax details of NVDL are different from NRL, it is still useful to go through NRL specification to see what can be done with NRL -- and thus also with NVDL. Until now, there was only one NVDL implementation, written for .NET. This is no longer true, as you can now download Java-based implementation called JNVDL. According to Rick Jelliffe, "NDVL will provide a great mechanism for allowing your to selectively dispatch different parts of your document to different validators. So you can pick the best schema language for the job. Or, as is more often the case, you may be working with different vocabularies each defined in a different schema language (DTD, RELAX NG, XSD, Schematron, etc)."
See also: NVDL references

GeoRSS: Geographically Encoded Objects for RSS Feeds
GeoRSS GML is a formal GML Application Profile, and supports a greater range of features than Simple, notably coordinate reference systems other than WGS84 latitude/longitude. It is designed for use with Atom 1.0, RSS 2.0 and RSS 1.0, although it can be used just as easily in non-RSS XML encodings. GeoRSS Simple has greater brevity, but also has limited extensibility. It can be used in all the same ways and places as GeoRSS GML. The georss.org web site describes a number of ways to encode location in RSS feeds. As RSS becomes more and more prevalent as a way to publish and share information, it becomes increasingly important that location is described in an interoperable manner so that applications can request, aggregate, share and map geographically tagged feeds. Perhaps the most powerful advantages of GeoRSS feeds will be seen in the possibilities for geographic search and aggregation. More than just getting feeds for a particular city or zip code, using GeoRSS it will be posssible to search with all sorts of geographic criteria.
See also: the OGC announcement

IBM DB2 "Viper" Revs XML Engine
Sean McCown, InfoWorld
The upcoming release of IBM database promises a richer XML experience. Viper (codename) contains an extensive list of enhancements that covers everything from security and development to storage and administration, but topping the list is a newly integrated XML storage engine that Big Blue says will put Microsoft and Oracle to shame. The addition of the native XML engine makes XML a separate but equal partner to relational data in DB2. Whereas the Microsoft and Oracle databases use structured CLOBs (Character Large Objects) to store XML documents, IBM uses a new storage format that the company says better suits the complexities of XML data storage and retrieval. The IBM scheme is actually a parallel data storage manager that parses XML data into a hierarchical format that, unlike CLOBs, supports native XML querying without degrading performance, according to IBM. The company claims the separate XML storage engine gives Viper a performance increase of anywhere between 2 and 7 times over Microsoft SQL Server and Oracle Database. Along with the new engine comes a supporting cast of other features that is sure to please hardcore developers. An XQuery builder helps you create queries against XML data; and Control Center, Visual Explain, db2look, and CLP (command line processor) have all been enhanced to support the management of XML data. IBM has also developed all new index structures for dealing with the new XML model.
See also: the XML description

Not in Schema Wildcard
Dave Orchard, Blog
The W3C XML Schema WG is now talking about how to do "versioning" in XML Schema 1.1, yeah! There are a lot of different approaches that are possible and better than the status quo. Roughly the requirements are allowing addtional content in mixed namespace documents with forwards and backwards compatible schemas. One approach that I think meets the 80/20 point is the "allow anything not declared in the Schema" extensibility model. I first published this way back in December 2003 and I like it more and more. The XML Schema WG has collected uses cases and I contributed a bunch of Web services versioning use cases to help the discussion. I always like using "names" as examples. The first version of a name structure has given and family. But then we want to add a "middle" name. Because we want to combine with extension (which happens at the end of content models), we'll show the "new" content at the end...
See also: XML Schemas

Accessing the Web of Databases
Jon Udell, InfoWorld
A world of possibilities is revealed when you view the Web as a network of interconnected data. The Web is becoming a database -- or, more precisely, a network of databases. All of the trends that inform this 'Strategic Developer' column -- including Web services, REST (Representational State Transfer), AJAX (Asynchronous JavaScript and XML), and interpersonal as well as interprocess collaboration -- can be usefully refracted through that lens. I've always regarded the Web as a programmable data source as well as a platform for the document/software hybrid that we call a Web page. Early on, programmable access to Web data entailed a lot of screen scraping. Nowadays it often still does, but it's becoming common to find APIs that serve up the Web's data. If you want to remix the InfoWorld metadata explorer, for example, as Mike Parsons did, you can fetch its data directly as XML. The holistic view of that network should be our focus. In Kingsley Idehen's view, you'll use something like SPARQL -- a query language for the semantic Web -- to traverse a graph of interlinked sites, and to merge interesting sources into a virtual collection. Then you'll dispatch queries to each member of that collection. They'll offer a range of query styles ranging from free text search to iteration over simple key/value pairs (accessed by way of RSS or Atom) to tree traversal (XPath, XQuery) and relational query (SQL). I think he's got it exactly right.

XQuery 1.0 and XPath 2.0 Full-Text Use Cases
Sihem Amer-Yahia and Pat Case (eds)
W3C has announced the release of an updated version of a Working Draft for "XQuery 1.0 and XPath 2.0 Full-Text Use Cases", together with the companion "XQuery 1.0 and XPath 2.0 Full-Text." As XML becomes mainstream, users expect to be able to search their XML documents; this requires a standard way to do full-text search, as well as structured searches, against XML documents. The Use Cases document was produced through the joint efforts of the W3C XML Query Working Group and the XSL Working Group. The use cases to illustrate important applications of full-text querying within an XML query language. Each use case exercises a specific functionality relevant to full-text querying. An XML Schema and sample input data are provided. Each use case specifies a query applied to the input data, a solution in XQuery, a solution in XPath (when possible), and the expected results. The full-text queries in the following use cases are performed on text which has been tokenized, i.e., broken into a sequence of words, units of punctuation, and spaces. A word is defined as any character, n-gram, or sequence of characters returned by a tokenizer as a basic unit to be queried. Each instance of a word consists of zero or more consecutive characters. Beyond that words are implementation- defined. Note that consecutive words need not be separated by either punctuation or space, and words may overlap. A phrase is an ordered list of words. A phrase may contain any number of words. Three new use cases have been added: (1) a query calling a stop word list but excluding a word from the list, (2) a query with a weight declaration, and (3) a query with an embedded XQuery expression.
See also: XML and Query Languages

Nobody Reads a Column About Open Standards
John Savarese, Campus Technology
The University System of Georgia institutions were stuck with doing a lot of hand work with transcripts. By fall 2006, however, transcripts passing in and out of most Georgia institutions will finally start to move at the pace we expect from electronic transmissions. Two developments made this possible: the creation of a standard by a truly community-based organization, and the implementation of that standard in software by a standards-savvy software vendor. First, the community, acting through its surrogate in matters like this (the Postsecondary Electronic Standards Council), developed a versatile way of representing transcripts in Extensible Markup Language (XML). PESC brings together subject-matter experts to hammer out standards for data exchange. In developing the XML transcript standard, the PESC committees included representatives of educational institutions that act as trading partners -- professional organizations like the American Association of Collegiate Registrars and Admissions Officers, and vendors that provide the software and services that have to interact with the standards. While PESC stands for the opposite of de facto, vendor-driven standards, the organization tries to build a cooperative community that includes vendor participation. The current PESC board of directors includes representation from companies like Oracle, Datatel, SunGard SCT, Sallie Mae, and the National Student Clearinghouse, alongside the University of Oklahoma, the University of Illinois, and Bowling Green State University (OH).
See also: PESC references

XML.org is an OASIS Information Channel sponsored by Innodata Isogen and SAP.

Use http://www.oasis-open.org/mlmanage to unsubscribe or change an email address. See http://xml.org/xml/news_market.shtml for the list archives.