XML and Web Services In The News - 22 June 2006

Provided by OASIS | Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by BEA Systems, Inc.

HEADLINES:

	Metadata Interoperability and Standardization: Schema Level Interop
	Achieving Metadata Interoperability at the Record and Repository Levels
	A Unified Standard Format for Proteomics Mass Spectrometry Data
	Scaling Up with XQuery, Part 2
	Motorola Joins Eclipse to Boost Mobile Linux Apps
	Big Guns Take Sides in Standards Shootout
	Real-World Rule Engines
	Metasearch Authentication and Access Management

Metadata Interoperability and Standardization -- A Study of Methodology, Part I, Achieving Interoperability at the Schema Level
Lois Mai Chan and Marcia Lei Zeng, D-Lib Magazine
The rapid growth of Internet resources and digital collections has been accompanied by a proliferation of metadata schemas, each of which has been designed based on the requirements of particular user communities, intended users, types of materials, subject domains, project needs, etc. This article contains an analysis of the methods that have been used to achieve or improve interoperability among metadata schemas and applications, for the purposes of facilitating conversion and exchange of metadata and enabling cross-domain metadata harvesting and federated searches. From a methodological point of view, implementing interoperability may be considered at different levels of operation: schema level, record level, and repository level. A metadata schema consists of a set of elements designed for a specific purpose, such as describing a particular type of information resource. In the literature, the words "schema", "scheme", and "element set" have been used interchangeably to refer to metadata standards. In practice, the word "schema" usually refers to an entire entity including the semantic and content components (which are usually regarded as an "element set") as well as the encoding of the elements with a markup language such as SGML and XML.
See also: OAI-PMH

Metadata Interoperability and Standardization -- A Study of Methodology, Part II, Achieving Interoperability at the Record and Repository Levels
Lois Mai Chan and Marcia Lei Zeng, D-Lib Magazine
This is the second part of an analysis of the methods that have been used to achieve or improve interoperability among metadata schemas and their applications in order to facilitate the conversion and exchange of metadata and to enable cross-domain metadata harvesting and federated searches. Results of efforts to improve interoperability can be observed at three different levels: (1) Schema level -- Efforts are focused on the elements of the schemas, being independent of any applications. The results usually appear as derived element sets or encoded schemas, crosswalks, application profiles, and element registries. (2) Record level -- Efforts are intended to integrate the metadata records through the mapping of the elements according to the semantic meanings of these elements. Common results include converted records and new records resulting from combining values of existing records. (3) Repository level -- With harvested or integrated records from varying sources, efforts at this level focus on mapping value strings associated with particular elements (e.g., terms associated with subject or format elements). The results enable cross-collection searching.

A Unified Standard Format for Proteomics Mass Spectrometry Data
Staff, GenomicsProteomics.com
The Human Proteome Organisation's Proteomics Standards Initiative (HUPO-PSI) has announced a roadmap for creating a unified data interchange format for proteomics mass spectrometry at the Conference of the American Society for Mass Spectrometry. The format will combine the current HUPO-PSI format (mzData) with the mzXML format. The format will include features from both formats: An interchange schema which has split data vectors compatible with other analytical interchange formats; Support for both random access indexes and digital signatures via a wrapper schema. In support of the format, the format project will also include tools to support developers and users of the format: A program to normalize XML files for random access and digital signatures; A validation program to insure that the use of controlled vocabulary terms matches minimum reporting ("MIAPE") requirements; An 'Application Programming Interface' (API) including language bindings for popular programming languages; Abstract data models and other documentation to assist software developers who wish to implement systems based on the interchange format. In addition to the interchange format and software to help read and validate documents, the project will also develop reference implementations of data converters to create the format from as many mass spectrometry instruments as possible.
See also: the HUPO International website

Scaling Up with XQuery, Part 2
Bob DuCharme, XML.com
Although scaling up from Saxon's implementation of an in-memory XQuery database to a disk-based version requires a bit of extra effort, it's worth doing because you can create applications around much larger collections of data. And, it can be done for free. The previous article in this series showed how to set up and use MarkLogic server. In this article we'll see how to perform the same setup and usage tasks with two more servers: eXist and Sleepycat's Berkeley DB XML. As with MarkLogic, you usually interact with the open source eXist XQuery engine through an HTTP server that is part of the program. Sleepycat's open source Berkeley DB XML is not a server, but a library built on top of their Berkeley DB database. Sleepycat offers APIs for DB XML in C++, Java, Perl, Python, Ruby, and Tcl. Each of these XQuery engines has many more features than are covered in this article -- as index control, updating, and full-text searching; the goal is to get you to the point where you could start exploring those features with a reasonably large collection of your own data. Without spending any money, you can check them all out and discover the advantages to having large amounts of your XML stored in a database where you (or an application!) can use a W3C standard language to quickly retrieve what you want from that database.
See also: XML and Query Languages

Motorola Joins Eclipse to Boost Mobile Linux Apps
Paul Krill, InfoWorld
Motorola announced that it has joined the Eclipse Foundation as a Strategic Developer member and is proposing a project to boost mobile Linux application development. With Strategic Developer status, Motorola has a seat on the open source tools organization's board of directors and participates in the Eclipse Architecture, Requirements and Planning councils, the company said. The company had participated in Eclipse projects before but had not signed up as a Strategic Developer-level member. Motorola is working with Eclipse to propose an Eclipse Tools for mobile Linux (TmL) project, which would be part of the Device Software Development Platform (DSDP) Top-Level Project at Eclipse. The TmL effort is intended eventually to provide a home for mobile Linux extensions. Motorola's Eclipse membership is regarded by the company as another step in promoting awareness and adoption of Linux in the mobile space.
See also: the PR

Big Guns Take Sides in Standards Shootout
Chris Preimesberger, eWEEK
With data storage and so-called ILM (information lifecycle management) becoming hotter than the weather this summer, industry leaders are jockeying for position and political clout, much like the identity management market did a few years ago, when Microsoft started its Passport group and Sun Microsystems countered with the Liberty Alliance. This time, it's IBM leading the way in the Aperi consortium against a new one announced June 22 at Storage World Conference 2006 in Long Beach -- one that still needs a name but features five heavyweight competitors in EMC, Hewlett-Packard, Sun, Hitachi Data Systems and Symantec. Unlike Aperi, the five companies are working with established standards bodies to advance a common standard API (application programming interface) for storage customers. The companies, collectively representing more than half the worldwide market share for enterprise storage management software, will work together to ensure that the SNIA's (Storage Networking Industry Association) SMI-S (Storage Management Initiative specification) becomes a common, widely used industry standard. Aperi is mainly composed of IBM's OEM suppliers and partners, and it's modeling the APIs using the Eclipse software development environment.
See also: CIM-XML

Real-World Rule Engines
Geoffrey Wiseman, InfoQ
For many developers, rule engines are buzzwords, or black boxes on an architectural diagram: something to be feared or admired from afar, but not understood. A rule engine is, at its core, a mechanism for executing 'business rules'. Business rules are simple business-oriented statements that encode business decisions of some kind, often phrased very simply in an if/then conditional form. Rule engines are not limited to execution; they often come with other tools to manage rules: common options allow the creation, deployment, storage, versioning and other such administration of rules, either individually, or in groups. One example of a rule engines is Drools, which has recently been brought under the banner of the JBoss group. Because Drools is freely available, is open-source, and has a good community, it's a good starting place for exploring rule engines. In general, you might consider a business rule solution if you need to externalize business rules, support rapid change and empower business users to change business rules. You'll get the most out of a rule engine if you accept the new paradigm by relinquishing flow control, using fine-grained rules and objects, avoiding cross-products, and understanding the combinatorics and recursion that a rule approach can create.
See also: W3C Rule Interchange Format Working Group

Metasearch Authentication and Access Management
Michael Teets and Peter Murray, D-Lib Magazine
Metasearch -- also called parallel search, federated search, broadcast search, and cross-database search -- has become commonplace in the information community's vocabulary. All speak to a common theme of searching and retrieving from multiple databases, sources, platforms, protocols, and vendors at the point of the user's request. Metasearch services rely on a variety of approaches including open standards (such as NISO's Z39.50 and SRU/SRW), proprietary programming interfaces, and 'screen scraping.' However, the absence of widely supported standards, best practices, and tools makes the metasearch environment less efficient for the metasearch provider, the content provider, and ultimately the end-user. This article summarizes work and final recommendation of the Access Management Task Group, one of three groups chartered by NISO as part of the Metasearch Initiative. The focus of the group was on gathering requirements for Metasearch authentication and access needs, inventorying existing processes, developing a series of formal use cases describing the access needs, recommending best practices given today's processes, and recommending and pursing changes to current solutions to better support metasearch applications. Metasearch Initiative task groups have approved an XML Gateway Implementors Guide, NISO Z39.92-200x Information Retrieval Service Description Specification, and the NISO Z39.91-200x Collection Description Specification (profile for DCMI Abstract Model with an XML binding).
See also: NISO MetaSearch Initiative

XML.org is an OASIS Information Channel sponsored by BEA Systems, Inc., IBM Corporation, Innodata Isogen, SAP AG and Sun Microsystems, Inc.

Use http://www.oasis-open.org/mlmanage to unsubscribe or change an email address. See http://xml.org/xml/news_market.shtml for the list archives.