XML and Web Services In The News - 11 August 2006

Provided by OASIS | Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by IBM


HEADLINES:

 Solr: Indexing XML with Lucene and REST
 W3C Releases SVG Tiny 1.2 As a Candidate Recommendation
 BPEL: Creating Simple Asynchronous & Synchronous Business Processes
 The Java XML Validation API: Check Documents for Conformance to Schemas
 Yahoo Delivers Resource for Python Developers
 DB2: XML in Focus
 Call for Asia to Adopt ODF
 Why Microsoft Should Open XAML
 Healthcare, Meet Open Source
 XML Programming with PHP and Ajax
 Authoritative Metadata

Solr: Indexing XML with Lucene and REST
Bertrand Delacretaz, XML.com
Solr (pronounced "solar") builds on the well-known Lucene search engine library to create an enterprise search server with a simple HTTP/XML interface. Using Solr, large collections of documents can be indexed based on strongly typed field definitions, thereby taking advantage of Lucene's powerful full-text search features. This article describes Solr's indexing interface and its main features, and shows how field-type definitions are used for precise content analysis. Solr began at CNET Networks, where it is used to provide high-relevancy search and faceted browsing capabilities. Although quite new as a public project: the code was first published in January 2006, and it is already used for several high-traffic websites. The project is currently in incubation at the Apache Software Foundation (ASF). This means that it is a candidate for becoming an official project of the ASF, after an observation phase during which the project's community and code are examined for conformity to the ASF's principles. With Solr you have all the indexing power of Lucene under the hood, with its highly customizable analyzers, similarity searches, controlled ranking of results, faceted browsing, etc. Also, having been designed for high- traffic systems means that Solr's performance and scalability is already up there with the best. Index replication between search servers is available, Solr's no-nonsense HTTP interface makes it possible to create search clusters using common HTTP load-balancing mechanisms, and powerful internal caches help get the most out of each Solr instance.
See also: Solr Apache Incubator Project

W3C Releases SVG Tiny 1.2 As a Candidate Recommendation
Ola Andersson, Robin Berjon, et al.(eds), W3C Technical Report
W3C has announces the advancement of the "Scalable Vector Graphics (SVG) Tiny 1.2 Specification" to W3C Candidate Recommendation as of 10 August 2006. With native support shipping in Opera and Firefox browsers on desktops, the SVG language describes interactive vector graphics, text, images, animation and graphical applications in XML. SVG Tiny 1.2 is designed for Web access by devices of all sizes from handhelds to desktops, automobile media centers and entertainment consoles. The specification describes a collection of abstract modules that provide specific units of functionality. These modules may be combined with each other and with modules defined in other specifications (such as XHTML) to create SVG subset and extension document types that qualify as members of the SVG family of document types. The SVG Working Group expects to request that the Director advance this document to Proposed Recommendation once the Working Group has demonstrated at least two interoperable implementations for each test in the SVG Tiny 1.2 test suite; furthermore, at least one of the passing implementations must be on a mobile platform. The SVG Working Group, working closely with the developer community, expects to show these implementations by January 2007. This estimate is based on the Working Group's preliminary implementation report. The Working Group expects to revise this report over the course of the implementation period. The Working Group does not plan to request to advance to Proposed Recommendation prior to 10 November 2006. The companion "SVGT 1.2 Requirements" specification has also been updated.
See also: the announcement

BPEL: Creating Simple Asynchronous & Synchronous Business Processes
Gopalan Suresh Raj, Web Cornucopia
This tutorial provides an overview of the sample project, AsynchronousSample, and illustrates deploying, executing and testing a asynchronous BPEL process using the NetBeans 5.5 Early Access bundle with all the necessary runtimes. The Process is simple. It is basically an echo process, but it is an asynchronous echo, not a synchronous echo. A client sends the process a message. The process receives the input message and returns immediately. Then the process asynchronously calls the original client and sends the same message back. An asynchronous process is used when the BPEL process is long running (takes a long time to compute the result) and the results are returned to the client by doing an invocation on the client. In this tutorial you will use a simple BPEL project called AsynchronousSample and a Composite Application project called AsynchronousSampleApplication. The project includes WSDL and Schema files, a deployment descriptor, and input files for testing. The web service interface for this process is a single asynchronous operation. The NetBeans Enterprise Pack 5.5 Early access that is part of the Java EE Tools Bundle is a free download that comes with a plethora of tooling that helps the SOA Developer. Tools like the XML Schema (XSD) Editor, the WSDL Editor, the BPEL Visual Designer, and all the other tools that are part of this download help the SOA developer be extremely productive.
See also: BPEL Designer Feature

The Java XML Validation API: Check Documents for Conformance to Schemas
Elliotte Rusty Harold, IBM developerWorks
Validation reports whether a document adheres to the rules specified by the schema. It enables you to quickly check that input is roughly in the form you expect and quickly reject any document that is too far away from what your process can handle. If there's a problem with the data, it's better to find out earlier than later. Different parsers and tools support different schema languages such as DTDs, the W3C XML Schema Language, RELAX NG, and Schematron. In the context of Extensible Markup Language (XML), validation normally involves writing a detailed specification for the document's contents in any of several schema languages such as the World Wide Web Consortium (W3C) XML Schema Language (XSD), RELAX NG, Document Type Definitions (DTDs), and Schematron. Sometimes validation is performed while parsing, sometimes immediately after. However, it's usually done before any further processing of the input takes place. Until recently, the exact Application Programming Interface (API) by which programs requested validation varied with the schema language and parser. DTDs and XSD were normally accessed as configuration options in Simple API for XML (SAX), Document Object Model (DOM), and Java API for XML Processing (JAXP). RELAX NG required a custom library and API. Schematron might use the Transformations API for XML(TrAX); and still other schema languages required programmers to learn still more APIs, even though they were performing essentially the same operation. Java 5 adds a uniform validation Application Programming Interface (API) that can compare documents to schemas written in these and other languages.

Yahoo Delivers Resource for Python Developers
Darryl K. Taft, eWEEK
Yahoo has created a new resource called the 'Yahoo Developer Network - Python Developer Center'. The Center is a Web site that provides Python developers with access to information to help them build applications in the Python object-oriented dynamic language. Yahoo officials said the Sunnyvale, Calif., company quietly launched the site as a developer resource for information about using Python with Yahoo Web Services APIs. Simon Willison, the developer who put together the Yahoo site, said the bulk of the information on the site is "how-tos" that show developers how to do various things with Python. Some of the specific how-tos Willison pointed out include: Make Yahoo Web Service REST calls with Python, Cache API calls using Python, Parse JSON using Python, Parse XML using Python, Access the Yahoo Search APIs using pYsearch, and Access Yahoo RSS feeds using Python. pYsearch is an open-source Python library for accessing the Yahoo Search APIs. The Yahoo Python Developer Center also features links to several Python educational resources, including Python.org, the home of Python on the Web; the Python Cookbook, a collection of useful Python code snippets; and the Python Package Index, which offers a range of open-source Python packages for developers to install.
See also: Python Developer Center

XML in Focus
Ken North, DB2 Magazine, Special Issue on XML
DB2 9's "pureXML" technology is speeding development for early customers, including financial-services giant Storebrand. Explore the developer- friendly features behind the radical improvements. Since IBM introduced object-relational technology with DB2 Universal Database 5.0, Internet technology, distributed computing, and, most recently, Extensible Markup Language (XML) have exerted a major influence on computing. XML turns a spotlight on document-centric computing, new standard formats for office documents, and SQL/XML:2003, the successor to the SQL standard. Content management and Web-facing applications often involve storing and retrieving XML data. XML provides the underpinnings for data integration, process integration, and enterprise information integration. XML also provides enabling technology for a new distributed computing model that includes Web services, grid services, and service-oriented architectures (SOA). DB2 9's ability to process both XML and SQL is a substantial benefit. It enables the use of a single database platform for data processing, document processing, and SOA. To someone grounded in SQL and tabular structures, XML opens the door to a structured document mindset and new query technology. A common approach to integrating XML into an SQL platform is to support queries over XML by mapping to relational algebra. This approach uses the existing relational engine, which DB2 XML Extender has done since DB2 UDB 6.1. In DB2 9, a single engine (optimized for both XML and relational data) processes relational and XML (hierarchical) data; however, the two data types reside in separate storage layers. The new engine treats an XML document as a parsed, annotated tree structure and supports indexing parts of documents. Hand-in-hand with the new XML data store, DB2 9 supports the SQL/XML:2003 XML type, SQL/XML functions, and XQuery. DB2 9 lets you query XML data using XQuery alone, SQL alone, XQuery that invokes SQL, and SQL/XML functions that execute XQuery expressions.
See also: the Editor's intro

Call for Asia to Adopt ODF
Aaron Tan, ZDNet Asia News
An official from the United Nations (U.N.) has called for countries in the Asia-Pacific region to embrace the OpenDocument format. Sunil Abraham, manager of the International Open Source Network (IOSN) at the U.N., told ZDNet Asia that most governments in the region have already stated their support for open standards, through their respective government interoperability frameworks. He hopes that governments in the region will now extend that support and "seriously consider" the OpenDocument Format (ODF). Last month, Malaysia became the one of the first Asian countries to propose the use of ODF as a national standard for office documents. Hasannudin Saidin, a member of Sirim, the country's standards development agency, said on his blog last month that the proposal will now undergo approval from a higher-level committee within Sirim. Public consultation on the proposal will stretch over two months, beginning in September and ending in October 2006, after which comments will be raised to the Malaysian Minister of Science, Technology and Innovation. According to Saidin, ODF is expected to become a Malaysian-defined standard MS 26300, by the year-end. In the Philippines, there is no official policy on the adoption of ODF in the country, according to Peter Antonio Banzon, division chief of the Philippines' Advanced Science & Technology Institute, although the government agency has already standardized its internal documents on the ODF.

Why Microsoft Should Open XAML
Jon Udell, InfoWorld
Open standards are key to leading the rich Internet applications market. In his recent blog entry, Google's Joe Beda accepts partial blame for the excruciatingly slow progress of the Windows Presentation Foundation (aka Avalon). The idea, he admits, was to go big and 'build something only Microsoft can build.' With 20/20 hindsight, Beda wishes things had been done differently: a smaller team, incremental releases. And he holds out some hope for the awkwardly named Windows Presentation Foundation/Everywhere (WPF/E), the lightweight, portable, .Net-based 'Flash killer,' that I discussed in my interview with Bill Gates from the 2005 Professional Developers Conference. The WPF/E runtime won't implement all of XAML (XML Application Markup Language), a .Net language tuned for declarative application layout. But 'the portion of XAML we've picked,' Gates told me, 'will be everywhere, absolutely everywhere, and it has to be.' Here's a crazy idea: Open-source the WPF/E, endorse a Mono-based version, and make XAML an open standard. Why? Because an Adobe/Microsoft arms race ignores the real competition: Web 2.0, and the service infrastructure that supports it. The HTML/ JavaScript browser has been shown to be capable of tricks once thought impossible. Meanwhile, though, we're moving inexorably toward so-called RIAs (rich Internet applications) that are defined, at least in part, by such declarative XML languages as Adobe's MXML, Microsoft's XAML, Mozilla's XUL (XML User Interface Language), and a flock of other variations on the theme. Imagine a world in which browsers are ubiquitous, yet balkanized by incompatible versions of HTML. That's just where RIA players and their XML languages are taking us. Is there an alternative? Sure. Open XAML. There's a stake in the ground that future historians could not forget.

Healthcare, Meet Open Source
Sean Michael Kerner, InternetNews.com
Though the ability to collaborate and share information is a critical component of modern IT infrastructures, it is often lacking in healthcare environments, where siloed information is the norm. Such information is housed on proprietary computing architectures that can't always be accessed by different platforms. Taking its cue to deliver a salve for this situation, IBM this week said it is open sourcing technology to the Eclipse Foundation's Open Healthcare Framework (OHF) project in an effort to bridge the information silos. "Medical facilities and doctors all have their own ways of communicating and distributing medical information much of it hard copy," Scott Handy, vice president of worldwide Linux and open source at IBM: "There is no good way to transmit medical information because there was no standard." Even with a standard in place, solutions would still be difficult to come by, which is why IBM is open sourcing an implementation of a health care information exchange standard. Handy said that because abstract specs are often so hard to collaborate on among different vendors, an open source implementation of a specification is the best way to collaborate. Eclipse OHF is endeavoring to create a standards- based platform for the healthcare software industry. IBM is no novice in open sourcing healthcare software. In 2005, the systems vendor began an effort called the Interoperable Healthcare Information Infrastructure (IHII) project, which includes an SOA approach to exchange information using OHF.
See also: Open Healthcare Framework (OHF) Project

XML Programming with PHP and Ajax
Hardeep Singh and Cindy Saracco, DB2 Magazine
DB2 and other relational databases have matured considerably in their XML offerings, making them an ideal choice to store and manage XML data in addition to relational data. DB2 9 XML support (called pureXML) provides the capability to store XML in its pure form (in other words, in annotated, tree-like, hierarchical storage). Inside DB2 9, XML data can be indexed using XML patterns, composed from relational data, decomposed to relational data, and queried, transformed, and published stand-alone or combined with relational data using a mix of SQL/XML and XQuery. Web browsers are also providing more functionality to client script to efficiently handle XML. Using Asynchronous JavaScript and XML (Ajax), Web pages can now make direct remote procedure calls to application servers and use DOM APIs on any returned XML data. This article shows how to exploit the capabilities provided by DB2 XML, Ajax, and PHP Hypertext Preprocessor (PHP) to write simple XML-based applications. With the help of a sample scenario, you will learn how to make JavaScript calls to a PHP application; how to modify any XML data using DOM and SimpleXML APIs, how to transfer the XML from the client to application to database, and how to create a PHP Web service to publish reports on the XML data using SQL/XML and XQuery. XML provides developers with the ability to define rules and structures for business documents as well as to instantiate the documents in memory as hierarchical objects that can be navigated, modified, and serialized in any of the tiers using standard APIs. Ajax enables Web-based client scripts to call DOM APIs and make remote procedure calls to a middle tier. PHP provides one of the simplest approaches for handling XML and Web services, making it a perfect fit for XML-based application development.

Authoritative Metadata
Roy T. Fielding and Ian Jacobs (eds), Approved W3C TAG Finding
Vincent Quint announced that The W3C Technical Architecture Group (TAG) has approved the finding on "Authoritative Metadata." This release is an update to the previously approved finding of 25-February-2004. W3C created the TAG to document and build consensus around principles of Web architecture and to interpret and clarify these principles when necessary. The TAG also resolves issues involving general Web architecture brought to the TAG, and help coordinate cross-technology architecture developments inside and outside W3C. In Web architecture, communication between agents consists of exchanging messages with predefined syntax and semantics: a shared expectation of how each message's control data and payload (representation data and metadata) will be interpreted by the recipient. When supported by the communication protocol, the Web architecture uses representation metadata to indicate the sender's intentions regarding how the recipient should interpret the representation data. For example, HTTP and MIME use the value of the "Content-Type" header field to indicate the Internet media type of the representation, which influences the dispatching of handlers and security-related decisions made by recipients of the message. The key architectural points of this finding: (1) Metadata received in an encapsulating container, such as the metadata within the header fields of a message that describe the data enclosed within that message, is authoritative in defining the nature of the data received. (2) Inconsistency between representation data and metadata is an error that should be discovered and corrected rather than silently ignored. (3) An agent MUST NOT ignore or override authoritative metadata without the consent of the party employing the agent. (4) Specifications MUST NOT work against the Web architecture by requiring or suggesting that a recipient override authoritative metadata without user consent.
See also: the document overview


XML.org is an OASIS Information Channel sponsored by BEA Systems, Inc., IBM Corporation, Innodata Isogen, SAP AG and Sun Microsystems, Inc.

Use http://www.oasis-open.org/mlmanage to unsubscribe or change an email address. See http://xml.org/xml/news_market.shtml for the list archives.


Bottom Gear Image