XML.ORG DAILY NEWSLINK

XML and Web Services In The News - 21 November 2006

Provided by OASIS | Edited by Robin Cover

This issue of XML Daily Newslink is sponsored by BEA Systems, Inc.

HEADLINES:

	How Much Do I Ignore Thee: An Architecture to Retain Unknown Extensions
	MochiKit: Lift Up Your DOM Manipulation of XML
	Draft Charter: OASIS Enterprise Key Management Infrastructure (EKMI) TC
	SIP Interface to VoiceXML Media Services
	Microsoft and Novell Brawl Over Linux Patent FUD
	The Digital Ice Age
	SGML and the Longevity of Information

How Much Do I Ignore Thee: An Architecture to Retain Unknown Extensions
Dave Orchard, Blog
We've been evangelizing a model of versioning called the "Must Ignore Unknown" rule for a while now — as described in a versioning article published at XML.com ("Extensibility, XML Vocabularies, and XML Schema"). It roughly means that any extra content that isn't known is ignored and specifically no error is generated. This works very well in the Web model because any extra markup is ignored by the browser. The human reader won't ever see the extra content. This works very well when the software doing the ignoring is the last piece of software looking at the data. In many applications, the software that gets an extension isn't the last piece. So what does it mean for it to ignore the extra content? Should it throw it away? Should it keep it but not fault? I'll call these two models the "Ignore and Discard" and the "Ignore but Retain" models. The application designer must choose which of the Ignore models to implement. There are pros and cons to each model. The discard model has the advantage that it may be simpler to implement and gives at least a simple versioning story. Language designers that have designed their systems for extensibility and versioning will usually have a flavour of the Must Ignore Unknown rule. This article describes the question of what flavor of Ignoring to use, and a sample architecture that preserves unknown content.
See also: the XML.com article

MochiKit: Lift Up Your DOM Manipulation of XML
David Mertz, IBM developerWorks
MochiKit is a useful and high-level library for JavaScript. MochiKit takes its main inspiration from Python, and from the many conveniences the Python standard library offers. The "X" in Ajax is there largely because ECMAScript, as implemented in Web browsers, more-or-less supports the W3C Document Object Model (DOM) specification. While you might make an argument to use the formality of the W3C DOM for a strongly and statically typed, highly structured, and carefully encapsulated, language like Java — what we programmers call a bondage- and-discipline language — there seems little motivation for it in a comparatively agile language like ECMAScript. A reader might be inclined to wonder why she should bother with the XML part at all, even given what MochiKit.DOM makes easier. After all, JSON is essentially just a native JavaScript data structure, which is lighter still. Nonetheless, XML retains some advantages. On one hand, in presentation contexts, XML is well able to be styled directly with CSS2. You can, of course, transform JSON into a stylable DOM object, but essentially that just means moving back to XML or (X)HTML. On the other hand, a lot more tools outside the ECMAScript interpreter itself talk XML than they do JSON. Data can arrive from — or be delivered back to — servers that use XML to define structured data. In some cases, this XML follows well-known and well-defined schemas, including ones that conform to published standards. If some other system in the overall communication flow wants to communicate using SVG, or OpenDocument, or TEI, or some ebXML standard, there are probably good reasons not to insert JSON as an extra layer in that mix. Fortunately, MochiKit.DOM builds on what W3C DOM is intended to do — provide an API for abstract document structures — while making the easy things easy, and the hard things a lot less hard than they are in W3C DOM. The real magic in MochiKit.DOM is its willingness to flexibly coerce various types of objects into the right types during method calls, including doing so recursively.

Draft Charter for OASIS Enterprise Key Management Infrastructure (EKMI) TC
Staff, OASIS Announcement
OASIS announced that a draft TC charter has been submitted to establish a new Enterprise Key Management Infrastructure (EKMI) Technical Committee. New interest is seen on the part of many companies in the management of symmetric keys used for encrypting sensitive data in their computing infrastructure. While symmetric keys have been traditionally managed by applications doing their own encryption and decryption, there is no architecture or protocol that provides for symmetric key management services across applications, operating systems, databases, etc. While there are many industry standards around protocols for the life-cycle management of asymmetric (or public/private) keys — PKCS10, PKCS7, CRMF, CMS, etc. — however, there is no standard that describes how applications may request similar life-cycle services for symmetric keys, from a server and how public-key cryptography may be used to provide such services. Key management needs to be addressed by enterprises in its entirety — for both symmetric and asymmetric keys. While each type of technology will require specific protocols, controls and management disciplines, there is sufficient common ground in the discipline justifying the approach to look at key-management as a whole, rather than in parts. Therefore, the TC will define the request/response protocols for: (1) Requesting a new or existing symmetric key from a server; (2) Requesting policy information from a server related to caching of keys on the client; (3) Sending a symmetric key to a requestor, based on a request; (4) Sending policy information to a requestor, based on a request; (5) Other protocol pairs as deemed necessary.

SIP Interface to VoiceXML Media Services
Dave Burke (et al., eds), IETF Internet Draft
This document describes a SIP interface to VoiceXML media services, which is commonly employed between application servers and media servers offering VoiceXML processing capabilities. VoiceXML is a World Wide Web Consortium (W3C) standard for creating audio and video dialogs that feature synthesized speech, digitized audio, recognition of spoken and DTMF key input, recording of audio and video, telephony, and mixed initiative conversations. VoiceXML allows Web-based development and content delivery paradigms to be used with interactive video and voice response applications. This document describes a SIP interface to VoiceXML media services, which is commonly employed between Application Servers and media servers offering VoiceXML processing capabilities. SIP is responsible for initiating a media session to the VoiceXML media server and simultaneously triggering the execution of a specified VoiceXML application. The interface described here owes its genesis to the 2001 draft of SIPVXML and leverages a mechanism for identifying dialog media services described in RFC 4240. A set of commonly implemented functions and extensions have been specified including VoiceXML dialog preparation, outbound calling, video media support, and transfers. CCXML 1.0 applications provide services mainly through controlling the interaction between Connections, Conferences, and Dialogs. Although CCXML is capable of supporting arbitrary dialog environments, VoiceXML is commonly used as a dialog environment in conjunction with CCXML applications; CCXML is specifically designed to effectively support the use of VoiceXML. The interface described in this document can be used by CCXML 1.0 implementations to control VoiceXML Media Servers.

Microsoft and Novell Brawl Over Linux Patent FUD
Kevin Murphy, Computer Business Review Online
The honeymoon is over already. Microsoft Corp and Novell Inc said yesterday they've 'agreed to disagree' on the touchy subject of whether Microsoft has any intellectual property rights over Linux. Novell boss Ron Hovsepian spoke out to "strongly challenge" recent statements by Microsoft executives, which he characterized as "damaging". "We disagree with the recent statements made by Microsoft on the topic of Linux and patents," he wrote in an open letter to the Linux community. "Importantly, our agreement with Microsoft is in no way an acknowledgment that Linux infringes upon any Microsoft intellectual property." Microsoft issued its own statement yesterday in which it admitted that Novell had not made such an acknowledgment, but added that the two companies have "agreed to disagree" on whether Linux does in fact infringe on Microsoft's patents. The two companies announced on November 2 a deal whereby Microsoft would funnel hundreds of millions of dollars into Novell and resell its SUSE Linux software. In return Novell would pay Microsoft a royalty on its sales of SUSE. Crucially, the deal also involved pledges not to sue each other's customers on intellectual property grounds. But it did not involve any IP licensing, and Novell soon said that neither party was asserting patent rights over the other's software, which was confusing. That changed late last week, when Microsoft executives started hinting that users of non-SUSE variants of Linux were at risk of infringing Microsoft patents.

The Digital Ice Age
Brad Reagan, Popular Mechanics
The documents of our time are being recorded as bits and bytes with no guarantee of future readability. As technologies change, we may find our files frozen in forgotten formats. Will an entire era of human history be lost? Ken Thibodeau is head of the U.S. National Archives' Electronic Records Archive (ERA), charged with the daunting task of preserving all historically relevant documents and materials generated by the federal government — everything from White House e-mails to the storage locations of nuclear waste. Thibodeau: "The problem is that everything we build, whether it is a highway, tunnel, ship or airplane, is designed using computers. Electronic records are being sent to the archives at 100 times the rate of paper records. We don't know how to prevent the loss of most digital information that's being created today." To date, the ERA has identified more than 4500 file types that need to be accounted for. Each file type essentially requires an independent solution. What type of information needs to be preserved? How does that information need to be presented? As a relatively simple example, let's take an e-mail from the head of a regulatory agency. If the correspondence is pure text, it's a straightforward solution. But what if there is an attachment? What type of file is the attachment? If the attachment is a spreadsheet, does the behavior of the spreadsheet need to be retained? In other words, will it be important for future generations to be able to execute the formulas and play with the data? Lockheed is building what is primarily a "migration" system, in which files are translated into flexible formats such as XML (Extensible Markup Language), so the files can be accessed by technologies of the future. The idea is to make copies without losing essential characteristics of the data. Not everyone agrees with Lockheed's approach. Rothenberg, of the Rand Corp., for example, believes an "emulation" strategy would be more appropriate. [Clyde] Relick says the cost and technical effort involved in emulation are not feasible for a project the size of the ERA. In addition, he notes that the archives in their entirety will need to be accessible to anyone with a browser, and emulation becomes more difficult when you have to account for users with an infinite variety of hardware and software.

SGML and the Longevity of Information
Erik Naggum, Post to Newsgroup comp.text.sgml [1995]
[Introduction: SGML and XML are formal metalanguage facilities for defining markup languages. SGML has the full power to configure a set of features for markup languages, whereas XML has a fixed set of these SGML features; XML is therefore a profile of SGML, rather like a proper subset of SGML features. Markup theorists have long believed that SGML and XML languages, using textual representation, play an important role in data preservation.] Erik Naggum: "to describe exactly what SGML is is very difficult. it's a language that can be used to build the infrastructure for interchange of and longevity for information. by way of analogy, one could describe it as 'SGML and the Art of Information Maintenance — An Inquiry into the Value of Information' (with apologies to Robert Pirsig). that is, a way of life once you have realized that the information we create take on a life of its own and it can die if we don't care for and feed it properly. in ancient times, you had to burn down a major library to destroy information, but you got to be remembered for it. today, you need only upgrade to the latest version of a particular software product, change a printer, use patented software in the compression of the data, etc, to destroy many orders of magnitude more information, but the history books have yet to notice that the previous generation was the last to leave permanent traces of its tools... outside of the publishing industry, understood suitably widely, SGML is thus regarded as a possible means to save the information that mankind generates and stores in perishable, proprietary, un(der)documented formats. e.g., during the time it takes to write and produce a dictionary, the computer industry will go through at least two major revolutions. in an industry where 'three seconds is a long time', the things it helps build: oil rigs, cities, laws, 'cultural heritage', standards, all have lifespans of several billion seconds..."

XML.org is an OASIS Information Channel sponsored by BEA Systems, Inc., IBM Corporation, Innodata Isogen, SAP AG and Sun Microsystems, Inc.

Use http://www.oasis-open.org/mlmanage to unsubscribe or change an email address. See http://xml.org/xml/news_market.shtml for the list archives.