Greg,
thanks for your comments. A few follow ups if I may be so bold :-)
Frankly it looks awfully complex for what you get.
Yes, there are some parts that I probably wouldn't favour (alternate
content for one).
If we are dealing with document or text oriented XML then this is
a rather more plausible approach because we can assume that the
Agreed. I am focussing on the exchange of business documents that are
loosely coupled to the consumer and provider applications.
Putting this in-band in this way means that everyone shares the same
processing requirement. Everyone must process the same thing in the
same way.
Well not really. The annotations identify parts of the message which
MAY be ignored by receiver if that receiver has no interest in that
content. So those receievers that are interested implement
corresponding processing, those that are not, discard it for
processing (although may retain it for audit or legal reasons).
of the reasons why this is of interest to me is in comparing this
explicit approach to one where a schema allows for extensibility at
various points (using xs:any and [say] namespace ##other) . This can
allow a receiver to ignore content from a [foreign] namespace that is
unrecognised without raising an errors, and is one approach described
by and others to enable extensibility and support for [minor]
versioning by both schema owners and schema users. That method though
implies that the protocol used by each pair of communicating parties
is arranged apriori whereas the annotations make at least the callers
intent explict. I'm just trying to judge whether its worth the effort
:-)
The alternateContent stuff just makes me dizzy:
Agreed, I'm not keen on it either and don't think its all that practical.
The reference below to
a consumer having to forward data that is not understood to an upstream
consumer makes me nervous
Well, I don't really see this as much different from using a 'role' or
'actor' attribute in other protocols where the value might indicate a
specifically targetted end-point. In those cases, if you are an
intermediary node it is often required that you forward on data that
is not marked for your attention to the next node in the chain.
course all of our topologies are likely to be different. In my
cases we have service consumers calling via a web service portal,
which then forwards messages to one or more service provider, each of
which may internally have various hubs and routers to deliver the
message to the actual service endpoint implementation.
Regards
Fraser.
12/09/06, Greg Hunt <greg (AT) firmansyah (DOT) comwrote:
Fraser,
There needs to be a way of handling this stuff, but I am not sure that
this is the right way to do it. I had a quick look at the MS spec.
Frankly it looks awfully complex for what you get. The idea of "Must
Understand" is really "must be able to process this bit of this
structure version". If the consuming application is tightly tied to the
data, shouldn't this just be handled through a different structure
version? If we are dealing with data oriented XML, there will be an
application that consumes the structure that also needs to be updated to
handle the different structure and the benefit of the MS approach is
less. If we are dealing with document or text oriented XML then this is
a rather more plausible approach because we can assume that the
structure is more plastic and that the consuming application is not
tightly bound to the structure that it processes.
Putting this in-band in this way means that everyone shares the same
processing requirement. Everyone must process the same thing in the
same way. For some industries this may be a reasonable requirement, but
if the consumers have different interests in the message (having
different types of processing) this approach devalues the idea of
understanding or processing. Some consumers will understand or process
some data by dropping it on the floor. Statistical processors will
understand the data by recording its existence or by ignoring it.
The MS markup moves the complexity from the consumer to the producer,
which is fine for MS, with a (probable) large population of difficult
to update clients that can sustain either specialised XML parsers (it
really should be a kind of extension to a validating parser, or should
use the preprocessing model that the MS document suggests) or which can
sustain much more complex application level interpretation of the XML.
In a scenario where there are multiple consumers of data oriented XML
with differing technology platforms it is harder to see the model being
readily (and reliably) implementable. Where the number of consumers is
small, simply issuing a new version of a structure looks like a more
feasible approach. With a large number of consumers, the limiting issue
seems to be the availability and complexity of the parsing support.
The alternateContent stuff just makes me dizzy: an alternation mechanism
in the schema language and a runtime version on top of it. This really
looks like a textual XML idea rather than a data-oriented thing. Had
XML Schema addressed these kinds of data-oriented versioning issues we
might have had tool support for this kind of thing, but absent that tool
support I am not convinced that there is a low-complexity/low-risk
approach to this stuff outside of the Schema extend/restrict and
substitution features.
I understand the desire to enable this type of control, but I am not
sure that this is the best way to implement it. The reference below to
a consumer having to forward data that is not understood to an upstream
consumer makes me nervous, it adds a level of uncertainty to the
interface between the first and second consumers. In the environment
that I am working in, we do not have this kind of long-chain
relationship. Data will be retained for audit purposes, but the data
size/storage space is not yet a major issue for us.
Greg
Fraser Goffin wrote:
A while back I posted a question about use of the Must Ignore Unknown
(retain/discard) pattern described (primarily) by David as an
approach to processing XML instances containing allowable content that
MAY be ignored by a receiver if that content is 'not understood' (see
below for original post).
aspect of this which troubled me slightly was how the
communicating parties agree on what content can/should be ignored and
what content can/should be retained. In some vocabularies a protocol
agreement is made which can be asserted at run-time (e.g. ebXML CPA)
but for many exchanges, that protocol agreement (particularly in a B2B
scenario) may just be the subject of 'out of band'
discussion/documentation and often-times will not cover this aspect
specifically (or at least it may only come up some time after the
original agreement was made).
For a while I have had a document called ' XML Markup
Compatibility' ()
and today I gave it a read over. It's basically a specification which
describes formal annotations that can be used to assert
'mustUnderstand', 'ignoreable' and preserve content' requirements for
message exchanges. That all sounds good. But I am wondering whether
anyone out there a) agrees that it *is* useful/necessary, b) is using
it in anger (or something equivalent).
My original post had no responses but I'm not sure if that was because
no-one is really all that bothered about this subject (for us it has
some potential in our versioning strategy) ?
Regards
Fraser.
original post - Must Ignore Unkown (retain/discard) - August 30
2006
Many of you will be familiar with this term which is used to describe
an approach to processing XML instances containing allowable content
that MAY be ignored by a receiver if that content is 'not understood'.
'Not understood' is typically related to content in particular
locations (extension points) which is contained in a namespace that is
[foreign] to that of the 'main' schema[s] and which a receiver MAY
have no prior knowledge of. This is a common (ish) approach where a
schema 'owner' wants to allow users of that schema to add arbitrary
(or possibly constrained) content without causing existing
implementations to fail during instance validation (at least if they
are only using standard schema validation capabilities of mainstream
parsers).
David has written much on this subject (as have a few others)
and also describes 2 variants of the must ignore unknown pattern,
specifically, 'discard' and 'retain'. As I understand it, the former
means that unknown content can be both ignored and discarded (not
passed to upstream processing) without generating and error, and the
latter, that content may be ignored but should *not* be removed. It is
the 'discard' aspect which, when I was discussing the possible use of
this approach recently, that came under some challenge. I would be
interested in this forums view :-)
The, not unsurprising, challenge was/is this :-
in a situation where :-
- message data is captured by some application
- the basic content model for the transaction is defined by a standard
schema to which all participants agree to conform
- the standard schema allows for extensibility at various points so
long as these are defined in a foreign namespace.
- some of the data captured has been specified by only one provide of
the service and that provider has arranged with the application owner
to put that data in the appropriate extensibility area in an agreed
foreign namespace.
- the message (including all extension data) will be sent to *all*
potential service providers of which there may be many
- the service provider who requested the additional data wants to use
the standards based data model *not* create a completely private
schema for this transaction.
so what should receivers of the message who do *not* understand
the extension do ?
Are they likely to be obliged (possibly by legal, regulatory, audit,
requirements) to retain ALL data that a customer has agreed to send
(perhaps for non repudiation, DPA, or other reasons) regardless of
whether they intend to process it or not. And if so, does that make
it a practical non starter given that the size and content of 'unknown
data' requires them to provide an adequate (and equally unknown)
storage (and retrieval) capability (at least for those business
transactions to which these sort of obligations might apply) ?
welcome
Fraser
end of original post
>
>
>
>
XML-DEV is a publicly archived, unmoderated list hosted by ASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address:
unsubscribe: xml-dev-unsubscribe (AT) lists (DOT) xml.org
subscribe: xml-dev-subscribe (AT) lists (DOT) xml.org
List archive:
List Guidelines:
--
XML-DEV is a publicly archived, unmoderated list hosted by ASIS
to support XML implementation and development. To minimize
spam in the archives, you must subscribe before posting.
[Un]Subscribe/change address:
unsubscribe: xml-dev-unsubscribe (AT) lists (DOT) xml.org
subscribe: xml-dev-subscribe (AT) lists (DOT) xml.org
List archive:
List Guidelines: