XML

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • Open XML Markup Compatibility

    8 answers - 4858 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    A while back I posted a question about use of the Must Ignore Unknown
    (retain/discard) pattern described (primarily) by David as an
    approach to processing XML instances containing allowable content that MAY
    be ignored by a receiver if that content is 'not understood' (see below for
    original post).
    aspect of this which troubled me slightly was how the communicating
    parties agree on what content can/should be ignored and what content
    can/should be retained. In some vocabularies a protocol agreement is made
    which can be asserted at run-time (e.g. ebXML CPA) but for many exchanges,
    that protocol agreement (particularly in a B2B scenario) may just be the
    subject of 'out of band' discussion/documentation and often-times will not
    cover this aspect specifically (or at least it may only come up some time
    after the original agreement was made).
    For a while I have had a document called ' XML Markup Compatibility'
    () and today I gave it a
    read over. It's basically a specification which describes formal annotations
    that can be used to assert 'mustUnderstand', 'ignoreable' and preserve
    content' requirements for message exchanges. That all sounds good. But I am
    wondering whether anyone out there a) agrees that it *is* useful/necessary,
    b) is using it in anger (or something equivalent).
    My original post had no responses but I'm not sure if that was because
    no-one is really all that bothered about this subject (for us it has some
    potential in our versioning strategy) ?
    Regards
    Fraser.
    original post - Must Ignore Unkown (retain/discard) - August 30 2006
    Many of you will be familiar with this term which is used to describe
    an approach to processing XML instances containing allowable content
    that MAY be ignored by a receiver if that content is 'not understood'.
    'Not understood' is typically related to content in particular
    locations (extension points) which is contained in a namespace that is
    [foreign] to that of the 'main' schema[s] and which a receiver MAY
    have no prior knowledge of. This is a common (ish) approach where a
    schema 'owner' wants to allow users of that schema to add arbitrary
    (or possibly constrained) content without causing existing
    implementations to fail during instance validation (at least if they
    are only using standard schema validation capabilities of mainstream
    parsers).
    David has written much on this subject (as have a few others)
    and also describes 2 variants of the must ignore unknown pattern,
    specifically, 'discard' and 'retain'. As I understand it, the former
    means that unknown content can be both ignored and discarded (not
    passed to upstream processing) without generating and error, and the
    latter, that content may be ignored but should *not* be removed. It is
    the 'discard' aspect which, when I was discussing the possible use of
    this approach recently, that came under some challenge. I would be
    interested in this forums view :-)
    The, not unsurprising, challenge was/is this :-
    in a situation where :-
    - message data is captured by some application
    - the basic content model for the transaction is defined by a standard
    schema to which all participants agree to conform
    - the standard schema allows for extensibility at various points so
    long as these are defined in a foreign namespace.
    - some of the data captured has been specified by only one provide of
    the service and that provider has arranged with the application owner
    to put that data in the appropriate extensibility area in an agreed
    foreign namespace.
    - the message (including all extension data) will be sent to *all*
    potential service providers of which there may be many
    - the service provider who requested the additional data wants to use
    the standards based data model *not* create a completely private
    schema for this transaction.
    so what should receivers of the message who do *not* understand
    the extension do ?
    Are they likely to be obliged (possibly by legal, regulatory, audit,
    requirements) to retain ALL data that a customer has agreed to send
    (perhaps for non repudiation, DPA, or other reasons) regardless of
    whether they intend to process it or not. And if so, does that make
    it a practical non starter given that the size and content of 'unknown
    data' requires them to provide an adequate (and equally unknown)
    storage (and retrieval) capability (at least for those business
    transactions to which these sort of obligations might apply) ?
    welcome
    Fraser
    end of original post
  • No.1 | | 1016 bytes | |

    9/12/06, Fraser Goffin <goffinf (AT) hotmail (DOT) comwrote:
    A while back I posted a question about use of the Must Ignore Unknown
    (retain/discard) pattern described (primarily) by David as an
    approach to processing XML instances containing allowable content that MAY
    be ignored by a receiver if that content is 'not understood' (see below for
    original post).

    aspect of this which troubled me slightly was how the communicating
    parties agree on what content can/should be ignored and what content
    can/should be retained.

    I guess that depends on how you publish. If the publishing is done as
    a pull, as resources that others can access then the publisher doesn't
    have to worry about how the receiver decides to ignore content.

    If it is push the publisher and receiver must have an agreement mechanism.

    globally speaking, pull scales better because it lends itself better
    to being loosely joined.

    Cheers,
    Bryan Rasmussen
  • No.2 | | 8826 bytes | |

    Fraser,
    There needs to be a way of handling this stuff, but I am not sure that
    this is the right way to do it. I had a quick look at the MS spec.
    Frankly it looks awfully complex for what you get. The idea of "Must
    Understand" is really "must be able to process this bit of this
    structure version". If the consuming application is tightly tied to the
    data, shouldn't this just be handled through a different structure
    version? If we are dealing with data oriented XML, there will be an
    application that consumes the structure that also needs to be updated to
    handle the different structure and the benefit of the MS approach is
    less. If we are dealing with document or text oriented XML then this is
    a rather more plausible approach because we can assume that the
    structure is more plastic and that the consuming application is not
    tightly bound to the structure that it processes.

    Putting this in-band in this way means that everyone shares the same
    processing requirement. Everyone must process the same thing in the
    same way. For some industries this may be a reasonable requirement, but
    if the consumers have different interests in the message (having
    different types of processing) this approach devalues the idea of
    understanding or processing. Some consumers will understand or process
    some data by dropping it on the floor. Statistical processors will
    understand the data by recording its existence or by ignoring it.

    The MS markup moves the complexity from the consumer to the producer,
    which is fine for MS, with a (probable) large population of difficult
    to update clients that can sustain either specialised XML parsers (it
    really should be a kind of extension to a validating parser, or should
    use the preprocessing model that the MS document suggests) or which can
    sustain much more complex application level interpretation of the XML.
    In a scenario where there are multiple consumers of data oriented XML
    with differing technology platforms it is harder to see the model being
    readily (and reliably) implementable. Where the number of consumers is
    small, simply issuing a new version of a structure looks like a more
    feasible approach. With a large number of consumers, the limiting issue
    seems to be the availability and complexity of the parsing support.

    The alternateContent stuff just makes me dizzy: an alternation mechanism
    in the schema language and a runtime version on top of it. This really
    looks like a textual XML idea rather than a data-oriented thing. Had
    XML Schema addressed these kinds of data-oriented versioning issues we
    might have had tool support for this kind of thing, but absent that tool
    support I am not convinced that there is a low-complexity/low-risk
    approach to this stuff outside of the Schema extend/restrict and
    substitution features.

    I understand the desire to enable this type of control, but I am not
    sure that this is the best way to implement it. The reference below to
    a consumer having to forward data that is not understood to an upstream
    consumer makes me nervous, it adds a level of uncertainty to the
    interface between the first and second consumers. In the environment
    that I am working in, we do not have this kind of long-chain
    relationship. Data will be retained for audit purposes, but the data
    size/storage space is not yet a major issue for us.

    Greg

    Fraser Goffin wrote:
    A while back I posted a question about use of the Must Ignore Unknown
    (retain/discard) pattern described (primarily) by David as an
    approach to processing XML instances containing allowable content that
    MAY be ignored by a receiver if that content is 'not understood' (see
    below for original post).

    aspect of this which troubled me slightly was how the
    communicating parties agree on what content can/should be ignored and
    what content can/should be retained. In some vocabularies a protocol
    agreement is made which can be asserted at run-time (e.g. ebXML CPA)
    but for many exchanges, that protocol agreement (particularly in a B2B
    scenario) may just be the subject of 'out of band'
    discussion/documentation and often-times will not cover this aspect
    specifically (or at least it may only come up some time after the
    original agreement was made).

    For a while I have had a document called ' XML Markup
    Compatibility' ()
    and today I gave it a read over. It's basically a specification which
    describes formal annotations that can be used to assert
    'mustUnderstand', 'ignoreable' and preserve content' requirements for
    message exchanges. That all sounds good. But I am wondering whether
    anyone out there a) agrees that it *is* useful/necessary, b) is using
    it in anger (or something equivalent).

    My original post had no responses but I'm not sure if that was because
    no-one is really all that bothered about this subject (for us it has
    some potential in our versioning strategy) ?

    Regards

    Fraser.

    original post - Must Ignore Unkown (retain/discard) - August 30
    2006

    Many of you will be familiar with this term which is used to describe
    an approach to processing XML instances containing allowable content
    that MAY be ignored by a receiver if that content is 'not understood'.
    'Not understood' is typically related to content in particular
    locations (extension points) which is contained in a namespace that is
    [foreign] to that of the 'main' schema[s] and which a receiver MAY
    have no prior knowledge of. This is a common (ish) approach where a
    schema 'owner' wants to allow users of that schema to add arbitrary
    (or possibly constrained) content without causing existing
    implementations to fail during instance validation (at least if they
    are only using standard schema validation capabilities of mainstream
    parsers).

    David has written much on this subject (as have a few others)
    and also describes 2 variants of the must ignore unknown pattern,
    specifically, 'discard' and 'retain'. As I understand it, the former
    means that unknown content can be both ignored and discarded (not
    passed to upstream processing) without generating and error, and the
    latter, that content may be ignored but should *not* be removed. It is
    the 'discard' aspect which, when I was discussing the possible use of
    this approach recently, that came under some challenge. I would be
    interested in this forums view :-)

    The, not unsurprising, challenge was/is this :-

    in a situation where :-
    - message data is captured by some application
    - the basic content model for the transaction is defined by a standard
    schema to which all participants agree to conform
    - the standard schema allows for extensibility at various points so
    long as these are defined in a foreign namespace.
    - some of the data captured has been specified by only one provide of
    the service and that provider has arranged with the application owner
    to put that data in the appropriate extensibility area in an agreed
    foreign namespace.
    - the message (including all extension data) will be sent to *all*
    potential service providers of which there may be many
    - the service provider who requested the additional data wants to use
    the standards based data model *not* create a completely private
    schema for this transaction.

    so what should receivers of the message who do *not* understand
    the extension do ?

    Are they likely to be obliged (possibly by legal, regulatory, audit,
    requirements) to retain ALL data that a customer has agreed to send
    (perhaps for non repudiation, DPA, or other reasons) regardless of
    whether they intend to process it or not. And if so, does that make
    it a practical non starter given that the size and content of 'unknown
    data' requires them to provide an adequate (and equally unknown)
    storage (and retrieval) capability (at least for those business
    transactions to which these sort of obligations might apply) ?

    welcome

    Fraser

    end of original post
    >
    >
    >


    XML-DEV is a publicly archived, unmoderated list hosted by ASIS
    to support XML implementation and development. To minimize
    spam in the archives, you must subscribe before posting.

    [Un]Subscribe/change address:
    unsubscribe: xml-dev-unsubscribe (AT) lists (DOT) xml.org
    subscribe: xml-dev-subscribe (AT) lists (DOT) xml.org
    List archive:
    List Guidelines:
  • No.3 | | 1776 bytes | |

    Message
    From: "Fraser Goffin" <goffinf (AT) hotmail (DOT) com>
    To: <xml-dev (AT) lists (DOT) xml.org>
    Sent: Tuesday, September 12, 2006 7:29 PM
    Subject: [xml-dev] XML Markup Compatibility

    A while back I posted a question about use of the Must Ignore Unknown
    (retain/discard) pattern described (primarily) by David as an
    approach to processing XML instances containing allowable content that MAY
    be ignored by a receiver if that content is 'not understood' (see below
    for
    original post).

    Whoever did this in 'may-ignore' stuff in the first place must have skipped
    communications 101.

    Firstly, if it may be ignored, then why send it in the first place?

    Secondly, any part of anything *may* be ignored, but it is up to the
    receiver to make that decision.

    When you get told by the communicator to ignore this and not that, you tend
    to think about just ignoring the whole jolly lot because the message
    becomes to conveluted, confusing and too much effort to understand.

    My original post had no responses but I'm not sure if that was because
    no-one is really all that bothered about this subject (for us it has some
    potential in our versioning strategy) ?

    Go back and check the setting of the <may_be_ignored_flag = 0>

    Regards

    David

    XML-DEV is a publicly archived, unmoderated list hosted by ASIS
    to support XML implementation and development. To minimize
    spam in the archives, you must subscribe before posting.

    [Un]Subscribe/change address:
    unsubscribe: xml-dev-unsubscribe (AT) lists (DOT) xml.org
    subscribe: xml-dev-subscribe (AT) lists (DOT) xml.org
    List archive:
    List Guidelines:
  • No.4 | | 1501 bytes | |

    >
    Firstly, if it may be ignored, then why send it in the first place?

    Because it may also not be ignored?

    Secondly, any part of anything *may* be ignored, but it is up to the
    receiver to make that decision.

    Sure, but if the document is a business document ignoring something
    that MUST NT be ignored then voids the business transaction.

    When you get told by the communicator to ignore this and not that, you tend
    to think about just ignoring the whole jolly lot because the message
    becomes to conveluted, confusing and too much effort to understand.

    However if you get told for purposes of doing x use the Y parts, for
    the purpose of doing x2 use Y parts + Y2, and in order to do x2, x3 or
    anything else you must first do x, then you may decide to implement
    your system for x and x2 but not anything above that, and anything for
    x-beyond x2 will be ignored.
    In other words your description of the meanings that may be layered on
    top of may ignore is lacking in actual real world usage of may ignore.

    Cheers,
    Bryan Rasmussen

    XML-DEV is a publicly archived, unmoderated list hosted by ASIS
    to support XML implementation and development. To minimize
    spam in the archives, you must subscribe before posting.

    [Un]Subscribe/change address:
    unsubscribe: xml-dev-unsubscribe (AT) lists (DOT) xml.org
    subscribe: xml-dev-subscribe (AT) lists (DOT) xml.org
    List archive:
    List Guidelines:
  • No.5 | | 12079 bytes | |

    Greg,

    thanks for your comments. A few follow ups if I may be so bold :-)

    Frankly it looks awfully complex for what you get.

    Yes, there are some parts that I probably wouldn't favour (alternate
    content for one).

    If we are dealing with document or text oriented XML then this is
    a rather more plausible approach because we can assume that the

    Agreed. I am focussing on the exchange of business documents that are
    loosely coupled to the consumer and provider applications.

    Putting this in-band in this way means that everyone shares the same
    processing requirement. Everyone must process the same thing in the
    same way.

    Well not really. The annotations identify parts of the message which
    MAY be ignored by receiver if that receiver has no interest in that
    content. So those receievers that are interested implement
    corresponding processing, those that are not, discard it for
    processing (although may retain it for audit or legal reasons).

    of the reasons why this is of interest to me is in comparing this
    explicit approach to one where a schema allows for extensibility at
    various points (using xs:any and [say] namespace ##other) . This can
    allow a receiver to ignore content from a [foreign] namespace that is
    unrecognised without raising an errors, and is one approach described
    by and others to enable extensibility and support for [minor]
    versioning by both schema owners and schema users. That method though
    implies that the protocol used by each pair of communicating parties
    is arranged apriori whereas the annotations make at least the callers
    intent explict. I'm just trying to judge whether its worth the effort
    :-)

    The alternateContent stuff just makes me dizzy:

    Agreed, I'm not keen on it either and don't think its all that practical.

    The reference below to
    a consumer having to forward data that is not understood to an upstream
    consumer makes me nervous

    Well, I don't really see this as much different from using a 'role' or
    'actor' attribute in other protocols where the value might indicate a
    specifically targetted end-point. In those cases, if you are an
    intermediary node it is often required that you forward on data that
    is not marked for your attention to the next node in the chain.

    course all of our topologies are likely to be different. In my
    cases we have service consumers calling via a web service portal,
    which then forwards messages to one or more service provider, each of
    which may internally have various hubs and routers to deliver the
    message to the actual service endpoint implementation.

    Regards

    Fraser.

    12/09/06, Greg Hunt <greg (AT) firmansyah (DOT) comwrote:
    Fraser,
    There needs to be a way of handling this stuff, but I am not sure that
    this is the right way to do it. I had a quick look at the MS spec.
    Frankly it looks awfully complex for what you get. The idea of "Must
    Understand" is really "must be able to process this bit of this
    structure version". If the consuming application is tightly tied to the
    data, shouldn't this just be handled through a different structure
    version? If we are dealing with data oriented XML, there will be an
    application that consumes the structure that also needs to be updated to
    handle the different structure and the benefit of the MS approach is
    less. If we are dealing with document or text oriented XML then this is
    a rather more plausible approach because we can assume that the
    structure is more plastic and that the consuming application is not
    tightly bound to the structure that it processes.

    Putting this in-band in this way means that everyone shares the same
    processing requirement. Everyone must process the same thing in the
    same way. For some industries this may be a reasonable requirement, but
    if the consumers have different interests in the message (having
    different types of processing) this approach devalues the idea of
    understanding or processing. Some consumers will understand or process
    some data by dropping it on the floor. Statistical processors will
    understand the data by recording its existence or by ignoring it.

    The MS markup moves the complexity from the consumer to the producer,
    which is fine for MS, with a (probable) large population of difficult
    to update clients that can sustain either specialised XML parsers (it
    really should be a kind of extension to a validating parser, or should
    use the preprocessing model that the MS document suggests) or which can
    sustain much more complex application level interpretation of the XML.
    In a scenario where there are multiple consumers of data oriented XML
    with differing technology platforms it is harder to see the model being
    readily (and reliably) implementable. Where the number of consumers is
    small, simply issuing a new version of a structure looks like a more
    feasible approach. With a large number of consumers, the limiting issue
    seems to be the availability and complexity of the parsing support.

    The alternateContent stuff just makes me dizzy: an alternation mechanism
    in the schema language and a runtime version on top of it. This really
    looks like a textual XML idea rather than a data-oriented thing. Had
    XML Schema addressed these kinds of data-oriented versioning issues we
    might have had tool support for this kind of thing, but absent that tool
    support I am not convinced that there is a low-complexity/low-risk
    approach to this stuff outside of the Schema extend/restrict and
    substitution features.

    I understand the desire to enable this type of control, but I am not
    sure that this is the best way to implement it. The reference below to
    a consumer having to forward data that is not understood to an upstream
    consumer makes me nervous, it adds a level of uncertainty to the
    interface between the first and second consumers. In the environment
    that I am working in, we do not have this kind of long-chain
    relationship. Data will be retained for audit purposes, but the data
    size/storage space is not yet a major issue for us.

    Greg

    Fraser Goffin wrote:
    A while back I posted a question about use of the Must Ignore Unknown
    (retain/discard) pattern described (primarily) by David as an
    approach to processing XML instances containing allowable content that
    MAY be ignored by a receiver if that content is 'not understood' (see
    below for original post).

    aspect of this which troubled me slightly was how the
    communicating parties agree on what content can/should be ignored and
    what content can/should be retained. In some vocabularies a protocol
    agreement is made which can be asserted at run-time (e.g. ebXML CPA)
    but for many exchanges, that protocol agreement (particularly in a B2B
    scenario) may just be the subject of 'out of band'
    discussion/documentation and often-times will not cover this aspect
    specifically (or at least it may only come up some time after the
    original agreement was made).

    For a while I have had a document called ' XML Markup
    Compatibility' ()
    and today I gave it a read over. It's basically a specification which
    describes formal annotations that can be used to assert
    'mustUnderstand', 'ignoreable' and preserve content' requirements for
    message exchanges. That all sounds good. But I am wondering whether
    anyone out there a) agrees that it *is* useful/necessary, b) is using
    it in anger (or something equivalent).

    My original post had no responses but I'm not sure if that was because
    no-one is really all that bothered about this subject (for us it has
    some potential in our versioning strategy) ?

    Regards

    Fraser.

    original post - Must Ignore Unkown (retain/discard) - August 30
    2006

    Many of you will be familiar with this term which is used to describe
    an approach to processing XML instances containing allowable content
    that MAY be ignored by a receiver if that content is 'not understood'.
    'Not understood' is typically related to content in particular
    locations (extension points) which is contained in a namespace that is
    [foreign] to that of the 'main' schema[s] and which a receiver MAY
    have no prior knowledge of. This is a common (ish) approach where a
    schema 'owner' wants to allow users of that schema to add arbitrary
    (or possibly constrained) content without causing existing
    implementations to fail during instance validation (at least if they
    are only using standard schema validation capabilities of mainstream
    parsers).

    David has written much on this subject (as have a few others)
    and also describes 2 variants of the must ignore unknown pattern,
    specifically, 'discard' and 'retain'. As I understand it, the former
    means that unknown content can be both ignored and discarded (not
    passed to upstream processing) without generating and error, and the
    latter, that content may be ignored but should *not* be removed. It is
    the 'discard' aspect which, when I was discussing the possible use of
    this approach recently, that came under some challenge. I would be
    interested in this forums view :-)

    The, not unsurprising, challenge was/is this :-

    in a situation where :-
    - message data is captured by some application
    - the basic content model for the transaction is defined by a standard
    schema to which all participants agree to conform
    - the standard schema allows for extensibility at various points so
    long as these are defined in a foreign namespace.
    - some of the data captured has been specified by only one provide of
    the service and that provider has arranged with the application owner
    to put that data in the appropriate extensibility area in an agreed
    foreign namespace.
    - the message (including all extension data) will be sent to *all*
    potential service providers of which there may be many
    - the service provider who requested the additional data wants to use
    the standards based data model *not* create a completely private
    schema for this transaction.

    so what should receivers of the message who do *not* understand
    the extension do ?

    Are they likely to be obliged (possibly by legal, regulatory, audit,
    requirements) to retain ALL data that a customer has agreed to send
    (perhaps for non repudiation, DPA, or other reasons) regardless of
    whether they intend to process it or not. And if so, does that make
    it a practical non starter given that the size and content of 'unknown
    data' requires them to provide an adequate (and equally unknown)
    storage (and retrieval) capability (at least for those business
    transactions to which these sort of obligations might apply) ?

    welcome

    Fraser

    end of original post
    >
    >
    >
    >


    XML-DEV is a publicly archived, unmoderated list hosted by ASIS
    to support XML implementation and development. To minimize
    spam in the archives, you must subscribe before posting.

    [Un]Subscribe/change address:
    unsubscribe: xml-dev-unsubscribe (AT) lists (DOT) xml.org
    subscribe: xml-dev-subscribe (AT) lists (DOT) xml.org
    List archive:
    List Guidelines:
    --

    XML-DEV is a publicly archived, unmoderated list hosted by ASIS
    to support XML implementation and development. To minimize
    spam in the archives, you must subscribe before posting.

    [Un]Subscribe/change address:
    unsubscribe: xml-dev-unsubscribe (AT) lists (DOT) xml.org
    subscribe: xml-dev-subscribe (AT) lists (DOT) xml.org
    List archive:
    List Guidelines:
  • No.6 | | 5270 bytes | |

    David,

    Whoever did this in 'may-ignore' stuff in the first place must have skipped
    communications 101.

    a little more detail on my environment is clearly needed (sorry the
    original post was already quite long so I left it out).

    service consumers (typically high street brokers) bind to and call
    web services on an industry portal, an example is an insurance
    quotation service for a given product. The portal will forward the
    same message to one or more implementers of that service. All of the
    messages are defined to have an industry standard structure. Sometimes
    insurers do deals with some brokers to capture additional data and
    want to have that data carried within the standard message but
    explicitly 'called out', typically using a agreed namespace 'foreign'
    to the main transaction schema. When the message is received by a
    service provider that 'understands' the namespace, it may process it.
    When it is received by a service provider that does not recognise it,
    it may safely ignore that content without raising an error. This is
    all possible using xs:any at various points in the schema, that is,
    the message would still pass schema validation for a provider who has
    no interest in the actual content in the extension point. It also
    provides a mechanism for the standards body for minor non breaking
    versioning of standard schemata.

    of the issues I have with this is that a receiver who wants to
    ignore the additional content still needs to know whether that content
    MUST be retained (for [say] legal non repudiation) or can actually be
    discarded altogether, perhaps by applying a transformation before
    processing (as per the current UBL 2 proposal). I was thinking that
    maybe the annotations might help to make this clearer in the absence
    of something like an ebXML Collabortion Protocol Agreement (CPA) ?

    Secondly, any part of anything *may* be ignored, but it is up to the
    receiver to make that decision.

    Absolutely. I'm quite certain that most of us who implement services
    do not necessarily process every piece of data that is sent.

    When you get told by the communicator to ignore this and not that, you tend
    to think about just ignoring the whole jolly lot because the message
    becomes to conveluted, confusing and too much effort to understand.

    The assertion from the caller is that you MAY ignore some content IF
    you don't understand it WITHUT this being considered as an error (ie.
    that content in effect represents relationships between SME of the
    parties that receive it but not necessarily all). Its a way of having
    a more generic message as opposed to lots of individual point-2-point
    tightly coupled services. Not saying that approach is wrong, both have
    merit IM

    Go back and check the setting of the <may_be_ignored_flag = 0>

    :-)

    Regards

    Fraser.

    13/09/06, David Lyon <david.lyon (AT) preisshare (DOT) netwrote:

    Message
    From: "Fraser Goffin" <goffinf (AT) hotmail (DOT) com>
    To: <xml-dev (AT) lists (DOT) xml.org>
    Sent: Tuesday, September 12, 2006 7:29 PM
    Subject: [xml-dev] XML Markup Compatibility
    --
    A while back I posted a question about use of the Must Ignore Unknown
    (retain/discard) pattern described (primarily) by David as an
    approach to processing XML instances containing allowable content that MAY
    be ignored by a receiver if that content is 'not understood' (see below
    for
    original post).

    Whoever did this in 'may-ignore' stuff in the first place must have skipped
    communications 101.

    Firstly, if it may be ignored, then why send it in the first place?

    Secondly, any part of anything *may* be ignored, but it is up to the
    receiver to make that decision.

    When you get told by the communicator to ignore this and not that, you tend
    to think about just ignoring the whole jolly lot because the message
    becomes to conveluted, confusing and too much effort to understand.

    My original post had no responses but I'm not sure if that was because
    no-one is really all that bothered about this subject (for us it has some
    potential in our versioning strategy) ?

    Go back and check the setting of the <may_be_ignored_flag = 0>

    Regards

    David
    --

    XML-DEV is a publicly archived, unmoderated list hosted by ASIS
    to support XML implementation and development. To minimize
    spam in the archives, you must subscribe before posting.

    [Un]Subscribe/change address:
    unsubscribe: xml-dev-unsubscribe (AT) lists (DOT) xml.org
    subscribe: xml-dev-subscribe (AT) lists (DOT) xml.org
    List archive:
    List Guidelines:
    --

    XML-DEV is a publicly archived, unmoderated list hosted by ASIS
    to support XML implementation and development. To minimize
    spam in the archives, you must subscribe before posting.

    [Un]Subscribe/change address:
    unsubscribe: xml-dev-unsubscribe (AT) lists (DOT) xml.org
    subscribe: xml-dev-subscribe (AT) lists (DOT) xml.org
    List archive:
    List Guidelines:
  • No.7 | | 5516 bytes | |

    Dave,

    I think you could have 2 flavours of mU in your app, the ignore and must
    retain, and the ignore and may retain.

    Agreed.

    But, the question in my mind, is why would the sender want to force the
    receiver to fail if it can't retain but can ignore.

    Well I guess this was part of the thrust of my original post some
    weeks back. In some cases there can be a legal or regulatory
    requirement that the message that is actually sent to an endpoint MUST
    be reproducable, perhaps for business audit and/or non repudiation
    purposes. So even though a receiver may not be interested in content
    which is marked as 'ignorable', it is only ignorable from a processing
    point of view (actually I believe you made this point in your
    materials on this subject). of the practical problems this might
    give rise to is that the receiver is basically being required to store
    data of arbitrary size, something which it may or may not be able to
    do.

    For many of our operational systems we don't have ownership of the
    source code and in some cases can't really change the structure of the
    data store, so we would have to create something supplementary and
    correlate the two datasets. So, even if we aren't bothered about some
    ignorable content we may have to error because we can't retain it. In
    those cases where there isn't a requirement for retaining content we
    can safely ignore and discard. This is partly what Ken Holman
    describes in the UBL 2 validation model proposal (transform before
    validate to 'strip off' ignorable content - he does also mention that
    you may need to create an audit of the complete message in some
    cases).

    I suppose this could be an argumanet for *not* allowing this form of
    extensibility, but I *do* see potential for it and don't want to throw
    the baby out with the bath water. IM we *have* to find some
    accomodation in messaging schemata for representing private
    relationships (as user defined extensions) and allow minor non
    breaking change.

    I'll point out that I think another big reason to "retain" is to be able use
    the "extra stuff" in the future.

    Yes. of the thing that I find attractive about the ability to
    include private extensions in public schema is that some of these can
    be regarded as 'candidate' standards for possible inclusion in the
    future (I know this wasn't quite your point - I've read your stuff on
    adding extra structure and agree with it - but we still need somewhere
    to persist it !)

    It is the perma 'versioning and extensibility' howto that I still
    can't quite nail down despite the best efforts of people such as
    yourself, Ken Holman, Tim Ewald, Dare and many others. All have [good]
    ideas but none seem completely satisfactory.

    In my case I am trying to assist both my own organisation and the UK
    insurance industry standards body formulate some strategy in these
    areas. At present no extensibility whatsoever is allowed and version
    changes are to all intents and purposes all 'breaking'. I am convinced
    we can improve this position but need to spell out every use case
    detail.

    Regards

    Fraser.

    13/09/06, Dave <orchard (AT) pacificspirit (DOT) comwrote:
    Hi Fraser,

    of the issues I have with this is that a receiver who
    wants to ignore the additional content still needs to know
    whether that content MUST be retained (for [say] legal non
    repudiation) or can actually be discarded altogether, perhaps
    by applying a transformation before processing (as per the
    current UBL 2 proposal). I was thinking that maybe the
    annotations might help to make this clearer in the absence of
    something like an ebXML Collabortion Protocol Agreement (CPA) ?
    --
    I think you could have 2 flavours of mU in your app, the ignore and must
    retain, and the ignore and may retain.

    But, the question in my mind, is why would the sender want to force the
    receiver to fail if it can't retain but can ignore. Seems to me that an app
    would generally try to retain information if reasonably possible. The
    difference between the two is the fault behaviour if it can't retain.

    Having the 2 modes only makes sense if there are clients that are prepared
    to talk to receivers that can ignore the content AND only talk to receivers
    that will keep the content in some cases.

    I'll point out that I think another big reason to "retain" is to be able use
    the "extra stuff" in the future. For example, a middle name is added but
    isn't known. If it's stored in the db in some "extra" table, then a future
    version of the db + app could know about the middle name and do something
    useful, such as return it in a query.

    I'm not sure of the app, but I think those are a couple of the key points of
    interest.

    Cheers,
    Dave
    >
    >
    >


    XML-DEV is a publicly archived, unmoderated list hosted by ASIS
    to support XML implementation and development. To minimize
    spam in the archives, you must subscribe before posting.

    [Un]Subscribe/change address:
    unsubscribe: xml-dev-unsubscribe (AT) lists (DOT) xml.org
    subscribe: xml-dev-subscribe (AT) lists (DOT) xml.org
    List archive:
    List Guidelines:
  • No.8 | | 2391 bytes | |

    Quoting Fraser Goffin <goffinf (AT) googlemail (DOT) com>:

    a little more detail on my environment is clearly needed (sorry the
    original post was already quite long so I left it out).

    service consumers (typically high street brokers) bind to and call
    web services on an industry portal

    of the issues I have with this is that a receiver who wants to
    ignore the additional content still needs to know whether that content
    MUST be retained (for [say] legal non repudiation) or can actually be
    discarded altogether, perhaps by applying a transformation before
    processing (as per the current UBL 2 proposal). I was thinking that
    maybe the annotations might help to make this clearer in the absence
    of something like an ebXML Collabortion Protocol Agreement (CPA) ?

    ok.

    Absolutely. I'm quite certain that most of us who implement services
    do not necessarily process every piece of data that is sent.

    That's right. At the moment, in my customer environments, data is just
    thrown away in the hundreds of megabytes weekly. Most of it falls
    outside of the ability of the businesses to be able to process it.

    As businesses have become more efficient at generating the data, the
    reverse is not always true for receiving it and loading it into some
    managable form.

    The assertion from the caller is that you MAY ignore some content IF
    you don't understand it WITHUT this being considered as an error (ie.
    that content in effect represents relationships between SME of the
    parties that receive it but not necessarily all). Its a way of having
    a more generic message as opposed to lots of individual point-2-point
    tightly coupled services. Not saying that approach is wrong, both have
    merit IM

    Fortunately I work in less tightly coupled systems with smaller
    companies who are less involved in the message elements down at that
    level.

    Regards

    David

    XML-DEV is a publicly archived, unmoderated list hosted by ASIS
    to support XML implementation and development. To minimize
    spam in the archives, you must subscribe before posting.

    [Un]Subscribe/change address:
    unsubscribe: xml-dev-unsubscribe (AT) lists (DOT) xml.org
    subscribe: xml-dev-subscribe (AT) lists (DOT) xml.org
    List archive:
    List Guidelines:

Re: Open XML Markup Compatibility


max 4000 letters.
Your nickname that display:
In order to stop the spam: 0 + 9 =
QUESTION ON "XML"

EMSDN.COM