Making a silk purse out of the schema sows ear - was Minimal XML Specification
10 answers - 3471 bytes -

I think the reality is that lots of people flipped the Bozo Bit on the XSD spec in 1999-2000. They went in different directions, however: Some to alternative schema languages, some to radical simplification of XML to de-emphasize schemas altogether.
In hindsight, had people foreseen today's reality that we're stuck with XSD as what the mainstream user thinks of as the "real standard", clearly the energy would have been better spent debugging the wretched thing rather than trying to pretend it doesn't exist or trying to drive a stake thru its heart. I'm more interested in discussing what to do going forward given the current mess. The problems I see are:
- The W3C is more interested in moving the XSD spec forward than fixing its numerous ambiguities. (Their pushback is that the people who want to fix it are not represented on the WG, and the people who have skin in the game want to move forward).
- RELAX NG is clearly "better" for textual documents but doesn't have much support for the data-oriented use cases. (Sure you can plug in the XSD type system, but that's a big part of the problem). We now have an unpleasant situation of fragmentation where there's little mainstream tool support for RELAX NG due to lack of demand, exploitation of its geek chic (partly to strike a blow against the empire, I suppose), with the result that the normative definitions of Atom and DF can't be used with most commercial XML tools. Maybe a good guerilla tactic in the open source wars, but for the moment it's the innocent who suffer the collateral damage.
- Schematron is moving forward as an IS standard and has some good implementations but has few normative references in vertical industry standards nor mindshare. (Correct me if I'm wrong about the normative references).
- Lots of people complain about the limitations of XSD that Schematron addresses and the W3C doesn't plan to, especially the lack of occurrence constraints.
The best way forward that I can see is to encourage end users to employ XSD + Schematron as necessary, and encourage W3C to address XSD's bugs and ambiguities before adding more onto an unstable foundation. What does that miss that the world actually values? (as much as it depresses me to say it, the world doesn't seem to value RELAX NG's elegance and mathematical foundation very much).
Date: Wed, 8 Feb 2006 18:04:33 +1100From: rjelliffe (AT) allette (DOT) com.auTo: elharo (AT) metalab (DOT) unc.eduCC: xml-dev (AT) lists (DOT) xml.orgSubject: Re: [xml-dev] Minimal XML SpecificationElliotte Harold wrote:>Eric van der Vlist wrote: What makes you think the community would have have changed anything by spending more time to try to influence the spec? What makes you think they couldn't have? >It's possible a few of the minor inconsistencies, unclear wording, and >outright mistakes could have been fixed had someone noticed them and >spoken up at the right time. However the major issues of design and >whether this was the right approach to a schema language really was >never on the table And people at the time knew this how? CheersRickThe xml-dev list is sponsored by XML.org , aninitiative of ASIS The list archives are at To subscribe or unsubscribe from this list use the subscriptionmanager:
Express yourself instantly with MSN Messenger! Download today it's FREE!
No.1 | | 2010 bytes |
| 
Wed, 2006-02-08 at 09:05 -0800, Michael Champion wrote:
- RELAX NG is clearly "better" for textual documents but doesn't have
much support for the data-oriented use cases. (Sure you can plug in
the XSD type system, but that's a big part of the problem).
A separable part? than Jeni Tennison, I haven't seen any uptake on
this issue.
At least relax ng allows data type plug ins.
We now have an unpleasant situation of fragmentation where there's
little mainstream tool support for RELAX NG due to lack of demand,
exploitation of its geek chic (partly to strike a blow against the
empire, I suppose), with the result that the normative definitions of
Atom and DF can't be used with most commercial XML tools.
Unless you use relax ng tools to convert to xsd?
- Schematron is moving forward as an IS standard and has some good
implementations but has few normative references in vertical industry
standards nor mindshare. (Correct me if I'm wrong about the normative
references).
I've always viewed Schematron as providing additional functionality
beyond what
my schema validation gives me, not as a replacement? Rick?
The best way forward that I can see is to encourage end users to
employ XSD + Schematron
I know of one tool that merges relax ng functionality with schematron
processing.
I haven't heard of anything merging Schematron with xsd validation.
A single stage validation is helpful, rather than pipelining.
as necessary, and encourage W3C to address XSD's bugs and
ambiguities before adding more onto an unstable foundation. What
does that miss that the world actually values? (as much as it
depresses me to say it, the world doesn't seem to value RELAX NG's
elegance and mathematical foundation very much).
I think it's valued Michael. I'm sure others on this list do too.
I guess we aren't 'the world'.
No.2 | | 1008 bytes |
| 
We now have an unpleasant situation of fragmentation where there's
little mainstream tool support for RELAX NG due to lack of demand,
exploitation of its geek chic (partly to strike a blow against the
empire, I suppose), with the result that the normative definitions of
Atom and DF can't be used with most commercial XML tools.
Unless you use relax ng tools to convert to xsd?
RELAX NG is a super set of XML Schema, you can express validity
constraints in RELAX NG which cannot be expressed in XML Schema. Try
converting that Atom for example allows stuff like "element x has to
contain at least one y and any amount of z, in any order". You can't do
that in XML Schema.
Martin
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of ASIS <http://www.oasis-open.org>
The list archives are at
To subscribe or unsubscribe from this list use the subscription
manager: <>
No.3 | | 2727 bytes |
| 
At 2006-02-08 09:05 -0800, Michael Champion wrote:
>- RELAX NG is clearly "better" for textual documents but doesn't
>have much support for the data-oriented use cases. (Sure you can
>plug in the XSD type system, but that's a big part of the problem).
RELAX-NG is IS/IEC 19757-2 (note that the compact syntax is also now
standardized as an amendment to the original IS document), and its
data type system is "plug and play". Yes, W3C Schema Part 2 can be
used, but IS/IEC 19757-5 Data Types is the standardization of the
Datatype Library Language (DTLL)
proposed by Jeni Tennison.
>We now have an unpleasant situation of fragmentation where there's
>little mainstream tool support for RELAX NG due to lack of demand
But that's the rub where would the demand be without the
successful uses of it to draw out the demand? "Demand" for W3C
Schema support came from on high as edict, to which W3C-related
vendors responded; the grassroots demand for RELAX NG is merit-based
and users are in a position to make demands of vendors for support.
>- Schematron is moving forward as an IS standard and has some good
>implementations but has few normative references in vertical
>industry standards nor mindshare. (Correct me if I'm wrong about
>the normative references).
I've incorporated IS/IEC 19757-3 Schematron normatively in an aspect
of the Universal Business Language (UBL) project; a draft of the use
of Schematron in a code list value validation methodology is at:
UBL users in Denmark are also employing a lot of Schematron (perhaps
Bryan can talk to this).
I think we will see much more demand for IS/IEC 19757-4
Namespace-based Validation Dispatching Language (NVDL) as grassroots
interest in the power of despatching separate validation tasks will
promote more heterogeneous use of XMl vocabularies in instances.
I'm also excited about what will come from IS/IEC 19757-7 Character
Repertoire Description Language (CRDL) to express the constraints on
the Unicode characters used in an XML document so as to ensure
processing systems can work on the characters found in documents.
All these (and others) are parts of IS/IEC 19757 Document Schema
Definition Languages (DSDL) note the plural to emphasis different
horses for different courses see http://dsdl.org for more
details, or get involved with your country's National Body to IS and
work on its development yourself!
I hope this helps.
. . . . . . . . . . Ken
No.4 | | 699 bytes |
| 
Wed, 2006-02-08 at 19:10 +0100, Martin Probst wrote:
Unless you use relax ng tools to convert to xsd?
RELAX NG is a super set of XML Schema, you can express validity
constraints in RELAX NG which cannot be expressed in XML Schema. Try
converting that
If you design for relax ng, use it.
If you design for xsd, be aware of what your restrictions are.
User beware perhaps?
regards DaveP
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of ASIS <http://www.oasis-open.org>
The list archives are at
To subscribe or unsubscribe from this list use the subscription
manager: <>
No.5 | | 5003 bytes |
| 
hi,
IMH, the main difficulty that schema technologies encounter is their
poor capabilities to express constraints because they are hard-coded in
the schema. This is the case for occurrence constraints and content
model definitions.
I have experimented a schema language that allows to compute the
occurrence constraints dynamically and that allows to switch from a
declarative language to an imperative one, which increases dramatically
the expressiveness of the schema. The idea is to push back the limits of
the declarative language when they are reached.
An example :
a RelaxNG user was complaining about a constraint that he couldn't
express : he had to design a <tablewith any <column>s but <column>s
should have the same number of <cell>s
I respond that he could consider an alternative schema technology, such
as these that I designed :
<asl:element name="column">
<asl:sequence>
<xcl:if test="{ asl:element()/preceding-sibling::column }">
<xcl:then>
<asl:element ref-elem="cell" min-occurs="{
$asl:max-occurs }" max-occurs="{ count( asl:element()//column[1]/cell
) }"/>
</xcl:then>
<xcl:else>
<asl:element ref-elem="cell" min-occurs="1"
max-occurs="unbounded"/>
</xcl:else>
</xcl:if>
</asl:sequence>
</asl:element>
The full schema and the running results are available here :
#2f92e65b7ad48dff
This demonstrates that a simple if-then-else statement allows to build a
made-to-measure content model with dynamic occurrence constraints.
I named that schema language the Active Schema Language and I have an
almost full implementation of it in Java, called RefleX :
http://reflex.gforge.inria.fr/
You can read the examples, download the tool and play with it.
Moreover, ASL allows to design smart data types ; there is a tutorial in
the RefleX web site that shows a semantic data type : the "temperature"
data type, which is able to parse "32F" and "20C" ; as this type is
used to augment the amount of information of the XML document, we can
sort a list of attributes of this type not on the string values but on
the typed values
#N800F69
It is worth seeing because all the problems you consider in your message
are pointed out and solutioned in ASL.
Michael Champion wrote:
I think the reality is that lots of people flipped the Bozo Bit on the
XSD spec in 1999-2000. They went in different directions, however:
Some to alternative schema languages, some to radical simplification of
XML to de-emphasize schemas altogether.
In hindsight, had people foreseen today's reality that we're stuck with
XSD as what the mainstream user thinks of as the "real standard",
clearly the energy would have been better spent debugging the wretched
thing rather than trying to pretend it doesn't exist or trying to drive
a stake thru its heart. I'm more interested in discussing what to do
going forward given the current mess. The problems I see are:
- The W3C is more interested in moving the XSD spec forward than fixing
its numerous ambiguities. (Their pushback is that the people who want
to fix it are not represented on the WG, and the people who have skin in
the game want to move forward).
- RELAX NG is clearly "better" for textual documents but doesn't have
much support for the data-oriented use cases. (Sure you can plug in the
XSD type system, but that's a big part of the problem). We now have an
unpleasant situation of fragmentation where there's little mainstream
tool support for RELAX NG due to lack of demand, exploitation of its
geek chic (partly to strike a blow against the empire, I suppose), with
the result that the normative definitions of Atom and DF can't be used
with most commercial XML tools. Maybe a good guerilla tactic in the
open source wars, but for the moment it's the innocent who suffer the
collateral damage.
- Schematron is moving forward as an IS standard and has some good
implementations but has few normative references in vertical industry
standards nor mindshare. (Correct me if I'm wrong about the normative
references).
- Lots of people complain about the limitations of XSD that Schematron
addresses and the W3C doesn't plan to, especially the lack of occurrence
constraints.
The best way forward that I can see is to encourage end users to
employ XSD + Schematron as necessary, and encourage W3C to address XSD's
bugs and ambiguities before adding more onto an unstable foundation.
What does that miss that the world actually values? (as much as it
depresses me to say it, the world doesn't seem to value RELAX NG's
elegance and mathematical foundation very much).
No.6 | | 5680 bytes |
| 
In Schematron - however going for xslt implementation of schematron by
using the current function:
<sch:rule context="table/column[1]">
<sch:report
test="following-sibling::column[count(cell)>current()[count(cell)]]"
>cells need to be the same number per column
</sch:report>
</sch:rule>
Cheers,
Bryan Rasmussen
2/9/06, Philippe Poulard <Philippe.Poulard (AT) sophia (DOT) inria.frwrote:
hi,
IMH, the main difficulty that schema technologies encounter is their
poor capabilities to express constraints because they are hard-coded in
the schema. This is the case for occurrence constraints and content
model definitions.
I have experimented a schema language that allows to compute the
occurrence constraints dynamically and that allows to switch from a
declarative language to an imperative one, which increases dramatically
the expressiveness of the schema. The idea is to push back the limits of
the declarative language when they are reached.
An example :
a RelaxNG user was complaining about a constraint that he couldn't
express : he had to design a <tablewith any <column>s but <column>s
should have the same number of <cell>s
I respond that he could consider an alternative schema technology, such
as these that I designed :
<asl:element name="column">
<asl:sequence>
<xcl:if test="{ asl:element()/preceding-sibling::column }">
<xcl:then>
<asl:element ref-elem="cell" min-occurs="{
$asl:max-occurs }" max-occurs="{ count( asl:element()//column[1]/cell
) }"/>
</xcl:then>
<xcl:else>
<asl:element ref-elem="cell" min-occurs="1"
max-occurs="unbounded"/>
</xcl:else>
</xcl:if>
</asl:sequence>
</asl:element>
The full schema and the running results are available here :
#2f92e65b7ad48dff
This demonstrates that a simple if-then-else statement allows to build a
made-to-measure content model with dynamic occurrence constraints.
I named that schema language the Active Schema Language and I have an
almost full implementation of it in Java, called RefleX :
http://reflex.gforge.inria.fr/
You can read the examples, download the tool and play with it.
Moreover, ASL allows to design smart data types ; there is a tutorial in
the RefleX web site that shows a semantic data type : the "temperature"
data type, which is able to parse "32F" and "20C" ; as this type is
used to augment the amount of information of the XML document, we can
sort a list of attributes of this type not on the string values but on
the typed values
#N800F69
It is worth seeing because all the problems you consider in your message
are pointed out and solutioned in ASL.
Michael Champion wrote:
I think the reality is that lots of people flipped the Bozo Bit on the
XSD spec in 1999-2000. They went in different directions, however:
Some to alternative schema languages, some to radical simplification of
XML to de-emphasize schemas altogether.
>
>
>
In hindsight, had people foreseen today's reality that we're stuck with
XSD as what the mainstream user thinks of as the "real standard",
clearly the energy would have been better spent debugging the wretched
thing rather than trying to pretend it doesn't exist or trying to drive
a stake thru its heart. I'm more interested in discussing what to do
going forward given the current mess. The problems I see are:
>
>
>
- The W3C is more interested in moving the XSD spec forward than fixing
its numerous ambiguities. (Their pushback is that the people who want
to fix it are not represented on the WG, and the people who have skin in
the game want to move forward).
- RELAX NG is clearly "better" for textual documents but doesn't have
much support for the data-oriented use cases. (Sure you can plug in the
XSD type system, but that's a big part of the problem). We now have an
unpleasant situation of fragmentation where there's little mainstream
tool support for RELAX NG due to lack of demand, exploitation of its
geek chic (partly to strike a blow against the empire, I suppose), with
the result that the normative definitions of Atom and DF can't be used
with most commercial XML tools. Maybe a good guerilla tactic in the
open source wars, but for the moment it's the innocent who suffer the
collateral damage.
>
>
>
- Schematron is moving forward as an IS standard and has some good
implementations but has few normative references in vertical industry
standards nor mindshare. (Correct me if I'm wrong about the normative
references).
>
>
>
- Lots of people complain about the limitations of XSD that Schematron
addresses and the W3C doesn't plan to, especially the lack of occurrence
constraints.
The best way forward that I can see is to encourage end users to
employ XSD + Schematron as necessary, and encourage W3C to address XSD's
bugs and ambiguities before adding more onto an unstable foundation.
What does that miss that the world actually values? (as much as it
depresses me to say it, the world doesn't seem to value RELAX NG's
elegance and mathematical foundation very much).
--
No.7 | | 9389 bytes |
| 
This is a remark that has been made in comp.text.xml
Unlike schematron, ASL computes content models, which allow
tools such as editors to predict which element is allowed ; schematron
is not predictive, it can only warn that a rule has not been followed
AFTER the user has made the mistake, for example by inserting an element
that is not allowed.
There is a great difference between a tool such as schematron that
checks if everything is right in the XML document, and other schema
technologies (DTD, RelaxNG, WXS, ASL) that are able to draw up a
contextual list of XML material (attributes, elements, text) that is
legal to use. You can consider that the Active Schema Language is like a
deep integration of known schemata (DTD, RNG, WXS) with an assertion
language such as schematron ; a deep integration goes further than using
schematron in WXS or RNG because even if they are located in the same
XML instance, they are processed separately.
Here is a mix of RNG + schematron :
<?xml version="1.0" encoding="UTF-8"?>
<grammar ns="" xmlns=""
xmlns:sch="">
<start>
<element name="table">
<More>
<element name="column">
<sch:pattern name="Check to have the same number of cells in
each column" id="cells">
<sch:rule context="column">
<sch:assert test="count(/column[1]/cell) =
count(cell)">The number of cells in this
column should be the same as in the firtst column,
expected <sch:value-of
select="count(/column[1]/cell)"/but got
<sch:value-of select="count(cell)"/>.
</sch:assert>
</sch:rule>
</sch:pattern>
<More>
<element name="cell">
<empty/>
</element>
</More>
</element>
</More>
</element>
</start>
</grammar>
Anyway, another point that is not covered by schematron is the
capability to design smart data types such as semantic data types, as
shown in this example :
#N800F69
This is a very basic problem that known schematas can't resolve.
I think that neither WXS nor schematron could perform the same result
(just tell me how if I'm wrong)
bryan rasmussen wrote:
In Schematron - however going for xslt implementation of schematron by
using the current function:
<sch:rule context="table/column[1]">
<sch:report
test="following-sibling::column[count(cell)>current()[count(cell)]]"
>cells need to be the same number per column
</sch:report>
</sch:rule>
Cheers,
Bryan Rasmussen
2/9/06, Philippe Poulard <Philippe.Poulard (AT) sophia (DOT) inria.frwrote:
>>hi,
>>
>>IMH, the main difficulty that schema technologies encounter is their
>>poor capabilities to express constraints because they are hard-coded in
>>the schema. This is the case for occurrence constraints and content
>>model definitions.
>>
>>I have experimented a schema language that allows to compute the
>>occurrence constraints dynamically and that allows to switch from a
>>declarative language to an imperative one, which increases dramatically
>>the expressiveness of the schema. The idea is to push back the limits of
>>the declarative language when they are reached.
>>
>>An example :
>>a RelaxNG user was complaining about a constraint that he couldn't
>>express : he had to design a <tablewith any <column>s but <column>s
>>should have the same number of <cell>s
>>
>>I respond that he could consider an alternative schema technology, such
>>as these that I designed :
>><asl:element name="column">
><asl:sequence>
><xcl:if test="{ asl:element()/preceding-sibling::column }">
><xcl:then>
><asl:element ref-elem="cell" min-occurs="{
>$asl:max-occurs }" max-occurs="{ count( asl:element()//column[1]/cell
>) }"/>
></xcl:then>
><xcl:else>
><asl:element ref-elem="cell" min-occurs="1"
>max-occurs="unbounded"/>
></xcl:else>
></xcl:if>
></asl:sequence>
>></asl:element>
>>The full schema and the running results are available here :
2f92e65b7ad48dff
>>
>>This demonstrates that a simple if-then-else statement allows to build a
>>made-to-measure content model with dynamic occurrence constraints.
>>
>>I named that schema language the Active Schema Language and I have an
>>almost full implementation of it in Java, called RefleX :
>>
>>http://reflex.gforge.inria.fr/
>>You can read the examples, download the tool and play with it.
>>
>>Moreover, ASL allows to design smart data types ; there is a tutorial in
>>the RefleX web site that shows a semantic data type : the "temperature"
>>data type, which is able to parse "32F" and "20C" ; as this type is
>>used to augment the amount of information of the XML document, we can
>>sort a list of attributes of this type not on the string values but on
>>the typed values
N800F69
>>
>>It is worth seeing because all the problems you consider in your message
>>are pointed out and solutioned in ASL.
>>
>>Michael Champion wrote:
>>
I think the reality is that lots of people flipped the Bozo Bit on the
XSD spec in 1999-2000. They went in different directions, however:
Some to alternative schema languages, some to radical simplification of
XML to de-emphasize schemas altogether.
In hindsight, had people foreseen today's reality that we're stuck with
XSD as what the mainstream user thinks of as the "real standard",
clearly the energy would have been better spent debugging the wretched
thing rather than trying to pretend it doesn't exist or trying to drive
a stake thru its heart. I'm more interested in discussing what to do
going forward given the current mess. The problems I see are:
The W3C is more interested in moving the XSD spec forward than fixing
its numerous ambiguities. (Their pushback is that the people who want
to fix it are not represented on the WG, and the people who have skin in
the game want to move forward).
RELAX NG is clearly "better" for textual documents but doesn't have
much support for the data-oriented use cases. (Sure you can plug in the
XSD type system, but that's a big part of the problem). We now have an
unpleasant situation of fragmentation where there's little mainstream
tool support for RELAX NG due to lack of demand, exploitation of its
geek chic (partly to strike a blow against the empire, I suppose), with
the result that the normative definitions of Atom and DF can't be used
with most commercial XML tools. Maybe a good guerilla tactic in the
open source wars, but for the moment it's the innocent who suffer the
collateral damage.
Schematron is moving forward as an IS standard and has some good
implementations but has few normative references in vertical industry
standards nor mindshare. (Correct me if I'm wrong about the normative
references).
Lots of people complain about the limitations of XSD that Schematron
addresses and the W3C doesn't plan to, especially the lack of occurrence
constraints.
The best way forward that I can see is to encourage end users to
employ XSD + Schematron as necessary, and encourage W3C to address XSD's
bugs and ambiguities before adding more onto an unstable foundation.
What does that miss that the world actually values? (as much as it
depresses me to say it, the world doesn't seem to value RELAX NG's
elegance and mathematical foundation very much).
>>
>>Cordialement,
>>
>///
>(. .)
>(_)
>>| Philippe Poulard |
>
>http://reflex.gforge.inria.fr/
>Have the RefleX !
>>
>>
>>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>>initiative of ASIS <http://www.oasis-open.org>
>>
>>The list archives are at
>>
>>To subscribe or unsubscribe from this list use the subscription
>>manager: <>
>>
>>
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of ASIS <http://www.oasis-open.org>
The list archives are at
To subscribe or unsubscribe from this list use the subscription
manager: <>
No.8 | | 10179 bytes |
| 
Well, looking at what you pointed to the first example also has
sorting, which is somewhat outside the bounds of what anyone has
discussed as validation before. As for the other ones I'm not exactly
sure what you're validating (maybe due to not having time to learn ASL
syntax)?
Actually this one
looks like you're changing the value of the temp attributes if scale
attribute equals the Fahrenheit scale.
Another in the list of schema languages that people don't note too
often is DSD http://www.brics.dk/DSD/
Cheers,
Bryan Rasmussen
2/9/06, Philippe Poulard <Philippe.Poulard (AT) sophia (DOT) inria.frwrote:
This is a remark that has been made in comp.text.xml
Unlike schematron, ASL computes content models, which allow
tools such as editors to predict which element is allowed ; schematron
is not predictive, it can only warn that a rule has not been followed
AFTER the user has made the mistake, for example by inserting an element
that is not allowed.
There is a great difference between a tool such as schematron that
checks if everything is right in the XML document, and other schema
technologies (DTD, RelaxNG, WXS, ASL) that are able to draw up a
contextual list of XML material (attributes, elements, text) that is
legal to use. You can consider that the Active Schema Language is like a
deep integration of known schemata (DTD, RNG, WXS) with an assertion
language such as schematron ; a deep integration goes further than using
schematron in WXS or RNG because even if they are located in the same
XML instance, they are processed separately.
Here is a mix of RNG + schematron :
<?xml version="1.0" encoding="UTF-8"?>
<grammar ns="" xmlns=""
xmlns:sch="">
<start>
<element name="table">
<More>
<element name="column">
<sch:pattern name="Check to have the same number of cells in
each column" id="cells">
<sch:rule context="column">
<sch:assert test="count(/column[1]/cell) =
count(cell)">The number of cells in this
column should be the same as in the firtst column,
expected <sch:value-of
select="count(/column[1]/cell)"/but got
<sch:value-of select="count(cell)"/>.
</sch:assert>
</sch:rule>
</sch:pattern>
<More>
<element name="cell">
<empty/>
</element>
</More>
</element>
</More>
</element>
</start>
</grammar>
Anyway, another point that is not covered by schematron is the
capability to design smart data types such as semantic data types, as
shown in this example :
#N800F69
This is a very basic problem that known schematas can't resolve.
I think that neither WXS nor schematron could perform the same result
(just tell me how if I'm wrong)
bryan rasmussen wrote:
In Schematron - however going for xslt implementation of schematron by
using the current function:
<sch:rule context="table/column[1]">
<sch:report
test="following-sibling::column[count(cell)>current()[count(cell)]]"
>cells need to be the same number per column
</sch:report>
</sch:rule>
Cheers,
Bryan Rasmussen
>
>
>
>
2/9/06, Philippe Poulard <Philippe.Poulard (AT) sophia (DOT) inria.frwrote:
>
>>hi,
>>
>>IMH, the main difficulty that schema technologies encounter is their
>>poor capabilities to express constraints because they are hard-coded in
>>the schema. This is the case for occurrence constraints and content
>>model definitions.
>>
>>I have experimented a schema language that allows to compute the
>>occurrence constraints dynamically and that allows to switch from a
>>declarative language to an imperative one, which increases dramatically
>>the expressiveness of the schema. The idea is to push back the limits of
>>the declarative language when they are reached.
>>
>>An example :
>>a RelaxNG user was complaining about a constraint that he couldn't
>>express : he had to design a <tablewith any <column>s but <column>s
>>should have the same number of <cell>s
>>
>>I respond that he could consider an alternative schema technology, such
>>as these that I designed :
>><asl:element name="column">
><asl:sequence>
><xcl:if test="{ asl:element()/preceding-sibling::column }">
><xcl:then>
><asl:element ref-elem="cell" min-occurs="{
>$asl:max-occurs }" max-occurs="{ count( asl:element()//column[1]/cell
>) }"/>
></xcl:then>
><xcl:else>
><asl:element ref-elem="cell" min-occurs="1"
>max-occurs="unbounded"/>
></xcl:else>
></xcl:if>
></asl:sequence>
>></asl:element>
>>The full schema and the running results are available here :
2f92e65b7ad48dff
>>
>>This demonstrates that a simple if-then-else statement allows to build a
>>made-to-measure content model with dynamic occurrence constraints.
>>
>>I named that schema language the Active Schema Language and I have an
>>almost full implementation of it in Java, called RefleX :
>>
>>http://reflex.gforge.inria.fr/
>>You can read the examples, download the tool and play with it.
>>
>>Moreover, ASL allows to design smart data types ; there is a tutorial in
>>the RefleX web site that shows a semantic data type : the "temperature"
>>data type, which is able to parse "32F" and "20C" ; as this type is
>>used to augment the amount of information of the XML document, we can
>>sort a list of attributes of this type not on the string values but on
>>the typed values
N800F69
>>
>>It is worth seeing because all the problems you consider in your message
>>are pointed out and solutioned in ASL.
>>
>>Michael Champion wrote:
>>
I think the reality is that lots of people flipped the Bozo Bit on the
XSD spec in 1999-2000. They went in different directions, however:
Some to alternative schema languages, some to radical simplification of
XML to de-emphasize schemas altogether.
In hindsight, had people foreseen today's reality that we're stuck with
XSD as what the mainstream user thinks of as the "real standard",
clearly the energy would have been better spent debugging the wretched
thing rather than trying to pretend it doesn't exist or trying to drive
a stake thru its heart. I'm more interested in discussing what to do
going forward given the current mess. The problems I see are:
The W3C is more interested in moving the XSD spec forward than fixing
its numerous ambiguities. (Their pushback is that the people who want
to fix it are not represented on the WG, and the people who have skin in
the game want to move forward).
RELAX NG is clearly "better" for textual documents but doesn't have
much support for the data-oriented use cases. (Sure you can plug in the
XSD type system, but that's a big part of the problem). We now have an
unpleasant situation of fragmentation where there's little mainstream
tool support for RELAX NG due to lack of demand, exploitation of its
geek chic (partly to strike a blow against the empire, I suppose), with
the result that the normative definitions of Atom and DF can't be used
with most commercial XML tools. Maybe a good guerilla tactic in the
open source wars, but for the moment it's the innocent who suffer the
collateral damage.
Schematron is moving forward as an IS standard and has some good
implementations but has few normative references in vertical industry
standards nor mindshare. (Correct me if I'm wrong about the normative
references).
Lots of people complain about the limitations of XSD that Schematron
addresses and the W3C doesn't plan to, especially the lack of occurrence
constraints.
The best way forward that I can see is to encourage end users to
employ XSD + Schematron as necessary, and encourage W3C to address XSD's
bugs and ambiguities before adding more onto an unstable foundation.
What does that miss that the world actually values? (as much as it
depresses me to say it, the world doesn't seem to value RELAX NG's
elegance and mathematical foundation very much).
>>
>>Cordialement,
>>
>///
>(. .)
>(_)
>>| Philippe Poulard |
>
>http://reflex.gforge.inria.fr/
>Have the RefleX !
>>
>>
>>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>>initiative of ASIS <http://www.oasis-open.org>
>>
>>The list archives are at
>>
>>To subscribe or unsubscribe from this list use the subscription
>>manager: <>
>>
>>
>
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of ASIS <http://www.oasis-open.org>
The list archives are at
To subscribe or unsubscribe from this list use the subscription
manager: <>
>
>
>
>
No.9 | | 10462 bytes |
| 
As noted since I don't know the ASL syntax I could be interpreting
those examples you pointed to completely wrong.
Cheers,
Bryan Rasmussen
2/9/06, bryan rasmussen <rasmussen.bryan (AT) gmail (DOT) comwrote:
Well, looking at what you pointed to the first example also has
sorting, which is somewhat outside the bounds of what anyone has
discussed as validation before. As for the other ones I'm not exactly
sure what you're validating (maybe due to not having time to learn ASL
syntax)?
Actually this one
looks like you're changing the value of the temp attributes if scale
attribute equals the Fahrenheit scale.
Another in the list of schema languages that people don't note too
often is DSD http://www.brics.dk/DSD/
Cheers,
Bryan Rasmussen
>
>
>
2/9/06, Philippe Poulard <Philippe.Poulard (AT) sophia (DOT) inria.frwrote:
This is a remark that has been made in comp.text.xml
Unlike schematron, ASL computes content models, which allow
tools such as editors to predict which element is allowed ; schematron
is not predictive, it can only warn that a rule has not been followed
AFTER the user has made the mistake, for example by inserting an element
that is not allowed.
There is a great difference between a tool such as schematron that
checks if everything is right in the XML document, and other schema
technologies (DTD, RelaxNG, WXS, ASL) that are able to draw up a
contextual list of XML material (attributes, elements, text) that is
legal to use. You can consider that the Active Schema Language is like a
deep integration of known schemata (DTD, RNG, WXS) with an assertion
language such as schematron ; a deep integration goes further than using
schematron in WXS or RNG because even if they are located in the same
XML instance, they are processed separately.
Here is a mix of RNG + schematron :
<?xml version="1.0" encoding="UTF-8"?>
<grammar ns="" xmlns=""
xmlns:sch="">
<start>
<element name="table">
<More>
<element name="column">
<sch:pattern name="Check to have the same number of cells in
each column" id="cells">
<sch:rule context="column">
<sch:assert test="count(/column[1]/cell) =
count(cell)">The number of cells in this
column should be the same as in the firtst column,
expected <sch:value-of
select="count(/column[1]/cell)"/but got
<sch:value-of select="count(cell)"/>.
</sch:assert>
</sch:rule>
</sch:pattern>
<More>
<element name="cell">
<empty/>
</element>
</More>
</element>
</More>
</element>
</start>
</grammar>
Anyway, another point that is not covered by schematron is the
capability to design smart data types such as semantic data types, as
shown in this example :
#N800F69
This is a very basic problem that known schematas can't resolve.
I think that neither WXS nor schematron could perform the same result
(just tell me how if I'm wrong)
bryan rasmussen wrote:
In Schematron - however going for xslt implementation of schematron by
using the current function:
<sch:rule context="table/column[1]">
<sch:report
test="following-sibling::column[count(cell)>current()[count(cell)]]"
>cells need to be the same number per column
</sch:report>
</sch:rule>
Cheers,
Bryan Rasmussen
>
>
>
>
2/9/06, Philippe Poulard <Philippe.Poulard (AT) sophia (DOT) inria.frwrote:
>
>>hi,
>>
>>IMH, the main difficulty that schema technologies encounter is their
>>poor capabilities to express constraints because they are hard-coded in
>>the schema. This is the case for occurrence constraints and content
>>model definitions.
>>
>>I have experimented a schema language that allows to compute the
>>occurrence constraints dynamically and that allows to switch from a
>>declarative language to an imperative one, which increases dramatically
>>the expressiveness of the schema. The idea is to push back the limits of
>>the declarative language when they are reached.
>>
>>An example :
>>a RelaxNG user was complaining about a constraint that he couldn't
>>express : he had to design a <tablewith any <column>s but <column>s
>>should have the same number of <cell>s
>>
>>I respond that he could consider an alternative schema technology, such
>>as these that I designed :
>><asl:element name="column">
><asl:sequence>
><xcl:if test="{ asl:element()/preceding-sibling::column }">
><xcl:then>
><asl:element ref-elem="cell" min-occurs="{
>$asl:max-occurs }" max-occurs="{ count( asl:element()//column[1]/cell
>) }"/>
></xcl:then>
><xcl:else>
><asl:element ref-elem="cell" min-occurs="1"
>max-occurs="unbounded"/>
></xcl:else>
></xcl:if>
></asl:sequence>
>></asl:element>
>>The full schema and the running results are available here :
2f92e65b7ad48dff
>>
>>This demonstrates that a simple if-then-else statement allows to build a
>>made-to-measure content model with dynamic occurrence constraints.
>>
>>I named that schema language the Active Schema Language and I have an
>>almost full implementation of it in Java, called RefleX :
>>
>>http://reflex.gforge.inria.fr/
>>You can read the examples, download the tool and play with it.
>>
>>Moreover, ASL allows to design smart data types ; there is a tutorial in
>>the RefleX web site that shows a semantic data type : the "temperature"
>>data type, which is able to parse "32F" and "20C" ; as this type is
>>used to augment the amount of information of the XML document, we can
>>sort a list of attributes of this type not on the string values but on
>>the typed values
N800F69
>>
>>It is worth seeing because all the problems you consider in your message
>>are pointed out and solutioned in ASL.
>>
>>Michael Champion wrote:
>>
I think the reality is that lots of people flipped the Bozo Bit on the
XSD spec in 1999-2000. They went in different directions, however:
Some to alternative schema languages, some to radical simplification of
XML to de-emphasize schemas altogether.
In hindsight, had people foreseen today's reality that we're stuck with
XSD as what the mainstream user thinks of as the "real standard",
clearly the energy would have been better spent debugging the wretched
thing rather than trying to pretend it doesn't exist or trying to drive
a stake thru its heart. I'm more interested in discussing what to do
going forward given the current mess. The problems I see are:
The W3C is more interested in moving the XSD spec forward than fixing
its numerous ambiguities. (Their pushback is that the people who want
to fix it are not represented on the WG, and the people who have skin in
the game want to move forward).
RELAX NG is clearly "better" for textual documents but doesn't have
much support for the data-oriented use cases. (Sure you can plug in the
XSD type system, but that's a big part of the problem). We now have an
unpleasant situation of fragmentation where there's little mainstream
tool support for RELAX NG due to lack of demand, exploitation of its
geek chic (partly to strike a blow against the empire, I suppose), with
the result that the normative definitions of Atom and DF can't be used
with most commercial XML tools. Maybe a good guerilla tactic in the
open source wars, but for the moment it's the innocent who suffer the
collateral damage.
Schematron is moving forward as an IS standard and has some good
implementations but has few normative references in vertical industry
standards nor mindshare. (Correct me if I'm wrong about the normative
references).
Lots of people complain about the limitations of XSD that Schematron
addresses and the W3C doesn't plan to, especially the lack of occurrence
constraints.
The best way forward that I can see is to encourage end users to
employ XSD + Schematron as necessary, and encourage W3C to address XSD's
bugs and ambiguities before adding more onto an unstable foundation
What does that miss that the world actually values? (as much as it
depresses me to say it, the world doesn't seem to value RELAX NG's
elegance and mathematical foundation very much).
>>
>>Cordialement,
>>
>///
>(. .)
>(_)
>>| Philippe Poulard |
>
>http://reflex.gforge.inria.fr/
>Have the RefleX !
>>
>>
>>The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
>>initiative of ASIS <http://www.oasis-open.org>
>>
>>The list archives are at
>>
>>To subscribe or unsubscribe from this list use the subscription
>>manager: <>
>>
>>
>
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of ASIS <http://www.oasis-open.org>
The list archives are at
To subscribe or unsubscribe from this list use the subscription
manager: <>
>
>
>
>
No.10 | | 11965 bytes |
| 
bryan rasmussen wrote:
Well, looking at what you pointed to the first example also has
sorting, which is somewhat outside the bounds of what anyone has
discussed as validation before. As for the other ones I'm not exactly
sure what you're validating (maybe due to not having time to learn ASL
syntax)?
the type simply checks that the value of the attribute is an xs:int
if it is not, an error will be reported
additionally, it builds a data model that is customizable
Actually this one
looks like you're changing the value of the temp attributes if scale
attribute equals the Fahrenheit scale.
in fact, it doesn't change the attribute value (but it could), it just
changes the typed data value which is bound to the attribute
I think it's important for the attribute to keep its value as is ; I
don't think that the role of a schema is to change the XML document
anyway, if some people really want to change the document, they could
replace the <xcl:updateinstruction with :
<xcl:attribute referent="{ asl:element() }" name="temp" operand="{
(value( . ) - 32) * 5 div 9 }"/>
<xcl:attribute referent="{ asl:element() }" name="scale" value="C"/>
which really converts <town temp="32" scale="F"to <town temp="0"
scale="C">
notice that XCL is not part of ASL : it's a general-purpose language
that can be advantageaously used in ASL, but more traditional
applications (web app, batch scripts) can also be considered with XCL
ASL is part of a wider framework that I called "Active Tags", that
allows to design native XML programs ; ASL is a module (a library) of
Active Tags, XCL (the XML Control Language) is another, and there is
also a library for accessing RDBMS ; thus, if you have an attribute
which values must be one the result of an SQL query, you can express it :
#N401309
Another in the list of schema languages that people don't note too
often is DSD http://www.brics.dk/DSD/
I'll have a look at it
Cheers,
Bryan Rasmussen
2/9/06, Philippe Poulard <Philippe.Poulard (AT) sophia (DOT) inria.frwrote:
>>This is a remark that has been made in comp.text.xml
>>
>>Unlike schematron, ASL computes content models, which allow
>>tools such as editors to predict which element is allowed ; schematron
>>is not predictive, it can only warn that a rule has not been followed
>>AFTER the user has made the mistake, for example by inserting an element
>>that is not allowed.
>>
>>There is a great difference between a tool such as schematron that
>>checks if everything is right in the XML document, and other schema
>>technologies (DTD, RelaxNG, WXS, ASL) that are able to draw up a
>>contextual list of XML material (attributes, elements, text) that is
>>legal to use. You can consider that the Active Schema Language is like a
>>deep integration of known schemata (DTD, RNG, WXS) with an assertion
>>language such as schematron ; a deep integration goes further than using
>>schematron in WXS or RNG because even if they are located in the same
>>XML instance, they are processed separately.
>>
>>Here is a mix of RNG + schematron :
>><?xml version="1.0" encoding="UTF-8"?>
>><grammar ns="" xmlns=""
>xmlns:sch="">
><start>
><element name="table">
><More>
><element name="column">
><sch:pattern name="Check to have the same number of cells in
>>each column" id="cells">
><sch:rule context="column">
><sch:assert test="count(/column[1]/cell) =
>>count(cell)">The number of cells in this
>column should be the same as in the firtst column,
>>expected <sch:value-of
>select="count(/column[1]/cell)"/but got
>><sch:value-of select="count(cell)"/>.
></sch:assert>
></sch:rule>
></sch:pattern>
><More>
><element name="cell">
><empty/>
></element>
></More>
></element>
></More>
></element>
></start>
>></grammar>
>>
>>Anyway, another point that is not covered by schematron is the
>>capability to design smart data types such as semantic data types, as
>>shown in this example :
N800F69
>>This is a very basic problem that known schematas can't resolve.
>>I think that neither WXS nor schematron could perform the same result
>>(just tell me how if I'm wrong)
>>
>>bryan rasmussen wrote:
>>
In Schematron - however going for xslt implementation of schematron by
using the current function:
<sch:rule context="table/column[1]">
<sch:report
test="following-sibling::column[count(cell)>current()[count(cell)]]"
>cells need to be the same number per column
</sch:report>
</sch:rule>
Cheers,
Bryan Rasmussen
2/9/06, Philippe Poulard <Philippe.Poulard (AT) sophia (DOT) inria.frwrote:
hi,
IMH, the main difficulty that schema technologies encounter is their
poor capabilities to express constraints because they are hard-coded in
the schema. This is the case for occurrence constraints and content
model definitions.
I have experimented a schema language that allows to compute the
occurrence constraints dynamically and that allows to switch from a
declarative language to an imperative one, which increases dramatically
the expressiveness of the schema. The idea is to push back the limits of
the declarative language when they are reached.
An example :
a RelaxNG user was complaining about a constraint that he couldn't
express : he had to design a <tablewith any <column>s but <column>s
should have the same number of <cell>s
I respond that he could consider an alternative schema technology, such
as these that I designed :
<asl:element name="column">
<asl:sequence>
<xcl:if test="{ asl:element()/preceding-sibling::column }">
<xcl:then>
<asl:element ref-elem="cell" min-occurs="{
$asl:max-occurs }" max-occurs="{ count( asl:element()//column[1]/cell
) }"/>
</xcl:then>
<xcl:else>
<asl:element ref-elem="cell" min-occurs="1"
max-occurs="unbounded"/>
</xcl:else>
</xcl:if>
</asl:sequence>
</asl:element>
The full schema and the running results are available here :
2f92e65b7ad48dff
This demonstrates that a simple if-then-else statement allows to build a
made-to-measure content model with dynamic occurrence constraints.
I named that schema language the Active Schema Language and I have an
almost full implementation of it in Java, called RefleX :
http://reflex.gforge.inria.fr/
You can read the examples, download the tool and play with it.
Moreover, ASL allows to design smart data types ; there is a tutorial in
the RefleX web site that shows a semantic data type : the "temperature"
data type, which is able to parse "32F" and "20C" ; as this type is
used to augment the amount of information of the XML document, we can
sort a list of attributes of this type not on the string values but on
the typed values
N800F69
It is worth seeing because all the problems you consider in your message
are pointed out and solutioned in ASL.
Michael Champion wrote:
I think the reality is that lots of people flipped the Bozo Bit on the
XSD spec in 1999-2000. They went in different directions, however:
Some to alternative schema languages, some to radical simplification of
XML to de-emphasize schemas altogether.
In hindsight, had people foreseen today's reality that we're stuck with
XSD as what the mainstream user thinks of as the "real standard",
clearly the energy would have been better spent debugging the wretched
thing rather than trying to pretend it doesn't exist or trying to drive
a stake thru its heart. I'm more interested in discussing what to do
going forward given the current mess. The problems I see are:
The W3C is more interested in moving the XSD spec forward than fixing
its numerous ambiguities. (Their pushback is that the people who want
to fix it are not represented on the WG, and the people who have skin in
the game want to move forward).
RELAX NG is clearly "better" for textual documents but doesn't have
much support for the data-oriented use cases. (Sure you can plug in the
XSD type system, but that's a big part of the problem). We now have an
unpleasant situation of fragmentation where there's little mainstream
tool support for RELAX NG due to lack of demand, exploitation of its
geek chic (partly to strike a blow against the empire, I suppose), with
the result that the normative definitions of Atom and DF can't be used
with most commercial XML tools. Maybe a good guerilla tactic in the
open source wars, but for the moment it's the innocent who suffer the
collateral damage.
Schematron is moving forward as an IS standard and has some good
implementations but has few normative references in vertical industry
standards nor mindshare. (Correct me if I'm wrong about the normative
references).
Lots of people complain about the limitations of XSD that Schematron
addresses and the W3C doesn't plan to, especially the lack of occurrence
constraints.
The best way forward that I can see is to encourage end users to
employ XSD + Schematron as necessary, and encourage W3C to address XSD's
bugs and ambiguities before adding more onto an unstable foundation.
What does that miss that the world actually values? (as much as it
depresses me to say it, the world doesn't seem to value RELAX NG's
elegance and mathematical foundation very much).
Cordialement,
///
(. .)
(_)
| Philippe Poulard |
http://reflex.gforge.inria.fr/
Have the RefleX !
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of ASIS <http://www.oasis-open.org>
The list archives are at
To subscribe or unsubscribe from this list use the subscription
manager: <>
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of ASIS <http://www.oasis-open.org>
The list archives are at
To subscribe or unsubscribe from this list use the subscription
manager: <>
>>
>>Cordialement,
>>
>///
>(. .)
>(_)
>>| Philippe Poulard |
>
>http://reflex.gforge.inria.fr/
>Have the RefleX !
>>
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of ASIS <http://www.oasis-open.org>
The list archives are at
To subscribe or unsubscribe from this list use the subscription
manager: <>