Michael Kay <mike (AT) saxonica (DOT) comha scritto:
I think that spaces in URIs, as from the RFC,
are not allowed
Yes, that's true.
If they are present
in the characters of an URI they should
be ignored, they are there just to allow
to split the URI between multiple lines.
(this part comes [historically] from the URL RFC).
I don't recall seeing any such statement: can you
give a reference?
In this section they talk about whitespace
1.6. Syntax Notation and Common Elements
This document uses two conventions to describe and
define the syntax
for URI. The first, called the layout form, is a
general description
of the order of components and component
separators, as in
<first>/<second>;<third>?<fourth>
The component names are enclosed in angle-brackets
and any characters
outside angle-brackets are literal separators.
Whitespace should be
ignored. These descriptions are used informally
and do not define
the syntax requirements.
Then they say is excluded, we agreed on this.
2.4.3. Excluded US-ASCII Characters
The space character is excluded because significant
spaces may
disappear and insignificant spaces may be
introduced when URI are
transcribed or typeset or subjected to the
treatment of word-
processing programs. Whitespace is also used to
delimit URI in many
contexts.
>In Appendix E: They say it should be removed
E. Recommendations for Delimiting URI in Context
In some cases, extra whitespace (spaces, linebreaks,
tabs, etc.) may
need to be added to break long URI across lines.
The whitespace
should be ignored when extracting the URI.
I have to go, talk later of the rest.
Regards,
Michele
So
http://www.example.com/Example with two spaces
is not a valid xs:anyURI
You seem to be assuming that because it's not a
valid URI then it's not a
valid xs:anyURI. This doesn't follow. The schema
spec allows an xs:anyURI to
contain what I call a "wannabe URI": more formally,
it can contain any
string that can be mapped to a URI by following the
escaping procedure in
section 5.4 of XLink. This mapping performs
percent-encoding on all
"disallowed characters"; a space is a disallowed
character that maps to %20;
therefore a space is allowed in an xs:anyURI value
(even though it not
allowed in an IRI as defined by RFC 3987).
As further evidence that space is allowed in an
xs:anyURI, you yourself
quoted the statement that spaces are discouraged. It
wouldn't be necessary
to discourage them if they were invalid.
Michael Kay
http://www.saxonica.com/
The xml-dev list is sponsored by XML.org
<http://www.xml.org>, an
initiative of ASIS <http://www.oasis-open.org>
The list archives are at
To subscribe or unsubscribe from this list use the
subscription
manager:
<>
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB
http://mail.yahoo.it
The xml-dev list is sponsored by XML.org <http://www.xml.org>, an
initiative of ASIS <http://www.oasis-open.org>
The list archives are at
To subscribe or unsubscribe from this list use the subscription
manager: <>