Standards
 
Forums: » Register « |  User CP |  Games |  Calendar |  Members |  FAQs |  Sitemap |  Support | 
 
User Name:
Password:
Remember me
Go Back   Web Development Archives Mailing Lists Standards

Reply
Add This Thread To:
  Del.icio.us   Digg   Google   Spurl   Blink   Furl   Simpy   Y! MyWeb 
Thread Tools Search this Thread Display Modes
 
Unread Web Development Archives Sponsor:
  #1  
Old July 6th, 2008, 06:20 AM
Julian Reschke
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Why Microsoft's authoritative=true won't work and is a bad idea

Ian Hickson wrote:

If you would like the document to be processed as plain text, then there
might not be a good answer for you, sorry. Your use case is incompatible
with the use case of the many users who want to see feeds sent as
text/plain handled as feeds. Enough people mislabel their feeds as
text/plain that in practice documents labeled as text/plain are, in some
browsers, sniffed for feeds before being treated as plain text.


With the current text in HTML5, there's not only no "good answer" but no
answer at all (except by telling users to configure their UAs to respect
mime types).

Sam's use case could be made compatible by making the response
distinguishable from one sent by a misconfigured server.

At this point it seems to me that you are simply not interested in that
case. Is this correct?

BR, Julian

Reply With Quote
  #2  
Old July 7th, 2008, 08:40 AM
Henrik Nordstrom
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Why Microsoft's authoritative=true won't work and is a bad idea

[sorry for the missing red thread in this message, please read it in
full before responding]

¥n, 2008-07-07 at 09:33 +0200, Julian Reschke wrote:

The IETF HTTPbis working group has no mandate to do so. Thus it would
need to be rechartered, or a new WG would have to start.

And from a protool specification and common sense point of view it would
be the wrong thing to officially allow sniffing even when the
content-type is clearly specified.

HTTP already specifies when sniffing is allowed or not. Major browser
vendors have over time and by intent choosen to ignore this part of the
specifications, and now their ignorance is coming back and biting them
and their users. Does this mean that specifications should change to
allow for these bugs to grow into a standard feature encouraging
ignorance?

It also seems that some noticeable players have lost faith, thinking
that things won't improve over time and things will stay as bad or worse
over time. An attitude I find a bit disturbing when working with
specifications as it means nothing can be changed or fixed other than
documenting how broken the current implementations is today, ending in
the rationale that "UTF7 content sniffing is implemented by some, so it
must be supported by everyone even if completely stupid and current
specifications we all agreed to implement years ago says you MUST NT".

Yee, it do take some years of effort before any result in these areas at
all is seen, but it's certainly not impossible. I have been fighting
some of these wars, and some hours per year over some years nagging the
right people about something which bothers you can make a difference.

Yes, in the end there will be some old minor sites no longer working
well with newer browsers if sniffing is deprecated. But there will also
be existing major sites working better, being able to use content types
as intended instead of having to find ways around the browsers guessing
game.

HTTP intentionally does not specify how sniffing is to be implemented or
evaluated. That's a client implementation detail as far as HTTP is
concerned, and extra feature to be used when nothing else is known about
the content.


How is that possible?

Using Microsoft's proposal or by using a separate header, for instance.

If it wasn't for the Apache answer that if such extension gets commonly
available then it will be set by default by Apache, and things would go
back to square -1 by the reasoning applied earlier, with even more bits
on the wire that nobody want's to trust because server admins is by
definition not trustworthy to be willing to make their servers conform
with requirements or in general completely ignorant if their content
breaks for large parts of their user base because of this.

My concern about the proposal or added header is the reverse. Yes, it
will enable servers to tell next generation of clients to trust them,
but on the downside it will give more slack to the proponents which
thinks sniffing is the solution to how to deal with mislabelled content.
It's not a real solution to the problem, in fact it encourages that bug
to grow even bigger, just adding a workaround to be able to ask that bug
to go and hide for a while.

Well, the biggest vendor just put a proposal on the table that would
make it possible to disable sniffing altogether.

Maybe it would make sense to consider it seriously, instead of
immediately stating "won't work"?

It will work, at least temporarily until there is again sufficient
amount of mislabelled content.

The only real long term solution I see to this problem is for major
browser vendors to gradually stop sniffing content even without this
extension. Add "serer trust" levels similar to how cookies
black/whitelisting is managed, enabling the browsers to learn (by user
experience) which sites label their content proerly and which don't. A
good start on this track is to add a visible indication when mislabelled
content is detected, enabling users to see when there is something wrong
without "destroying the web".

Regards
Henrik

PGP SIGNATURE
Version: GnuPG v1.4.7 (GNU/Linux)




rQ2BF/qNXEI=
=ZCHo
PGP SIGNATURE

Reply With Quote
  #3  
Old July 7th, 2008, 08:40 AM
Justin James
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Why Microsoft's authoritative=true won't work and is a bad idea

There is no "URI group" -- there's a list of people subscribed to the
URI mailing list. That being said, I haven't seen *any* kind of
consensus that RFC3986 should be changed. I've seen some discussion
about whether RFC3987bis should expand on the "LEIRI" topic, and it
seems Martin D was considering that input.

It seems to me that the following facts are true:

* The URI group/mailing list is not actively working to update or change the
URI specs.
* the last few weeks, it has become clear that the URI specs need to
change for certain aspects of browser behavior and HTML to make sense and/or
work right.
* The current URI/URL/"HTTP URL"/IRI breakout is artificial and can/should
be fixed in the URI spec.

If what Julian says is correct (and I have no reason to doubt it), how do we
get some traction on this issue? Who do we engage? Does it make sense,
instead of trying to do the work of an active URI group within the HTML 5
spec (the "HTTP URL" initiative) for a number of us to get involved with
getting an *active* URI group going and simply working within that framework
on that issue? Yes, it might feel like "packing the court", but if the spec
is in desperate need of some reality-based changes, and there is no *active*
group willing or able to even consider changes, then I don't see any issue
with it.

J.Ja

Reply With Quote
  #4  
Old July 7th, 2008, 09:20 AM
Julian Reschke
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
URI/IRI vs HTML-URL, was: Why Microsoft's authoritative=true won't work and is a bad idea

Justin James wrote:
>There is no "URI group" -- there's a list of people subscribed to the
>URI mailing list. That being said, I haven't seen *any* kind of
>consensus that RFC3986 should be changed. I've seen some discussion
>about whether RFC3987bis should expand on the "LEIRI" topic, and it
>seems Martin D was considering that input.


It seems to me that the following facts are true:

* The URI group/mailing list is not actively working to update or change the
URI specs.

There is no URI working group. URI is a stable specification (full IETF
standard), and there's no consensus that anything needs to be done with
it with respect to "HTML URL".

There are individuals (?) working on a revision of the IRI spec,
including Martin D That revision may contain more information about
what's currently called LEIRI (Legacy Extended IRI), but I don't think
there's consensus about whether this is really good idea. Head over to
the URI mailing list and discuss it, if you're interested.

* the last few weeks, it has become clear that the URI specs need to
change for certain aspects of browser behavior and HTML to make sense and/or
work right.

Nope.

What has become clear is that HTML needs to handle a superset of what
IRI allows, and also needs to special case IRI->URI conversion for query
components.

That can be done in a separate spec, defining a mapping from "HTTP URL"
to IRI reference, and then letting the default URI/IRI rules apply.

It's not yet clear whether the same is needed outside HTML. Still
waiting for examples.

* The current URI/URL/"HTTP URL"/IRI breakout is artificial and can/should
be fixed in the URI spec.

Not sure what you call "breakout", and what you want fixed.

If what Julian says is correct (and I have no reason to doubt it), how do we
get some traction on this issue? Who do we engage? Does it make sense,
instead of trying to do the work of an active URI group within the HTML 5
spec (the "HTTP URL" initiative) for a number of us to get involved with
getting an *active* URI group going and simply working within that framework
on that issue? Yes, it might feel like "packing the court", but if the spec
is in desperate need of some reality-based changes, and there is no *active*
group willing or able to even consider changes, then I don't see any issue
with it.

I think HTML5 defining local rules for treatment of identifiers in HTML
documents is fine. this is done by defining a mapping to IRI
(which as far as I understand currently is not the case).

*If* more specifications need the same kind of mapping (and that's still
an "if" for me), it would make sense to extract these mapping rules into
a separate spec. Should these specs live in W3C land, it would probably
make sense to make this a W3C activity.

BR, Julian

Reply With Quote
  #5  
Old July 7th, 2008, 01:40 PM
Boris Zbarsky
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Why Microsoft's authoritative=true won't work and is a bad idea

Henrik Nordstrom wrote:
HTTP already specifies when sniffing is allowed or not. Major browser
vendors have over time and by intent choosen to ignore this part of the
specifications

Indeed, though as far as I can tell all of them except IE did this in the face
of the #1 most-commonly-used HTTP server having a "feature" which essentially
forced them to do it if they were to have a hope of being compatible with
commonly-used websites. That's for text/plain. Feed sniffing was more a matter
of standalone feed readers ignoring Content-Type altogether and treating
everything as a feed, which meant that there was zero incentive to label feeds
as such. When browsers came to implement a feed reader, the status quo was that
a large fraction of feeds (easily double-digit percentages) was mislabeled.

and now their ignorance is coming back and biting them
and their users.

Excuse me? "Ignorance"? Everyone involved knew exactly what they were doing.
There were just no good solutions; the small amount of sniffing added seemed
like the least bad of a set of bad choices.

Does this mean that specifications should change to
allow for these bugs to grow into a standard feature encouraging
ignorance?

The specifications, the UAs, and the servers should change such that:

1) The UAs implement the specification.
2) The servers implement the specification.
3) The specification defines error-handling.
4) The ensemble is a stable equilibrium (Ideally no one has incentive to
change behavior).
5) At no point in between here and there is a UA required to do something
that would cause its users to stop using it (an obvious non-starter
from a UA point of view).
6) At no point in between here and there is a server required to do
something that would cause administrators to stop using it (also an
obvious non-starter, I would think).

I have no opinion as to what the final state should be, subject to the above
constraints.

It also seems that some noticeable players have lost faith, thinking
that things won't improve over time and things will stay as bad or worse
over time.

That's an empirical observation of the last 10 years, for what it's worth, not
just a "think". If you think the next 10 years will somehow be different, I'd
love to know why.

-Boris

Reply With Quote
  #6  
Old July 7th, 2008, 05:40 PM
Henrik Nordstrom
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Why Microsoft's authoritative=true won't work and is a bad idea

¥n, 2008-07-07 at 09:36 -0400, Justin James wrote:

* the last few weeks, it has become clear that the URI specs need to
change for certain aspects of browser behavior and HTML to make sense and/or
work right.

Whats wrong with the HTTP URL specification that makes HTML not make
sense or not work right?

I know some cases where browsers behave oddly wrt Internet URLs in
general (mainly http:// and ftp://), and in all cases so far they are
not following specifications and would behave quite well if they did

Regards
Henrik

PGP SIGNATURE
Version: GnuPG v1.4.7 (GNU/Linux)



+
uR4t5y2mytU=
=dXL/
PGP SIGNATURE

Reply With Quote
  #7  
Old July 7th, 2008, 06:20 PM
Justin James
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Why Microsoft's authoritative=true won't work and is a bad idea

Whats wrong with the HTTP URL specification that makes HTML not make
sense or not work right?

I know some cases where browsers behave oddly wrt Internet URLs in
general (mainly http:// and ftp://), and in all cases so far they are
not following specifications and would behave quite well if they did

Henrik -

The problem with the concept of HTML specifying its own URLs, from my viewpoint, is that developers need one standard to follow, not 3 (URI, IRI, HTTP URL). All too often, once you get more than 2 competing "standards", none of are actually "standard" and enough will get enough traction so that they never die. I truly think that everyone would be better served if there was simply 1 "U|IR*" standard (it's really sad when a regex is the best way to refer to a group of things) that developers learn and understand. All of the debate on this list over having a "U|IR*" standard added to the HTML spec, in order to compensate for discrepancies between how U|IR*'s are commonly used in HTML, as opposed to the way the specs read, is further proof that the specs are broken.

A simple summary of my thoughts:
Any spec which is not properly followed by the majority of developers a majority of the time (where pertinent, of course) is not a "standard" and is a broken spec. Sometimes, it is broken outside of the spec itself, such as being sponsored or ratified by an unrecognized body. times it is broken within the spec, like 800 page specs describing a floor sweeping process or something. Sometimes it is just a marketing problem (like so many of the X* specs, like XHTML, XForms, XPath, and a zillion other X* specs which few people use).

>From what I can tell, the W3C has very, very hard time producing specs which don't qualify as "broken" by that measure, and HTML is heading that list.


Imagine if drive manufacturers followed the SATA spec as well as HTML authors followed the HTML spec. We'd still be using pen and paper. So we need to be asking ourselves, "what's wrong with HTML that no one follows it?" The answer is not *just* "browsers accept garbage". The answer also includes, "a spec so long and lengthy that only a select few people can understand it to the point where they can write valid HTML." In other words, HTML is broken from the inside.

J.Ja

Reply With Quote
  #8  
Old July 7th, 2008, 06:20 PM
Henrik Nordstrom
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Why Microsoft's authoritative=true won't work and is a bad idea

¥n, 2008-07-07 at 14:21 -0400, Boris Zbarsky wrote:

Excuse me? "Ignorance"? Everyone involved knew exactly what they were doing.
There were just no good solutions; the small amount of sniffing added seemed
like the least bad of a set of bad choices.

I obviously disagree, but that's my opinion.

The specifications, the UAs, and the servers should change such that:

I'll add

0) The specifications makes sense and unambious to implement

1) The UAs implement the specification.
2) The servers implement the specification.
3) The specification defines error-handling.
4) The ensemble is a stable equilibrium (Ideally no one has incentive to
change behavior).
5) At no point in between here and there is a UA required to do something
that would cause its users to stop using it (an obvious non-starter
from a UA point of view).
6) At no point in between here and there is a server required to do
something that would cause administrators to stop using it (also an
obvious non-starter, I would think).

Yes, with some reservations for 5 & 6. I do expect UAs and servers to be
willing to correct bugs, even if correcting those bugs would cause some
slight interoperability issues with other broken implementations at the
benefit of enabling correct interoperability with correct
implementations. Even if this results in some users shifting one way or
another.

I have no opinion as to what the final state should be, subject to the above
constraints.

I have some opinions, based on

- Simplicity.

- No second-guessing or non-obvious sideeffects. If something is said
it is said and should be trusted to be correct.

- Consistent. As few special cases as possible.

That's an empirical observation of the last 10 years, for what it's worth, not
just a "think". If you think the next 10 years will somehow be different, I'd
love to know why.

Been in this business for more than 10 years, and have not yet lost
faith in the ability to work for a more standardized and predictable
computing environment.

But if standardisation discussions in general tend to focus on "making
current broken implementations the standardized status and assuming all
implementations will be broken in the same way" instead of what makes
sense from a long term technical standard point of view then things will
certainly spin in the direction of worse.

Regards
Henrik

PGP SIGNATURE
Version: GnuPG v1.4.7 (GNU/Linux)




QkeToE=
=agFT
PGP SIGNATURE

Reply With Quote
  #9  
Old July 7th, 2008, 07:01 PM
Henrik Nordstrom
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Why Microsoft's authoritative=true won't work and is a bad idea

¥n, 2008-07-07 at 18:56 -0400, Justin James wrote:

The problem with the concept of HTML specifying its own URLs, from my
viewpoint, is that developers need one standard to follow, not 3 (URI,
IRI, HTTP URL).

But I am still not aware of the problem which triggered this. I linger
on the HTTP WG, not the HTML one and is therefore unaware of what
problem HTTP URL/URI/IRI specifications cause for HTML.

Any spec which is not properly followed by the majority of developers
a majority of the time (where pertinent, of course) is not a
"standard" and is a broken spec.

There is a large grey zone there. But yes, if every implementer consider
what the specs says in some area to be nonsense and implements something
else than the specs says then the spec is most likely broken. But in
quite many cases it's just poor choice of language making the intentions
of the specification not so obvious

If every implementer implements something else because what the specs
says is correct but the will to try to interoperate with existing/older
broken implementations is greater than the will to keep a sane
implementation. And especially not when there is multiple such areas for
historical reasons (which HTTP has it's noticeable share of with 3.5
generations in a less than a handful years)

Sometimes, it is broken outside of the spec itself, such as being
sponsored or ratified by an unrecognized body.

implemented before the effects has been properly analyzed

times it is broken within the spec, like 800 page specs
describing a floor sweeping process or something.

Yes and unfortunately many specifications is heading in that
direction, growing uncontrollably large with huge amounts of legacy
attached.

But quite often it's better to clearly define the original intents using
the original mechanisms and encourage compliance, than to reinvent the
same things again only because most implementers got it wrong the first
time.

Sometimes it is just a marketing problem (like so many of the X*
specs, like XHTML, XForms, XPath, and a zillion other X* specs which
few people use).

Heh

From what I can tell, the W3C has very, very hard time producing specs
which don't qualify as "broken" by that measure, and HTML is heading
that list.

Can't comment. HTML is not my main field, staying mostly in the area of
protocols and bits. But I do still feel a significant gap between HTML
(and related) specifications and user agent implementation, and quite
different gaps depending on implementation But I still have faith
that things will improve over time if one has a little patience, and
coverge towards the specications instead of diverging even further
apart.

A really big problem is to how to get rid of legacy from earlier
specifications whos design choices perhaps wasn't the best a
feature gets into a standard and implemented in more than one
implementation it's likely to stay for a considerable time even if it
turned out to be a very bad idea.

Things which is only implemented but not officially standardised, or
only in the standards but never implemented is a while lot easier to
change as you can always claim that one of the two is wrong/broken.

Same for when implementations misread specifications, resulting in
unintentional deviations from the specification, most often from not
understanding the specification or how it applies to what they do. Such
mistakes is often relatively easy to get corrected once the right people
is made aware of the issue and why it's important to follow the specs.

Regards
Henrik

PGP SIGNATURE
Version: GnuPG v1.4.7 (GNU/Linux)


/

vP+Bn221BAQ=
=pMWq
PGP SIGNATURE

Reply With Quote
  #10  
Old July 7th, 2008, 07:01 PM
Boris Zbarsky
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Why Microsoft's authoritative=true won't work and is a bad idea

Henrik Nordstrom wrote:
>Excuse me? "Ignorance"? Everyone involved knew exactly what they were doing.
>There were just no good solutions; the small amount of sniffing added seemed
>like the least bad of a set of bad choices.


I obviously disagree, but that's my opinion.

You're entitled to it, and I should clarify that the above only applies to the
cases in which I've been able to see the reasoning process that led to the
decisions (namely Gecko and Webkit).

0) The specifications makes sense and unambious to implement

Assuming you meant "unambiguous", I agree. If you meant something else, what
did you mean?

>5) At no point in between here and there is a UA required to do something
>that would cause its users to stop using it (an obvious non-starter
>from a UA point of view).
>6) At no point in between here and there is a server required to do
>something that would cause administrators to stop using it (also an
>obvious non-starter, I would think).


Yes, with some reservations for 5 & 6. I do expect UAs and servers to be
willing to correct bugs, even if correcting those bugs would cause some
slight interoperability issues with other broken implementations at the
benefit of enabling correct interoperability with correct
implementations. Even if this results in some users shifting one way or
another.

So you're asking people to shoot themselves in the foot for the common good.
While some may be willing to, in general that's a tough sell if the shooting is
significant enough.

Put another way, I can't think of a browser that would be willing to, say,
sacrifice 5% of market share on this issue. I suspect sacrificing a single user
is acceptable. The line is somewhere in between.

- Simplicity.

Which is nice if possible, of course. Are we talking simplicity of
specification, of implementation, or of deployment?

- No second-guessing or non-obvious sideeffects. If something is said
it is said and should be trusted to be correct.

This is nice to have, yes.

- Consistent. As few special cases as possible.

Again, this is nice to have.

-Boris

Reply With Quote
  #11  
Old July 7th, 2008, 07:01 PM
Henrik Nordstrom
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
Why Microsoft's authoritative=true won't work and is a bad idea

¥n, 2008-07-07 at 19:31 -0400, Boris Zbarsky wrote:

0) The specifications makes sense and unambious to implement

Assuming you meant "unambiguous", I agree.

I did. Always have a hard time spelling that word for some reason

So you're asking people to shoot themselves in the foot for the common good.
While some may be willing to, in general that's a tough sell if the shooting is
significant enough.

No I am not.

Put another way, I can't think of a browser that would be willing to, say,
sacrifice 5% of market share on this issue. I suspect sacrificing a single user
is acceptable. The line is somewhere in between.

Yes. The rule is that you sacrifice some share to gain another part and
improve long term stability and reliability.

- Simplicity.

Which is nice if possible, of course. Are we talking simplicity of
specification, of implementation, or of deployment?

In this discussion at least specification and implementation. Usually
goes hand in hand.

Regards
Henrik

PGP SIGNATURE
Version: GnuPG v1.4.7 (GNU/Linux)




4E5az6o5v3k=
=T9tL
PGP SIGNATURE

Reply With Quote
  #12  
Old July 8th, 2008, 06:20 AM
Stefan Eissing
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
the "HTML URL" issue, was: Why Microsoft's authoritative=true won't work and is a bad idea

Am 08.07.2008 um 09:27 schrieb Julian Reschke:
The other issue that got a lot of discussion is whether the things
used in HTML should be called "URL", when in reality they are
something else.

Calling them HREFs (even though they also appear in other attributes)
would give everyone the right context (HTML) and topic (URLs) without
the confusion of redefining existing terms.

//Stefan
--
<green/>bytes GmbH, Hafenweg 16, D-48155 M, Germany
Amtsgericht M: HRB5782

Reply With Quote
  #13  
Old July 8th, 2008, 06:20 AM
Julian Reschke
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
URI/IRI vs HTML-URL, was: Why Microsoft's authoritative=true won't work and is a bad idea

Martin Duerst wrote:

It may or may not need such a special case. The truth is that some years
ago (less than 10), virtually all existing non-ASCII path information
in (U/I)RIs had to be interpreted in the encoding of the containing page.
This has changed, because people started to pick up on the idea of IRIs,
more and more systems used UTF-8 on the server side, and at least some
people understood that using the encoding of the containing page
made it impossible to treat such identifiers free-standing. Also, a
fallback for paths in legacy encodings is still availible (and was always
available): %-encoding.

As long as query URIs are interpreted based on the encoding of the
containing page, they will stay useless without that context. I.e.
they cannot (without further pain) be put into bookmark lists, they
cannot be sent in email, and so on. The only sensible way to make
this possible is to do the same as for the path part, namely use
UTF-8 for the IRI->URI conversion. Freestanding (U/I)RIs with
query parts may be less important than freestanding (U/I)RIs
without query parts, but still, they are often convenient.
However, they won't work if implemented the way HTML5 is currently
describing them. Also, same as for path parts, a fallback for query
parts in legacy encodings is still availible (and was always
available): %-encoding.

In summary, there are cases where things changed to the better
in the last few years, and there are cases where some solutions
make the Web work better than others.


Note that HTML5 documents that carry aren't encoded in UTF-8 (or UTF-16)
and which carry non-ASCII query parameters are currently non-conformant.
(I personally don't think it makes a big difference in practice as HTML5
makes normatively defines their handling, so people will rely on that
anyway).

>That can be done in a separate spec, defining a mapping from "HTTP URL" to IRI reference, and then letting the default URI/IRI rules apply.


I'm very much confused by "HTTP URL". In case that's the term that HTML5
currently uses, it should use a different one, to avoid confusion.

Actually, I wanted to say "HTML URL" (URL as used in HTML5). HTML5
really uses just the term "URL".

BR, Julian

Reply With Quote
  #14  
Old July 8th, 2008, 07:01 AM
Robert J Burns
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
URI/IRI vs HTML-URL, was: Why Microsoft's authoritative=true won't work and is a bad idea

Jul 8, 2008, at 9:13 AM, Martin Duerst wrote:
>

As long as query URIs are interpreted based on the encoding of the
containing page, they will stay useless without that context. I.e.
they cannot (without further pain) be put into bookmark lists, they
cannot be sent in email, and so on. The only sensible way to make
this possible is to do the same as for the path part, namely use
UTF-8 for the IRI->URI conversion. Freestanding (U/I)RIs with
query parts may be less important than freestanding (U/I)RIs
without query parts, but still, they are often convenient.
However, they won't work if implemented the way HTML5 is currently
describing them. Also, same as for path parts, a fallback for query
parts in legacy encodings is still availible (and was always
available): %-encoding.
>


Some implementations also break the fallback %-encoding by first
trying to reinterpret the %-encoding within the current document
encoding and then translating where appropriate. For example if the
percent encoding represents a Unicode code point that maps to the
current document encoding the implementation uses that translated
bytecode instead of the literal percent encoded bytecode. I'm not sure
whether this is an unfixable implementation error or whether we could
use HTML5 to get these implementations back on track though.


Jul 8, 2008, at 11:20 AM, Stefan Eissing wrote:
>

Am 08.07.2008 um 09:27 schrieb Julian Reschke:
>The other issue that got a lot of discussion is whether the things
>used in HTML should be called "URL", when in reality they are
>something else.
>

Calling them HREFs (even though they also appear in other
attributes) would give everyone the right context (HTML) and topic
(URLs) without the confusion of redefining existing terms.

From the relevant RFCs the term "URL reference" already exists and is
the appropriate term for the value taken by the @href, @cite, @src and
other attributes ("URI reference" or "IRI reference" might also make
sense).

Take care,
Rob

Reply With Quote
  #15  
Old July 8th, 2008, 09:01 AM
Justin James
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
the "HTML URL" issue, was: Why Microsoft's authoritative=true won't work and is a bad idea

The other issue that got a lot of discussion is whether the things
used in HTML should be called "URL", when in reality they are
something else.

Calling them HREFs (even though they also appear in other attributes)
would give everyone the right context (HTML) and topic (URLs) without
the confusion of redefining existing terms.

Having nearly identical concepts is the root of this problem, not the nearly
identical names (although that does not help either). There is no need to
have a different spec for URI, IRI, and "HTTP URL", "URL reference", "HREF"
(or whatever this mystery spec is being called). There should be *one* spec
for resource locations. Period.

Besides, defining resource locators is outside the domain of HTML as far as
I am concerned.

J.Ja

Reply With Quote
  #16  
Old July 8th, 2008, 09:40 AM
Stefan Eissing
Guest
Dev Archives Newbie (0 - 499 posts)
 
Posts: n/a  
Time spent in forums:
Reputation Power:
the "HTML URL" issue, was: Why Microsoft's authoritative=true won't work and is a bad idea

Am 08.07.2008 um 15:55 schrieb Justin James:
Having nearly identical concepts is the root of this problem, not
the nearly
identical names (although that does not help either). There is no
need to
have a different spec for URI, IRI, and "HTTP URL", "URL
reference", "HREF"
(or whatever this mystery spec is being called). There should be
*one*