Re: RFC question

by "Bryan Bateman" <batemanb(at)home.com>
Date:	Fri, 8 Dec 2000 14:08:15 -0000
To:	<John.ksi(at)webplus.net>, <hwg-servers(at)hwg.org>
References:	webplus
	todo: View Thread, Original
What you are refering to is the difference between relative and absolute
URI's.  That is addressed by RFC 1808 and a copy of it and all the other
RFC's can be found here.

ftp://ftp.sunsite.utk.edu/pub/rfc/rfc1808.txt

The ending slash is resolved as an absolute path and (I believe but don't
quote me) does not follow the index in the server config.

There are workarounds but I would have to do some digging for that.

<!------------------------Snippets from the above
RFC------------------------------------>

2.4.6.  Parsing the Path

   After the above steps, all that is left of the parse string is the
   URL <path> and the slash "/" that may precede it.  Even though the
   initial slash is not part of the URL path, the parser must remember
   whether or not it was present so that later processes can
   differentiate between relative and absolute paths.  Often this is
   done by simply storing the preceding slash along with the path.

4.  Resolving Relative URLs

   This section describes an example algorithm for resolving URLs within
   a context in which the URLs may be relative, such that the result is
   always a URL in absolute form.  Although this algorithm cannot
   guarantee that the resulting URL will equal that intended by the
   original author, it does guarantee that any valid URL (relative or
   absolute) can be consistently transformed to an absolute form given a
   valid base URL.

   The following steps are performed in order:

   Step 1: The base URL is established according to the rules of
           Section 3.  If the base URL is the empty string (unknown),
           the embedded URL is interpreted as an absolute URL and
           we are done.

   Step 2: Both the base and embedded URLs are parsed into their
           component parts as described in Section 2.4.

           a) If the embedded URL is entirely empty, it inherits the
              entire base URL (i.e., is set equal to the base URL)
              and we are done.

           b) If the embedded URL starts with a scheme name, it is
              interpreted as an absolute URL and we are done.

           c) Otherwise, the embedded URL inherits the scheme of
              the base URL.

   Step 3: If the embedded URL's <net_loc> is non-empty, we skip to
           Step 7.  Otherwise, the embedded URL inherits the <net_loc>
           (if any) of the base URL.

   Step 4: If the embedded URL path is preceded by a slash "/", the
           path is not relative and we skip to Step 7.

  Step 5: If the embedded URL path is empty (and not preceded by a
           slash), then the embedded URL inherits the base URL path,
           and

           a) if the embedded URL's <params> is non-empty, we skip to
              step 7; otherwise, it inherits the <params> of the base
              URL (if any) and

           b) if the embedded URL's <query> is non-empty, we skip to
              step 7; otherwise, it inherits the <query> of the base
              URL (if any) and we skip to step 7.

   Step 6: The last segment of the base URL's path (anything
           following the rightmost slash "/", or the entire path if no
           slash is present) is removed and the embedded URL's path is
           appended in its place.  The following operations are
           then applied, in order, to the new path:

           a) All occurrences of "./", where "." is a complete path
              segment, are removed.

           b) If the path ends with "." as a complete path segment,
              that "." is removed.

           c) All occurrences of "<segment>/../", where <segment> is a
              complete path segment not equal to "..", are removed.
              Removal of these path segments is performed iteratively,
              removing the leftmost matching pattern on each iteration,
              until no matching pattern remains.

           d) If the path ends with "<segment>/..", where <segment> is a
              complete path segment not equal to "..", that
              "<segment>/.." is removed.

   Step 7: The resulting URL components, including any inherited from
           the base URL, are recombined to give the absolute form of
           the embedded URL.

   Parameters, regardless of their purpose, do not form a part of the
   URL path and thus do not affect the resolving of relative paths.  In
   particular, the presence or absence of the ";type=d" parameter on an
   ftp URL does not affect the interpretation of paths relative to that
   URL.  Fragment identifiers are only inherited from the base URL when
   the entire embedded URL is empty.








----- Original Message -----
From: <John.ksi(at)webplus.net>
To: <hwg-servers(at)hwg.org>
Sent: Friday, December 08, 2000 4:26 PM
Subject: RFC question


> I got into a conversation with another webmaster.
> He's got a URL like http://www.someplace.com/yadda
> which works, yet http://www.someplace.com/yadda/
> does not.  (Note the trailing slash and the lack of
> a filename extension.)
>
> At http://help.netscape.com/kb/corporate/19960513-120.html
> I read "'http://server/directory' is not technically
> valid..." (altho' most handle this anyway).  But to
> FORCE the omission of the trailing slash?  I don't
> understand.
>
> Does an RFC address this?  If so, where?
>
> -John
>
HWG: hwg-servers mailing list archives, maintained by Webmasters @ IWA