Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 58 additions & 14 deletions url.bs
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,14 @@ valid input. User agents, especially conformance checkers, are encouraged to rep
<p>"<code>https://exa%23mple.org</code>"
</div>
<td class=yes>Yes
<tr>
<td><dfn>domain-non-strict</dfn>
<td>
<p><a abstract-op lt=ToASCII>Unicode ToASCII</a> records an error or returns the empty string when
<i>CheckHyphens</i>, <i>UseSTD3ASCIIRules</i>, and <i>VerifyDnsLength</i> are all set to true.
[[UTS46]]
<p class=example id=example-domain-non-strict>"<code>https://_dmarc.example.com/</code>"
<td class=no>·
<tr>
<td><dfn>domain-to-Unicode</dfn>
<td>
Expand All @@ -132,6 +140,14 @@ valid input. User agents, especially conformance checkers, are encouraged to rep
<tbody>
<tr>
<th colspan=3 scope=rowgroup><a href=#host-parsing>Host parsing</a>
<!-- domain host -->
<tr>
<td><dfn>domain-percent-encoded</dfn>
<td>
<p>The input's <a for=/>host</a> to be processed as a domain contains a
<a>percent-encoded byte</a>.
<p class=example id=example-domain-percent-encoded>"<code>https://exam%70le.org</code>"
<td class=no>·
<!-- opaque-host parser -->
<tr>
<td><dfn>host-invalid-code-point</dfn>
Expand Down Expand Up @@ -912,19 +928,10 @@ concepts.
<var>domain</var> and a boolean <var>beStrict</var>, runs these steps:

<ol>
<li>
<p>Let <var>result</var> be the result of running <a abstract-op lt=ToASCII>Unicode ToASCII</a>
with <i>domain_name</i> set to <var>domain</var>, <i>CheckHyphens</i> set to <var>beStrict</var>,
<i>CheckBidi</i> set to true, <i>CheckJoiners</i> set to true, <i>UseSTD3ASCIIRules</i> set to
<var>beStrict</var>, <i>Transitional_Processing</i> set to false, <i>VerifyDnsLength</i> set to
<var>beStrict</var>, and <i>IgnoreInvalidPunycode</i> set to false. [[!UTS46]]

<p class=note>If <var>beStrict</var> is false, <var>domain</var> is an <a>ASCII string</a>, and
<a>strictly splitting</a> <var>domain</var> on U+002E (.) does not produce any
<a for=list>item</a> that <a for=string>starts with</a> an <a>ASCII case-insensitive</a> match for
"<code>xn--</code>", this step is equivalent to <a>ASCII lowercasing</a> <var>domain</var>.
<li><p>Let <var>result</var> be the result of running <a>domain to ASCII internal</a> with
<var>domain</var> and <var>beStrict</var>.

<li><p>If <var>result</var> is a failure value, <a>domain-to-ASCII</a> <a>validation error</a>,
<li><p>If <var>result</var> is failure, <a>domain-to-ASCII</a> <a>validation error</a>,
return failure.

<li>
Expand All @@ -942,6 +949,9 @@ concepts.
<a>forbidden domain code points</a> are a subset of those disallowed when
<i>UseSTD3ASCIIRules</i> is true. See also
<a href="https://github.com/whatwg/url/issues/397">issue #397</a>.

<li><p>If the result of running <a>domain to ASCII internal</a> with <var>domain</var> and true
is failure, <a>domain-non-strict</a> <a>validation error</a>.
</ol>

<li>
Expand All @@ -959,6 +969,29 @@ concepts.
<code>☕.example</code> becomes <code>xn--53h.example</code> and not failure. [[UTS46]] [[RFC5890]]
</div>

<div algorithm>
<p>The <dfn>domain to ASCII internal</dfn> algorithm, given a <a for=/>string</a>
<var>domain</var> and a boolean <var>beStrict</var>, runs these steps:

<ol>
<li>
<p>Let <var>result</var> be the result of running <a abstract-op lt=ToASCII>Unicode ToASCII</a>
with <i>domain_name</i> set to <var>domain</var>, <i>CheckHyphens</i> set to <var>beStrict</var>,
<i>CheckBidi</i> set to true, <i>CheckJoiners</i> set to true, <i>UseSTD3ASCIIRules</i> set to
<var>beStrict</var>, <i>Transitional_Processing</i> set to false, <i>VerifyDnsLength</i> set to
<var>beStrict</var>, and <i>IgnoreInvalidPunycode</i> set to false. [[!UTS46]]

<p class=note>If <var>beStrict</var> is false, <var>domain</var> is an <a>ASCII string</a>, and
<a>strictly splitting</a> <var>domain</var> on U+002E (.) does not produce any
<a for=list>item</a> that <a for=string>starts with</a> an <a>ASCII case-insensitive</a> match for
"<code>xn--</code>", this step is equivalent to <a>ASCII lowercasing</a> <var>domain</var>.

<li><p>If <var>result</var> is a failure value, then return failure.

<li><p>Return <var>result</var>.
</ol>
</div>

<div algorithm>
<p>The <dfn id=concept-domain-to-unicode>domain to Unicode</dfn> algorithm, given a <a>domain</a>
<var>domain</var> and a boolean <var>beStrict</var>, runs these steps:
Expand Down Expand Up @@ -1045,6 +1078,9 @@ false), and then runs these steps. They return failure or a <a for=/>host</a>.

<li><p>Assert: <var>input</var> is not the empty string.

<li><p>If <var>input</var> contains a <a>percent-encoded byte</a>,
<a>domain-percent-encoded</a> <a>validation error</a>.

<li>
<p>Let <var>domain</var> be the result of running <a>UTF-8 decode without BOM</a> on the
<a for=string>percent-decoding</a> of <var>input</var>.
Expand Down Expand Up @@ -1650,10 +1686,15 @@ unified model would be, please file an issue.
<td>✅
<td><code>file:///C:/</code>
<tr>
<td><code>file://loc%61lhost/</code>
<td><code>file://localhost/</code>
<td>
<td>✅
<td><code>file:///</code>
<tr>
<td><code>file://loc%61lhost/</code>
<td>
<td>❌
<td><code>file:///</code>
<tr>
<td><code>https://user:password@example.org/</code>
<td>
Expand Down Expand Up @@ -1987,7 +2028,8 @@ an <a>absolute-URL string</a>, optionally followed by U+0023 (#) and a <a>URL-fr
<a>special scheme</a> and not an <a>ASCII case-insensitive</a> match for "<code>file</code>",
followed by U+003A (:) and a <a>scheme-relative-special-URL string</a>
<li><p>a <a>URL-scheme string</a> that is <em>not</em> an <a>ASCII case-insensitive</a> match for a
<a>special scheme</a>, followed by U+003A (:) and a <a>relative-URL string</a>
<a>special scheme</a>, followed by U+003A (:) and one of: a <a>scheme-relative-URL string</a>, a
<a>path-absolute-URL string</a>, or zero or more <a>URL units</a>
<li><p>a <a>URL-scheme string</a> that is an <a>ASCII case-insensitive</a> match for
"<code>file</code>", followed by U+003A (:) and a <a>scheme-relative-file-URL string</a>
</ul>
Expand Down Expand Up @@ -2943,6 +2985,8 @@ and then runs these steps:
<p>Otherwise, if <a>c</a> is U+0020 SPACE:

<ol>
<li><p><a>Invalid-URL-unit</a> <a>validation error</a>.

<li><p>If <a>remaining</a> starts with U+003F (?) or U+0023 (#), then append
"<code>%20</code>" to <var>url</var>'s <a for=url>path</a>.

Expand Down