Skip to content

Fix escaping of '#' in parameter values and handling of unescaped '#' in uri#26

Open
nothingmuch wants to merge 4 commits into
Kixunil:masterfrom
nothingmuch:escaping-rules
Open

Fix escaping of '#' in parameter values and handling of unescaped '#' in uri#26
nothingmuch wants to merge 4 commits into
Kixunil:masterfrom
nothingmuch:escaping-rules

Conversation

@nothingmuch
Copy link
Copy Markdown

The first two commits add test assertions for existing behavior.

The next commit adds a failing test for parameter values containing '#', which should be escaped. The commit after that addresses this by specifying a more precise character set, causing the test to pass and should therefore be squashed into it before merge to preserve bisectability. For ease of review they were added separately.

The last three commits add two additional failing tests for correct handling of fragment (unescaped #) followed by a fix commit, and should also be squashed. This set of changes might be considered incomplete without a capability to parse RFC 3986 fragments analogous to the Extras mechanism, but since this is not specified in BIP 21 arguably this data should be extracted using the url crate.

@DanGould
Copy link
Copy Markdown
Contributor

@Kixunil, this helps us out in @payjoin land where we're using an (escaped) '#' fragment inside the pj query parameter

@DanGould
Copy link
Copy Markdown
Contributor

We thought we could work around it, but it turns out this change is actually vital to shipping a clean API for a stable Payjoin v2 version. Any way we can get prioritized review on this? Thanks in advance.

payjoin/rust-payjoin#373

Copy link
Copy Markdown
Contributor

@DanGould DanGould left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tACK f50ebb5

I checked that each commit behaves as advertised.

Would you like to squash failing tests as your commit messages suggest, or shall we proceed as-is?

'#'' is not in the set qchar indirectly defined in BIP 21, and therefore
should be escaped.

[BIP 21](https://github.com/bitcoin/bips/blob/master/bip-0021.mediawiki#abnf-grammar):

>     labelparam     = "label=" *qchar
>     messageparam   = "message=" *qchar
>     otherparam     = qchar *qchar [ "=" *qchar ]
...
> Here, "qchar" corresponds to valid characters of an RFC 3986 URI query
> component, excluding the "=" and "&" characters, which this BIP takes
> as separators.

[RFC 3986 § 3.4](https://www.rfc-editor.org/rfc/rfc3986#section-3.4):

> The query component is indicated by the first question mark ("?")
> character and terminated by a number sign ("#") character or by the
> end of the URI.

[RFC 3986 Appendix A](https://www.rfc-editor.org/rfc/rfc3986#appendix-A):

>     pchar         = unreserved / pct-encoded / sub-delims / ":" / "@"
>     query         = *( pchar / "/" / "?" )
...
>     pct-encoded   = "%" HEXDIG HEXDIG
>     unreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~"
...
>     sub-delims    = "!" / "$" / "&" / "'" / "(" / ")"
>                   / "*" / "+" / "," / ";" / "="
Although behavior for when encountering RFC 3986 fragments in BIP 21
URIs is not specified, according to RFC 3986 it is unambiguously not
query data and therefore should be excluded from BIP 21 query
parameters.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants