So? In any decent programming environment, checking if a string is a URL is a on...

dkdbejwi383 · on Nov 10, 2021

Not without false-positives.

E.g., is "foo.bar" a URL? Maybe. But it could also be a filename. How do you know if it's a "real" URL or not?

toyg · on Nov 10, 2021

In The Good Ol' Times, I would have replied "let's check the TLD", but now that list is basically trending to include the entire English dictionary... so I guess the only response these days is "ask DNS". So we've already gone from "pattern-match a string" to "pattern-match then make network calls", which (as anyone who's done any network work knows) also requires managing a bunch of possible/likely error states (offline, timeout, partial response, response format, etc etc). So yeah, nothing is as easy as it looks.

AstralStorm · on Nov 10, 2021

Can't quite do this most of the time due to privacy concerns. Leaking random URL-looking text to the network is a big no.

toyg · on Nov 10, 2021

So now we have to ask for user consent (installation time? first run?) and respond accordingly, adding another piece of UI... but it will only take an hour, right...?

dkdbejwi383 · on Nov 10, 2021

You probably also want a setting to detect slow networks and disable it there in case the user is tethering etc.

quesera · on Nov 10, 2021

Strictly speaking, a URL begins with a scheme followed by a colon.

Schemes can be registered with IANA (or not), and everyone knows the most common half-dozen or so. People often forget "mailto:" and "tel:".

The project brief asks for one thing, but the practical implementation probably requires something else.

This is a good lesson to learn, and this is how two hours becomes two days, becomes two weeks.

tremon · on Nov 10, 2021

It isn't a URL. As per RFC 3986 [0]:

> The term "Uniform Resource Locator" (URL) refers to the subset of URIs that [..] provide a means of locating the resource by describing its primary access mechanism

Since "foo.bar" does not describe an access mechanism, it is not a URL. Yes, you could make the argument that "foo.bar" is a relative-path reference as described in section 4.2, but that is only used to:

> express a URI reference relative to the name space of another hierarchical URI

So "foo.bar" can only be considered a URL in the context of another given URL, and in your example there is none.

[0] https://datatracker.ietf.org/doc/html/rfc3986#section-1.1.3

marcellus23 · on Nov 10, 2021

I don't have to worry about that, because I'll pick a language that offers a `URL` object or something similar, and which handles the validation for me.

Additionally, if foo.bar were a valid URL, then I would expect it to appear on the list. I can't read the user's mind as to whether the text should be treated as a URL or not.