I agree it's still an issue. Enterprise sysadmins should be allowed to block access to external registries, including Docker Hub. There is nothing contentious about it, and it has nothing to do with profit. If you send a properly implemented patch for it, the maintainers will merge it.
That's great news. What do you consider a "properly implemented patch"? I would think the most bulletproof & simplest patch would simply allow the default index (index.docker.io) to be reconfigured to something else in docker.conf. Would you support a patch that did that?
I think the best way is to allow a "whitelist mode" where only an explicitly specified list of URLs are allowed to be reached. Everything else would be blocked by default. This should give ops the peace of mind they need.
Note that this is an ACL change, and not a namespace change. That is important because we want image names to have the same meaning everywhere, regardless of site-specific configuration. So for example, "docker pull ubuntu" should always mean "install ubuntu from the official docker library". This is crucial to the developer experience and to respect the principle of separation of concern between dev and ops. However, if ops chooses to block access to the standard library then "docker pull ubuntu" will fail with "access blocked by your administrator", which is totally acceptable. What we don't want is the operation silently substituting a site-specific image, without the knowledge of the end user, thereby breaking their build in a thousand invisible ways.
I hope this helps. Does this mean I should look forward to a patch from you? :)
> However, if ops chooses to block access to the standard library then "docker pull ubuntu" will fail with "access blocked by your administrator", which is totally acceptable.
Actually, it isn't, at least in production, because it forces us to rebuild a lot of images that references the standard library, when a more reasonable approach would be to mirror them.
The index/registry/image identity problem is by far the weakest part of Docker, and what appears most attractive with Rocket, in my opinion. There are pretty much zero cases where I, in my ops role, can allow production deployments to have access to the official Docker repository, because it opens the door to pulling in all kinds of stuff that has not been vetted (e.g. referencing the "latest" images, and having that change between dev signing something off and deployment), and it creates all kinds of obnoxious failure scenarios.
At the same time, I don't want devs to have the hassle of having to repackage all the images to point them to our internal registries, when we could easily mirror the images that have been tested.
So if there's no easy way to point the default somewhere else, what we'll resort to instead is increasingly adding firewall rules to block the official registry, coupled with DNS tweaks to make *.docker.io point where we want it, or patching the code.
Or switch to Rocket once it gets more mature, if Docker continues to make custom image management more troublesome than necessary.
I totally agree that mirroring of official images should be easier, and right now it's an obstacle to easier production deployment. This is why it's important to have cryptographically signed, self-describing images. Then it becomes irrelevant where you download them from, and anyone could host a public or private mirror. I am 100% in favor of it and we are upgrading the registry system to allow it. Happy to chat more on #docker-dev.
> What we don't want is the operation silently substituting a site-specific image, without the knowledge of the end user, thereby breaking their build in a thousand invisible ways
But this is not preventable in any way but explicitly pulling an image with a signing key. Who knows what index.docker.io resolves to or is serving?
I will happily provide a patch that implements the desired functionality :-)
However, I think people want the option of not using the docker registry. It is not under their control, it is slow, and it is a single point of failure. Having to recompile Docker to remove the magic constants is painful. I can see that the ACL functionality would be useful, but I don't think it addresses those concerns.
Even if you believe it is a bad idea to allow users to change the meaning of image names, you should allow users the freedom to choose their own path. I think this is a fairly basic tenet of open-source: that true innovation comes when people are allowed to try new things, even if they look like a bad idea at the time.
I totally agree that users should be given the freedom to choose their own path. For example that's why we added pluggable backends for storage and sandboxing very early, before even shipping 1.0 :)
At the same time, I care a lot about design and usability - and good design requires constraints. So there is a fundamental tension between flexibility and usability. My approach to this problem is to put usability first, and improve flexibility over time. The rationale is simple: you can always make things more flexible over time - but once usability is broken, it's almost impossible to fix.
So, to apply this in the topic at hand:
* Usability first: no matter what, don't break the predictable meaning of image names.
* Flexibility over time, step 0: allow admins to block access to the Hub. Since the standard library is only hosted on the Hub, this blocks access to the standard library. Note that you are not at all required to use the Hub. You can refer to any other registry, today, simply by "docker pull URL".
* Flexibility over time, step 1: allow hosting the standard library outside the Hub. The standard library is actually community project, very similar to Homebrew. The fact that it can only be hosted on the Hub is a technical limitation. We should lift that limitation, so that even if you block access to the Hub, you can still access the standard library. This could be via a private mirror, community-controller mirrors, or perhaps a bittorrent transport :)
* Flexibility over time, step 2: allow complete control over which images are downloaded from where. This means completely separating the naming of images from their transport. This allows a sysadmin to compose an image storage and transport setup that fits their needs: S3 for this namespace, high-performance NFS filer for that, public bittorrent transport for this authorized subset of public images, Hub for the the QA department, etc. The key technical requirement here is to make image self-describing, and make cryptographic signature and verification mandatory.
> * Usability first: no matter what, don't break the predictable meaning of image names.
Give up any idea that you have control over this. You don't.
As a user, I don't want you to have control over this, because it is directly contrary to my interests to not be able to redirect requests for "docker pull ubuntu" as per your example, to go somewhere I have full control over, so that I can choose to get exactly the image I expect matches the "ubuntu" image name rather than whatever the index will currently return for that name, and do so without having to change every reference to that name to something relative to my own registry.
You can make this hard, and have it continue to annoy users, or you can fix this, but this is a the biggest usability problem with Docker to me the way it stands now.
I find it hard to take you seriously when you claim usability first, and then claim that there is predictable meaning of image names: That is only true if you by "predictable meaning of image names" means "I don't really know what I'll get". This is especially true because of the lax use of tags all over the place.
The irony is that your "key technical requirement" for "flexibility over time, step 2" is far more important with the current situation than in actually would have been if we had the flexibility you describe for that step:
If Docker provided that flexibility today, then I could have easy complete control over exactly which images gets accessed if I wanted to. Instead I need to resort to firewall rules and/or DNS hacks or changing the source if I want to prevent accidentally pulling in images different than what I expect.
> this is a the biggest usability problem with Docker to me the way it stands now
> You can make this hard, and have it continue to annoy users
I have to respectfully disagree. From my experience talking to many many Docker users, this is definitely not a usability issue. To achieve what you want, literally all you have to do is give an explicit name to your own Ubuntu name, under a DNS name you control. "docker pull mydomain.tld/ubuntu" will work out of the box, and you have full control over what goes in there. The overwhelming majority of Docker users have no problem with that. It's exactly how the Golang packaging system works, for example. It's also similar to the jar naming system with its mandatory reverse-dns notation. With this system you get a consistent user experience across all Docker installations, and you get flexibility. Can you point to a specific usage scenario where you found yourself stuck because of this design?
So, I don't believe you are actually complaining, as a user, about a usability issue. I believe you are annoyed, as a fellow domain expert, that we designed it differently than you would have. I respect the fact that there were different ways to approach the problem of naming and discovering images. We made a design decision, and so far the users seem to agree with it.
There is also a usability benefit of this design which is not obvious, but has a huge impact. If we allowed fragmentation of the namespace, the first thing that would happen is that every OS vendor and every cloud provider would start shipping Docker with a modified configuration, to override the meaning of "ubuntu" to mean "ubuntu in my walled garden app store". I know this because all of them are busy pressuring us into doing it. If they had their way, not only would it not solve any usability problems, it would fundamentally break the experience of using Docker because "ubuntu" would depend on which walled garden your particular machine is running in. This would fundamentally destroy the value of Docker, which is interoperability.
This doesn't mean I'm happy with the usability of Docker's image distribution in general. There are definitely issues which I look forward to fixing. For example, image signature should be mandatory. All layers should be content-addressed. It should be easier to extend a registry to customize authentication. And the standard library images should be easier to mirror.
EDIT: in another comment you mention the specific problem of mirroring official images in production. I agree that is a real usability issue for ops. We will fix it. But I think it's orthogonal to what you suggest here, which is freedom to fragment the namespace (and I believe is not a good thing).
The doublespeak here is that you say you would welcome a patch if it was "properly implemented", but then it turns out you mean "if it does ACLs and does not replace the default registry". You claim this is for the good of the users, but actually users are asking for the ability to replace the registry, and the primary beneficiary of this policy is the Docker Registry. You only want to disallow access to the registry if it provides an error message that places the blame on the administrator, so that in practice their users demand they turn it back on.
Anyway, I think it's clear that the patch that users want isn't welcome, so I won't be wasting any further time on it.
I think the problem is that we are talking about 2 different things. You want to access a particular piece of content without being forced to connect to a particular server. I want to avoid the same name designating completely different pieces of content depending on factors outside the control of the end user.
These are both good goals. It's possible to reach both. I just want to make sure we don't sacrifice one for the other.
> the primary beneficiary of this policy is the Docker Registry
That makes no sense. When you download an official image from Docker Hub's servers, you are not charged in any way, and you are not required to create an account. The hosting and bandwidth costs are enormous, and there is no business benefit. The only reason the Hub hosts these images is because it improves the experience of using Docker, which in turn creates a larger market of Docker users to sell various services to. It is absolutely in the company's interest to allow for mirroring of the standard library, so that the burden of storing and distributing it is spread out across the ecosystem, and the company can focus more resources on things it can actually sell. It is also in the community's interest, because official images maintained by open-source maintainers shouldn't become unavailable if Docker Hub goes down, for example.
> I think it's clear that the patch that users want isn't welcome
Respectfully, it would be more intellectually honest of you to talk about the patch that you want. Just because you happen to have a soapbox on this forum doesn't make you a representative of "the Docker users", and it doesn't bless you with any particular insight on what they want collectively. That is a rather large group of people.