I’ve said this at other companies I’ve worked at — the datacenter / enterprise software distribution is underserved by P2P.
I wish more companies used BitTorrent internally for distribution of tools and software. I feel like homogenous environments like those that benefit from centralized config management like puppet would probably see performance gains with P2P technologies introduced.
This looks like a great contribution from Uber engineering and I look forward to playing with it!
This is partially why I built chihaya[0]. The idea is by having a powerful middleware model for traditional BitTorrent software, devs can easily extend the protocol for whatever they need internally. As a sibling comment mentions, an older fork of Chihaya is one of the tools used at FB for distribution in their orchestration system, Tupperware.
In practice, internal networks are such a mess that leveraging p2p is often more trouble than it's worth.
This is absolutely great if it fits your case, but otherwise maybe not the best time investment. We happen to do event-based stuff at $work, with short running servers. Deploys go to several AWS regions, coordinated from servers in Europe. Bittorrent is really amazing for that sort of thing.
On this note, we’ve been happily using Resilio (formerly BTSync) here at QuasarDB, as an alternative to Dropbox and the likes. It’s P2P and without a central cloud, and serves us very well.
Enterprises want to consume software dependencies they can trust. In a peer-to-peer model, anybody can publish anything. The value in having some sort of intermediary (e.g., Artifactory, etc) is they can act as a gateway before vulnerable dependencies make it into your build pipeline.
> In a peer-to-peer model, anybody can publish anything.
That doesn't have to be the case. You can certainly have private peer-to-peer networks that are authenticated and/or restricted to your internal network.
Google did it even before that and it's closer to a P2P model than Kraken (all Borg nodes participate, IIRC). By 2010, quite a few services built on top of Borg/MPM with the feature enabled had already undergone SAS 70 (later SOC / SSAE) audits. I know because one of my teams had to work on compliance. So there's nothing about this that is intrinsically not fit for the enterprise or to meet regulations. You have to do the legwork to meet the auditors' requests, of course.
But torrent has built in integrity checking. You should still check signatures on packages. Your distribution and storage media should be totally independent of the packages you choose to use and how you validate them.
I think the idea is that enterprises would still prefer to offload that responsibility to a third-party provider. I agree with what you're saying, from a risk management perspective, you shouldn't be trusting that your dev's will be pulling in safe third party code (e.g., from the NPM public registry) instead of forcing them to go through your dependency gateway (e.g., artifactory).
Under the hood, a tool like Artifactory could take advantage of this means of distribution, be responsible for signature checking, etc.
I didn't think anyone was suggesting to pull random stuff out of a p2p network but instead that you build your docker image and then use p2p to share it to thousands of servers. There are no trust issues here because you are the creator of the data being downloaded.
Having been at the hurting side of both a 500+TB globally distributed Artifactory cluster and a relatively smaller 30TB Quay cluster, I for one welcome this new contender! Scaling binary object stores globally is no small feat. FWIW, JFrog the company behind Artifactory is way better than CoreOS/RedHat from a vendor support aspect.
I'm chuckling inside thinking about how many people go and install/use this versus how many people actually work at a scale that they need to use this.
Given that e.g. AWS charges you for cross-region ECR image pulling, this can make a difference for scrappy companies that push large images on green (=multiple times a day, with lots of cache misses) to multiple regions. That's even if your deployments have just tens of replicas. Larger companies probably worry about other parts of the bill.
It makes sense to plan ahead for increased scale. If you are working for a VC backed company whose mission goals are to grow grow grow scale scale scale, then you cant exactly build for the infrastructure you currently are using. Its perfectly acceptable to build out an overbuilt infra, as long as your costs aren't shooting you in the foot. You know whats worse then paying too much for infrastructure? Loosing money and clients because your infrastructure breaks anytime you get a real workload on it.
But even worse is not being able to release because the system complexity has shot through the roof. Plan (and test!) for 10x scale at a time, then optimize to squeeze another 5-10x while you build the 1000x system.
To some extent testing this out when you need it is a bit too late. If you anticipate having a problem, it's useful to play with solutions before you actually have said problem.
That's different from applying a 50,000 node solution to a 50 node problem though.
Naively I would have thought a docker registry with on-disk storage managed by ipfs would have been a low-effort way to meet this requirement.
Unfortunately you'd need something to manage the pinning settings, but it feels like a relatively small addition to some other registry that could be hacked together in a small amount of time.
I didn't know ipfs cluster existed -- thanks for the reference.
I was more thinking of a harbor[0] like cobbling of these technologies together -- harbor combines a bunch of F/OSS tools into one powerful registry solution. Here are a few:
- Distribution for storing images
- Notary for signing
- Clair for static analysis
Based on this I was thinking of was basically Distribution + ipfs cluster + a small management daemon. The daemon is only there to tie the other pieces together and present a unified interface but the bulk of the work can be done with the other pieces.
I'm intrigued by p2p distribution like this but it makes me wonder if it's really more cost effective to switch/route all these small torrent packets instead of just using fileservers with 10G interfaces and maybe tiered caching.
"... a test where a 3G Docker image with 2 layers is downloaded by 2600 hosts concurrently (5200 blob downloads)"
they show this finishing in 20s, so that's 7,800 GB transferred in 20 seconds. That works out to 390 GB/sec. A 10Gbps interface can transfer about 1GB/sec, so you'd need 195 dual attached fileservers, or about 100 if they were running at 40gbps.
This line is far funnier than it should be. It reminds me of the time that I was told about how awful some architecture was because it would overload the network.
Their improved architecture sent lots of smaller files over NFS instead.
Well, probably they meant network congestion. Network congestion depends on both time distribution and link distribution of data, which differs widely with the choice of network protocols and network topologies (even if total data flowing through network remains the same)
That is definitely possible, but in my scenario the discussion wasn't that deep. It was more like the difference between NFS sending lots of small files and HTTP sending a tarball. The real differences were always going to be in the details of the implementation, not the choice of protocol.
Yep. I'll admit, I didn't crunch the numbers but routing torrent traffic isn't cheap. I haven't found a definitive resources but AFAIK/IIRC torrent uses at least an order of magnitude more packets per second for the same throughput as HTTP.
> torrent uses at least an order of magnitude more packets per second for the same throughput as HTTP.
This is total nonsense. This would mean that instead of using 1200+ byte packets, bittorrent uses 120 byte packets to transfer data. This is easily disproven by looking at the implementation of any client or just by looking at the traffic and seeing that it uses 1200+ byte packets for sending chunks around.
Or the php framework.
Or kraken.js.
Or the responsive css boilerplate.
Or the API gateway (KrakenD).
Or the Joomla theme.
Or the ransomware-as-a-service (RaaS) affiliate program.
Or the commonly used bioinformatics tool "kraken" which is used to predict the species a bit of DNA came from (useful in metagenomic studies of the microbiome or environment where a sample may contain DNA from thousands of organisms)
https://ccb.jhu.edu/software/kraken/
I'm always wondering though, if those optimizations are really needed or if they are created by engineers that want to have fun and do premature optimizations. Ever wondered why a "WebApp" like Uber needs so many engineers and new projects?
premature optimization is super fun. I did it a lot and I would advise to engineers to do it if they can.
But when I look at companies like Airbnb and Uber for example, I cannot really understand why they require that type of tools. Yes they are big, even very big, but nowhere the size of a Facebook or a Google. Most of those projects seem to be engineers having fun at work, and marketing themselves by publishing cool blog posts.
I was at Google when this became the standard more than a decade ago (https://www.usenix.org/sites/default/files/conference/protec...). For jobs where you had tens or hundreds of replicas, i.e. for services with lots of traffic, it made the rollout of new binaries in Borg much faster, especially if you were deploying to a cluster in Asia and your packages were built and stored in the US. Even in the US, this being before Firehose/Jupiter fabrics, network congestion had had noticeable impact.
MPM traffic was low priority, so for large enough binaries (and most of them were, because of static linking) you could easily see a Borg task spend seconds or even tens of seconds in "downloading" state if the borglet hadn't already seen and cached locally that version of the package.
It was about time this became available also for Docker and Kubernetes.
I guess what IFPS is missing and Kraken has are "pluggable storage options, and instead of managing data blobs, Kraken plugs into reliable blob storage options like S3, HDFS"
#1 - Dragonfly started November 2017 where as "Kraken was first deployed at Uber in early 2018". Which presumes Uber was working on this probably much earlier than Dragonfly even started.
#2 - Dragonfly is still listed as a Sandbox CNCF project, where as Uber is running this in production for roughly a year it seems.
#3 - Dragonfly was just refactored completely from Java to Go. So its kind of hard to say it's "battle tested" at all, at this point in time. It's basically an entirely new project as of 2 days ago.
What got into CNCF may be a newer iteration based on what Alibaba has been running but that means it should at least hold a candle to what is described in the post. With that said, one can surmise the Java version is the one they used in production.
Apart from that, P2P image distribution is probably limited to internal uses that are not critical. Kind of missing indications of much difference in reliability to say an internal CDN this would give.
Gonna be really hard for devs to find tutorials[1] or other related things on this, given the name ( 'Kraken' ) is already a well-established crypto exchange.
Same is the case with 'Discourse'. Almost impossible to find relevant content cos any search for 'discourse + keyword' invariably shows content related to "online discourse (conversation)"
My initial thought was that Kraken had released a Docker registry which was odd.
But now I realise it is a separate project by Uber that is called Kraken.
But as it is a project I am okay with the name clash. It kind of suits the distributed nature of its arms.
I know there will be some confusion but hopefully, people's google-fu and search terms will be able to distinguish from Kraken the cryptocurrency company, Kraken the docker registery tool and Kraken the mythical mega octopus, and more.
I wish more companies used BitTorrent internally for distribution of tools and software. I feel like homogenous environments like those that benefit from centralized config management like puppet would probably see performance gains with P2P technologies introduced.
This looks like a great contribution from Uber engineering and I look forward to playing with it!