Be aware, when you use mesh VPN products such as, ZeroTier, Nebula or Tailscale, the company has access to your network, may add hidden nodes, and sees who is taking to who and what services they are running, which can be a privacy concern.
ZeroTier and Tailscale, yes, maybe. Nebula, as I understand it, no... it's a fully Open Source project, the mesh is P2P, and the "lighhouse nodes" which serve as tunnel end-point directories, you run yourself as well. There is no "vendor" involved.
For nebula, if you run the coordination server yourself (lighthouse or moon), obviously the concern I mentioned is not applicable. It will be self hosted; the company is not even involved to be responsible for anything.
My comment applies only if Slack provides the lighthouse. I don’t know if it does; usually the company operates a default one for users and provides the option to self host it as well (Tailscale does not offer one, but there is third party code).
Slack does not offer lighthouses or any part of Nebula.
They spun off a company, Defined Networks, which is enhancing Nebula, and they offer a hosted CA with key rotation and push configuration updates to your fleet, currently in a kind of early beta. You still run your own lighthouses, but they are entirely in control of your CA, so they can sign certs for nodes at their will.
That could allow third parties into your networks, but I'm not sure if that would imply any ability to redirect or decrypt traffic.
Exactly. Slack does not offer "lighthouses as a service". Here is a quote by the authors from the Nebula GH repository README...
"Nebula lighthouses allow nodes to find each other, anywhere in the world. A lighthouse is the only node in a Nebula network whose IP should not change. Running a lighthouse requires very few compute resources, and you can easily use the least expensive option from a cloud hosting provider. If you're not sure which provider to use, a number of us have used $5/mo DigitalOcean droplets as lighthouses."
Agreed, hard to see how that could happen with Nebula, since there is no vendor that has your CA key.
Defined Networks version, however, completely controls your CA key, and could generate their own nodes. The config files they generate are plain text and can be inspected for "mirror traffic" configs (I don't remember if Nebula has that feature, ZeroTier does). Defined Networks has a pretty slick setup, which I can go into a bit further if there is interest, I've done an eval and ended up deciding it wasn't quite ready for our use.
Being mesh networks, one could do some examination of traffic to ensure that they aren't shipping traffic external.
With ZeroTier, there are also some "self hosted" options. I haven't dug too deeply into them. I really like ZeroTier, and wanted to use it for our work overlay network, but I'm a bit skeptical about the reliability. It's been good for my test use case, but last year they had some sort of controller outage and when I asked their sales people about it and how we might be able to run a backup controller, he said that sort of outage wasn't possible. When I asked what was meant by this specific tweet that ZeroTier sent out that said it had happened, I got no reply. :-(
ZeroTier is super slick, but I can't move our entire infrastructure over to something that could have an outage that would take out our infrastructure until some third party resolved it.
They don’t need private keys. The company is responsible for distribution of public keys. So, they can inject a public key to your network, and you happily encrypt your traffic with that public key, to be decrypted on the other side by their private key.
It’s the same old key distribution problem; for instance, when you SSH you need to verify the authenticity of the key that is presented to you first time. You approve the wrong public key and it’s over.
This is not to say, Tailscale does that. The service is by far my favorite (Nebula is not as user friendly, and ZeroTier uses nonstandard tunneling). Tailscale is dead simple, uses Wireguard, has integration with SSO, provides ACLs, relays, good NAT traversal, good management interface and lately a lot of DERPs around the world. Just be aware of limitations (in US, they can even be forced to share the networks, even if they don’t want to).
Two other comments. These mesh networking products could use pre-shared keys to address this concern. For example, Tailscale could use Wireguard preshared keys, as an optional feature for those concerned with key distribution. I don’t know why they don’t offer this option. Also, these services are not zero trust, contrary to what they often claim on their websites (usually they twist the meaning of the term zero trust).
I'm afraid you don't understand how nebula works. A nebula cluster is fully self-contained, you are responsible for distribution of your own certs and hosting of your own lighthouse instances, there is no phoning home to any outside parties.
Under the hood nebula uses the noise protocol, the same used by wireguard.
If this is part of your real threat model then you're better off using a self-hosted control plane. Headscale for tailscale, the built-in one for Zerotier, a manually managed wireguard mesh, or Nebula (which is always self-hosted).
Security is always a compromise. If you want to access your homelab from outside (eg keep your documents in own hands), you have to open a way. Opening your own VPN endpoint or an SSH port is also non-zero risk, imho
Not intending to drag on comments, but I would argue running a basic Wireguard VPN on a central VPS (old hub and spoke) near your city, is more secure and faster. The attack surface is minimal, you have better control over firewall etc.
Mesh VPNs shine in small businesses with many users, where ACLs, SSO etc become useful. In home labs, a basic Wireguard server works fine.
I think mesh VPNs and VPS-based solutions are the same in terms of privacy. They all involve third parties that you have to trust. Mesh VPNs might even be slightly better because it uses P2P connections whenever it can.
The best solution IMHO would be to use mesh VPNs and secure inter-node connections with an additional layer of encryption. SSH and TLS should cover most use cases here, and both are widely supported and easy to set up.
Apologies if this might sound ignorant, but what is the purpose of such a complex setup. If it's to build a personal cloud host, why go through all the trouble of setting this up instead of just using AWS. I've been working as a web dev (flask + vue) for nearly 4 years now and have never come across any of this in my day job. While reading the article, I barely understood how all of these technologies (Ansible, Nomad, Tailscale, etc) work together. And, I'm now realising that it's high time I learn all these modern technologies and update myself.
What would be the best place to learn all of this. While the Ansible docs seem like a good place to start, what I'm really looking for is how do I identify when I would need to use any of these technologies. At what point in the development cycle starting from a basic web backend would I need to consider using these technologies. Any pointers to some good resources to learn all this would be very much appreciated.
> why go through all the trouble of setting this up instead of just using AWS.
Because you may want to have control over your hardware, your costs, your network, your software. AWS and most cloud providers are very expensive and mostly lock you in (unless you only use EC2 instances as VM, in which case you are getting terrible value for what you pay). I already have computers and an ISP, so why would I want to rent someone else’s?
The point is either to learn some new stuff in a semi-practical environment or build an environment where you can host things that are impractical to host in the cloud. Things like home assistant or whatever.
If you are doing anything with infrastructure than finding a place with Ansible or Nomad would be:
Ansible - Deploying VMs on-prem and in the cloud, creating a managed-k8s cluster in the cloud, deploying k8s on-prem, possibly bare-metal server provisioning. Configuration of
resources for all of the above.
Nomad - Alternative to K8s from Hashicorp (supposed to be simpler, I wouldn't know as I've never used it).
Tailscale - VPN provider
Where would you use these tools your day-to-day? If you aren't deploying VMs, containers, bare-metal servers, or making configurations to the above - you would probably be looking to pick these tech stacks up as a hobby.
Resources - While I always check the sources first, I'm not a fan of the Ansible getting started docs. The list I run down for resources after the source (based on cost and quality)
- Online resources provided by a local library - was able to find a good Java reference for a recent project.
- Getting started books ( your OReillys, Manning Pub, Starch Press, etc)
I'm running a similar setup, but with k3s on Raspberry Pis and an old Mac Mini. For me it's about running home automation workloads (that should stay in the LAN) as well as learning and experimenting with devops tooling, currently working on managing everything with Ansible as well, still not sure whether to use a k3s role from Ansible Galaxy or just k3sup (anyone got a recommendation?) and running one of the k3s nodes on a Raspberry Pi robot I was gifted for Christmas, mainly because I can.
If I did the same on AWS I'd pay a big lot more and that'd cut into budgets for other pastimes. Also I wouldn't get to tinker with hardware and do some of the more exotic things like exposing GPIO pins to Kubernetes workloads, and if I ran home automation workloads there, they'd be useless if my uplink fails.
But if this was work, I'd do things differently for sure.
> why go through all the trouble of setting this up instead of just using AWS
1. Have you ever hosted something on AWS, for public consumption by parties not under your control, where you, personally were footing the bill?
2. Have you ever hosted something on AWS where a misconfiguration on your side and/or upset/impatient customer caused a bill so large that you had to shut the company down?
AWS will scale to fill your bank balance, irregardless of whether you can afford to pay for that scale. AWS is great for elasticity when you can afford to pay.
AWS has no functional bill limits, even after 10 years. For those who post a link to the AWS bill limits havn't actually handled any meaningful scale, at which point a set of dedicated servers is great value.
> why go through all the trouble of setting this up instead of just using AWS
I will never understand that mentality. Maybe I always was to poor to pay for the cloud but a bunch of old notebooks, electricity a switch and internet is all you need to experiment and self-host. AWS is powerful if you need to scale but for homelab/experiments it's overkill - today I'd rather get an Oracle Cloud free account that gets you 24gb ram / 4 arm cores and 200gb storage and ipv4/ipv6 addresses, internal subnets for free to experiment if you need cloud.
Ansible is a lot of python that's already been written, battle tested, and made available to use. Once you get beyond one box or need to deploy and tweak something more than once, getting an ansible config setup is much easier in the long run.
> Your cloud provider or your ISP might provide you with direct public IP access which is nice. But it makes no sense for a home lab anymore in 2022 with Cloudflare tunnel, Twingate, Tailscale, etc.
With IPv6 we'll be able to get back to direct access (where it makes sense) and do away with all these extra layers.
I think the point was avoiding exposing things to the public outside of your lab. If everything has an ipv6 address, you need to concern yourself with firewalling things off yourself.
It's important to recognize snark and sarcasm for better or worse.
I think GP is referring to the idea that just because ipv6 is capable to provide universal connectivity from a technical sense, that does not translate to the techopolies implementing it in its rawest form. They have "interests".
They are quite happy being a dependent node between two individuals/devices talking to each other directly on the network in many aspects. The "Who, ???, When, Where, ???" are very important to their ability to monetize you and keep their business going. No good/evil duality here its just business and the capitalist way in the basic form. Why would they want you to send messages directly to another individual/device when you can just as aptly use their "cloud/network" service instead? Why buy a VPS from AWS or Digitalocean when you can just host the same services from your phone or a spare computer?
ipv6 can in some sense threaten the for-mentioned dependency, given no restriction. So expect even if these massive operators implement ipv6 end-to-end, probably as a cost/complexity saving measure, that "security", "convenience", "reliability" measures are put in place so that you are not permitted to make direct connections across ipv6 unchecked or at best some technical upsell, or just not possible at all.
The sad story being that while ipv4 and NAT/CGNAT were intended as a technical stop-gap to ipv4 exhaustion and security, waiting for ipv6, it effectively moats users into network centric power hierarchies where the ISPs, hardware vendors, OS vendors get to dictate the level of access, which are useful from a business aspect.
Remember ipv4 is now "scarce". Scarcity produces economies, which produce commodities, which produce futures/speculation, which produce business strategy. ipv6 promotes universal abundance and connectivity which is terrible for business on the strategic front. No wonder why ipv6 is going nowhere so fast.
I've tried really hard with Ansible and I think the complexity is more trouble than its worth. Its kinda nice to have an engine to do a lot of servers at the same time but its easier just to run some bash scripts.
I agree: bash scripts should always be the first step what people should try. Usually, the build.sh finds its way into a project to structurally build the system on the pipeline, or a cron running shell to delete some temporary files and so on.
But, it simply doesn't work if you have more than one server. Each one becomes a so called "snowflake", with diverging configuration, installed packages and so on.
A book I really liked is "UNIX&Linux System Administration Handbook" which explains exactly this evolution in thinking that most people need to go through at some point if they want to automate things.
Ansible is not the only approach: I can see a lot of other things popping up in recent years:
- Nix is really cool
- Salt is an alternative for Ansible (but I didn't get it, although one of my former employers moved from Puppet to Salt so it's probably a really good thing for complex setup)
- Terraform/Packer (if everything is on VMs or in the cloud)
- K8s (ubiquitous, omni-potent, but I had enough of it tbh: I need to learn each year new tools and good practices and frameworks are changing constantly, it's like "frontend frameworks of infra" ;))
In general I agree, first try bash scripts and then, if needed, something like Ansible. More often than not, though, the bash scripts I produce are buggy, non-idempotent, and really difficult to maintain. One could argue “know your tools!”, and while I agree with the sentiment, somehow for me at least, it’s way easier to write maintainable, less buggy, and idempotent Ansible plays than to do the same with bash.
It's sad that we go from bash ---> massive enterprisey system. There's nothing in between.
I ran into this at my work. We have ansible and salt and terraform and nomad and spinnaker and kubernetes and docker, each in varying states of favor/disfavor.
Bash scripts are awful: no standard library, no functional ide for debugging, pathetic language, whitespace sensitive. The only saving grace is the universality.
Everything else? Heavyweight servers, daemons, opinions through the wazoo.
What did I want? I want a way to get an inventory/list/cluster map of nodes, then be able to send CLI commands to them, and the parse the response and send more. I want to a decent language, good runtime, IDE/modern toolchain support.
Can I just ssh into boxes? Yeah, uh, kinda? Or maybe I use aws-ssm to send/receive commands. Or kubectl or dockerrun. Or teleport. Or salt daemon. Each of them can do the job of "send command to a node".
And I want to write meta-applications on top of that.
So I settled on groovy: optional typing, if I want near-full JVM speed I can use CompileStatic, lots of scriptability, full JVM parallelism/threading power. The JVM sucks in some ways at running CLI commands locally, but once I figured that monstrosity out, got Jsch working for "pure" java SSH, caching kinda solved, thing....
So after about six rewrites, and a sufficient amount of configuration that makes a complicated ssh conf look like child's play, I can do cluster-level operations that make me happy: setup clusters, orchestrate/run/track load tests, fully scripted backups and restores, migrations/upgrades, red/black. Restore backups to analysis clusters.
I'm not tied to a specific cloud. I'm not tied to a command delivery substrate.
Is it useful to someone else? Dunno, I'll try to get it released and documented.
Having dabbled with Ansible and Saltstack -- and much preferring the latter -- it's very sad that Ansible has seemingly won most of the sentiment amongst the demographic of people who didn't want Ruby / DSL, but preferred Python / YAML / Jinja. RH's ownership and marketing was probably the compelling feature for many shops.
Less so with Ansible, and more so with Saltstack, I found it easy to slowly migrate my stack of bash scripts into the platform. As with any config management system you can always just wrap some config boilerplate around your scripts, and try to cater for idempotent requirements / shortfalls -- a bit like how learning PHP is relatively easy, as you can start safely by inject tiny fragments into your HTML.
I have seen saltstack references in the kubernetes repo, and that alone was enough to pique my interest to read into it, but I guess maybe the "chicken and egg" is causing salt to be just massively behind on what one can do with ansible
For example, this[1] is all that seems to be available for "cloud" on AWS. Now, I am super, super cognizant that there is a camp of "TF all the cloud, then $something on the machine" but here salt seems to be dipping its toes into the AWS API, and yet treating that machine as a pet, not even offering ASG nor LaunchTemplate knobs that could bring the machine back to life if it falls over
Yeah, there's certainly some fuzzy borders that delineate the various domains of deployment, fleet orchestration , config management, BCP, etc.
Similar to author of TFA, I use Consul & Nomad and config management (in my case Salt) to manage a fleet of 'home' systems, and a couple of IaaS boxes (DO rather than AWS, because Jeff). $dayjob uses Ansible & Tower, but nothing about that experience has inspired me to try to lift and shift to that side of the fence. In any case, I've not hit many functionality gaps with Salt, though my requirements are relatively modest.
My favourite demonstration of one of the pain points with Ansible is:
What complexity are you hitting? Are you trying to go all-in on roles and a full setup or just doing some simple scripts? The latter is a much easier jumping-in point. Also, you can ignore the yaml files entirely and use Ansible as a parallel SSH tool.
I think the issue is my requirements aren't that big. Download a tar file, unzip, switch some soft links, run some setup. A bash script does those things in a way people understand, with ansible it ends up being a few people (me) that do everything.
Interesting choices. My smaller, more humble, totally local homelab uses InfluxDB v2 + Telegraf for observability and alerting. Plain old multicast-DNS with Avahi for, somewhat clunky, convention-based service discovery. Ansible for deploying Docker containers to the cluster. Would be interesting to clean it up with some of the services mentioned here.
I can attest to Caddy being excellent on ARM64. So good that I ditched Jellyfin for media-streaming and went with a simple auth-enabled directory listing file server.
Go-based tools and servers are a blessing for old ARM devices.
Aside: TIL Oracle Cloud was a thing. Spent 20 mins signing up and trying to set up an Ampere A1 compute instance to see how it compared with AWS Graviton2.
Failed to provision. "Out of capacity for shape...". Also, what a weird UX.
mDNS is cool and I use it for VMs. How do you set them up within a container, do you run a service manager within the container (for the containerized service and avahi daemon), or is it some kind of configuration at the OS level that runs the containers?
I would take a different approach when describing the distribution choice, and consequently, the modules.
Well-written tasks (and eventually roles) aim to use the generic modules when available. Skip 'apt' unless you truly need to target that platform for some reason. Use 'package' instead.
The 'package' module will figure out the appropriate package manager. This way the only real cross-distribution work you have to do tends to be:
- config file location overrides
- package names
... in limited cases. This is where roles, group_vars, and facts all get really useful.
Principles like this let me support a base of ~5 distributions spanning a few families/roots without too much trouble at all!
In the sense that K8s doesn't really care about the OS underneath, I try to recreate it in my Ansible playbooks
I do ansible+handrolled systemd jobs for mine, and the mixing of all my apps into the ansible config isn't great.
On the other hand, it avoids most of the fuss of getting stuff working in containers.. unless i mistrust it, roll my own podman container, and again orchestrate with systemd, making a huge amount of work for myself and bloating my ansible config even more.