Using Ansible and Nomad for a homelab (part 1)

aborsy · on Feb 27, 2022

Be aware, when you use mesh VPN products such as, ZeroTier, Nebula or Tailscale, the company has access to your network, may add hidden nodes, and sees who is taking to who and what services they are running, which can be a privacy concern.

jbotz · on Feb 27, 2022

ZeroTier and Tailscale, yes, maybe. Nebula, as I understand it, no... it's a fully Open Source project, the mesh is P2P, and the "lighhouse nodes" which serve as tunnel end-point directories, you run yourself as well. There is no "vendor" involved.

aborsy · on Feb 27, 2022

For nebula, if you run the coordination server yourself (lighthouse or moon), obviously the concern I mentioned is not applicable. It will be self hosted; the company is not even involved to be responsible for anything.

My comment applies only if Slack provides the lighthouse. I don’t know if it does; usually the company operates a default one for users and provides the option to self host it as well (Tailscale does not offer one, but there is third party code).

linsomniac · on Feb 27, 2022

Slack does not offer lighthouses or any part of Nebula.

They spun off a company, Defined Networks, which is enhancing Nebula, and they offer a hosted CA with key rotation and push configuration updates to your fleet, currently in a kind of early beta. You still run your own lighthouses, but they are entirely in control of your CA, so they can sign certs for nodes at their will.

That could allow third parties into your networks, but I'm not sure if that would imply any ability to redirect or decrypt traffic.

jbotz · on Feb 27, 2022

Exactly. Slack does not offer "lighthouses as a service". Here is a quote by the authors from the Nebula GH repository README...

"Nebula lighthouses allow nodes to find each other, anywhere in the world. A lighthouse is the only node in a Nebula network whose IP should not change. Running a lighthouse requires very few compute resources, and you can easily use the least expensive option from a cloud hosting provider. If you're not sure which provider to use, a number of us have used $5/mo DigitalOcean droplets as lighthouses."

Arkanosis · on Feb 27, 2022

I've barely scratched the surface of Nebula lately, but how could a vendor add nodes to your Nebula network if they don't have your CA's private key?

linsomniac · on Feb 27, 2022

Agreed, hard to see how that could happen with Nebula, since there is no vendor that has your CA key.

Defined Networks version, however, completely controls your CA key, and could generate their own nodes. The config files they generate are plain text and can be inspected for "mirror traffic" configs (I don't remember if Nebula has that feature, ZeroTier does). Defined Networks has a pretty slick setup, which I can go into a bit further if there is interest, I've done an eval and ended up deciding it wasn't quite ready for our use.

Being mesh networks, one could do some examination of traffic to ensure that they aren't shipping traffic external.

With ZeroTier, there are also some "self hosted" options. I haven't dug too deeply into them. I really like ZeroTier, and wanted to use it for our work overlay network, but I'm a bit skeptical about the reliability. It's been good for my test use case, but last year they had some sort of controller outage and when I asked their sales people about it and how we might be able to run a backup controller, he said that sort of outage wasn't possible. When I asked what was meant by this specific tweet that ZeroTier sent out that said it had happened, I got no reply. :-(

ZeroTier is super slick, but I can't move our entire infrastructure over to something that could have an outage that would take out our infrastructure until some third party resolved it.

aborsy · on Feb 27, 2022

I am highly interested in your review of these three software Defined network products, including slickness.

I also evaluated them, and decided I need a “zero trust” solution (well, or “less trusted” at least).

I thought ZeroTier doesn’t use a standard VPN tunneling protocol, and that’s not a best practice.

aborsy · on Feb 27, 2022

They don’t need private keys. The company is responsible for distribution of public keys. So, they can inject a public key to your network, and you happily encrypt your traffic with that public key, to be decrypted on the other side by their private key.

It’s the same old key distribution problem; for instance, when you SSH you need to verify the authenticity of the key that is presented to you first time. You approve the wrong public key and it’s over.

This is not to say, Tailscale does that. The service is by far my favorite (Nebula is not as user friendly, and ZeroTier uses nonstandard tunneling). Tailscale is dead simple, uses Wireguard, has integration with SSO, provides ACLs, relays, good NAT traversal, good management interface and lately a lot of DERPs around the world. Just be aware of limitations (in US, they can even be forced to share the networks, even if they don’t want to).

Two other comments. These mesh networking products could use pre-shared keys to address this concern. For example, Tailscale could use Wireguard preshared keys, as an optional feature for those concerned with key distribution. I don’t know why they don’t offer this option. Also, these services are not zero trust, contrary to what they often claim on their websites (usually they twist the meaning of the term zero trust).

mediocregopher · on Feb 27, 2022

I'm afraid you don't understand how nebula works. A nebula cluster is fully self-contained, you are responsible for distribution of your own certs and hosting of your own lighthouse instances, there is no phoning home to any outside parties.

Under the hood nebula uses the noise protocol, the same used by wireguard.

aborsy · on Feb 27, 2022

Obviously, in that case. See my response to jbotz.

chaxor · on Feb 27, 2022

This is fixed with headscale.

zrail · on Feb 27, 2022

If this is part of your real threat model then you're better off using a self-hosted control plane. Headscale for tailscale, the built-in one for Zerotier, a manually managed wireguard mesh, or Nebula (which is always self-hosted).

gonzaloalvarez · on Feb 27, 2022

Security is always a compromise. If you want to access your homelab from outside (eg keep your documents in own hands), you have to open a way. Opening your own VPN endpoint or an SSH port is also non-zero risk, imho

aborsy · on Feb 27, 2022

Not intending to drag on comments, but I would argue running a basic Wireguard VPN on a central VPS (old hub and spoke) near your city, is more secure and faster. The attack surface is minimal, you have better control over firewall etc.

Mesh VPNs shine in small businesses with many users, where ACLs, SSO etc become useful. In home labs, a basic Wireguard server works fine.

soraminazuki · on Feb 27, 2022

I think mesh VPNs and VPS-based solutions are the same in terms of privacy. They all involve third parties that you have to trust. Mesh VPNs might even be slightly better because it uses P2P connections whenever it can.

The best solution IMHO would be to use mesh VPNs and secure inter-node connections with an additional layer of encryption. SSH and TLS should cover most use cases here, and both are widely supported and easy to set up.

mikojan · on Feb 27, 2022

WireGuard is a mesh VPN. It's dead-simple to set up this way, too. All my devices and containers (local and remote) talk to each other this way

chaxor · on Feb 27, 2022

This is precisely why headscale is better than tailscale. Use headscale, it's great.

rebootlearn · on Feb 27, 2022

Apologies if this might sound ignorant, but what is the purpose of such a complex setup. If it's to build a personal cloud host, why go through all the trouble of setting this up instead of just using AWS. I've been working as a web dev (flask + vue) for nearly 4 years now and have never come across any of this in my day job. While reading the article, I barely understood how all of these technologies (Ansible, Nomad, Tailscale, etc) work together. And, I'm now realising that it's high time I learn all these modern technologies and update myself.

What would be the best place to learn all of this. While the Ansible docs seem like a good place to start, what I'm really looking for is how do I identify when I would need to use any of these technologies. At what point in the development cycle starting from a basic web backend would I need to consider using these technologies. Any pointers to some good resources to learn all this would be very much appreciated.

Jiejeing · on Feb 27, 2022

> why go through all the trouble of setting this up instead of just using AWS.

Because you may want to have control over your hardware, your costs, your network, your software. AWS and most cloud providers are very expensive and mostly lock you in (unless you only use EC2 instances as VM, in which case you are getting terrible value for what you pay). I already have computers and an ISP, so why would I want to rent someone else’s?

zrail · on Feb 27, 2022

The point is either to learn some new stuff in a semi-practical environment or build an environment where you can host things that are impractical to host in the cloud. Things like home assistant or whatever.

blown_gasket · on Feb 27, 2022

If you are doing anything with infrastructure than finding a place with Ansible or Nomad would be:

Ansible - Deploying VMs on-prem and in the cloud, creating a managed-k8s cluster in the cloud, deploying k8s on-prem, possibly bare-metal server provisioning. Configuration of resources for all of the above.

Nomad - Alternative to K8s from Hashicorp (supposed to be simpler, I wouldn't know as I've never used it).

Tailscale - VPN provider

Where would you use these tools your day-to-day? If you aren't deploying VMs, containers, bare-metal servers, or making configurations to the above - you would probably be looking to pick these tech stacks up as a hobby.

Resources - While I always check the sources first, I'm not a fan of the Ansible getting started docs. The list I run down for resources after the source (based on cost and quality)

- Online resources provided by a local library - was able to find a good Java reference for a recent project.

- Getting started books ( your OReillys, Manning Pub, Starch Press, etc)

- MOOCs

- Pluralsight, LinkedIn Learning ,etc.

- Youtube - CodeCamp, etc

dividedbyzero · on Feb 27, 2022

I'm running a similar setup, but with k3s on Raspberry Pis and an old Mac Mini. For me it's about running home automation workloads (that should stay in the LAN) as well as learning and experimenting with devops tooling, currently working on managing everything with Ansible as well, still not sure whether to use a k3s role from Ansible Galaxy or just k3sup (anyone got a recommendation?) and running one of the k3s nodes on a Raspberry Pi robot I was gifted for Christmas, mainly because I can.

If I did the same on AWS I'd pay a big lot more and that'd cut into budgets for other pastimes. Also I wouldn't get to tinker with hardware and do some of the more exotic things like exposing GPIO pins to Kubernetes workloads, and if I ran home automation workloads there, they'd be useless if my uplink fails.

But if this was work, I'd do things differently for sure.

MaknMoreGtnLess · on Feb 28, 2022

> why go through all the trouble of setting this up instead of just using AWS

1. Have you ever hosted something on AWS, for public consumption by parties not under your control, where you, personally were footing the bill?

2. Have you ever hosted something on AWS where a misconfiguration on your side and/or upset/impatient customer caused a bill so large that you had to shut the company down?

AWS will scale to fill your bank balance, irregardless of whether you can afford to pay for that scale. AWS is great for elasticity when you can afford to pay.

AWS has no functional bill limits, even after 10 years. For those who post a link to the AWS bill limits havn't actually handled any meaningful scale, at which point a set of dedicated servers is great value.

packetslave · on Feb 27, 2022

"Some people build things because they like building things." -- Zuck

nisa · on Feb 27, 2022

> why go through all the trouble of setting this up instead of just using AWS

I will never understand that mentality. Maybe I always was to poor to pay for the cloud but a bunch of old notebooks, electricity a switch and internet is all you need to experiment and self-host. AWS is powerful if you need to scale but for homelab/experiments it's overkill - today I'd rather get an Oracle Cloud free account that gets you 24gb ram / 4 arm cores and 200gb storage and ipv4/ipv6 addresses, internal subnets for free to experiment if you need cloud.

hrrsn · on Feb 27, 2022

> What would be the best place to learn all of this

Your homelab ;)

goodpoint · on Feb 27, 2022

On, even better, you use a bash script or a little bit of Python.

qbasic_forever · on Feb 27, 2022

Ansible is a lot of python that's already been written, battle tested, and made available to use. Once you get beyond one box or need to deploy and tweak something more than once, getting an ansible config setup is much easier in the long run.

goodpoint · on Feb 27, 2022

> Ansible is a lot of python

One more reason to stay away from it. The codebase is very complex and the issues labeled "bug" count more than 18 thousands!

The language is a kludge of yaml and jinja with custom chunks of Python that is much less expressive than pure Python.

Go ahead and downvote at will.

snorrah · on Feb 28, 2022

it’s an entirely usable system that can be used as shell script replacements to full virtual machine provisioning.

it does not matter if the language is a “kludge” if it suits the purpose. Ansible suits a lot of purposes and is worth exploring as a free technology.

tehbeard · on Feb 27, 2022

...and dropbox is just an rsync script, Google docs is just vim/emacs (delete as appropriate) over ssh, and FF VII is just a doom clone /s

Ansible gives a framework that covers several of the gotchas your bash + python hack would run into.

cube00 · on Feb 28, 2022

> Your cloud provider or your ISP might provide you with direct public IP access which is nice. But it makes no sense for a home lab anymore in 2022 with Cloudflare tunnel, Twingate, Tailscale, etc.

With IPv6 we'll be able to get back to direct access (where it makes sense) and do away with all these extra layers.

snorrah · on Feb 28, 2022

I think the point was avoiding exposing things to the public outside of your lab. If everything has an ipv6 address, you need to concern yourself with firewalling things off yourself.

turminal · on Feb 28, 2022

Yeah. Sure. And apple will let you out of the walled garden. And Google will get rid of their ad network.

snorrah · on Feb 28, 2022

Nothing about this makes sense. You have misinterpreted the comment you are replying to.

ReganLaitila · on Feb 28, 2022

It's important to recognize snark and sarcasm for better or worse.

I think GP is referring to the idea that just because ipv6 is capable to provide universal connectivity from a technical sense, that does not translate to the techopolies implementing it in its rawest form. They have "interests".

They are quite happy being a dependent node between two individuals/devices talking to each other directly on the network in many aspects. The "Who, ???, When, Where, ???" are very important to their ability to monetize you and keep their business going. No good/evil duality here its just business and the capitalist way in the basic form. Why would they want you to send messages directly to another individual/device when you can just as aptly use their "cloud/network" service instead? Why buy a VPS from AWS or Digitalocean when you can just host the same services from your phone or a spare computer?

ipv6 can in some sense threaten the for-mentioned dependency, given no restriction. So expect even if these massive operators implement ipv6 end-to-end, probably as a cost/complexity saving measure, that "security", "convenience", "reliability" measures are put in place so that you are not permitted to make direct connections across ipv6 unchecked or at best some technical upsell, or just not possible at all.

The sad story being that while ipv4 and NAT/CGNAT were intended as a technical stop-gap to ipv4 exhaustion and security, waiting for ipv6, it effectively moats users into network centric power hierarchies where the ISPs, hardware vendors, OS vendors get to dictate the level of access, which are useful from a business aspect.

Remember ipv4 is now "scarce". Scarcity produces economies, which produce commodities, which produce futures/speculation, which produce business strategy. ipv6 promotes universal abundance and connectivity which is terrible for business on the strategic front. No wonder why ipv6 is going nowhere so fast.

rr808 · on Feb 27, 2022

I've tried really hard with Ansible and I think the complexity is more trouble than its worth. Its kinda nice to have an engine to do a lot of servers at the same time but its easier just to run some bash scripts.

milanaleksic · on Feb 27, 2022

I agree: bash scripts should always be the first step what people should try. Usually, the build.sh finds its way into a project to structurally build the system on the pipeline, or a cron running shell to delete some temporary files and so on. But, it simply doesn't work if you have more than one server. Each one becomes a so called "snowflake", with diverging configuration, installed packages and so on.

A book I really liked is "UNIX&Linux System Administration Handbook" which explains exactly this evolution in thinking that most people need to go through at some point if they want to automate things.

Ansible is not the only approach: I can see a lot of other things popping up in recent years:

- Nix is really cool

- Salt is an alternative for Ansible (but I didn't get it, although one of my former employers moved from Puppet to Salt so it's probably a really good thing for complex setup)

- Terraform/Packer (if everything is on VMs or in the cloud)

- K8s (ubiquitous, omni-potent, but I had enough of it tbh: I need to learn each year new tools and good practices and frameworks are changing constantly, it's like "frontend frameworks of infra" ;))

tkiolp4 · on Feb 27, 2022

In general I agree, first try bash scripts and then, if needed, something like Ansible. More often than not, though, the bash scripts I produce are buggy, non-idempotent, and really difficult to maintain. One could argue “know your tools!”, and while I agree with the sentiment, somehow for me at least, it’s way easier to write maintainable, less buggy, and idempotent Ansible plays than to do the same with bash.

phillu · on Feb 27, 2022

You are not the only one. Anecdotal, but I have seen far more unmaintainable, half-assed, 1000-line build.sh scripts than Ansible scripts.

AtlasBarfed · on Feb 28, 2022

It's sad that we go from bash ---> massive enterprisey system. There's nothing in between.

I ran into this at my work. We have ansible and salt and terraform and nomad and spinnaker and kubernetes and docker, each in varying states of favor/disfavor.

Bash scripts are awful: no standard library, no functional ide for debugging, pathetic language, whitespace sensitive. The only saving grace is the universality.

Everything else? Heavyweight servers, daemons, opinions through the wazoo.

What did I want? I want a way to get an inventory/list/cluster map of nodes, then be able to send CLI commands to them, and the parse the response and send more. I want to a decent language, good runtime, IDE/modern toolchain support.

Can I just ssh into boxes? Yeah, uh, kinda? Or maybe I use aws-ssm to send/receive commands. Or kubectl or dockerrun. Or teleport. Or salt daemon. Each of them can do the job of "send command to a node".

And I want to write meta-applications on top of that.

So I settled on groovy: optional typing, if I want near-full JVM speed I can use CompileStatic, lots of scriptability, full JVM parallelism/threading power. The JVM sucks in some ways at running CLI commands locally, but once I figured that monstrosity out, got Jsch working for "pure" java SSH, caching kinda solved, thing....

So after about six rewrites, and a sufficient amount of configuration that makes a complicated ssh conf look like child's play, I can do cluster-level operations that make me happy: setup clusters, orchestrate/run/track load tests, fully scripted backups and restores, migrations/upgrades, red/black. Restore backups to analysis clusters.

I'm not tied to a specific cloud. I'm not tied to a command delivery substrate.

Is it useful to someone else? Dunno, I'll try to get it released and documented.

Does anyone else know of something like this?

Jedd · on Feb 28, 2022

Having dabbled with Ansible and Saltstack -- and much preferring the latter -- it's very sad that Ansible has seemingly won most of the sentiment amongst the demographic of people who didn't want Ruby / DSL, but preferred Python / YAML / Jinja. RH's ownership and marketing was probably the compelling feature for many shops.

Less so with Ansible, and more so with Saltstack, I found it easy to slowly migrate my stack of bash scripts into the platform. As with any config management system you can always just wrap some config boilerplate around your scripts, and try to cater for idempotent requirements / shortfalls -- a bit like how learning PHP is relatively easy, as you can start safely by inject tiny fragments into your HTML.

mdaniel · on Feb 28, 2022

I have seen saltstack references in the kubernetes repo, and that alone was enough to pique my interest to read into it, but I guess maybe the "chicken and egg" is causing salt to be just massively behind on what one can do with ansible

For example, this[1] is all that seems to be available for "cloud" on AWS. Now, I am super, super cognizant that there is a camp of "TF all the cloud, then $something on the machine" but here salt seems to be dipping its toes into the AWS API, and yet treating that machine as a pet, not even offering ASG nor LaunchTemplate knobs that could bring the machine back to life if it falls over

1: https://docs.saltproject.io/en/3004/ref/clouds/all/salt.clou...

I will say, it seems salt's documentation just absolutely spanks ansible's approach, so there's that

Jedd · on Feb 28, 2022

Yeah, there's certainly some fuzzy borders that delineate the various domains of deployment, fleet orchestration , config management, BCP, etc.

Similar to author of TFA, I use Consul & Nomad and config management (in my case Salt) to manage a fleet of 'home' systems, and a couple of IaaS boxes (DO rather than AWS, because Jeff). $dayjob uses Ansible & Tower, but nothing about that experience has inspired me to try to lift and shift to that side of the fence. In any case, I've not hit many functionality gaps with Salt, though my requirements are relatively modest.

My favourite demonstration of one of the pain points with Ansible is:

https://docs.ansible.com/ansible/latest/user_guide/playbooks...

dfinninger · on Feb 27, 2022

What complexity are you hitting? Are you trying to go all-in on roles and a full setup or just doing some simple scripts? The latter is a much easier jumping-in point. Also, you can ignore the yaml files entirely and use Ansible as a parallel SSH tool.

rr808 · on Feb 27, 2022

I think the issue is my requirements aren't that big. Download a tar file, unzip, switch some soft links, run some setup. A bash script does those things in a way people understand, with ansible it ends up being a few people (me) that do everything.

k8sToGo · on Feb 27, 2022

So all your scripts are idempotent?

dijit · on Feb 27, 2022

Mine are!

I’ll make a guide sometime.

I considered making some helper functions for people to just import at one point. But ingesting 1k+ lines of bash for a few functions feels wrong.

sirius87 · on Feb 27, 2022

Interesting choices. My smaller, more humble, totally local homelab uses InfluxDB v2 + Telegraf for observability and alerting. Plain old multicast-DNS with Avahi for, somewhat clunky, convention-based service discovery. Ansible for deploying Docker containers to the cluster. Would be interesting to clean it up with some of the services mentioned here.

I can attest to Caddy being excellent on ARM64. So good that I ditched Jellyfin for media-streaming and went with a simple auth-enabled directory listing file server.

Go-based tools and servers are a blessing for old ARM devices.

Aside: TIL Oracle Cloud was a thing. Spent 20 mins signing up and trying to set up an Ampere A1 compute instance to see how it compared with AWS Graviton2.

Failed to provision. "Out of capacity for shape...". Also, what a weird UX.

mhitza · on Feb 27, 2022

mDNS is cool and I use it for VMs. How do you set them up within a container, do you run a service manager within the container (for the containerized service and avahi daemon), or is it some kind of configuration at the OS level that runs the containers?

sirius87 · on Feb 27, 2022

Host OS level config setting up /etc/avahi/hosts with Ansible

bravetraveler · on Feb 28, 2022

I would take a different approach when describing the distribution choice, and consequently, the modules.

Well-written tasks (and eventually roles) aim to use the generic modules when available. Skip 'apt' unless you truly need to target that platform for some reason. Use 'package' instead.

The 'package' module will figure out the appropriate package manager. This way the only real cross-distribution work you have to do tends to be:

- config file location overrides

- package names

... in limited cases. This is where roles, group_vars, and facts all get really useful.

Principles like this let me support a base of ~5 distributions spanning a few families/roots without too much trouble at all!

In the sense that K8s doesn't really care about the OS underneath, I try to recreate it in my Ansible playbooks

akdor1154 · on Feb 27, 2022

Very interested in the Nomad writeup.

I do ansible+handrolled systemd jobs for mine, and the mixing of all my apps into the ansible config isn't great.

On the other hand, it avoids most of the fuss of getting stuff working in containers.. unless i mistrust it, roll my own podman container, and again orchestrate with systemd, making a huge amount of work for myself and bloating my ansible config even more.