> NOTE: Systemd has a nasty habit of delaying logins for 25 - 100 seconds while it waits for some service you never asked for to time out and fail, before it lets you have a prompt.
If you experience this when logging in to a host via SSH, the delays are almost certainly due to either missing or non-functional forward/reverse DNS (the SSH server will perform these lookups when connections are received).
You've got at least three options (in order of preference):
- Add the appropriate (A/PTR) resource records to DNS so that the queries get a valid response
- Add entries to /etc/hosts file
- Set "UseDNS no" in the /etc/ssh/sshd_config file and add "-u 0" to the parameters passed to `sshd` when it's started (how to do this will vary by distribution/operating system)
Thank you. I haven't seen any apparent DNS problems; I fixed it, on a Debian host, by deleting the gnupg packages; probably deleting just gpgconf would have sufficed. That said, the IP addresses I used really would not show up in a DNS lookup. If I see it again, I will look for DNS lookup attempts.
Dec 15 22:23:14 ip-99-99-99-100 sshd[1995]: pam_unix(sshd:session): session opened for user admin by (uid=0)
Dec 15 22:23:14 ip-99-99-99-100 systemd[1]: Created slice User Slice of UID 1000.
Dec 15 22:23:14 ip-99-99-99-100 systemd[1]: Starting User Runtime Directory /run/user/1000...
Dec 15 22:23:14 ip-99-99-99-100 systemd-logind[510]: New session 34 of user admin.
Dec 15 22:23:14 ip-99-99-99-100 systemd[1]: Finished User Runtime Directory /run/user/1000.
Dec 15 22:23:14 ip-99-99-99-100 systemd[1]: Starting User Manager for UID 1000...
Dec 15 22:23:14 ip-99-99-99-100 systemd[2001]: pam_unix(systemd-user:session): session opened for user admin by (uid=0)
Dec 15 22:24:44 ip-99-99-99-100 systemd[2003]: pam_unix(systemd-user:session): session closed for user admin
Dec 15 22:24:44 ip-99-99-99-100 systemd[1]: user@1000.service: Main process exited, code=exited, status=1/FAILURE
Dec 15 22:24:44 ip-99-99-99-100 systemd[1]: user@1000.service: Killing process 2007 (gpgconf) with signal SIGKILL.
Dec 15 22:24:44 ip-99-99-99-100 systemd[1]: user@1000.service: Killing process 2008 (awk) with signal SIGKILL.
Dec 15 22:24:44 ip-99-99-99-100 systemd[1]: user@1000.service: Killing process 2013 (dirmngr) with signal SIGKILL.
Dec 15 22:24:44 ip-99-99-99-100 systemd[1]: user@1000.service: Failed with result 'exit-code'.
Dec 15 22:24:44 ip-99-99-99-100 systemd[1]: Failed to start User Manager for UID 1000.
Dec 15 22:24:44 ip-99-99-99-100 systemd[1]: Started Session 34 of user admin.
But what are these evidently unnecessary session and User Manager things? What controls starting them? What are they supposed to do for me, if they ever work right? Why did starting them fail?
The user manager directs things like user services. That can be things like starting pulseaudio on a desktop machine, for example, or gnupg on general systems so the user has a gnupg daemon available in their own UID space. This alleviates putting them into .rc or .profile files and making them hard to manage in case they fail.
In this case, GNUPG is trying and failing to start (likely misconfigured), this in turn means system failed to start the likely only user service configured, hence it fails the session setup. The session is still created (managed via logind) to provide a shell and PAM environment.
On a non-systemd this stuff would be managed by other scripts and tools that do essentially the same thing (minus being able to start user services).
It also creates a slice for the session, which enables systemd to kill all processes created in a session when the session ends (logout), minus some exceptions that can manage themselves (like Tmux). Helps against people who forget that they backgrounded and nohup'd a script (which isn't a good way to do things that are supposed to keep running).
If you want to see what services are available in your user session run 'systemctl --user'.
In this case, gnupg was, both for that user and for that host, wholly unconfigured -- evidently it was just roped in by some package dependency. So, trying to run something that depended on the system or user having gnupg configured correctly (or at all) was wholly wrong.
Systemd killing my nohups does not seem like doing me a favor. If I had wanted the process killed when my connection dropped, I would not have nohup'd it. (Later in the article, I use nohup as intended.) How do I turn that off, without breaking other things? Or, failing that, what is the right way to get that behavior, without nohup?
And, what controls which services the session manager will try to run on behalf of a user who has not tried to customize anything? It would probably have been better to turn pushy gpg-agent off, in some sysyem-wide way, than to have entirely deleted it.
Systemd doesn't kill nohups as long as the session exists. Hence your nohup works if you opened tmux or a graphical terminal and close it. But logging out terminates all processes belonging to that session that don't properly disassociate from the session.
You can run 'loginctl enable-linger' to disable this behaviour or set 'KillUserProcesses=no' in /etc/systemd/logind.conf
You can also use 'systemd-run --scope --user /usr/bin/yourscript.sh', this will explicitly keep the process running even when the session closes, even if lingering processes are killed. It'll also keep track of all processes created by that script, so if the script exits, it'll kill all processes that are left over and cleanup. It also meshes with other parts of systemd so `systemd-run --scope --user --on-calendar=daily /usr/bin/script' will run the script daily until you tell it to stop or the machine restarts (since it's not configured on disk). You can even keep the service if the script dies with '--remain-after-exit', which tells systemd to keep the service alive as long as processes for it exist or you terminate it manually.
If whatever your running in the background needs forking, you can create a service and start it under your user, that properly keeps it up and running.
Systemd understands a nohup as "don't terminate the process when the terminal closes", not as "don't terminate the process when the session ends" which can be different events depending on the setup (ie, graphical terminal or tmux).
The session manager will start any user services that are activated, your distro usually configures that in '/usr/lib/systemd/user'. Manually activated units are in '/etc/systemd/user' if they are system-wide and user config is in '$XDG_CONFIG_HOME/systemd/user' or '$HOME/.local/share/systemd/user'.
To turn it off you can deactivate it via 'systemctl --user disable gpg-agent' for the current user or 'systemctl --global disable gpg-agent' for all users. If it activates via socket activation you can either disable the socket 'systemctl --global disable gpg-agent.socket' or mask the service 'systemctl --global mask gpg-agent'. Masking the service prevents it from being started for any reason.
https://zerotier.com does a very nice job of NAT "bouncing", free (hosted) for up to 50 hosts, or you can run your own.
You end up with another "zt" interface, so you can apply firewall rules to that interface, separate from your main NIC(s), kinda like assigning your own VPC but for your laptop, etc. It only takes a few seconds to install the agent and join your private network, and it seems to reconnect quite nicely as well; if your network is private, you can force all new attempts to connect to be approved (where you can check the mac address first). Basically, it's exactly the right level of technical config traits with a simple interface.
(no affiliation, just a very happy user, but I will say that running your own seems to be a lot more difficult than it should be..)
This is also my go-to. It's dead simple and just works. Tbh, the last thing I want to spend my time doing is tinkering with my personal "network architecture". I'm sure I'd manage to fuck it up at some point when I just want to reach my plex server and watch a movie.
Is zerotier FLOSS? I have been looking for something to fulfill this need, but especially after the Solarwinds hack, I am unwilling to run a closed-source agent on any of my hosts.
Looks like the Business Source License [0]. Which does not qualify under the Open Source Definition [1], but they claim it meets "most" of the OSI criteria.
This[1] is the best post I've seen on that _outside_ of the Tailscale writeup linked below. The clever bit is that the introduction server uses wireguard tunnels to find your endpoint information, then shares it out via DNS. Of course it still requires you to be able to run custom code on all of the endpoints, which requires supporting many different platforms.
I'm still looking for a FLOSS mesh network built on top of wireguard that can do NAT traversal between nodes and fall back to tunnelling traffic for the annoying cases where this fails. I don't really want to use Tailscale because (1) their server is not open (though this should not matter a ton for trust) and (2) they require a Google account or similar to sign up.
I can't see how the post you linked actually does the NAT traversal...
Alice finds Bob's external IP:port using the registry. That makes sense. But doesn't Bob need to send a packet to Alice to setup the NAT traversal on his side? More accurately, Alice uses the SRV field to populate the wg peer information on her side -- but how does Bob know that he needs to update the peer information on his side?
I think this is really close, but maybe using a custom DNS server for this might be trying to be a little too clever?
I also haven't fully groked this to be fair, but my understanding is:
(1) Alice and Bob both connect to coordination server
(2) Alice sets Bob's endpoint to bob_ip:bob_port
(3) Bob sets Alice's endpoint to alice_ip:alice_port
(4) Both try to ping each other, which makes both of them originate outbound packets from the wireguard socket.
IF Alice and Bob have the same public IP:port pair in their NAT with each other as they do with the coordination server (which turns out to be true in tests on my EdgeRouter X NAT, but certainly isn't true 100% of the time), then I believe this process will result in a working connection.
If you're asking how Bob & Alice know that the information in the coordination server has changed and that they need to reconfig their endpoints, then I'm not sure. Perhaps they could just re-query the coordination server on a timeout or whenever their connection went down.
I agree - using DNS as a transport seems orthogonal to the NAT punching process, and maybe too clever. If you're going to need to write custom endpoint code either way, I don't see a huge advantage. If wireguard supported DNS-based endpoint information by default (automatically re-resolving IP and detecting port from SRV), then it'd make a lot more sense.
> I can't see how the post you linked actually does the NAT traversal...
It assumes full-cone NAT - you punch a hole once by sending a packet to a third party, like a stun server which will along the way tell you your external ip:port, and then a full-cone NAT would forward packets coming to that ip:port back to you, regardless of the source.
I use wireguard on a cheap VPS that comes with an IPV6 /64. Connecting devices get a public IPV6 address. Most of which are assigned a subdomain so I can remember them. It works wonderfully.
No, I just cobbled it together with standard Linux routing tools.
echo 1 > /proc/sys/net/ipv6/conf/all/forwarding
The trickiest part is making sure that the VPS answers NDP requests for the routed addresses:
echo 1 > /proc/sys/net/ipv6/conf/all/proxy_ndp
for i in `seq 0x0010 0x001f`; do
ip=2001:1111:2222:3333::$(printf '%x' $i)
ip neigh add proxy $ip dev ens3
done
Assign the routed address range to the wireguard interface:
ip addr add 2001:1111:2222:3333::10/124 dev wg0
That's the gist of it. I might do a proper writeup next week if I can find time.
So we've known about this for years, but does anyone actually rely on this behavior in production, or are we all still pretending that this clever hack could be patched out of existence any day now?
It clearly works. It's worked for a long time. But the common opinion still seems to be "well, this isn't supposed to work, so using it is a bit dodgy..."
Well, you need just one NAT to be cooperative, like a full cone. There's enough legitimate interest in NAT traversal that big NAT deployments like CGNATs tend to cooperate
Traversing non-cooperative fully symmetric NATs, which randomize ports, is hard enough also for pwnat. Though in theory should be doable - you just need a lot of patience to brute force ports (there's only 64k of them) until it finally clicks
I've had basically this exact setup running on a cloud server for the past few months. It's pretty liberating to be able to open ports to 10.44.0.0/24 and have all of my different machines easily access them, regardless of what networks they're connected to. I've added in an instance of nsd that manages a zonefile so I can do:
ssh laptop.wg.mydomain.net
which resolves into 10.44.0.3 for example.
There are two major downsides with this approach:
(1) all of your traffic must go by way of the bounce server, even if your machines are on the same LAN.
(2) the bounce host can observe all of your traffic.
I've been considering adding a second layer of wireguard on top of this for end-to-end encryption between nodes. Each machine would have one interface configured like this article suggests simply to assign a stable NAT-traversing IP address. Then they'd each have a second interface using the first-level IP addresses as endpoints. This would result in 2x the encryption overhead, both in CPU and more importantly in MTU, but it would mean the bounce host was no longer able to intercept even unencrypted data. If you run a daemon on all of the hosts, they can play with the endpoints used on this second level network to avoid using the bounce host when they find they can directly connect to each other, but the default path would work (if somewhat slowly) by the double tunnel process even without this daemon.
For bonus points, run multiple bounce hosts, connect to all of them from each peer, then select the endpoint to use for each paired peer connection based on the total path latency.
"If you have someplace to put more secure equipment on the open net, that would be better.
A [your choice of HW] would do fine, if it runs only your code."
Idea: "Colocation centres" for users' computers instead of data centres for users' data. As directed by the author here, users would store no data on these computers.1
1. By their nature, each user-owned supernode computer would provide some discoverable metadata, as would any router, namely, the IP addresses of the members of the private network/s it supports and which of those IPs are connecting to each other, but this would be unlike the centralised repositories of millions of users' metadata we have now, managed by private tech companies.
As for the rest of this writeup, it sounds much like the need for a supernode in "LAN-over-Internet"-type P2P networks that encapsulate Ethernet-like packets in UDP packets. The supernode does not necesarily need to route traffic between nodes behind NAT (rarely necessary), it only needs to store the equivalent of an ARP table accessible by all of them.
> Idea: "Colocation centres" for users' computers instead of data centres for users' data. As directed by the author here, users would store no data on these computers.
This is basically AWS Workspaces. It can work, but it's annoying, expensive, and various latency and bandwidth issues are a killer.
Love how they market that: "Desktop as a Service". Contrast with tiny, self-served routers owned by users that store no user data and whose only access is via SSH. Who owns the computers in the "AWS Workspace". Need to see the terms the AWS customer must agree to (non-negotiable, no doubt) to get a better understanding of what this service is really about.
A self hosted "TURN like" server seems like it would be simpler and more secure -- yet achieve similar goals, and yet still more robust vs funky NAT layers than a dynamic dns solution alone would provide.
I'd love to see a nice guide for setting this up. I've found the world of STUN/TURN/ICE incredibly confusing, probably because of the SIP/telephony background.
Someone linked a write-up above [1]. It's not simple, but I found it well-written, and doesn't assume too much networking knowledge. I guess it's still a young protocol and we have proofs of concepts but not full-fledged deployment-ready solutions yet...
Nice article. I had been maintaining a few servers doing a similar thing, but found the peer management, address allocation and key creation to be a bit tedious after some time so I created a command, https://github.com/naggie/dsnet/ -- adds peers in one command with IPv6 and PSK support.
>because Amazon techs
and motivated hackers can peek at files you put there (yes, really,
all of them! including your secret keys)
What did the author mean by this? I can understand amazon employees having the ability to peek at private data but what about "hackers"? is he referring to the fact that lightsail is containerized and not running on VMs?(just a guess)
I must be missing something. Why go through all of this when you can just buy a domain name, setup dynamic dns on the natted network (to keep the IP updated), and then setup wireguard to route to the domain name? If you have two different networks then just use two different domains or subdomains.
What if you live in a college dorm or use apartment wifi where they do not give you control of the port forwarding rules? Or what if you want to deploy devices into other peoples' networks (e.g. IoT)? What if you're sending someone a bootable USB image so you can do data recovery on their hard drive in situ without them having to ship it to you? What if you're behind a carrier-grade NAT and don't have a public IPv4 address at all? In these cases and many more, you might find a wireguard bounce host useful.
Honestly, I'd probably just use Tailscale for that, so I guess the bounce server solution occupies some middle ground where one doesn't have control over the network and either doesn't have the money for tailscale or likes owning their infra end-to-end.
For me, it's indeed about FLOSS and owning the system end-to-end. Plus, I want to be able to build things like Tailscale, and I can't very well do that just using the existing tools. :)
If that does what you need, then it is a reasonable alternative. I would prefer the shorter routing you get. But if one of your NATs is corporate, a phone carrier, it might not work.
If you might want to provide other services, particularly a remote exit node or web service, the bounce node is a good start on that.
OPs method would allow for remoting into your home network without port forwarding, since home network would establish a connection to the 'bounce' node, which would facilitate communication between the 3rd WG client
> Don't see how... Nftables is set up once and then left alone.
Same thing for a router configuration to accept inbound wireguard requests on a home network with dyndns. You set it up once and are done.
I see the benefit of the bounce server if you operate a network in an environment where you don't have the ability to control the router config; however, when you do have the ability to update firewall/router config, then I'd prefer just setting up a domain name and avoid the dependency on a third party server.
Dyndns does work when you have a say in at least one side of each potential connection, unless you want one of your NATted hosts to also act as a bounce server in a pinch.
But you do not seem to be getting that the use of nftables, for the open-network bounce server, is wholly optional.
ah, okay. Yes, I missed that. So the point of nftables then is just to avoid sending REJECT messages, so it makes it harder to determine what port wireguard is operating on?
Yes, it is my preference: When you drop packets, they stop costing you anything further, where rejecting them generates more work for you. And, you are providing attackers free information that you don't need to.
I am not sure the nftables configuration I have is right... It might permit using my bounce server to forward packets that then appear to come from it, if they happen to mention the right port. I would welcome advice.
After further investigation, I have discovered that dyndns would not solve my problem, because the firewall at one end is especially picky; even zerotier and tailscale admit (grudgingly) that they use bounce servers for such clients.
Yes, you setup the wireguard configuration to connect to the peer using the domain name and once connected, the network assigns a local IP for the connected peer that is valid on the natted network. I use this kind of setup to access my NAS system when I am traveling or working away from home.
You don't need to punch through a NAT when you have a bounce server. Both sides are already connected to the bounce server and all interaction between those peers is done through the bounce server. Thus, the peers don't need to communicate directly... so you don't need to do any NAT punching.
That's what makes this method simple... with the downside that you need a publicly exposed intermediary. But it is vastly simpler to setup than STUN.
Seems like a good tutorial. I've been using this type of set up for about 6 months now and it works well for me. I wish I had a straight forward explanation like this when I was first learning how to use WireGuard.
That was my experience setting up a WireGuard connection to my home network about 12 months ago. It was really difficult to get a straightforward explanation of how to set things up. For example, I had (as mentioned above) dynamic DNS set to update from my home router with port forwarding setup to a home Linux server (Rpi actually).
This isn't as easy as A <=> B, but it is probably a common enough use-case. It was really hard to figure out what I needed to put in my wg.conf files. But once I got it, it was really solid.
Is there something like "WireGuard in anger" or even something like the GitHub introduction to Git for WireGuard? It seems like it would be very helpful.
Same! I set up something similar with DigitalOcean by combining a couple of tutorials together. I pretty much followed the same steps until the `nftables` stuff.
This works if both sides has wireguard installed and are in the network, but is there a way to got this to work for any connections from the internet? eg. your VPS has ip 1.1.1.1, you want any traffic that goes to 1.1.1.1 to go to the wireguard interface on your PC, and any traffic set from the wireguard interface on your PC exits through 1.1.1.1. Bonus points if set it up in such so that the interface address on your PC shows up as 1.1.1.1.
Yes this is relatively straightforward to accomplish via iptables pre&postrouting rules. I think the key insight is to realize that you're basically asking for NAT with DMZ:
Internet -> [eth0] WG BOX [wg0] -> [wg0] HOME SERVER
analogous to:
Internet -> [eth0] ROUTER [eth1] -> [eth0] HOME SERVER
I'm looking at doing something similar: does the server actually need an IP address assigned to its wg0 interface, or can it work without one? I only need peer-to-peer.
Yes, because of Cryptokey routing[1]. Wireguard works at the IP level and determines which keys to use for encryption and authentication based on the destination IP. That said, if you only need peer-to-peer, you could pick any two RFC-1918 addresses like 10.0.0.1 and 10.0.0.2.
Even on my 20.04 box, I seem to have wireguard-dkms auto-installed in response to "apt install wireguard". Perhaps they are doing this to allow for faster wireguard module updates? Or perhaps this is just a bug.
If you experience this when logging in to a host via SSH, the delays are almost certainly due to either missing or non-functional forward/reverse DNS (the SSH server will perform these lookups when connections are received).
You've got at least three options (in order of preference):
- Add the appropriate (A/PTR) resource records to DNS so that the queries get a valid response
- Add entries to /etc/hosts file
- Set "UseDNS no" in the /etc/ssh/sshd_config file and add "-u 0" to the parameters passed to `sshd` when it's started (how to do this will vary by distribution/operating system)