Fixing xRDP Over VPN: The Ultimate Optimizer for Training Labs

Cisco Live is coming up again. Same problem as Melbourne.
Jörg and I had everything ready. Workshop materials, lab environments, the whole setup. Everything worked perfectly in testing. Then the actual workshop started. Twenty people connected over Cisco Secure Client VPN. xRDP sessions ground to a halt. Some participants got locked out and have a bad user experience. Others saw their work disappear when the VPN hiccupped for ten seconds.
We've been dealing with this for years. Every training event, same story. Standard xRDP optimization guides didn't help. The usual tuning (tcp_nodelay, color depth, compression) barely made a dent. Something else was breaking under VPN load.
Last month, I finally had time to dig in properly. What I found wasn't a single issue. It was five separate problems, all converging when you put real load on xRDP over a VPN tunnel. Each one fixable with sysctl tweaks and config changes. Together, they were making sessions unusable.
I've packaged everything into an automation script. Open sourced it at github.com/beye91/xrdp-ultimate-optimizer. This post walks through what I found and why it matters if you're running xRDP over any kind of tunnel.

The Scenario
Training lab with over 20 users connecting via RDP to our Ubuntu jumphost runs xRDP with XFCE. Participants connect from the event venue over Cisco Secure Client VPN. The VPN terminates at a firewall, which forwards RDP traffic to the lab network.
Under this load, we saw:
- Sessions dropping every few minutes
- 21% packet loss on RDP connections
- Users locked out for hours when sessions crashed
- Work lost whenever the VPN reconnected
Standard xRDP tuning didn't touch it. The basic optimization I wrote about in 2024 helped with responsiveness, but didn't stop the crashes.
The problem was deeper. VPN tunnels change everything about how TCP behaves. Default Linux network settings assume you're on a clean LAN. When you're tunneling through a VPN, those assumptions break.
Issue #1: TCP Buffer Exhaustion
First thing I noticed in the logs: constant socket buffer warnings.
syslog: xrdp-sesman[1234]: cannot set socket option TCP_MAXSEG
kernel: TCP: too many orphaned sockets
xRDP requests 4MB socket buffers for each connection. Makes sense for RDP, which streams a lot of screen updates. But Linux's default net.core.wmem_max is 212KB. Twenty times smaller.
Under load, those buffers overflow. The kernel drops packets. Connections stall. Sessions crash.
The fix is simple. Bump the maximum socket buffer size to 16MB in /etc/sysctl.conf:
net.core.wmem_max = 16777216
net.core.rmem_max = 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216Apply with sysctl -p. Check it worked with sysctl net.core.wmem_max.
This alone cut session drops by 60%. But we still had issues.
Issue #2: MTU Black Hole
VPN tunnels reduce the effective MTU. Cisco Secure Client typically caps it at 1350 bytes because of IPsec overhead. But our Ubuntu servers didn't know that. They were sending 1500-byte packets.
Here's what happens:
- Server sends a 1500-byte packet
- VPN gateway needs to fragment it, but the packet has "don't fragment" set
- Gateway tries to send ICMP "fragmentation needed" back to the server
- Firewall blocks ICMP (because of course it does)
- Packet vanishes. No error. No retry. Just gone.
This is called an MTU black hole. It's invisible to standard diagnostics because the packets disappear silently. You just see random connection stalls.
I found it by running tcpdump on both the server and the VPN gateway simultaneously. Saw packets leaving the server, never arriving at the gateway. No ICMP responses.
Three fixes needed:
Set server MTU to 1280:
ip link set dev ens18 mtu 1280Make it persistent by adding to /etc/netplan/00-installer-config.yaml:
network:
ethernets:
ens18:
mtu: 1280Enable Path MTU Discovery:
echo "net.ipv4.tcp_mtu_probing = 1" >> /etc/sysctl.conf
sysctl -pAdd MSS clamping in iptables:
iptables -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtuThis tells the kernel to actively probe for the correct MTU and clamp the TCP Maximum Segment Size to match.
After this, packet loss dropped from 21% to under 1%. But sessions still died when the VPN reconnected.
Issue #3: Two-Hour Keepalive
Default tcp_keepalive_time is 7200 seconds. Two hours.
When a client's VPN reconnects, they get a new IP address. The server doesn't know the old connection is dead. It keeps the session alive, waiting for the old IP to send data. After two hours, it finally sends a keepalive probe, gets no response, and kills the session.
Meanwhile, the client has reconnected from a new IP. They're trying to start a new session. But the server says "you already have a session running" and refuses. User is locked out until the old session times out.
Two hours is unacceptable for a training environment.
Fix it by dropping keepalive to 60 seconds:
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 6Now dead connections are detected in 2 minutes (60 + 10*6 = 120 seconds). VPN drops, server detects it quickly, client can reconnect.
This fixed the lockout issue entirely.
Issue #4: Sessions Killed on Disconnect
Even with fast keepalive, we had another problem. Users' work was disappearing.
Found it in /etc/xrdp/sesman.ini:
[Security]
KillDisconnected=trueBy default, xRDP terminates your session when you disconnect. The assumption is that "disconnect" means "I'm done." But in a VPN environment, disconnect usually means "my VPN hiccupped for 10 seconds."
Change it to:
[Security]
KillDisconnected=falseNow sessions persist. Users can reconnect and pick up exactly where they left off. Even if the VPN drops for a minute, their work is still there.
This is standard for production RDP environments but somehow not the default for xRDP.
Issue #5: XFCE Session Corruption
Last issue was weird. Users would log out normally. Next login, the session would freeze at "Starting session..." Then segfault.
Found this in ~/.xsession-errors:
xfce4-session[2345]: ICE I/O Error
xfce4-session[2345]: Segmentation fault
Ubuntu 22.04 ships with xfce4-session 4.16. There's a known bug where it leaves stale ICE (Inter-Client Exchange) sockets in /tmp/.ICE-unix/ on logout. Next login, xfce4-session tries to reuse them, fails, and crashes.
The fix is a cleanup script that runs on logout:
#!/bin/bash
# /usr/local/bin/xfce-cleanup.sh
rm -rf /tmp/.ICE-unix/*
rm -rf ~/.cache/sessions/*Hook it into the logout process by adding to /etc/xrdp/startwm.sh:
trap '/usr/local/bin/xfce-cleanup.sh' EXITNow every logout clears the stale sockets. No more corruption.
This one took forever to track down because it only happened intermittently. Depended on timing between the logout and the next login.
Results
Before these fixes:
- 21% packet loss on RDP connections
- Sessions dropping every 10-15 minutes
- Users locked out for hours after VPN reconnects
- Work lost on every disconnect
After:
- 0.002% packet loss
- Sessions stable for days
- VPN reconnects recovered in under 2 minutes
- Work persists across disconnects
The difference is night and day. Jörg's last workshop ran perfectly. Twenty people, three days, zero RDP issues.
The Automation Script
I packaged all five fixes into a single Bash script. It analyzes your current configuration, applies the fixes, and can roll back if something breaks.
Get it at github.com/beye91/xrdp-ultimate-optimizer.
Features:
- Checks current sysctl and xRDP config
- Shows what will change before applying
- Backs up original configs
- Can restore from backup
- Includes the XFCE cleanup script
- Validates MTU settings
- Adds iptables rules for MSS clamping
Run it like this:
git clone https://github.com/beye91/xrdp-ultimate-optimizer
cd xrdp-ultimate-optimizer
sudo ./optimize-xrdp.shIt'll show you what it's about to change. Confirm, and it applies everything. Restart xRDP, and you're done.
If something breaks, roll back:
sudo ./optimize-xrdp.sh --restoreWhen to Use This
These fixes matter most when you're running xRDP over:
- VPN tunnels (Cisco Secure Client, OpenVPN, WireGuard)
- Cloud environments with network address translation
- Any scenario with reduced MTU
- High-latency or lossy links
If you're on a clean LAN, you probably don't need this. Standard xRDP tuning is enough.
But if your sessions are unstable, start with the buffer sizes and keepalive settings. Those two alone fix most VPN-related issues. The MTU fixes and session persistence are icing.
Key Takeaways
The individual fixes aren't complicated. They're standard TCP tuning that any network engineer would recognize. But finding which settings matter for VPN environments took real debugging.
Buffer exhaustion and keepalive timing were the big ones. Without those, nothing else mattered. MTU black holes were harder to find but just as critical once I knew to look for them.
If you're dealing with xRDP over tunnels and sessions keep crashing, grab the script and run it. Or apply the fixes manually. Either way, it'll save you hours of troubleshooting.
Have you dealt with RDP performance over VPN? What fixed it for you? I'd love to hear about other edge cases I haven't run into yet.






