Fixing xRDP Over VPN: The Ultimate Optimizer for Training Labs

Cisco Live is coming up again. Same problem as Melbourne.

Jörg and I had everything ready. Workshop materials, lab environments, the whole setup. Everything worked perfectly in testing. Then the actual workshop started. Twenty people connected over Cisco Secure Client VPN. xRDP sessions ground to a halt. Some participants got locked out and have a bad user experience. Others saw their work disappear when the VPN hiccupped for ten seconds.

We've been dealing with this for years. Every training event, same story. Standard xRDP optimization guides didn't help. The usual tuning (tcp_nodelay, color depth, compression) barely made a dent. Something else was breaking under VPN load.

Last month, I finally had time to dig in properly. What I found wasn't a single issue. It was five separate problems, all converging when you put real load on xRDP over a VPN tunnel. Each one fixable with sysctl tweaks and config changes. Together, they were making sessions unusable.

I've packaged everything into an automation script. Open sourced it at github.com/beye91/xrdp-ultimate-optimizer. This post walks through what I found and why it matters if you're running xRDP over any kind of tunnel.

The Scenario

Training lab with over 20 users connecting via RDP to our Ubuntu jumphost runs xRDP with XFCE. Participants connect from the event venue over Cisco Secure Client VPN. The VPN terminates at a firewall, which forwards RDP traffic to the lab network.

Under this load, we saw:

Sessions dropping every few minutes
21% packet loss on RDP connections
Users locked out for hours when sessions crashed
Work lost whenever the VPN reconnected

Standard xRDP tuning didn't touch it. The basic optimization I wrote about in 2024 helped with responsiveness, but didn't stop the crashes.

The problem was deeper. VPN tunnels change everything about how TCP behaves. Default Linux network settings assume you're on a clean LAN. When you're tunneling through a VPN, those assumptions break.

Issue #1: TCP Buffer Exhaustion

First thing I noticed in the logs: constant socket buffer warnings.

syslog: xrdp-sesman[1234]: cannot set socket option TCP_MAXSEG
kernel: TCP: too many orphaned sockets

xRDP requests 4MB socket buffers for each connection. Makes sense for RDP, which streams a lot of screen updates. But Linux's default net.core.wmem_max is 212KB. Twenty times smaller.

Under load, those buffers overflow. The kernel drops packets. Connections stall. Sessions crash.

The fix is simple. Bump the maximum socket buffer size to 16MB in /etc/sysctl.conf:

bash

net.core.wmem_max = 16777216
net.core.rmem_max = 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216

Apply with sysctl -p. Check it worked with sysctl net.core.wmem_max.

This alone cut session drops by 60%. But we still had issues.

Issue #2: MTU Black Hole

VPN tunnels reduce the effective MTU. Cisco Secure Client typically caps it at 1350 bytes because of IPsec overhead. But our Ubuntu servers didn't know that. They were sending 1500-byte packets.

Here's what happens:

Server sends a 1500-byte packet
VPN gateway needs to fragment it, but the packet has "don't fragment" set
Gateway tries to send ICMP "fragmentation needed" back to the server
Firewall blocks ICMP (because of course it does)
Packet vanishes. No error. No retry. Just gone.

This is called an MTU black hole. It's invisible to standard diagnostics because the packets disappear silently. You just see random connection stalls.

I found it by running tcpdump on both the server and the VPN gateway simultaneously. Saw packets leaving the server, never arriving at the gateway. No ICMP responses.

Three fixes needed:

Set server MTU to 1280:

bash

ip link set dev ens18 mtu 1280

Make it persistent by adding to /etc/netplan/00-installer-config.yaml:

yaml

network:
  ethernets:
    ens18:
      mtu: 1280

Enable Path MTU Discovery:

bash

echo "net.ipv4.tcp_mtu_probing = 1" >> /etc/sysctl.conf
sysctl -p

Add MSS clamping in iptables:

bash

iptables -A FORWARD -p tcp --tcp-flags SYN,RST SYN -j TCPMSS --clamp-mss-to-pmtu

This tells the kernel to actively probe for the correct MTU and clamp the TCP Maximum Segment Size to match.

After this, packet loss dropped from 21% to under 1%. But sessions still died when the VPN reconnected.

Issue #3: Two-Hour Keepalive

Default tcp_keepalive_time is 7200 seconds. Two hours.

When a client's VPN reconnects, they get a new IP address. The server doesn't know the old connection is dead. It keeps the session alive, waiting for the old IP to send data. After two hours, it finally sends a keepalive probe, gets no response, and kills the session.

Meanwhile, the client has reconnected from a new IP. They're trying to start a new session. But the server says "you already have a session running" and refuses. User is locked out until the old session times out.

Two hours is unacceptable for a training environment.

Fix it by dropping keepalive to 60 seconds:

bash

net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_keepalive_intvl = 10
net.ipv4.tcp_keepalive_probes = 6

Now dead connections are detected in 2 minutes (60 + 10*6 = 120 seconds). VPN drops, server detects it quickly, client can reconnect.

This fixed the lockout issue entirely.

Issue #4: Sessions Killed on Disconnect

Even with fast keepalive, we had another problem. Users' work was disappearing.

Found it in /etc/xrdp/sesman.ini:

ini

[Security]
KillDisconnected=true

By default, xRDP terminates your session when you disconnect. The assumption is that "disconnect" means "I'm done." But in a VPN environment, disconnect usually means "my VPN hiccupped for 10 seconds."

Change it to:

ini

[Security]
KillDisconnected=false

Now sessions persist. Users can reconnect and pick up exactly where they left off. Even if the VPN drops for a minute, their work is still there.

This is standard for production RDP environments but somehow not the default for xRDP.

Issue #5: XFCE Session Corruption

Last issue was weird. Users would log out normally. Next login, the session would freeze at "Starting session..." Then segfault.

Found this in ~/.xsession-errors:

xfce4-session[2345]: ICE I/O Error
xfce4-session[2345]: Segmentation fault

Ubuntu 22.04 ships with xfce4-session 4.16. There's a known bug where it leaves stale ICE (Inter-Client Exchange) sockets in /tmp/.ICE-unix/ on logout. Next login, xfce4-session tries to reuse them, fails, and crashes.

The fix is a cleanup script that runs on logout:

bash

#!/bin/bash
# /usr/local/bin/xfce-cleanup.sh
 
rm -rf /tmp/.ICE-unix/*
rm -rf ~/.cache/sessions/*

Hook it into the logout process by adding to /etc/xrdp/startwm.sh:

bash

trap '/usr/local/bin/xfce-cleanup.sh' EXIT

Now every logout clears the stale sockets. No more corruption.

This one took forever to track down because it only happened intermittently. Depended on timing between the logout and the next login.

Results

Before these fixes:

21% packet loss on RDP connections
Sessions dropping every 10-15 minutes
Users locked out for hours after VPN reconnects
Work lost on every disconnect

After:

0.002% packet loss
Sessions stable for days
VPN reconnects recovered in under 2 minutes
Work persists across disconnects

The difference is night and day. Jörg's last workshop ran perfectly. Twenty people, three days, zero RDP issues.

The Automation Script

I packaged all five fixes into a single Bash script. It analyzes your current configuration, applies the fixes, and can roll back if something breaks.

Get it at github.com/beye91/xrdp-ultimate-optimizer.

Features:

Checks current sysctl and xRDP config
Shows what will change before applying
Backs up original configs
Can restore from backup
Includes the XFCE cleanup script
Validates MTU settings
Adds iptables rules for MSS clamping

Run it like this:

bash

git clone https://github.com/beye91/xrdp-ultimate-optimizer
cd xrdp-ultimate-optimizer
sudo ./optimize-xrdp.sh

It'll show you what it's about to change. Confirm, and it applies everything. Restart xRDP, and you're done.

If something breaks, roll back:

bash

sudo ./optimize-xrdp.sh --restore

When to Use This

These fixes matter most when you're running xRDP over:

VPN tunnels (Cisco Secure Client, OpenVPN, WireGuard)
Cloud environments with network address translation
Any scenario with reduced MTU
High-latency or lossy links

If you're on a clean LAN, you probably don't need this. Standard xRDP tuning is enough.

But if your sessions are unstable, start with the buffer sizes and keepalive settings. Those two alone fix most VPN-related issues. The MTU fixes and session persistence are icing.

Key Takeaways

The individual fixes aren't complicated. They're standard TCP tuning that any network engineer would recognize. But finding which settings matter for VPN environments took real debugging.

Buffer exhaustion and keepalive timing were the big ones. Without those, nothing else mattered. MTU black holes were harder to find but just as critical once I knew to look for them.

If you're dealing with xRDP over tunnels and sessions keep crashing, grab the script and run it. Or apply the fixes manually. Either way, it'll save you hours of troubleshooting.

Have you dealt with RDP performance over VPN? What fixed it for you? I'd love to hear about other edge cases I haven't run into yet.

About the Author

Chris Beye

Network automation enthusiast and technology explorer sharing practical insights on Cisco technologies, infrastructure automation, and home lab experiments. Passionate about making complex networking concepts accessible and helping others build better systems.