Showing posts with label tcp. Show all posts
Showing posts with label tcp. Show all posts

Friday, May 16, 2025

Kamailio TLS connection lifetime

 TL;DR;

Pay attention to 

modparam("tls", "connection_timeout", 600) # Default

 A bit longer.

Usually, you don't need to use SIP OPTIONS ping over TCP connections. Because you have TCP keepalives built-in. And the default registration time is 1h, which is also generous in the case of TCP (TLS). But the default connection timeout, found in mod_tls parameters is only 10 minutes. So, if your registration time is lower than 10 minutes and there are no calls on this phone (which is ok as well), you'll have a TLS connection drop every 10 minutes with a corresponding indication on the phone itself.

 Again, defaults might not be 100% sane.

Tuesday, October 25, 2022

Protect Kamailio from TCP/TLS flood

 After stress-testing Kamailio with sipflood tool from sippts suite (which deserves another article), not so good outcome was faced.

Using CentOS 7 default OpenSSL library (1.0.2k-fips) with using Kamailio 5.4-5.6 with TLS transport, it's quite easy to get a segfault inside tls routines. I've found that roughly 10 000 OPTIONS packets with 200 threads is enough to ruin Kamailio process.

Basically, you can DoS the whole server regardless of it's power just with a single mid-range computer.

Solution was found with using Kamailio 5.6, but with tlsa flavour and latest openssl 1.1.x compiled.

Turns out it's a really simple process. 

As we're gonna need to compile Kamailio anyway, assume, that we have all necessary packets for build already on the system.

First - we need to get openssl sources:

# cd /usr/src

# wget https://www.openssl.org/source/openssl-1.1.<latest>.tar.gz

# tar xvf https://www.openssl.org/source/openssl-1.1.<latest>.tar.gz

# cd  openssl-1.1.<latest>

#  ./config

# make

(Optionally) Here we can make sure that this release is passing tests

# yum install perl-Test-Simple

# make test

Next step - point Kamailio to newly compiled openssl

# cd /usr/src

# wget https://www.kamailio.org/pub/kamailio/5.6.<latest>/src/kamailio-5.6.<latest>_src.tar.gz

# tar xvf kamailio-5.6.<latest>_src.tar.gz

# cd kamailio-5.6.<latest>

#  sed -i "s?LIBSSL_STATIC_SRCLIB \?= no?LIBSSL_STATIC_SRCLIB \?= yes?g" ./src/modules/tlsa/Makefile

# sed -i "s?LIBSSL_STATIC_SRCPATH \?= /usr/local/src/openssl?LIBSSL_STATIC_SRCPATH \?= /usr/src/openssl-1.1.<latest>?g" ./src/modules/tlsa/Makefile

...

Than goes your usual Kamailio compiling and don't forget to replace all "tls" module mentions in kamailio.cfg to "tlsa"

Results are much better. But than I've faced, that it's possible to "eat" all TCP connections on Kamailio server with this type of flood.

First - ulimit. Never underestimate defaults.  

# ulimit -n unlimited

Next steps - tune TCP stack.

Disclamer: next provided options are discussable and was not found by me and need to be adjusted to your case

kamailio.conf

...

tcp_connection_lifetime=3605
tcp_max_connections=4096
tls_max_connections=4096
tcp_connect_timeout=5
tcp_async=yes
tcp_keepidle=5
open_files_limit=4096

...

/etc/sysctl.conf

...

# To increase the amount of memory available for socket input/output queues
net.ipv4.tcp_rmem = 4096 25165824 25165824
net.core.rmem_max = 25165824
net.core.rmem_default = 25165824
net.ipv4.tcp_wmem = 4096 65536 25165824
net.core.wmem_max = 25165824
net.core.wmem_default = 65536
net.core.optmem_max = 25165824

# To limit the maximum number of requests queued to a listen socket
net.core.somaxconn = 128

# Tells TCP to instead make decisions that would prefer lower latency.
net.ipv4.tcp_low_latency=1

# Optional (it will increase performance)
net.core.netdev_max_backlog = 1000
net.ipv4.tcp_max_syn_backlog = 128
...

This will help, but not fully (at least in my case, I've must miss something and comments here are really welcomed)

As the second part I've decided to go with Fail2Ban and block flood on iptables level.

Setup is quite simple as well.

First - make sure Kamailio will log flood attempts:

kamailio.conf

...

 loadmodule "pike.so"

modparam("pike", "sampling_time_unit", 2)
modparam("pike", "reqs_density_per_unit", 30)
modparam("pike", "remove_latency", 120)

...

if (!pike_check_req()) {
            xlog("L_ALERT", "[SIP-FIREWALL][FAIL2BAN] $si\n");

            $sht(ipban=>$si) = 1;
            if ($proto != 'udp') {
                tcp_close_connection();
            }
            drop;
        }

...

Next - install and configure Fail2Ban

# yum install -y fail2ban

 /etc/fail2ban/jail.local

[DEFAULT]
# Ban hosts for one hour:
bantime = 3600

# Override /etc/fail2ban/jail.d/00-firewalld.conf:
banaction = iptables-multiport
action      = %(action_mwl)s

[cernphone-iptables]
enabled  = true
filter   = mypbx
action   = iptables-mypbx[name=mypbx, protocol=tcp, blocktype='REJECT --reject-with tcp-reset']
           sendmail[sender=<sender_addr>, dest=<dest_addr> sendername=Fail2Ban]
logpath  = <your_kamailio_logfile>
maxretry = 1
bantime  = 3600s
findtime = 10s
 

 /etc/fail2ban/action.d/iptables-mypbx.conf

[INCLUDES]

before = iptables-common.conf

[Definition]

actionstart = <iptables> -N f2b-<name>
              <iptables> -A f2b-<name> -j <returntype>
              <iptables> -I <chain> -p <protocol> -j f2b-<name>

actionstop = <iptables> -D <chain> -p <protocol>  -j f2b-<name>
             <actionflush>
             <iptables> -X f2b-<name>


actioncheck = <iptables> -n -L <chain> | grep -q 'f2b-<name>[ \t]'

actionban = <iptables> -I f2b-<name> 1 -s <ip> -p <protocol> -j <blocktype>

actionunban = <iptables> -D f2b-<name> -s <ip> -p <protocol> -j <blocktype>

 

 /etc/fail2ban/filter.d/mypbx.local

[Definition]
# filter for kamailio messages
failregex = \[SIP-FIREWALL\]\[FAIL2BAN\] <HOST>$
 

# systemctl enable fail2ban

# systemctl start fail2ban

In this case we will get host banned on iptables level.



Thursday, October 29, 2020

Kamailio and mobile TCP endpoints. Unregistering

The modern world is mobile. You may like it or not. But it is. So, with all these technologies that are about saving your battery life, all application reachability goes to vendor-lock push servers.
 
But the problem emerges in the following. SIP client registers on registrar (Kamailio) with TCP. Yes, you can do SIP keepalives here, but why to have em if we have a built-in mechanism of keepalive in TCP itself? Plus, additional packets over the network will drain the battery faster. And iOS, when putting the app to the background, just cut the network of app off. Literally killing it.
 
But we somehow need to know which actual state of the endpoint is. And here Kamailio is to help us with tcpops module.
 
The idea is quite simple. On each query to location table, we will clean-up it from "dead" TCP connections. So, after save(). And the only actual state of endpoints would be taken in the branch and calling process. Yes, it will give extra initial INVITE on already 'dead' connection, but appears to be, it sometimes cleaning just arrived REGISTER's so it's acceptable.


# Save info for TCP connections for unregister on close/timeout/error
loadmodule "htable.so"

modparam("htable", "htable", "tcpconn=>size=15;autoexpire=7200;")

...

# Handle SIP registrations
route[REGISTRAR] {

   ...

    save("location");
    route(TRACK_TCP_STATE);

    $var(processed_subscriber) = $fu;

    route(TCP_REGISISTER_CLEANUP);

   ...

}

 

route[TRACK_TCP_STATE] {
    if (proto == UDP) {
        return;
    }
    xlog("[TRACK_TCP_STATE] Saving state for future disconect track of $fu\n");

    $sht(tcpconn=>$conid) = $fu;


    tcp_enable_closed_event();
}


# Make sure you set up $var(processed_subscriber) before calling this route

route[TCP_REGISISTER_CLEANUP] {

    if ($var(processed_subscriber) == $null) {
        return;
    }

    xlog("[TCP_REGISISTER_CLEANUP] Processing subscriber $var(processed_subscriber)\n");

    # Getting registered endpoints for this AoR
    if (!reg_fetch_contacts("location", "$var(processed_subscriber)", "subscriber")) {
        $var(processed_subscriber) = $null;
        xlog("[TCP_REGISISTER_CLEANUP] No registered contacts for $var(processed_subscriber)\n");
        return;
    }

    $var(i) = 0;

    # Loop through registered endpoints
    while ($var(i) < $(ulc(subscriber=>count))) {

        $var(stored_subscriber_conid) = $(ulc(subscriber=>conid)[$var(i)]);

        # Make sure proto is TCP
        if ($var(stored_subscriber_conid) != $null) {

            # Check if entry is still active TCP connection. Unregister otherwise.
            if ($var(stored_subscriber_conid) == -1 || !tcp_conid_alive("$var(stored_subscriber_conid)")) {

                $var(stored_subscriber_ruid) = $(ulc(subscriber=>ruid)[$var(i)]);
                $var(stored_subscriber_address) = $(ulc(subscriber=>addr)[$var(i)]);

                xlog("[TCP_REGISISTER_CLEANUP]: Unregistering entry $var(i)/$var(stored_subscriber_conid) -> $var(stored_subscriber_address)\n");

                if (!unregister("location", "$var(processed_subscriber)", "$var(stored_subscriber_ruid)")) {
                    xlog("[TCP_REGISISTER_CLEANUP]: Unregistering entry $var(i)/$var(stored_subscriber_conid) -> $var(stored_subscriber_address) FAILED!\n");
                }
            }
        }
        $var(i) = $var(i) + 1;
    }
    $var(processed_subscriber) = $null;
}


event_route[tcp:closed] {
    xlog("[TCP:CLOSED] $proto connection closed conid=$conid\n");

    $var(processed_subscriber) = $sht(tcpconn=>$conid);

    if ($var(processed_subscriber) != $null) {
        route(TCP_REGISISTER_CLEANUP);
        $sht(tcpconn=>$conid) = $null;
    }
}

event_route[tcp:timeout] {
    xlog("[TCP:TIMEOUT] $proto connection timeout conid=$conid\n");

    $var(processed_subscriber) = $sht(tcpconn=>$conid);

    if ($var(processed_subscriber) != $null) {
        route(TCP_REGISISTER_CLEANUP);
        $sht(tcpconn=>$conid) = $null;
    }
}

event_route[tcp:reset] {
    xlog("[TCP:RESET] $proto reset closed conid=$conid\n");

    $var(processed_subscriber) = $sht(tcpconn=>$conid);

    if ($var(processed_subscriber) != $null) {
        route(TCP_REGISISTER_CLEANUP);
        $sht(tcpconn=>$conid) = $null;
    }
}


Simple and efficient solution. Yes, some of "dead" endpoints would be present in table during expiration time, but I'm ok with that.