Monday, July 29, 2024

Sipsak → sipexer for small SIP tests

I think a lot of peoples who are working with SIP knows and use sipsak. It's a Swiss-army knife for SIP, as it states. But not anymore. In a way migrating to OpenSSL 3.0 and newer distributions proper TLS transport support was lost. Or, in another words, I cannot make it work on Alma 9.

 

So, meet a replacement - sipexer from Kamailio author. Doing mainly the same (for me) and supports TLS and WebSocket out of the box. As a bonus for simple testing, you can get SIP reply code as a return code of the program itself.

So, instead of checks like

 

sipsak -s sip:check_server_health@localhost:5061 --transport=tls

if [ $? != 0 ]; then

  restart_kamailio

fi

 

you can do 

 

sipexer -timeout-write 500 -timeout 500 -vl 0 -mt options -ruser check_server_health tls:localhost:5061 >> /dev/null

if [ $? != 200 ]; then

  restart_kamailio

fi

 

Quite a simple replacement and due to sipexer being written on Go, you can just get the binary and use it right away.

I like it when things become simpler.

Thursday, June 27, 2024

No or distorted sound using SRTP and rtpengine

 

TL;DR;

SDES-pad 

Long version:

Recently had a chance of upgrading rtpengine from version 8 to version 12. 

For those, who don't know what it is, it's just one of the most popular RTP proxy in the opensource ecosystem. And usually coming with Kamailio and OpenSIPS as a default RTP Proxy option nowdays. 

And I must say, it's a really stable software with great backwards compatibility. I did not expected to get any issues on this upgrade, but quite fast, got into situation, when receiving a calls on Linphone (mobile, if it matters to someone), sound get distorted in a case of Android and completely absent in a case of iOS.

First I've blamed a new chiper suites, that had been added and disabled it. (SDES-no-<CHIPER> string in rtpengine_manage(), if someone interested). But that does not help.

In the process of searching I've found this issue on github. And yes, turning on SDES-pad resolved the issue, but I was really interested why this option was introduced in a first place.

So, according to the documentation,

pad

   RFC 4568 (section 6.1) is somewhat ambiguous regarding the base64 encoding format of a=crypto parameters added to an SDP body. The default interpretation is that trailing = characters used for padding should be omitted. With this flag set, these padding characters will be left in place.

Hm, let's read the RFC about it.

   When base64 decoding the key and salt, padding characters (i.e., one or two "=" at the end of the base64-encoded data) are discarded (see [RFC3548] for details).
   Base64 encoding assumes that the base64 encoding input is an integral number of octets.  If a given crypto-suite requires the use of a concatenated key and salt with a length that is not an integral number of octets, said crypto-suite MUST define a padding scheme that results in the base64 input being an integral number of octets.

What I can see here is reference to how base64 is made, just explicit remark, that padding characters are not carrying any data, just.. padding according to the corresponding RFC:

   In some circumstances, the use of padding ("=") in base encoded data is not required nor used.  In the general case, when assumptions on size of transported data cannot be made, padding is required to yield correct decoded data.
   Implementations MUST include appropriate pad characters at the end of encoded data unless the specification referring to this document explicitly states otherwise.

Really, I can't see any way, that padding is "optional" in base64 encoding by default, both RFC's expicitly saying, that padding is needed and this is the case for SDP option a=crypto.

As a takeway for me, "default" options even in poular software not meaning "right".

Tuesday, May 28, 2024

Sublime syntax for Kamailio

For a long time I've been using Visual Studio Code for writing code and it's still a great product, it's becoming more and more bloated with M$ telemetry and even using VSCodium does not help a lot (especially how hard to get Pylance working there).

So now it's a good time to "prepare a spare airfield" and every time I'm going back to Sublime. Yes, it's paid, but it's worth it. And Sublime Merge is the best git GUI over there for me.

One of the things that were missing for me, was the lack of syntax highlighting for Kamailio in Sublime Text.

Luckily, there is a syntax file for VSCode from Daniel-Constantin Mierla, so it was really easy to adapt it for Sublime.

The file itself is here. It's less advanced than the original file, mainly cause I need to dig into syntax format a bit more, but as a start, it will work.

Now it's a bit easier to write Kamailio configs in Sublime



Friday, March 1, 2024

Python and random task distribution in ThreadPoolExecutor

Not a usual article for me, but will leave this boilerplate here for myself.

So, the task was the following: send multiple PUBLISH messages over SIPP, but not overload the SIP Proxy, and distribute them in time, like not sending all at once. Yes, I do know SIPP can do this, but I wanted to have SIPP exit code after sending each message and try to re-send it in case of issues. Order of sending is whatever. Just need to be delivered.

Here is the boilerplate Python function that I've used to achieve this

#!/usr/bin/python
import logging
import time
import random

from concurrent.futures import ThreadPoolExecutor

logging.getLogger('paramiko').setLevel(logging.INFO)

logging.basicConfig(
    format = '[%(asctime)s.%(msecs)03d] %(threadName)s %(name)s %(levelname)s: %(message)s',
    level=logging.INFO,
)

def get_timer_delay():
    '''
    Generator to be passed in ThreadPool to have delays in multiple iterated value processes.
    random delay between 5 and 200 ms. When reaching 1 sec - resets back to 0
    '''
    num = 0.0
    while True:
        yield num
        if num < 1:
            num += float(random.randrange(5, 200, 3)) / 1000
        else:
            num = 0.0

def process_iterated_value(value, start_delay):
    '''
    This function is called in thread-wise way to have multiple values processed at the moment
    '''
    time.sleep(start_delay)

    # Call SIPP process here (with exit code control)

    logging.info(f"Processed value {value} with delay {start_delay}")

#### ---------------- Script start ---------------- ########

execute_timeout = get_timer_delay()
iterated_value = range(0, 30)

logging.info("Process start")

with ThreadPoolExecutor(max_workers=5) as executor:
    executor.map(process_iterated_value, iterated_value, execute_timeout)


logging.info("Process end")

Output is looks like this. All values are processed in a more-or-less distributed way that can be seen by timestamps

[00:06:43,375.375] MainThread root INFO: Process start
[00:06:43,375.375] ThreadPoolExecutor-0_0 root INFO: Processed value 0 with delay 0.0
[00:06:43,504.504] ThreadPoolExecutor-0_0 root INFO: Processed value 1 with delay 0.128
[00:06:43,620.620] ThreadPoolExecutor-0_1 root INFO: Processed value 2 with delay 0.244
[00:06:43,794.794] ThreadPoolExecutor-0_2 root INFO: Processed value 3 with delay 0.417
[00:06:43,928.928] ThreadPoolExecutor-0_3 root INFO: Processed value 4 with delay 0.5509999999999999
[00:06:44,053.053] ThreadPoolExecutor-0_4 root INFO: Processed value 5 with delay 0.6759999999999999
[00:06:44,334.334] ThreadPoolExecutor-0_0 root INFO: Processed value 6 with delay 0.828
[00:06:44,334.334] ThreadPoolExecutor-0_0 root INFO: Processed value 11 with delay 0.0
[00:06:44,475.475] ThreadPoolExecutor-0_0 root INFO: Processed value 12 with delay 0.14
[00:06:44,590.590] ThreadPoolExecutor-0_1 root INFO: Processed value 7 with delay 0.968
[00:06:44,689.689] ThreadPoolExecutor-0_0 root INFO: Processed value 13 with delay 0.21400000000000002
[00:06:44,769.769] ThreadPoolExecutor-0_2 root INFO: Processed value 8 with delay 0.973
[00:06:44,914.914] ThreadPoolExecutor-0_3 root INFO: Processed value 9 with delay 0.984
[00:06:44,941.941] ThreadPoolExecutor-0_1 root INFO: Processed value 14 with delay 0.35100000000000003
[00:06:45,185.185] ThreadPoolExecutor-0_0 root INFO: Processed value 15 with delay 0.494
[00:06:45,185.185] ThreadPoolExecutor-0_4 root INFO: Processed value 10 with delay 1.13
[00:06:45,413.413] ThreadPoolExecutor-0_2 root INFO: Processed value 16 with delay 0.643
[00:06:45,641.641] ThreadPoolExecutor-0_3 root INFO: Processed value 17 with delay 0.726
[00:06:45,686.686] ThreadPoolExecutor-0_1 root INFO: Processed value 18 with delay 0.743
[00:06:45,686.686] ThreadPoolExecutor-0_1 root INFO: Processed value 23 with delay 0.0
[00:06:45,740.740] ThreadPoolExecutor-0_1 root INFO: Processed value 24 with delay 0.053
[00:06:45,810.810] ThreadPoolExecutor-0_1 root INFO: Processed value 25 with delay 0.07
[00:06:45,886.886] ThreadPoolExecutor-0_1 root INFO: Processed value 26 with delay 0.07500000000000001
[00:06:45,952.952] ThreadPoolExecutor-0_0 root INFO: Processed value 19 with delay 0.766
[00:06:46,021.021] ThreadPoolExecutor-0_1 root INFO: Processed value 27 with delay 0.134
[00:06:46,071.071] ThreadPoolExecutor-0_4 root INFO: Processed value 20 with delay 0.885
[00:06:46,142.142] ThreadPoolExecutor-0_0 root INFO: Processed value 28 with delay 0.19
[00:06:46,354.354] ThreadPoolExecutor-0_1 root INFO: Processed value 29 with delay 0.33299999999999996
[00:06:46,376.376] ThreadPoolExecutor-0_2 root INFO: Processed value 21 with delay 0.962
[00:06:46,721.721] ThreadPoolExecutor-0_3 root INFO: Processed value 22 with delay 1.078
[00:06:46,721.721] MainThread root INFO: Process end
 

For sure, all values need to be adjusted after, but the idea is there.

The only thing that bothers, I'm not 100% sure how generator value num would behave in a multithread environment due to Python GIL. But here I'm asking for someone who has a Python knowledge to comment.

Tuesday, November 28, 2023

Asterisk and sharing <Custom:> device states

Asterisk has a mechanism to sharing device states across the servers using just PJSIP.

But what is interesting, this mechanism is not really applies to Custom: device states. Means if you have 2 servers and want to share BLF state via hint and have something like

extensions.conf

...

exten => MY_STATE,hint,Custom:MY_DEVICE


and changing it via asterisk CLI


> devstate change Custom:MY_DEVICE BUSY


on the local server, it will change the hint on the remote server, but not the actual devstate that you're connecting this hint with 

> devstate list 

---------------------------------------------------------------------
--- Custom Device States --------------------------------------------
---------------------------------------------------------------------
---
--- Name: 'Custom:MY_DEVICE'  State: 'NOT_INUSE'
 

> core show hints

    -= Registered Asterisk Dial Plan Hints =-
MY_STATE@hints: Custom:MY_DEVICE  State:Busy            Presence:not_set         Watchers  0


This situation gives me a lot of hard times, when you have visual inconsistency on the same server, but totally forgetting that you have second spare server.

For me, the resolution was to sync the actual Custom: device states via external service using AMI that monitors DeviceStateChange events and using Setvar just changing the state explicitly.

Just a note, that if someone will fall into the same pit.

Thursday, September 21, 2023

Asterisk queues, local channels and transfers

Here comes the same old story about Asterisk queues and using Local channels within it. But with PJSIP flavour.

So, it is known, that local channels in Asterisk can be tricky and might require disable of the optimizations when used in Queues. Yes, famous /n at the end of the channel name so the Queue application can track the status of the channel

But the drawback of this method is also known and this is transfers calls from the agents. Means when you do a transfer, initial channel is not freed.

Or outgoing calls from the agents. Like if agent do an outbound call and still resides in a queue, it would not be considered as busy.

Many methods to workaround this, and I'm not exception to build own based on GROUP and GROUP_COUNT originating from Asterisk: The Definitive Guide book, but with some modifications regarding new ways of setting variables on PJSIP channels and further optimisations of Local Channel in the Asterisk.


queues.conf

...

[QUEUE_A]

member => Local/AGENT_A@agents

...

extensions.conf

...

[incoming]

exten => QUEUE_A,1,Queue(QUEUE_A)

[agents]

; First to  check if there are active calls on this agent

exten => AGENT_A,1,ExecIf($[ ${GROUP_COUNT(${EXTEN}@agents)} >= 1 ]?Congestion())

    ; We  need to set GROUP on outgoing channel as current channel will be destroyed upon answer

    same => n,Dial(PJSIP/AGENT_A@trunk,,b(set_group^${EXTEN}^1))

[agents_outgoing] 

; Make sure we're not getting calls to agent when it's on outgoing call

exten => OUT_NUM,1,Gosub(set_group,${CALLERID(num)},1)

    same => n,Dial(PJSIP/${EXTEN}@out_trunk)


[set_group]

exten => AGENT_A,1,Set(GROUP(agents)=${EXTEN})

    same => n,Return()

Just a small warning, it's more a pseudo-code, but just to give an overall idea

Friday, August 18, 2023

Importance of creating dialog in Kamailio right

 I know, the things described here might be obvious and considered "for beginners", but everyone can make stupid errors. Even the best of us :)
So, what I'm using Kamailio dialog module for?

  • Keepalives with ka_timer parameter, mainly cause of the nature of mobile clients that are unstable due to the mobile network nature
  • Presence PUBLISH'es via pua/pua_dialoginfo modules to separate presence server

 

With the second one actually, I've issued some "problems" due to configuration.

Most examples (at least what I've seen) on Kamailio configs, that are operating with the dialogs, have more-or-less this structure:

request_route {
   route(REQINIT);

   if (is_method("INVITE")) {
       dlg_manage();
   }

   route(WITHINDLG);

   ...

   route(AUTH);

   ...

}

There is a problem with this code. And it is, that this code creates dialogs for EVERY INVITE, even if not authed. 

So, the usual flow INVITE - 401 - INVITE (w/auth) - 200 will create 2 dialogs in this case. The first one will be not terminated correctly with connection with pua module. It will generate PUBLISH with the Trying state,  but on 401 - ACK, there would be no corresponding PUBLISH for the terminated state. And as usual, Expire here is 3600, and you will have a lot of "ringing" devices on your presence server.

The answer would be: create dialogs only for authed (and valid) INVITEs. This will also save some CPU not to create dialogs for spammers.

request_route {
   route(REQINIT);

   route(WITHINDLG);

   ...

   route(AUTH);

   if (is_method("INVITE")) {
       dlg_manage();
   }

   ...

}