The most exciting thing about this world is its ever changing quality.

Wednesday, August 26, 2009

Real time signal driven between Kernel and User space

I have written a blog before about the standard usage of /proc and /dev interface as IPC between kernel space and user space applications. Of course you can do clever things such as asynchronous I/O (AIO) and non-blocking system calls, but they do not really solve the problem, if what we need is a real event driven rather than thread polling.

For standard non-real time Linux, popular ways include socket (relying on which, netlink is constructed), signalling, mapping memory. (These are all I know, please tell me if there are other tricks I don't!) There are other tricks like upcall (using call_usermodehelper in the kernel module to invoke user space program) but they are rather hacks and not well supported during porting to different hardware platforms. Of course via you can use named pipes or fifo (mknod and mkfifo) which are essentially system level device node based (similar to /dev interface) communications.

(To be extreme, here is what I have always believed - the only reason why we are having the difference of user space as opposed to kernel space is you can do whatever you want in user space without screwing up the whole os, which is protected and can allow others like you to screw up the system independently. After 2.6, the whole Linux kernel can be considered as a single process, with multiple concurrent, schedulable threads.)

Basically, signals can be sent from kernel and some can be queued if you choose to. The Linux signal queue is interrupt-safe. I won't go through the whole list of signals available and APIs to use as you can find them C&P all over the net. What I would like to note here is POSIX.4 Real Time Signals or known as RT signals. They are a group of signals (between SIGMIN and SIGMAX) supported by the Linux kernel which overcome some of the limitations of traditional UNIX signals. First of all, RT signals can be queued to by the kernel, instead of setting bits in a signal mask as for the traditional UNIX signals. This allows multiple signals of the same type to be delivered to a process. In addition, each signal carries a siginfo_t payload which provides the process with the context in which the signal was raised. To say process is a little confusing, in fact, you can signal to specific thread or a group. The catch here is you need to specify carefully which type of signal it is when you generate them in the kernel (unfortunately they need to be manually mapped to the send_sigxxx APIs you will be using to trigger the signal, i.e. if you want to use sigqueue, the si_code has to be SI_QUEUE. Unluckily, some Linux porting doesn't support sigqueue, e.g. Blackfin and PPC. There are workarounds. You can still use send_signal_info to generate signal with siginfo_t payload to queue the RT signal, but be aware you can't use _sifields. _rt. si_sigval. sival_ptr to pass a 32 bits pointer of a struct and hope to use it the same way as the value you can pass with sigqueue, you can only pass a 32 bits value in the union. I learnt it the hard way...)

One problem with RT signals is that the signal queue is finite, and hence, once the signal queue overflows, a server using RT signals has to have some fall backs. Good thing about RT signals is that they have a very low overhead. They also provide a very much software interrupt-driven approach, which to my mind is quite intuitive as you think about it. All the interesting events originally will come from hardware interface, pass to device drivers sitting kernel. What is more efficient than building your higher logic on these events?

Also, I have wrapped up these signal handlers in an I/O lib, where it sits in user space, elegantly creating and posting events in user spaces, to those who are interested - a post-office like publish-subscribe mechanism. This way, you wouldn't have to worry about errant signals. In the kernel, I have added a linked list to maintain all the threads (task_struct) which have initiated the requirement to be signalled. They will be signalled in a simple round-robin way. Code is dead simple:

// in the kernel driver
list_for_each(ptr, &user_tasks.list)
{
entry = list_entry(ptr, struct user_task_struct, list);
memset(&info, 0, sizeof(struct siginfo));
info.si_int = pdev->data;
info.si_signo = SIGGPIBUTTON;
info.si_errno = 0;
info.si_code = SI_QUEUE;
info.si_uid = pdev->minor_node_id; // stole this field for additional info
err = send_sig_info(SIGGPIBUTTON, &info, entry->thread);
}

// in the I/O lib
memset(&m_actBt, 0, sizeof(m_actBt));
m_actBt.sa_sigaction = &CGPI::ButtonSignalHandler;
m_actBt.sa_flags = SA_SIGINFO;
sigemptyset(&m_actBt.sa_mask);
err = sigaction(SIGGPIBUTTON, &m_actBt, NULL);

void ButtonSignalHandler(int signum, siginfo_t *info, void *ptr)
{
int data;
printf("Received signal %d\n", signum);
if(signum != SIGGPIBUTTON) return;
data = (int)(info->si_int);
CEvent * evt = new CEvent(data);
evt->id = info->si_uid;
evt->Post(m_evtMgr);
}

In Xenomai (a real time patch for Linux), you have the option to enable kernel rt_task communicate with user space ones using real time message queue (rt_queue_create). I have also tried to get the real time signal working with Xenomai 2.4.91. Xenomai patch only support the RT signals via POSIX skin (officially). However, pthread_sigqueue_np only takes pse51_thread instead of standard POSIX thread, which means the signal can only be sent to Xenomai POSIX skin thread, created by overloaded pthread_create. This is a little messy I know. In non-real time Linux, you have GNU thread and POSIX thread implementations to choose from, or both, depending how you choose to link your libraries. Xenomai has its own implementation of kernel mode finer thread (xnthread_t , if you like to call it). These Xeno threads existing only in Xeno real time domain. When you choose to use native skins, you will be dealing with rt_task_xxx interfaces. However, all the signal handling will just happen in Linux domain. Within POSIX skin, you can queue and set up signal handler for Xeonmai POSIX skin thread. Effectively, instead of using rt_task_xxx, you can use familiar pthread_create APIs. Just bear in mind you need to link to Xenomai libraries.

No comments: