What's soft lockup?

Robert Heller heller at deepsoft.com
Tue Jul 7 18:14:13 UTC 2020


At Tue, 7 Jul 2020 12:59:08 -0300 "Ubuntu user technical support,  not for general discussions" <ubuntu-users at lists.ubuntu.com> wrote:

> 
> I'm really sorry I can't spend more time into this right now but.. and
> that the xenomai project (and CAN msgs) are new to me.. but I got
> curious with xenomail src code when I opened, and indeed you're
> right.. there are only socket operations in rtcansend.c or
> rtcanrcv.c...

Yes, the CAN (Community Area Network) is commonly used in automobiles as the 
communication "network" for all of the little processors in your car.  I 
believe CAN might also be used in industrial systems as an interprocessor 
communication.  (It is also being used in Model Railroading -- see 
OpenLCB.org.)  Under Linux CAN drivers show up as sockets in the CAN family.  
It is a kind of network that physically is wired bus style (daisy chained) 
using twisted pair / differencial signaling.

> 
> Actually.. xenomai project is *very complex* =o)... as anything
> related to realtime...
> 
> but reading a bit the drivers sources ... they do implement file_operations...
> 
> Grepping for the ones with write function implementation I found:
> 
> ksrc/nucleus/pipe.c:static struct file_operations xnpipe_fops = {
> ksrc/nucleus/pipe.c- .owner = THIS_MODULE,
> ksrc/nucleus/pipe.c- .read = xnpipe_read,
> ksrc/nucleus/pipe.c- .write = xnpipe_write,
> ksrc/nucleus/pipe.c- .poll = xnpipe_poll,
> ksrc/nucleus/pipe.c- .unlocked_ioctl = xnpipe_ioctl,
> ksrc/nucleus/pipe.c- .open = xnpipe_open,
> ksrc/nucleus/pipe.c- .release = xnpipe_release,
> ksrc/nucleus/pipe.c- .fasync = xnpipe_fasync
> ksrc/nucleus/pipe.c-};
> 
> ksrc/nucleus/vfile.c:static struct file_operations vfile_snapshot_fops = {
> ksrc/nucleus/vfile.c- .owner = THIS_MODULE,
> ksrc/nucleus/vfile.c- .open = vfile_snapshot_open,
> ksrc/nucleus/vfile.c- .read = seq_read,
> ksrc/nucleus/vfile.c- .write = vfile_snapshot_write,
> ksrc/nucleus/vfile.c- .llseek = seq_lseek,
> ksrc/nucleus/vfile.c- .release = vfile_snapshot_release,
> ksrc/nucleus/vfile.c-};
> 
> ksrc/nucleus/vfile.c:static struct file_operations vfile_regular_fops = {
> ksrc/nucleus/vfile.c- .owner = THIS_MODULE,
> ksrc/nucleus/vfile.c- .open = vfile_regular_open,
> ksrc/nucleus/vfile.c- .read = seq_read,
> ksrc/nucleus/vfile.c- .write = vfile_regular_write,
> ksrc/nucleus/vfile.c- .llseek = seq_lseek,
> ksrc/nucleus/vfile.c- .release = vfile_regular_release,
> ksrc/nucleus/vfile.c-};
> 
> And reading xenomai docs a bit:
> 
> https://xenomai.org/documentation/xenomai-3/html/xeno3prm/group__cobalt__core__vfile.html
> 
> I see that the objects are represented as files so that answers why
> the VFS layer could be in this stack trace. So its a good star.t..
> 
> Checking xenomai userland code we have:
> 
> src/skins/posix/rtdm.c:
> 
> ssize_t __wrap_write(int fd, const void *buf, size_t nbyte)
> {
> if (fd >= __pse51_rtdm_fd_start) {
> int ret, oldtype;
> 
> pthread_setcanceltype(PTHREAD_CANCEL_ASYNCHRONOUS, &oldtype);
> 
> ret = set_errno(XENOMAI_SKINCALL3(__pse51_rtdm_muxid,
>   __rtdm_write,
>   fd - __pse51_rtdm_fd_start,
>   buf, nbyte));
> 
> pthread_setcanceltype(oldtype, NULL);
> 
> return ret;
> } else
> return __real_write(fd, buf, nbyte);
> }
> 
> We have a possibility of a real write here to a file descriptor =).
> And if we go through the kernel source code.. the drivers also
> implement real writes to file descriptors:
> 
> sys_rtdm_write() -> __rt_dev_write() -> MAJOR_FUNCTION_WRAPPER(write)
> 
> it creates a TH (top half) and a BH (bottom half) for scheduling
> purposes.. but it does wrap a write function.
> 
> So.. unfortunately without knowing more of the xenomai internals all
> that can be done here is to analyse the kernel dump and check what is
> the issue in the kernel (having its debug symbols etc) OR, easier,
> like I proposed before, you get the source code you have and bisect to
> find the change that is causing the dead lock.
> 
> It looks like xenomai is "wrapping" the systemcalls not allowing them
> to happen so it can keep the realtimeness of the application.. the
> __pse51_rtdm_fd_start global variable is handled by sys_rtdm_open()
> and all __wrap_SYSTEMCALL() functions... so its likely that the issue
> happened doing a
> 
> __wrap_write() using the xenomai posix skin for the realtime operation
> (if that makes any sense to you).
> 
> Cheers o/
> 
> > > >> Jul  3 10:24:35 yx kernel: [ 1240.166257] Call Trace:
> > > >> Jul  3 10:24:35 yx kernel: [ 1240.166266]  _raw_spin_lock+0x20/0x30
> > > >> Jul  3 10:24:35 yx kernel: [ 1240.166275]  can_write+0x6c/0x2c0 [advcan]
> > > >> Jul  3 10:24:35 yx kernel: [ 1240.166281]  ? dequeue_signal+0xae/0x1a0
> > > >> Jul  3 10:24:35 yx kernel: [ 1240.166286]  ? recalc_sigpending+0x1b/0x50
> > > >> Jul  3 10:24:35 yx kernel: [ 1240.166291]  ? __set_task_blocked+0x3c/0xa0
> > > >> Jul  3 10:24:35 yx kernel: [ 1240.166297]  __vfs_write+0x3a/0x190
> > > >> Jul  3 10:24:35 yx kernel: [ 1240.166303]  ? apparmor_file_permission+0x1a/0x20
> > > >> Jul  3 10:24:35 yx kernel: [ 1240.166307]  ? security_file_permission+0x3b/0xc0
> > > >> Jul  3 10:24:35 yx kernel: [ 1240.166312]  vfs_write+0xb8/0x1b0
> > > >> Jul  3 10:24:35 yx kernel: [ 1240.166316]  ksys_write+0x5c/0xe0
> > > >> Jul  3 10:24:35 yx kernel: [ 1240.166321]  __x64_sys_write+0x1a/0x20
> > > >> Jul  3 10:24:35 yx kernel: [ 1240.166326]  do_syscall_64+0x87/0x250
> > > >> Jul  3 10:24:35 yx kernel: [ 1240.166331]
> > > >> entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 

-- 
Robert Heller             -- 978-544-6933 Cell: 413-658-7953 GV: 978-633-5364
Deepwoods Software        -- Custom Software Services
http://www.deepsoft.com/  -- Linux Administration Services
heller at deepsoft.com       -- Webhosting Services
                                         




More information about the ubuntu-users mailing list