Page MenuHomeFreeBSD

sound: Get rid of snd_clone and use DEVFS_CDEVPRIV(9)
ClosedPublic

Authored by christos on Mar 18 2024, 5:30 PM.
Tags
None
Referenced Files
F102092085: D44411.diff
Thu, Nov 7, 12:44 PM
F102021841: D44411.diff
Wed, Nov 6, 3:49 PM
F102021832: D44411.diff
Wed, Nov 6, 3:49 PM
Unknown Object (File)
Mon, Oct 21, 1:12 AM
Unknown Object (File)
Wed, Oct 16, 1:05 AM
Unknown Object (File)
Tue, Oct 15, 12:07 PM
Unknown Object (File)
Mon, Oct 14, 12:06 PM
Unknown Object (File)
Wed, Oct 9, 10:41 PM
Subscribers

Details

Summary

Currently the snd_clone framework creates device nodes on-demand for
every channel, through the dsp_clone() callback, and is responsible for
routing audio to the appropriate channel(s). This patch gets rid of the
whole snd_clone framework (including any related sysctls) and instead
uses DEVFS_CDEVPRIV(9) to handle device opening, channel allocation and
audio routing. This results in a significant reduction in code size as
well as complexity.

Behavior that is preserved:

  • hw.snd.basename_clone.
  • Exclusive access of an audio device (i.e VCHANs disabled).
  • Multiple processes can read from/write to the device.
  • A device can only be opened as many times as the maximum allowed channel number (see SND_MAXHWCHAN in pcm/sound.h).

Behavior changes:

Only one /dev/dspX device node is created (on attach) for each audio
device, as opposed to the current /dev/dspX.Y devices created by
snd_clone. According to the sound(4) man page, devices are not meant to
be opened through /dev/dspX.Y anyway, so it is best if we do not create
device nodes for them in the first place.

Sponsored by: The FreeBSD Foundation
MFC after: 3 weeks

Diff Detail

Repository
rG FreeBSD src repository
Lint
Lint Skipped
Unit
Tests Skipped
Build Status
Buildable 56675
Build 53563: arc lint + arc unit

Event Timeline

christos created this revision.

@markj Do you think using an SX lock for the cdevpriv TAILQ is a good choice?
@dev_submerge.ch Does this patch build on your machine? Also do you think setting the format SND_FORMAT(AFMT_U8, 1, 0) and the rate to DSP_DEFAULT_SPEED in dsp_open() is correct? They get overriden by the VCHAN settings anyway.

Additional work that will be implemented:

  • Update man page.
  • Be backwards compatible with the OSSv4 device aliases, as discussed in freebsd-multimedia@
  • Add some comments where appropriate.
  • Make sure no IOCTL breaks.
  • Attach more than one sound card to make sure everything works as expected.
  • See if we need to clean up anything from unit.c.

So far I've tested the following and they all seem to work fine:

  • jackd -R -d oss -r 48000 -C /dev/dsp
  • mpv and a few other players
  • /usr/share/examples/sound/basic.c
  • Hot-unplug during playback and see if hw.snd.timeout takes effect (it does).
  • for i in $(seq 1 n); do cat /dev/random >/dev/dsp & cat /dev/dsp >/dev/null & done
  • kldunload during card usage.
  • /dev/sndstat also reports things fine.
sys/dev/sound/pcm/dsp.c
53

Why do you need this list?

We should update sound(4) to indicate that /dev/dspX.Y are deprecated and will be removed, not just strongly discouraged. That said, we don't list /dev/dsp or /dev/dsp0 in sound(4) right now, only /dev/dsp%d.%d and related. We should add those, and probably also find a better representation than printf's %d

We should update sound(4) to indicate that /dev/dspX.Y are deprecated and will be removed, not just strongly discouraged. That said, we don't list /dev/dsp or /dev/dsp0 in sound(4) right now, only /dev/dsp%d.%d and related. We should add those, and probably also find a better representation than printf's %d

As discussed in https://lists.freebsd.org/archives/freebsd-multimedia/2024-March/002283.html, we can allow access to /dev/dspX.Y but this will actually have no effect in reality. sound(4) will just dev_ref(/dev/dspX) and create/select an existing channel without caring about what the user requested (basically /dev/dspX.Y will now be an alias of /dev/dspX). I think this could be a good idea to preserve backwards compatibility, but I'm not really sure if there are applications that currently access devices by accessing "hardcoded" channels. Currently each /dev/dspX.Y node can only be opened by a single process, so it doesn't make much sense for an application to open it like this, because if another application has the node, open() will fail.

Anyway, if more people think preserving the old interface, even as an alias, is a worthwhile effort, I have no strong objections.

sys/dev/sound/pcm/dsp.c
53

dsp_oss_audioinfo() needs to somehow access the appropriate clone's channels, which was previously embedded in cdev->si_drv1 (see dsp_cdevinfo_alloc() and the PCM_RDCH/PCM_WRCH/... macros in the left diff). Since this information is now part of dsp_cdevpriv, we need to somehow have access to it in dsp_oss_audioinfo().

An alternative solution could be to pass the priv structure as an argument in dsp_oss_audioinfo() but I find this to be quite ugly, because pcm/mixer.c also calls this function.

@dev_submerge.ch Does this patch build on your machine?

A full build still fails. There's a stale entry of clone.c in sys/conf/files. If I remove that, build succeeds.

sys/dev/sound/pcm/dsp.c
53

This does not quite answers my question, at least not in the way I can comprehend. Let me try to ask this differently.

Devfs already keeps the list of all cdevpriv's associated with the given cdev. I can easily add some KPI that would iterate over the list and execute a callback. Then only issue I see with it is that the callback would be called with some mutex held, which prevents doing anything non-trivial there, but just fetching the data. Would it be enough?

@dev_submerge.ch Also do you think setting the format SND_FORMAT(AFMT_U8, 1, 0) and the rate to DSP_DEFAULT_SPEED in dsp_open() is correct? They get overriden by the VCHAN settings anyway.

I suppose it's as good or bad as any other format / speed. Most of my pcm devices report that format / speed, when unused. So I think it's correct. The alternative of setting either format or speed to 0 for chn_reset could leave some fields uninitialized.

We should update sound(4) to indicate that /dev/dspX.Y are deprecated and will be removed, not just strongly discouraged. That said, we don't list /dev/dsp or /dev/dsp0 in sound(4) right now, only /dev/dsp%d.%d and related. We should add those, and probably also find a better representation than printf's %d

As discussed in https://lists.freebsd.org/archives/freebsd-multimedia/2024-March/002283.html, we can allow access to /dev/dspX.Y but this will actually have no effect in reality. sound(4) will just dev_ref(/dev/dspX) and create/select an existing channel without caring about what the user requested (basically /dev/dspX.Y will now be an alias of /dev/dspX). I think this could be a good idea to preserve backwards compatibility, but I'm not really sure if there are applications that currently access devices by accessing "hardcoded" channels. Currently each /dev/dspX.Y node can only be opened by a single process, so it doesn't make much sense for an application to open it like this, because if another application has the node, open() will fail.

Anyway, if more people think preserving the old interface, even as an alias, is a worthwhile effort, I have no strong objections.

I concur with Christos here, it's unlikely that applications use explicit clones (/dev/dspX.Y). If any applications do that, they should be fixed. There's no guarantee that a given clone isn't already in use.

On a side note, I know now why I couldn't use /dev/dspX.Y devices when developing Jack. Due to shortcomings of the OSS API, Jack opens the device separately for playback and recording, and opening the same /dev/dspX.Y a second time fails. Thus playback only would work, but duplex would not.

sys/dev/sound/pcm/dsp.c
2262

@kib

Devfs already keeps the list of all cdevpriv's associated with the given cdev. I can easily add some KPI that would iterate over the list and execute a callback. Then only issue I see with it is that the callback would be called with some mutex held, which prevents doing anything non-trivial there, but just fetching the data. Would it be enough?

So, here is where I am using the priv list. I think this operation could indeed just be part of a callback.

@dev_submerge.ch Also do you think setting the format SND_FORMAT(AFMT_U8, 1, 0) and the rate to DSP_DEFAULT_SPEED in dsp_open() is correct? They get overriden by the VCHAN settings anyway.

I suppose it's as good or bad as any other format / speed. Most of my pcm devices report that format / speed, when unused. So I think it's correct. The alternative of setting either format or speed to 0 for chn_reset could leave some fields uninitialized.

We should update sound(4) to indicate that /dev/dspX.Y are deprecated and will be removed, not just strongly discouraged. That said, we don't list /dev/dsp or /dev/dsp0 in sound(4) right now, only /dev/dsp%d.%d and related. We should add those, and probably also find a better representation than printf's %d

As discussed in https://lists.freebsd.org/archives/freebsd-multimedia/2024-March/002283.html, we can allow access to /dev/dspX.Y but this will actually have no effect in reality. sound(4) will just dev_ref(/dev/dspX) and create/select an existing channel without caring about what the user requested (basically /dev/dspX.Y will now be an alias of /dev/dspX). I think this could be a good idea to preserve backwards compatibility, but I'm not really sure if there are applications that currently access devices by accessing "hardcoded" channels. Currently each /dev/dspX.Y node can only be opened by a single process, so it doesn't make much sense for an application to open it like this, because if another application has the node, open() will fail.

Anyway, if more people think preserving the old interface, even as an alias, is a worthwhile effort, I have no strong objections.

I concur with Christos here, it's unlikely that applications use explicit clones (/dev/dspX.Y). If any applications do that, they should be fixed. There's no guarantee that a given clone isn't already in use.

On a side note, I know now why I couldn't use /dev/dspX.Y devices when developing Jack. Due to shortcomings of the OSS API, Jack opens the device separately for playback and recording, and opening the same /dev/dspX.Y a second time fails. Thus playback only would work, but duplex would not.

I think what we want is to make all dsp_cdevs[] aliases of /dev/dsp (like the OSSv4 compatibility aliases) to just preserve compatibility and add, and perhaps add a deprecation entry in the man page.

As for dspX.Y[r|p|vr|vp] we could either (I am more inclined to go with the 1st option):

  1. Disallow access altogether and only keep them in dsp_cdevs[] for /dev/sndstat use.
  2. Make them aliases as well.
sys/dev/sound/pcm/dsp.c
2262
christos marked 3 inline comments as done.
  • Remove stale clone.c from sys/conf/files.
  • Removed cdevpriv TAILQ, depends on D44469, see dsp_oss_audioinfo().
  • Update man page.
  • Support OSSv4 compatibility aliases.

Get rid of now-unused PCMMAXCLONE define.

christos retitled this revision from WIP: sound: Get rid of snd_clone and use DEVFS_CDEVPRIV(9) to sound: Get rid of snd_clone and use DEVFS_CDEVPRIV(9).Mar 23 2024, 3:54 AM
sys/dev/sound/pcm/dsp.c
66

Can this be simplified?

#define DSP_REGISTERED(dsp) (PCM_REGISTERED(dsp) && (dsp)->dsp_dev !=  NULL)
145

Why is it 0600 instead of 0666?

149

It would be nice to print the value of err in this message.

2126
christos marked 4 inline comments as done.

Address Mark's comments.

This looks reasonable to me. I will test it on my desktop at the end of the week once I'm back home.

This revision is now accepted and ready to land.Mar 26 2024, 3:06 AM

My first round of tests was all good, no regressions so far. I still have to test mmap'ed operation, probably tomorrow.

Do we have any systematic tests yet to check the whole OSS API? Or at least the most common ioctls?

The mmap'ed tests worked fine in general, but I hit the following while starting two instances simultaneously:

lock order reversal:
 1st 0xfffff8012e17e0a0 pcm7:virtual:dsp7.vr0 (pcm virtual record channel, sleep mutex) @ /usr/src/sys/dev/sound/pcm/dsp.c:2617
 2nd 0xfffff8012e17e120 pcm7 (sound cdev, sleep mutex) @ /usr/src/sys/dev/sound/pcm/channel.c:2214
lock order sound cdev -> pcm virtual record channel established at:
#0 0xffffffff80bc7b7a at witness_checkorder+0x2fa
#1 0xffffffff80b2d200 at __mtx_lock_flags+0x90
#2 0xffffffff808e6b7d at sound_oss_sysinfo+0x1ed
#3 0xffffffff808cf810 at dsp_ioctl+0x960
#4 0xffffffff809db8f2 at devfs_ioctl+0xd2
#5 0xffffffff80c63092 at vn_ioctl+0xc2
#6 0xffffffff809dbfce at devfs_ioctl_f+0x1e
#7 0xffffffff80bce186 at kern_ioctl+0x286
#8 0xffffffff80bcde93 at sys_ioctl+0x143
#9 0xffffffff8105b473 at amd64_syscall+0x153
#10 0xffffffff8102d10b at fast_syscall_common+0xf8
lock order pcm virtual record channel -> sound cdev attempted at:
#0 0xffffffff80bc83e3 at witness_checkorder+0xb63
#1 0xffffffff80b2d200 at __mtx_lock_flags+0x90
#2 0xffffffff808c949a at chn_trigger+0x26a
#3 0xffffffff808e9c6c at vchan_trigger+0x12c
#4 0xffffffff808c9321 at chn_trigger+0xf1
#5 0xffffffff808d5acb at dsp_oss_syncstart+0x1bb
#6 0xffffffff808d0b40 at dsp_ioctl+0x1c90
#7 0xffffffff809db8f2 at devfs_ioctl+0xd2
#8 0xffffffff80c63092 at vn_ioctl+0xc2
#9 0xffffffff809dbfce at devfs_ioctl_f+0x1e
#10 0xffffffff80bce186 at kern_ioctl+0x286
#11 0xffffffff80bcde93 at sys_ioctl+0x143
#12 0xffffffff8105b473 at amd64_syscall+0x153
#13 0xffffffff8102d10b at fast_syscall_common+0xf8

I was able to reproduce this a second time, with some "luck" in timing.

Didn't investigate yet, I want to do some more testing first.

My first round of tests was all good, no regressions so far. I still have to test mmap'ed operation, probably tomorrow.

Do we have any systematic tests yet to check the whole OSS API? Or at least the most common ioctls?

To my knowledge at least, we don't.

The mmap'ed tests worked fine in general, but I hit the following while starting two instances simultaneously:

lock order reversal:
 1st 0xfffff8012e17e0a0 pcm7:virtual:dsp7.vr0 (pcm virtual record channel, sleep mutex) @ /usr/src/sys/dev/sound/pcm/dsp.c:2617
 2nd 0xfffff8012e17e120 pcm7 (sound cdev, sleep mutex) @ /usr/src/sys/dev/sound/pcm/channel.c:2214
lock order sound cdev -> pcm virtual record channel established at:
#0 0xffffffff80bc7b7a at witness_checkorder+0x2fa
#1 0xffffffff80b2d200 at __mtx_lock_flags+0x90
#2 0xffffffff808e6b7d at sound_oss_sysinfo+0x1ed
#3 0xffffffff808cf810 at dsp_ioctl+0x960
#4 0xffffffff809db8f2 at devfs_ioctl+0xd2
#5 0xffffffff80c63092 at vn_ioctl+0xc2
#6 0xffffffff809dbfce at devfs_ioctl_f+0x1e
#7 0xffffffff80bce186 at kern_ioctl+0x286
#8 0xffffffff80bcde93 at sys_ioctl+0x143
#9 0xffffffff8105b473 at amd64_syscall+0x153
#10 0xffffffff8102d10b at fast_syscall_common+0xf8
lock order pcm virtual record channel -> sound cdev attempted at:
#0 0xffffffff80bc83e3 at witness_checkorder+0xb63
#1 0xffffffff80b2d200 at __mtx_lock_flags+0x90
#2 0xffffffff808c949a at chn_trigger+0x26a
#3 0xffffffff808e9c6c at vchan_trigger+0x12c
#4 0xffffffff808c9321 at chn_trigger+0xf1
#5 0xffffffff808d5acb at dsp_oss_syncstart+0x1bb
#6 0xffffffff808d0b40 at dsp_ioctl+0x1c90
#7 0xffffffff809db8f2 at devfs_ioctl+0xd2
#8 0xffffffff80c63092 at vn_ioctl+0xc2
#9 0xffffffff809dbfce at devfs_ioctl_f+0x1e
#10 0xffffffff80bce186 at kern_ioctl+0x286
#11 0xffffffff80bcde93 at sys_ioctl+0x143
#12 0xffffffff8105b473 at amd64_syscall+0x153
#13 0xffffffff8102d10b at fast_syscall_common+0xf8

I was able to reproduce this a second time, with some "luck" in timing.

Didn't investigate yet, I want to do some more testing first.

Can you share the exact commands you ran to get this?

The mmap'ed tests worked fine in general, but I hit the following while starting two instances simultaneously:

lock order reversal:
 1st 0xfffff8012e17e0a0 pcm7:virtual:dsp7.vr0 (pcm virtual record channel, sleep mutex) @ /usr/src/sys/dev/sound/pcm/dsp.c:2617
 2nd 0xfffff8012e17e120 pcm7 (sound cdev, sleep mutex) @ /usr/src/sys/dev/sound/pcm/channel.c:2214
lock order sound cdev -> pcm virtual record channel established at:
#0 0xffffffff80bc7b7a at witness_checkorder+0x2fa
#1 0xffffffff80b2d200 at __mtx_lock_flags+0x90
#2 0xffffffff808e6b7d at sound_oss_sysinfo+0x1ed
#3 0xffffffff808cf810 at dsp_ioctl+0x960
#4 0xffffffff809db8f2 at devfs_ioctl+0xd2
#5 0xffffffff80c63092 at vn_ioctl+0xc2
#6 0xffffffff809dbfce at devfs_ioctl_f+0x1e
#7 0xffffffff80bce186 at kern_ioctl+0x286
#8 0xffffffff80bcde93 at sys_ioctl+0x143
#9 0xffffffff8105b473 at amd64_syscall+0x153
#10 0xffffffff8102d10b at fast_syscall_common+0xf8
lock order pcm virtual record channel -> sound cdev attempted at:
#0 0xffffffff80bc83e3 at witness_checkorder+0xb63
#1 0xffffffff80b2d200 at __mtx_lock_flags+0x90
#2 0xffffffff808c949a at chn_trigger+0x26a
#3 0xffffffff808e9c6c at vchan_trigger+0x12c
#4 0xffffffff808c9321 at chn_trigger+0xf1
#5 0xffffffff808d5acb at dsp_oss_syncstart+0x1bb
#6 0xffffffff808d0b40 at dsp_ioctl+0x1c90
#7 0xffffffff809db8f2 at devfs_ioctl+0xd2
#8 0xffffffff80c63092 at vn_ioctl+0xc2
#9 0xffffffff809dbfce at devfs_ioctl_f+0x1e
#10 0xffffffff80bce186 at kern_ioctl+0x286
#11 0xffffffff80bcde93 at sys_ioctl+0x143
#12 0xffffffff8105b473 at amd64_syscall+0x153
#13 0xffffffff8102d10b at fast_syscall_common+0xf8

I was able to reproduce this a second time, with some "luck" in timing.

Didn't investigate yet, I want to do some more testing first.

Can you share the exact commands you ran to get this?

Unfortunately my mmap'ed Jack backend is still not in ports, I used the test executable of the sosso library. From memory:

pkg install cmake spdlog
git clone https://github.com/0EVSG/sosso.git sosso_src
mkdir sosso_build
cd sosso_build
cmake ../sosso_src
make

Then you start the test executable with ./sosso_test /dev/dsp1. It will open the pcm device in duplex mode, mmap'ed if available, and processes some audio data (silence) with simulated under- and overruns. Lots of noise on the console, but we only care about how it opens and starts the pcm channel here.

To reproduce, I started two instances manually within ~1s, standard settings with vchans enabled. Maybe needs scripting to get the timing right. Good luck!

I'm not sure this is a regression or actually related to your changes, though. Have to verify with CURRENT main branch first. AFAICT, you didn't touch the places where the order of locking is determined - either the locking mechanism of the old clones was not susceptible to this problem, or it's already present in main and we just never saw that occur.

"Good news", I can reproduce the lock order reversal on the main branch, without this patch - it's not related to the changes here. How should we proceed with that?

"Good news", I can reproduce the lock order reversal on the main branch, without this patch - it's not related to the changes here. How should we proceed with that?

I will try to reproduce the LOR and possibly fix it. In the meantime I think we can go ahead and commit this patch.

I've been running this on my desktop with no issues.

Re-opening the review. Apparently the patch introduced a regression where where
dsp_[get|set]_flags() would fetch the flags of the wrong softc, resulting in
generally strange behavior if the machine had more than 1 audio device attached
(EBUSY's from dsp_open() etc). Get rid of dsp_[get|set]_flags() as they are no
longer needed and use pcm_[get|set]_flags() directly.

tested successfully on my laptop, which faced the regression!

This revision is now accepted and ready to land.Apr 11 2024, 4:11 PM

@meka_tilda.center Let me know when you test it. I think this should fix your virtual_oss returning EBUSY as well.

Works here, too. I did only the short test, but no EBUSY.