You are viewing np237

 
 
26 September 2008 @ 12:54 pm
The ALSA/OSS debate is irrelevant  

After Lennart posted his summary of Linux sound APIs, the debate heated up again between ALSA and OSS zealots. Both ALSA and OSS developers think their API is the definitive interface for sound support in Unix and like to bash each other, and tell how the other API sucks and how the others are spreading FUD.

This is not only a sterile debate, but it is completely irrelevant. Do you know why? Because neither of them is a suitable sound API.

Technical constraints, functional requirements and history make the “ideal” sound stack on Linux look like the following:

  • a sound driver, consisting of a kernel module and if necessary a low-level library
  • if configured so, a sound daemon which brings software mixing and network support – PulseAudio finally kicking out all crappy alternatives
  • a high-level programming API for use by developers.
Only the last bit should be of interest to application developers. Everything else is plumbing.

When you develop a multimedia application or a mixer controller, why would you care of the insanely complex ALSA library or to make ioctls somewhere in /dev when GStreamer and Phonon provide you with high-level functions that can do it in a snap? When you want to insert sounds in your game or application, why would you care of those bits when, with SDL_Mixer or libcanberra, a single line of code is enough to decode and play the sound?

If we want to see sound working properly under Linux, we need to keep the stack simple and to clearly separate the roles. In the end, this boils down to very simple things:

  • Application developers must stop using ALSA and OSS. Both of them are completely unsuitable for the task, and it only serves the purpose of letting ALSA and OSS developers troll. Whichever application you are writing, there is one of the high-level frameworks – SDL, GStreamer, Phonon, libcanberra, KNotify, JACK – that is better.
  • OSS and ALSA should stop trying to be application-level APIs. Instead of trolling about which API is better, they need to write drivers so that sound fucking works instead of asking users to use Windows to see their brand new computer output some sound.
  • The plumbing layers (backends for the high-level APIs and PulseAudio) need excellent support for both ALSA and OSS. In the end, what matters is to have a driver for your sound card and to have it usable by all applications. Keeping the driver layer independent from the sound API is the only reliable way to keep things working over time. If someone else starts writing yet another driver framework, we will be able to make applications use it, instead of adding yet another ton of compatibility stacks.
  • In the end, compatibility layers that cross layers in the stack must disappear. ALSA emulation for PulseAudio? OSS emulation with a LD_PRELOAD hack? Seriously, WTF? Applications should not have used them as a sound API, and we need to trash this crap eventually.

Please, guys, hold off a minute. You are developing drivers and you must be praised for that. Keep doing that well, and stop trying to invade applications. There are people who may not know how to develop a driver, but who obviously know better what a sound API should look like.

 
 
( 28 comments )
(Anonymous) on September 26th, 2008 12:36 pm (UTC)
SDL, GStreamer, Phonon, libcanberra, KNotify, JACK: none of these is applicable for a (non-QT4) application that simply wants to output sound.

SDL: more suited for games
GStreamer: hmm, where is my callback?
Phonon: QT4
libcanberra: no, no sound events
KNotify: no KDE here
JACK: a bit more buffering would be nice

In my opinion, the higher level ALSA api with an output of "default" is indeed *the* api to use: simple, to the point, easy to use, multiple backends (gstreamer/pulseaudio/...).
(Frozen)(Thread) (Link)
np237np237 on September 26th, 2008 12:55 pm (UTC)
If you simply want to output sound, I really don’t get what is missing in libcanberra for this purpose.
(Frozen)(Parent) (Thread) (Link)
np237np237 on September 26th, 2008 12:56 pm (UTC)
I also forgot to add that the ALSA API does *not* have multiple backends. It has gross hacks to get through PulseAudio and OSS, but certainly not through GStreamer.
(Frozen)(Parent) (Thread) (Link)
Sapphire Catsapphirecat on October 1st, 2008 02:55 am (UTC)
And really, the *driver* should not be sending anything back to *userland*. Userland should be smart enough to get the right driver/config the first time around.

Which was actually your original point, I think.
(Frozen)(Parent) (Thread) (Link)
Mako the Wolfmakomk on October 6th, 2008 06:26 pm (UTC)
Of course, the part of the ALSA code that handles handing audio to the correct backend is entirely in userland. (Apparently, it's theoretically possible to get enough of ALSA running on OSes with no ALSA drivers or kernel support whatsoever in order to output sound from ALSA-supporting apps.)
(Frozen)(Parent) (Thread) (Link)
(Anonymous) on September 26th, 2008 12:58 pm (UTC)
Not right
You are not correct about how ALSA people see the ALSA API. Takashi himself thinks that the ALSA API is not ideal for normal applications programmers. That's very similar how I, Lennart, see the PulseAudio API. The PA API is comprehensive and not redundant, but due to its asynchronous nature very difficult to use.

The reason why I advocate to use the safe ALSA subset is mostly because it's the easiest not-completely-fucked-up API we have right now and it already is well established and will be supported for quite a while. I do believe that we need to establish an API for PCM that goes beyond ALSA and OSS and fixes all the problems we have now. But that's not there yet and I thus cannot recommend it. Note that I wrote "In the future I hope to introduce a more suitable and portable replacement for the safe ALSA subset of functions."

So again, neither I as the PA guy, nor Takashi as the ALSA guy consider their APIs to be definite choices. That is in contrast to the OSS people who apparently indeed are convinced their API is a good choice, although everyone else might disagree, especially me.

I thought my guide allowed people to read between the lines that neither ALSA not PA as an API are perfect: why else would I list two ALSA APIs there, and why would I sentences like "The full ALSA API can appear very complex and is large"? Also, please note that in the first section PA isn't even mentioned for usage for anything except mixers!

Lennart
(Frozen)(Thread) (Link)
np237np237 on September 26th, 2008 01:10 pm (UTC)
Re: Not right
FWIW I agree with almost everything that you said in your guide to sound APIs. However you recommend using ALSA for some cases (like mixer control applications or basic PCM playback) while there are much better replacements that do not have backend restrictions.

Of course it is only a good thing if the ALSA developers acknowledge their API is not what we need and if they can help design a better low-level API.
(Frozen)(Parent) (Thread) (Link)
(Anonymous) on September 26th, 2008 02:09 pm (UTC)
Re: Not right
I'd like to know which API you consider a better choice for writing mixers.

Also, I do think that the ALSA API is still the best choice for PCM. All other APIs have critical drawbacks.
(Frozen)(Parent) (Thread) (Link)
np237np237 on September 26th, 2008 03:08 pm (UTC)
Re: Not right
I'd like to know which API you consider a better choice for writing mixers.

High level APIs like GStreamer, Phonon and JACK (maybe even SDL ?) have all you need to write a mixer.
(Frozen)(Parent) (Thread) (Link)
(Anonymous) on September 26th, 2008 03:12 pm (UTC)
Re: Not right
The GStreamer mixer abstraction is a trainwreck. And they all only wrap hardware mixers, i.e. don't allow per-stream volume control which in my opinion is crucial to have.

Lennart
(Frozen)(Parent) (Thread) (Link)
np237np237 on September 26th, 2008 03:23 pm (UTC)
Re: Not right
Well, SDL does have a per-stream volume control, but only because it does software mixing by itself.

I agree that this is the point where you most need integration with the sound daemon layer, which is the one that actually does the mixing. For that, using the PulseAudio API is best, but you lose genericity. I’d prefer to see these features exposed in the high level API instead. Another solution is of course to say stop to all the backend madness and have all high-level APIs rely only on PulseAudio, which would not be a bad thing.
(Frozen)(Parent) (Thread) (Link)
(Anonymous) on October 1st, 2008 12:33 am (UTC)
Re: Not right
they might have fooled the distro maintainers, but we users are not fooled. pulse audio is not ready for prime time and the fact seemingly every distro under the sun is now using it is not a good thing at all!
(Frozen)(Parent) (Thread) (Link)
(Anonymous) on September 26th, 2008 05:15 pm (UTC)
*Bing bing bing*

We have a winner!
(Frozen)(Thread) (Link)
(Anonymous) on October 1st, 2008 03:55 am (UTC)
Have you used OSS4 yet? Didn't think so. I know game developers (yes, the real ones who do port to Linux) who absolutely hate ALSA. Many drivers are broken, many times it will lie to you about the frequency and cause buffering issues, etc.

All agree that OSS is way simpler and just works^TM. And wow, it would even be portable to other Unix systems!

OSS4 has per application volume control and more hardware support than ALSA with better sound quality. Why is anyone using ALSA? ALSA can't use my multiple mic inputs without crashing applications (Teamspeak, Mumble, Skype at the same time).

And for the record, until Pulseaudio becomes unbroken (how many ALSA/OSS apps out there don't work correctly? WHY do we need networked audio? HOW can a high level sound server achieve true low latency?

I'm sticking with OSS4. Perhaps you and the rest of the world should try it instead of sticking with your preconceived notions from 6 years ago. It's almost a completely different beast now.
(Frozen)(Thread) (Link)
np237np237 on October 1st, 2008 07:47 am (UTC)
Thanks for making my point, this aggressive and useless behavior is exactly what I’m talking about. And until you understand the OSS API is just as unsuitable as ALSA, we’re going nowhere.
(Frozen)(Parent) (Thread) (Link)
(Anonymous) on October 1st, 2008 01:30 pm (UTC)
It's still obvious you haven't touched OSS4 as it has its own evolved API while still being compatible with the ancient OSS.

"OSS 4.0 version fixes all the drawbacks of the earlier API. We have improved the device abstraction so that differences between the devices have been completely hidden from the applications. At the same time we have implemented some API extensions that permit control of every detail of the device if the application wants to do that (the utilities shipped with OSS provide complete control of the device features so there is no need to support them in the application)."

http://manuals.opensound.com/developer/ossapi.html


You say I'm being hostile, but it's obvious you're being naive.
(Frozen)(Parent) (Thread) (Link)
acidtaoistacidtaoist on October 1st, 2008 06:40 pm (UTC)
does OSS4 also allow multiple applications to use the soundcard concurrently?
(Frozen)(Parent) (Thread) (Link)
acidtaoistacidtaoist on October 1st, 2008 06:42 pm (UTC)
oh, and i forgot: does OSS4 also run on windows and OSX?

why do you guys keep forgetting portaudio? you behave as if unix was the only operating system around.
(Frozen)(Parent) (Thread) (Link)
(Anonymous) on December 2nd, 2008 09:15 am (UTC)
As an unix programmer for ten years, may I ask you,
"why are you thinking the oss ioctl style APIs are not suitable for
audio application programming? And could you give me some example how
"ioctl" gave you headache in your programming experience?

And why do I need a sound server to just play some music, do a voice chat,
play some games, etc?

(Frozen)(Parent) (Thread) (Link)
np237np237 on December 2nd, 2008 01:19 pm (UTC)
As an unix programmer for ten years

You’ve said all. As a Unix programmer for ten years as well, I have no trouble coding something involving ioctls. But I am a Unix programmer, not an audio application developer. Low-level Unix programming is not suitable at all for coding audio applications, you need a dedicated high-level API for that.

And why do I need a sound server to just play some music, do a voice chat,
play some games, etc?


I never said you need it. The sound server is and should remain an optional layer between the high-level API and the sound driver.
(Frozen)(Parent) (Thread) (Link)
(Anonymous) on October 1st, 2008 10:39 am (UTC)
I think there a two high-level APIs missing in the list of which APIs applications should probably use.
Pulseaudio - for full duplex desktop applications
OpenAL - for games that need complex 3d sound
(Frozen)(Thread) (Link)
Wandering Sagethe_wordspinner on October 1st, 2008 03:06 pm (UTC)
I'll toss in that in my situation, they're both worthless.

Bought a new desktop, with a Creative X-Fi Gamer, which meant that ALSA was out. I figured that was it for my short-lived Ubuntu days, but someone on the forums linked me to a lengthy article on how to switch to OSS. Mind you, I'd been on Linux for all of a day at this point, but I shrugged and slogged my way through the tutorial over the course of another day or so.

OSS now works. Every other bootup. With awful sound quality. In exactly two applications (Totem and Rhythmbox).

Honestly, this is pretty sad. I am completely over the "hardware developers need to give us drivers" argument--when I use an OS, I expect functionality, not broken drivers and obscure command line based installs to make things halfway work.



Which makes your entire point all the worse: why on earth are ALSA and OSS attempting to be high level when they can't even perform the tasks the yare supposed to (Make my sound card functional under Linux)?
(Frozen)(Thread) (Link)
londonplumber on July 29th, 2009 09:25 am (UTC)
First off, OSS doesn't support enough hardware. I don't know the exact numbers, but I remember support for ESI Juli@ recently announced as "big news!". ALSA supported this card for so long that I don't remember how long it was.

South West London Plumbing
(Frozen)(Thread) (Link)
(Anonymous) on November 22nd, 2009 06:49 am (UTC)
Quotation of Plato

Those who intend on becoming great should love neither themselves nor their own things, but only what is just, whether it happens to be done by themselves or others.
Quotation of Plato
(Frozen)(Thread) (Link)
(Anonymous) on January 22nd, 2010 08:12 pm (UTC)
pulseaudio over alsa or OSS?!?!?
are you for real???

pulseaudio sucks crap, it is the first thing to go if i am doing realtime audio in linux...1st off, you are comparing apples(oss/alsa) to oranges(jack,phonon,gstreamer). none of the latter actually are low-level. none communicate with the hardware...

...so your article makes no sense at all!!

and i think in the end it really isn't an alsa/OSS debate. alsa is clearly winning that war, and phonon and the rest can pretty much desolve...
bigger companies in the pro-audio world actually are writing alsa drivers for their hardware - ie: native instruments.. which is a big deal.

jack is the future, coupled with a refined ALSA set of tools..

furthermore, pulseaudio is extremely high-level and isn't suitable for anything but playing youtube and mp3's. pure garbage!

we need to stick to one standard - ALSA and potentially jack on top of that, if you have uses for it!
(Frozen)(Thread) (Link)
np237np237 on January 22nd, 2010 09:43 pm (UTC)
Re: pulseaudio over alsa or OSS?!?!?
You need to learn to read.
(Frozen)(Parent) (Thread) (Link)
(Anonymous) on April 20th, 2010 11:37 am (UTC)
I sort of agree...
As someone who does a lot of DSP work I can see why you'd have a lot to say about sound API's. However there is a lot more to this debate than just API's. Playback is not playback. There are things you should also take into account... for instance resampling. How resampling is done has a rather dramatic effect on sound accuracy. This doesn't matter for many applications... however it does matter for just as many. There are countless other scenarios similar to this that can vary depending on what you use.

I don't really disagree with what you say entirely, just that things like OSS, ALSA, and so on still matter a great deal. There are many times when you need to know exactly how your samples are being handled. If you're trying to write an application that has ANY emphisis on sound quality and you just pass it off to gstreamer/phonon/jack/etc blindly then you'll surely fail at that task.
(Frozen)(Thread) (Link)
np237np237 on April 20th, 2010 11:48 am (UTC)
Re: I sort of agree...
When you write an application to play MP3/Vorbis audio, the one thing that matters most in it is sound quality. Yet, using OSS or ALSA directly for the task of playing a MP3 would be completely inappropriate.

That’s exactly the opposite: you need to pass it to GStreamer or Phonon directly. And if there’s a problem with the quality of the output, fix it in GStreamer or Phonon. Not just work around the bug by using inappropriate APIs.
(Frozen)(Parent) (Thread) (Link)
( 28 comments )