
Bits to notes. To money?
Previously on Stream of unconsciousness: a first (simple) audio streaming implementation has been put in place using the Audio File Stream Example. The main parser mechanism has been analysed but a lot of mysteries about the callbacks have to be uncovered. If you are unsatisfied with this micro-summary, you can read the whole story on this post.
Time has come to introduce another audio-related library: the Audio Queue Services library. This is a C API under the under the Audio Toolbox framework (the same framework of the Audio File Stream library introduced in the previous post).
This library can be used to record and, as in our case, to play audio. Two fundamental documents to understand audio playing are: About Audio Queues that introduces the library main concepts and structure, and Playing Audio that, as its name implies deals with the details of audio playing.
As described in these documents, playing audio is just a matter of:
- Define a custom structure to manage state, format, and path information.
- Write an audio queue callback function to perform the actual playback.
- Write code to determine a good size for the audio queue buffers.
- Open an audio file for playback and determine its audio data format.
- Create a playback audio queue and configure it for playback.
- Allocate and enqueue audio queue buffers. Tell the audio queue to start playing. When done, the playback callback tells the audio queue to stop.
- Dispose of the audio queue. Release resources.
Let’s see how these tasks are dealt with in the code example starting with the first callback we still have to describe: PropertyListenerProc.
The PropertyListenerProc callback definition is:
typedef void (*AudioFileStream_PropertyListenerProc) (
void *inClientData,
AudioFileStreamID inAudioFileStream,
AudioFileStreamPropertyID inPropertyID,
UInt32 *ioFlags
);
This callback has to be implemented in your code and is invoked by the parser each time it parses (it is its job, after all) or sets an audio property. The meaning is (parser speaking now): “Let’s see these bytes, and these, and these..oh wait I’ve found/set an audio property!!”.
The example code does the following:
- retrieves the user data (as a MyData structure, more on this later)
- if the property that has been set is a Ready to Produce Packets property (kAudioFileStreamProperty_ReadyToProducePackets), meaning that the parser has covered all the useful properties and is going to parse the audio data:
- the audio queue and the audio queue buffers are initialized
- a listener (MyAudioQueueIsRunningCallback callback) is attached to the change of state of the audio queue ‘running’ property.
The first thing worth some explanation is the MyData structure and its use in the code example. Remember that the Audio File Stream library is function oriented; in particular note that the inPropertyListenerProc and inPacketsProc parameters of the AudioFileStreamOpen function are bound to specific function signature and cannot be ObjC instance methods. Each time a callback is called it has no context per-se (it is somehow stateless); the context of this callback (i.e. all the relevant information that describe the streaming session) has to be passed as a parameter, and that’s the meaning of the inClientData parameter passed to the AudioFileStreamOpen function. The client code uses this parameter to track a certain streaming session state and to retrieve its context in the different callbacks. In the case of this code example, a struct, called MyData, is used to keep track of everything the process is in need of (e.g. the audio file stream identifier, the audio queue and queue buffers, the mutexes to synchronize the bytes consumption with the music being played and so on).
The structure is declared as follows:
struct MyData
{
AudioFileStreamID audioFileStream; // the audio file stream parser
AudioQueueRef audioQueue; // the audio queue
AudioQueueBufferRef audioQueueBuffer[kNumAQBufs]; // audio queue buffers
AudioStreamPacketDescription packetDescs[kAQMaxPacketDescs]; // packet descriptions for enqueuing audio
unsigned int fillBufferIndex; // the index of the audioQueueBuffer that is being filled
size_t bytesFilled; // how many bytes have been filled
size_t packetsFilled; // how many packets have been filled
bool inuse[kNumAQBufs]; // flags to indicate that a buffer is still in use
bool started; // flag to indicate that the queue has been started
bool failed; // flag to indicate an error occurred
pthread_mutex_t mutex; // a mutex to protect the inuse flags
pthread_cond_t cond; // a condition varable for handling the inuse flags
pthread_cond_t done; // a condition varable for handling the inuse flags
};
typedef struct MyData MyData;
The definition of this structure clears the point 1 of the above list.
Two other important things that happen under the PropertyListenerProc callback are the audio queue and queue buffers initialization and the MyAudioQueueIsRunningCallback set up.
The audio queue is initialized (AudioQueueNewOutput function) with the format of the audio to play (&asbd), an audio queue callback (MyAudioQueueOutputCallback, see point 2 of the above list) and the timeless MyData structure. An output parameter is used to save a reference to the newly created playback audio queue; guess where??? In a field of the MyData structure, of course!! This covers the points 4 (more or less) and 5 of the list.
As a part of the audio queue initialization, the magic cookie is also initialized. Some file formats (e.g. MPEG 4 AAC) need a structure that contains some audio metadata: that’s the magic cookie. In the code example its value is read from the audio stream session (AudioFileStreamGetProperty function) and it is written in the audio queue (AudioQueueSetProperty function).
Three (as suggested by the documentation) audio queue buffers are also initialized (AudioQueueAllocateBuffer function) with a constant size (this should shortcut the point 3 of the list). Part of the point 6 of the list is also covered (later for the other part).
That’s the whole setup thing that comes with the PropertyListenerProc callback, called when the parser is ‘ready to play’.
When everything is in place, the real audio packets make their appearance. Each time some packets are parsed the PacketsProc callback is called.
It is declared as follows:
typedef void (*AudioFileStream_PacketsProc) (
void *inClientData,
UInt32 inNumberBytes,
UInt32 inNumberPackets,
const void *inInputData,
AudioStreamPacketDescription *inPacketDescriptions
);
This callback fills the queue buffer in use with the audio data (the bytes of the audio packets) and, each time a buffer is full, it enqueues it (MyEnqueueBuffer function that calls the AudioQueueEnqueueBuffer function and starts the queue if needed) and asks for the next buffer to be filled. The latter task is performed by the WaitForFreeBuffer function, this function asks for a ‘used’ buffer (a buffer with already played content) and waits for it stopping the thread with a mutex. Keep in mind that the bytes producer (the stream parser) is faster than the bytes consumer (the audio ‘player’) and some ‘retard’ has to be artificially put on the producer size (and that is exactly what this function does). That completes the point 6 of the above list. The point 7 (dispose and release) is managed at the end of the loop in the main function.
Each time an audio buffer has been ‘played’, the audio queue callback (MyAudioQueueOutputCallback) is called. It allows unlocking the mutex set by the WaitForFreeBuffer function. Remember the listener on the change of state of the audio queue ‘running’ property (MyAudioQueueIsRunningCallback)? This function is also used to unlock the mutex when the audio queue has done playing.
More or less that’s the whole story of the Audio File Stream Example. It seems a lot convoluted (OK, it IS actually convoluted), but the more you study it and the more you feel at ease with it.
The example is just this, an example to introduce two audio libraries; it is far from complete (for example it doesn’t handle CBR data) and has some limitations. For a detailed analysis of its problems and a brilliant example of how to overcome them have a look at Streaming and playing an MP3 stream and its revision: Revisiting an old post: Streaming and playing an MP3 stream. They offer a very interesting analysis of the Audio File Stream Example and its limitations and provide an extension to the example to implement a sample application that streams and plays an audio file from a URL on the iPhone or Mac (source code downloadable by Github!!!). The whole blog (Cocoa with Love) by Matt Gallager is a GREAT place to find technical advice, hints, code examples and more; definitely a place to go if you are involved in OS X / iOS development (we’ve also added the link to our blogroll section on the right).
That’s it. See you next time with our inglorious first implementation of the audio streaming.