Introduction
Last September, I wrote a post titled "Streaming and playing an MP3 stream". The post was largely an experiment — I just wanted to see if I could play a streaming MP3 by quickly adapting Apple's AudioFileStreamExample to accept an HTTP data stream.
Unexpectedly, the post became one of my most popular. The attention quickly revealed the limitations in my approach:
- The blend of Objective-C and C was muddled and led to a situation where neither were being used cleanly.
- The boolean flags I copied from the original example were a bad way to describe the playback state and lots of situations were not covered by these flags.
- Sending notifications to the user-interface on a thread that isn't the main thread causes problems.
- The extra thread I added (the download thread) was never thread-safe.
I've finally decided to take the time to present a solution to these issues and present an approach which is a little more robust and a little easier to extend if needed.
You can download the complete AudioStreamer project as a zip file(around 110kB) which contains Xcode projects for both iPhone and Mac OS. You can also browse the source code repository.
Limited scope
One point should be clarified before I continue: this class is intended for streamingaudio. By streaming, I don't simply mean "an audio file transferred over HTTP". Instead, I mean a continuous HTTP source without an end that continues indefinitely (like a radio station, not a single song).
Yes, this class will handle fixed-length files transferred over HTTP but it is not ideal for the task.
This class does not handle:
- Buffering of data to a file
- Seeking within downloaded data
- Feedback about the total length of the file
- Parsing of ID3 metadata
These things often can't be done on streaming data, so this class doesn't try. See the "Adding other functionality" section for hints about how the class could be reorganised to handle some of these features.
Taking code out of C functions
Since I had borrowed the AudioFileStream
and AudioQueue
callback functions from Apple's example, they were Standard C.
My first change was to make these 6 callback functions (7 including the CFReadStream
callback) little more than wrappers around Objective-C methods:
void MyPacketsProc(
void *inClientData,
UInt32 inNumberBytes,
UInt32 inNumberPackets,
const void *inInputData,
AudioStreamPacketDescription *inPacketDescriptions)
{
// this is called by audio file stream when it finds packets of audio
AudioStreamer* streamer = (AudioStreamer *)inClientData;
[streamer
handleAudioPackets:inInputData
numberBytes:inNumberBytes
numberPackets:inNumberPackets
packetDescriptions:inPacketDescriptions];
}
At a compiled code level, this is a step backwards: all I've done is slowed the program down by an extra Objective-C message send.
Technically, a C function that takes a "context" pointer (like the inClientData
pointer here) is not significantly different to a method. What a method does is makes data hiding and data abstracted actions easier. Within a method, you can easily access the instance variables of an object and you don't need to explicitly pass context into each function.
This is the cliché argument in favor of object-orientation — but it isn't why I reorganized these functions and methods.
The honest reason why I did it is aesthetics: it is easier to read a class that is implemented using Objective-C methods alone — it's more consistent. I chose to move towards an Objective-C aesthetic and away from the Standard C aesthetic of the CoreAudio sample code to promote consistent formatting, consistent means of accessing state variables, consistent ways of invoking methods and consistent ways of synchronizing access to the class.
Describing state
With the majority of code now inside the class, I was in a better position to start handling changes through methods rather than direct member access.
My original approach to state came from Apple's original example. This example had just one piece of state: a bool
named finished
(which indicated that the run loop should exit).
The problem with this flag is how simple it is. It is unable to distinguish between the following:
- End of file, normal automatic stop.
- The user has asked the
AudioStreamer
to stop but the AudioQueue
thread has not yet responded. - An error has occurred before the
AudioQueue
thread is created and we must exit. - We are stopping the
AudioQueue
for temporary reasons (clearing it, changing device, seeking to a new point) but we don't want the loop to stop.
For Apple's example, there was no problem: the first case was the only one that ever occurred.
As a hasty solution, I had added started
and failed
flags but these really only covered the first and third case adequately.
In the end, I realized that the AudioStreamer
needed much more descriptive state where every combination of progress within each thread had a different position:
typedef enum
{
AS_INITIALIZED = 0,
AS_STARTING_FILE_THREAD,
AS_WAITING_FOR_DATA,
AS_WAITING_FOR_QUEUE_TO_START,
AS_PLAYING,
AS_BUFFERING,
AS_STOPPING,
AS_STOPPED,
AS_PAUSED
} AudioStreamerState;
and when stopping, one of the following values would also be needed:
typedef enum
{
AS_NO_STOP = 0,
AS_STOPPING_EOF,
AS_STOPPING_USER_ACTION,
AS_STOPPING_ERROR,
AS_STOPPING_TEMPORARILY
} AudioStreamerStopReason;
In this way, the state always describes where every thread is and the stop reason explains why a transition is occurring.
Combining this with an error code that replaces the old failed
flag, I now have a complete desription of the state.
By cleaning up the state of the object, I was able to make the object capable of state transitions that weren't previously possible including pausing/unpausing and returning to the AS_INITIALIZED
state after a stop (instead of requiring that the class be released after stopping).
Notifications
In the old version of the project the only way for the user-interface to follow the playback state was to observe the isPlaying
property on the object which reflected thekAudioQueueProperty_IsRunning
property of the AudioQueue.
This observing was handled through KeyValueObserving. I'm a big fan of KeyValueObserving for its simplicity and ubiquity but this was not the correct place to use it.
KeyValueObserving always invokes the observer methods in the same thread as the change. Since all changes in AudioStreamer happen in secondary threads, this means that the observer methods were getting invoked in secondary threads.
Why is this bad? A minor drawback is simply the unexpectedness for the observer but the biggest reason was that the sole purpose of observing this property was to update the user-interface and the user-interface on the iPhone cannot be updated from any thread except the main thread. Even on the Mac, performing updates off the main thread can have unexpected and glitchy results.
The solution is to retain the NSNotificationCenter
of the thread that first calls start
on the object and use this center to send messages as follows:
NSNotification *notification =
[NSNotification
notificationWithName:ASStatusChangedNotification
object:self];
[notificationCenter
performSelector:@selector(postNotification:)
onThread:[NSThread mainThread]
withObject:notification
waitUntilDone:NO];
Don't invoke postNotification:
directly from the secondary thread as, like most methods, it is not thread safe and it could be in use from the main thread.
Thread safety
Despite adding an extra thread on top of Apple's AudioFileStreamExample, I never really spent any time thinking about thread safety — a reckless approach to stability. In my defence Apple's example wasn't exactly cautious with its threads and would quit while the AudioQueue
's thread was still playing the last buffer.
The most efficient approach to threading is to carefully enter @synchronized
(or NSLock
orpthread_mutex_lock
) in a tight region around any use of a shared variable.
Unfortunately for the AudioStreamer
class, almost everything in the class is shared. Instead, I decided to go for the decidedly less efficient approach of running almost everything in the class within a @synchronized
section, emerging only at points when control must be yielded to other threads.
The drawback is that the code rarely runs simultaneously on multiple threads (although threading here is for blocking and I/O, not for multi-threaded performance reasons so that's not a probem). The advantage with this heavy-handed locking approach is that the only threading condition that may cause problems are deadlocks.
When do deadlocks occurs? Only when you're waiting for another thread to do something while you're inside the synchronized section needed by that other thread. The simple solution: never wait for another thread inside a synchronized section.
AudioStreamer
has three situations where 1 thread waits for another:
- The run loop (the
AudioFileStream
thread waits for any kind of control communication from the main thread or playback finished notification from the AudioQueue
thread). - The
enqueueBuffer
method (AudioFileStream
thread waits for the AudioQueue
thread to free up a buffer). - Synchronous
AudioQueueStop
invocations (waits for the AudioQueue
to release all buffers).
The first two points are easy: perform these actions (any any method invocation which invokes them) outside of the @synchronized
section.
The final point is harder: the synchronous stop must be performed inside the@synchronized
section to prevent multiple AudioQueueStop
actions occurring at once. To address this, the release of buffers by the AudioQueue
(in handleBufferCompleteForQueue:buffer:
) must perform its work without entering the @synchronized
section (although it's allowed to use the queueBuffersMutex
as normal since that isn't used by anything else during a synchronous stop).
Of course, every time the @sychronized
section is re-entered, a check must be performed to see if "control communication" has occurred (the class checks this by invoking the isFinishing
method and exiting if it returns YES
).
Adding other functionality
Get metadata
The easiest source of metadata comes from the HTTP headers. Inside thehandleReadFromStream:eventType:
method, use CFReadStreamCopyProperty
to copy thekCFStreamPropertyHTTPResponseHeader
property from the CFReadStreamRef
, then you can useCFHTTPMessageCopyAllHeaderFields
to copy the header fields out of the response. For many streaming audio servers, the stream name is one of these fields.
The considerably harder source of metadata are the ID3 tags. ID3v1 is always at the end of the file (so is useless when streaming). ID3v2 is located at the start so may be more accessible.
I've never read the ID3 tags but I suspect that if you cache the first few hundred kilobytes of the file somewhere as it loads, open that cache withAudioFileOpenWithCallbacks
and then read the kAudioFilePropertyID3Tag
withAudioFileGetProperty
you may be able to read the ID3 data (if it exists). Like I said though: I've never actually done this so I don't know for certain that it would work.
Stream fixed-length files
The biggest variation you may want to make to the class is to download fixed-length files, rather than streaming audio.
To handle this, the best approach is to remove the download from the class entirely. Download elsewhere and when "enough" (an amount you should determine on your own) of the file is downloaded, start a variation of the class that plays by streaming from a file on disk.
To adapt the class for streaming from a file on disk, remove the CFHTTPMessageRef
andCFReadStreamRef
code from openFileStream
and replace it with NSFileHandle
code that useswaitForDataInBackgroundAndNotify
to asynchronously stream the file in the same way thatCFReadStreamRef
streamed the network data.
Once you're streaming from a file, you'll probably want to permit seeking within the file. I've already put hooks within the file to seek (set the seekNeeded
flag to true and set the seekTime
to the time in seconds to which you want to seek) — however, the mechanics of seeking within the file would be dependent on how you access the file.
Incidentally, the AudioFileStreamSeek
function seems completely broken. If you can't get it to work (as I couldn't) just seek to a new point in the file, set discontinuous
to true
and let AudioFileStream
deal with it.
Handling data interruptions
At the moment, if the AudioQueue
has no more buffers to play, the state will transition toAS_BUFFERING
. At this point, no specific action is taken to resolve this situation — it assumes that the network will eventually resume and requeue enough buffers.
I actually expect there will be cases where this action is insufficient — you may need to ensure that the AudioQueue
is paused until enough buffers are filled before resuming or even restart the download entirely. I haven't experimented much since it is easiest with streaming audio just to stop and start new.
Incidentally, if you're curious to know how many audio buffers are in use at any given time, uncomment the NSLog
line in the handleBufferCompleteForQueue:buffer:
method. This will log how many 1 kilobyte audio buffers are queued waiting for playback (when the queue reaches zero, the AudioStreamer
enters the AS_BUFFERING
state).
Conclusion
You can download the complete AudioStreamer project as a zip file(around 110kB) which contains Xcode projects for both iPhone and Mac OS. You can also browse the source code repository.
The functionality of this new version has not changed greatly — my purposed was to present a version that is more stable and tolerant of unexpected situations, rather than add new features.
As before, the AudioStreamer class should work on Mac OS X 10.5 and on the iPhone (SDK 2.0 and greater).
The source repository is hosted on github so you can browse, fork or track updates as you choose. I will likely update again in future (I can't imagine I've written this much code without causing more problems) and this way, you can see the changes I've made.
I hope this post has shown you a number of problems that can happen when code is written hastily. This doesn't mean you should always avoid hastily written code (timeliness and proof of concepts are important) but it does mean you should be practised at refactoring code and not simply slap poor fixes onto code that doesn't cleanly solve a problem in the first place.