What Is the AudioTrack in Android?

The AudioTrack class in Android is used for streaming and playing audio data (AudioTrack | Android Developers). It provides an interface between applications and the audio framework that handles playback of PCM audio buffers to the audio hardware. AudioTrack enables apps to output audio streams like music or voice playback by writing the raw audio data to a buffer managed by the AudioTrack instance.

Essentially, AudioTrack handles streaming, timing, and playback of audio resources. It plays a central role in controlling audio output in Android apps. Using AudioTrack allows precise scheduling of the audio data delivery to the audio sink and provides low latency and high performance audio playback capabilities.

Creating an AudioTrack

An AudioTrack object must be created to play a stream of audio content in Android. There are two constructors for creating an AudioTrack object:

The older constructor takes the following parameters:

Stream type (e.g. STREAM_MUSIC)
Sample rate in Hz

Channel configuration (e.g. CHANNEL_OUT_STEREO)
Audio format (e.g. ENCODING_PCM_16BIT)
Buffer size in bytes

Mode (e.g. MODE_STREAM)

This constructor is deprecated as of API level 26. The recommended approach is to use the AudioTrack.Builder class instead:

The AudioTrack.Builder takes the following parameters:

Sample rate
Channel mask
Audio format

Buffer size
Transfer mode

Some optional parameters that can be set via the builder include session ID, performance mode, audio attributes, etc. See the Android documentation for more details on the AudioTrack.Builder.

AudioTrack Methods

The AudioTrack class contains many important methods for controlling audio playback:

The play() method starts playing an audio track. For example:


AudioTrack track = new AudioTrack(...);
track.play();

This begins audio playback. The pause() method can be used to pause playback:


track.pause();

Calling play() again will resume playback from where it was paused. The flush() method discards all audio data that hasn’t been played yet:


track.flush();

This resets the track. The release() method frees all resources associated with an AudioTrack object:


track.release();

After calling release(), the AudioTrack can no longer be used. See the Android developer documentation for details on all available AudioTrack methods.

Audio Configuration

The audio configuration in Android refers to settings like sample rate, channel count, and encoding, which determine audio quality and performance. Some key points on audio configuration:

Higher sample rates like 44.1kHz or 48kHz can capture wider frequency ranges and result in better audio quality compared to lower rates like 8kHz or 16kHz. However, higher sample rates require more data and processing power which can negatively impact performance on resource-constrained devices. Most modern Android devices can support 44.1kHz or 48kHz sample rates.

The channel count determines if audio is mono, stereo, or surround sound. Stereo audio uses two channels and can provide directional sound, while mono uses just one channel. The encoding refers to the audio file compression format like MP3, AAC, FLAC etc. Lossless formats like FLAC provide better quality, while lossy formats like MP3 offer smaller file sizes.

The optimal configuration depends on the use case – music streaming apps may prefer 44.1kHz sample rate, 2 channels, and a lossless encoding like FLAC for quality, while a voice assistant may only need 16kHz, mono, and a smaller encoding like Opus. Developers need to choose settings suitable for their app based on factors like audio quality needs, network bandwidth, expected device capabilities etc. The Android audio policy allows enforcing appropriate configurations programmatically.

Audio Buffers

AudioTrack stores audio data in byte arrays or buffers before sending it to the audio hardware for playback. There are two main approaches for buffering audio data:

Streaming buffers: With this method, the audio data is continuously fed to AudioTrack in small chunks as it plays. The buffer size is generally small (few KB) and refilled often by the app. This allows audio playback to start with low latency since the buffer doesn’t need to fill up before playback starts. However, it requires the app to continuously push new data to AudioTrack [1].

Static/entire buffers: Here, the audio data for the full playback duration is written into the buffer before starting playback. The buffer size is usually large, often several MB. Playback can’t start until the entire buffer fills up initially. But then the app doesn’t need to keep pushing new data during playback. This method is useful for short audio files like sound effects [2].

Choosing between streaming or static buffers depends on the audio length, whether low latency is required, and the app’s capability to continuously supply data. For long tracks like songs, streaming is commonly used to start playback quickly without large buffers, while providing data continuously.

AudioTrack Threads

Playing audio using the AudioTrack class in Android can be done on the main UI thread or in a separate thread. There are tradeoffs with each approach:

Playing audio on the UI thread is simpler to implement, but can cause performance problems or UI glitches since the audio playback will block the UI from updating smoothly. This is especially problematic for long audio files or continuous playback ¹.

Playing audio in a separate thread avoids blocking the UI thread. However, this introduces complexity around thread-safety and synchronization. The AudioTrack methods like play(), stop(), and write() need to be called from the same thread the track was created on or synchronized properly ².

In general, playing short audio clips on the UI thread is okay, but longer playback or streaming should use a separate thread. Care must be taken to properly initialize AudioTrack, handle threading, and synchronize state between threads.

Volume and Playback Control

Android provides programmatic control over volume levels and playback for AudioTrack. The volume can be set using setStereoVolume(), which takes left and right volume values from 0 to 1.0. For example:

audioTrack.setStereoVolume(1.0f, 1.0f); // Full volume

Playback can be controlled using methods like play(), pause(), stop(), and flush(). To start playback, call play(). To temporarily pause, use pause(). To resume after pausing, call play() again. To stop completely, use stop(). And to discard pending buffers, use flush().

Here is an example playback control flow:

audioTrack.play(); // Start playback


// ...

audioTrack.pause(); // Pause playback 
// ...
audioTrack.play(); // Resume playback

// ...
audioTrack.stop(); // Stop playback

So Android AudioTrack provides full control over volume levels and playback state to create a robust audio experience.

Audio Focus

Audio focus in Android is used to moderate audio playback between apps. It ensures that only one app plays audio at a time by allowing the system to force an app’s audio to fade out when another app requests focus [1]. The system manages audio focus and apps need to properly request and handle focus changes to cooperate with other media apps on the device.

To play audio, your app should request audio focus using the AudioManager.requestAudioFocus() method before starting playback. This allows the app to gain exclusive or shared access to output audio streams. When playback is done, the app should abandon focus using AudioManager.abandonAudioFocus() so other apps can gain access.

When another app requests audio focus, your app will receive a callback via AudioManager.OnAudioFocusChangeListener(). This callback indicates your app’s audio should either pause or attenuate its playback volume temporarily until focus returns. Strategies for handling this include ducking the volume down to a low level or pausing playback until focus returns [2]. This allows multiple apps to cooperatively share audio output.

Overall, properly managing audio focus improves the user experience by preventing multiple media sources from playing over each other. Apps should request focus when needed and gracefully handle external focus changes.

Common Use Cases

Some common use cases for AudioTrack in Android include:

Music and Video Playback

AudioTrack is often used for streaming and playing back music or video soundtracks in real-time. For example, apps like music players and video streaming services need to play audio without latency or lag. AudioTrack allows feeding raw audio data to the audio subsystem continuously while the audio plays. This is more efficient than alternatives like SoundPool when low latency and uninterrupted playback are required.[1]

Sound Effects

AudioTrack can generate sound effects that match the action in games or other interactive apps. This allows precise timing and playback control compared to pre-recorded sound clips. Effects can be programmed to play at the exact right moment. AudioTrack also allows adjusting pitch/tempo and applying audio filters dynamically.[2]

Voice Chat

For Voice over IP (VoIP) apps like messaging or voice chat, AudioTrack provides low latency playback and recording of voice data. It can decode incoming voice data and immediately play it over the speaker. Likewise, it can route microphone input to encode and transmit as a voice stream. This enables real-time two-way conversations.

By providing precise timing and minimized lag, AudioTrack enables audio experiences like music, video, sound effects, and voice chat that need to work smoothly in real-time.

Conclusion

The AudioTrack class in Android provides developers with robust audio playback capabilities. By creating AudioTrack instances and calling methods like play(), pause(), and stop(), apps can seamlessly stream audio files, generate sounds, play music, handle audio focus changes, and more.

Properly initializing the audio session, audio attributes, buffer size, and sample rate are key to ensuring glitch-free, high-quality audio playback. Developers should also be mindful of audio focus and volume controls so that audio playback does not interfere with other apps. Additionally, using worker threads and callbacks allows audio to progress smoothly without interrupting the main UI thread.

Best practices include minimizing lag and latency, handling headphone and Bluetooth changes gracefully, releasing resources properly, and fall back options for unsupported formats. With the versatile AudioTrack class and careful programming, Android apps can deliver robust audio capabilities and great user experiences.