What Is the AudioRecord in Android?

What is AudioRecord in Android?

AudioRecord is a class in the android.media package that allows for recording audio on Android devices. It provides an interface for reading audio data from an audio source or stream.

The AudioRecord class manages the audio resources to capture audio input from the microphone or other audio input hardware on an Android device. This allows developers to build audio recording functionality into their Android apps.

Some key things to know about AudioRecord:

Part of the Android SDK since API level 1
Allows accessing audio data from the audio hardware as an uncompressed 16-bit PCM stream
Can specify audio source (microphone, voice call uplink/downlink, etc.)

Configurable audio encoding settings (sample rate, channel config, audio format)
Reads audio in a callback or via polling

In summary, AudioRecord is the foundation class for recording audio and accessing raw audio data streams on Android. It abstracts away the complexity of interfacing with audio driver APIs.

Creating an AudioRecord Object

To start recording audio, we need to create an AudioRecord object. The AudioRecord constructor requires several parameters:

Audio source – For example, MediaRecorder.AudioSource.MIC to record audio from the device’s microphone.
Sample rate – The number of samples per second. Common values are 8000, 16000, 44100 Hz.

Channel configuration – Recording in mono, stereo, etc.
Audio format – Encoding of the audio data, like AudioFormat.ENCODING_PCM_16BIT.
Buffer size – The size of the buffer where the audio data is stored.

After creating the AudioRecord, we need to call the initialize() method to configure the audio recorder before starting recording. According to the Android documentation, initialize() sets the recorder sample rate, channel configuration, etc.

Starting and Stopping Recording

Once you have initialized the AudioRecord, you need to start and stop the actual recording. This is done using the startRecording() and stop() methods:

startRecording() begins the actual recording using the parameters set during initialization. This method allocates the necessary resources for recording based on the configuration and starts capturing audio from the specified source. You need to call this method once you are ready to start recording.

According to the Android developer documentation, startRecording() can throw an IllegalStateException if the recorder is not properly initialized or configured. It’s important to set up the AudioRecord properly before calling this method.

To stop the recording, simply call the stop() method. This will release the resources acquired during startRecording() and stop capturing audio data. The recorded audio buffers can then be read and processed. According to the Tabnine Java code examples, stop() should be called even if startRecording() fails with an exception.

Proper use of startRecording() and stop() is critical to control the recording lifecycle when using AudioRecord. These two methods start and stop actual audio capture.

Reading Audio Data

The read() method of the AudioRecord class reads audio data from the audio hardware into a buffer. It takes in three parameters:

A byte array buffer to read the audio data into
An offset within the buffer to start writing

Number of bytes to read

The format of the audio data depends on the configuration of the AudioRecord. By default, the data is 16-bit PCM with a sampling rate of 8kHz. So each sample will contain 16-bits of data per channel. For mono recording, each sample will be 16 bits. For stereo, each sample will contain 32 bits total (16 bits per channel). The number of bytes read will be equal to the buffer size passed in.

So the read() method reads raw audio samples into the buffer which can then be processed and played back as needed. The format and sampling rate allow converting the byte data back into analog audio signals. For more details see the AudioRecord reference.

Processing the Recorded Audio

Once the audio data is recorded, the next step is to process it in some way. The raw audio data that is read from the AudioRecord is stored in a byte array. This needs to be converted to a usable audio format like PCM in order to analyze the audio or play it back.

To convert the raw audio data to PCM, you can use the AudioRecord’s read method and pass in a buffer to store the PCM data [1]. This buffer can then be processed to analyze aspects of the audio like frequency, volume, etc. The AudioRecord class provides methods like getMaxAmplitude() and getMinBufferSize() to help with analysis.

Some common ways to process the PCM audio data include:

Performing Fast Fourier Transforms (FFT) to analyze the frequency domain
Looking at the amplitude to determine volume levels
Applying filters like low-pass, high-pass, etc.

Compressing or changing the sample rate

Audio processing on Android does have some challenges like latency and performance limitations to be aware of. Using the onReadyListener() and setting the thread priority can help [1].

Playing Back Recorded Audio

Once audio data has been recorded, it can be played back using an AudioTrack object. To play back the recorded audio, you need to write the audio data to the AudioTrack in a streaming fashion.

Here is an example of playing back recorded PCM audio data (from Stack Overflow):


int buffsize = AudioRecord.getMinBufferSize(...);
short[] audiobuffer = new short[buffsize];

AudioTrack audioTrack = new AudioTrack(..., buffsize, AudioTrack.MODE_STREAM);
audioTrack.play();

while(recording) {
  int readsize = recorder.read(audiobuffer, 0, buffsize);
  audioTrack.write(audiobuffer, 0, readsize);
}

The key things to note are:

Create an AudioTrack object in streaming mode.

In a loop, read audio data from the AudioRecord into a buffer.
Write the audio data to the AudioTrack to play it back.
This streams the audio playback while recording is ongoing.

Streaming playback allows the recorded audio to be played in real-time as it’s being captured. This avoids having to store the entire recording in memory before playback.

Setting the Recording Source

The audio input source for recording with AudioRecord can be set using the AudioSource parameter when constructing the AudioRecord object. The most common value is MediaRecorder.AudioSource.MIC which allows recording audio input from the device’s microphone.

Some other audio input sources that can be used with AudioRecord include:

MediaRecorder.AudioSource.VOICE_UPLINK – for recording voice calls
MediaRecorder.AudioSource.VOICE_DOWNLINK – for recording voice calls on the other end
MediaRecorder.AudioSource.VOICE_CALL – for recording both uplink and downlink in voice calls

MediaRecorder.AudioSource.CAMCORDER – for recording audio along with camera
MediaRecorder.AudioSource.REMOTE_SUBMIX – for recording submix of audio from remote device
MediaRecorder.AudioSource.UNPROCESSED – for unprocessed audio input

So in most cases, MediaRecorder.AudioSource.MIC will be used to simply record audio input from the device’s microphone. The other sources allow recording audio from specific sources during calls, camera use cases, etc.

Configuring Audio Encoding

When initializing an AudioRecord, there are three main encoding parameters you can configure:

Sample Rate – This refers to how many samples per second are captured. Higher sample rates mean better audio quality, but larger file sizes. Common rates are 44100 Hz, 48000 Hz, and 96000 Hz. The maximum sample rate depends on the device.

Bit Depth – This refers to the number of bits captured per sample. More bits means better dynamic range. Typical values are 16 bits or 24 bits per sample.
Mono/Stereo – You can choose between mono (single channel) or stereo (two channel) recording. Stereo audio sounds more natural, but takes up more space.

There are always tradeoffs when selecting encoding parameters. Higher sample rates and bit depths will produce better quality audio, but require more processing power to encode and decode, as well as taking up more storage space. Mono vs stereo depends on the audio source and use case – for example, speech can usually be encoded as mono without losing quality.

When initializing the AudioRecord, use getMinBufferSize() to determine the minimum buffer size needed for your chosen encoding parameters on the current device.

Use Cases

The AudioRecord API can be useful for a variety of audio recording applications on Android, including:

Streaming audio/voice chat: By reading audio data from the AudioRecord and transmitting it over the network in real time, apps can enable streaming audio or voice chat functionalities. The AudioRecord allows low latency audio capture critical for these use cases.

Music recording apps: With AudioRecord, apps can capture high quality audio input for music recording, while applying audio effects and encoding as needed. This is useful for building music memo apps, digital audio workstations, and other music creation tools.

Background audio: AudioRecord enables capturing audio even when the app is in the background. This is helpful for implementing background audio recording features, like recording phone calls, ambient sounds, or music playback.

Conclusion

AudioRecord provides a powerful framework for recording audio directly in Android applications. Some key points about AudioRecord:

AudioRecord lets you configure the audio source, sample rate, channel count, encoding and other parameters to customize audio recording. It captures the raw audio for analysis or playback in the app. This differs from MediaRecorder which outputs audio files.

Key capabilities include low latency audio input, ability to process data during recording, and flexible buffering. Limitations are that it only records uncompressed audio, lacks some MediaRecorder features like effects, and requires more work to play back recordings.

Overall, AudioRecord opens up real-time audio processing and voice recognition abilities. With proper configuration, it can record high quality audio for games, VOIP, speech synthesis and many other use cases. Just be ready to handle the raw data and account for performance impacts.