TRVAudioPlayer

A sound playing, recording, and speech recognition component.

Unit [VCL and LCL] MRVAudioPlayer;

Unit [FMX] fmxMRVAudioPlayer;

Syntax

TRVAudioPlayer = class (TCustomRVAudioPlayer)

Hierarchy

Description

This component must be linked to TRVMicrophone, TRVCamSound or TRVCamReceiver components to play sound.

Without TRVAudioPlayer, TRVMicrophone cannot play or record sound.

Without TRVAudioPlayer, TRVCamSound cannot play or record sound, and cannot synchronize video playback speed to audio playback speed.

TRVCamReceiver can play sound even without TRVAudioPlayer, but only on the default audio output device, with default sound parameters, and without recording to a file.

To play sound with this component, assign it to TRVMicrophone.AudioOutput, TRVCamSound.AudioOutput, or TRVCamReceiver.AudioOutput properties.

Playing sound

See the topic on TCustomRVAudioPlayer for properties controlling sound playing.

Recoding

The component can record sound to a file. FFmpeg library must be available to the application for this feature.

The component supports recording to mp3, ogg, wav, flac and other formats, see EncodeAudioCodec property.

Warning: Some audio formats may be patent-protected in some countries, and supporting these formats will require from you obtaining licenses from the patent owners.

Recording is started when you assign True to Recording property. Sound is recorded to OutputFileName.

Recording is stopped when you assign False to Recording property, or change any of Encode* properties. When it is stopped, OnStopRecording occurs.

Alternatively, you can use TRVCamRecorder component for sound recording.

Speech to text

Speech recognition requires FFmpeg 8+ with integrated Whisper model, and a model file specified in SpeechToTextProperty.ModelFileName property.

Speech recognition starts when you assign Recording = True, if SpeechToTextProperty.Active = True. It stops when you assign Recording = False.

If OutputFileName is not empty, speech recognition is performed together with recording, using sound parameters specified in Encode* properties.

If OutputFileName is empty, speech recognition is performed without recording; Encode* properties are ignored, sound is processed as 16-bit samples, mono, sample rate 16000 Hz.

Platform notes

Linux: RVMedia uses ALSA (Advanced Linux Sound Architecture) to play sound. If ALSA is not available, it falls back to OSS (Open Sound System). However, an OSS support has less functionality, so ALSA is highly recommended.