C¶
-
int
DS_CreateModel
(const char *aModelPath, unsigned int aBeamWidth, ModelState **retval)¶ An object providing an interface to a trained DeepSpeech model.
- Return
Zero on success, non-zero on failure.
- Parameters
aModelPath
: The path to the frozen model graph.aBeamWidth
: The beam width used by the decoder. A larger beam width generates better results at the cost of decoding time.[out] retval
: a ModelState pointer
-
void
DS_FreeModel
(ModelState *ctx)¶ Frees associated resources and destroys model object.
-
int
DS_EnableDecoderWithLM
(ModelState *aCtx, const char *aLMPath, const char *aTriePath, float aLMAlpha, float aLMBeta)¶ Enable decoding using beam scoring with a KenLM language model.
- Return
Zero on success, non-zero on failure (invalid arguments).
- Parameters
aCtx
: The ModelState pointer for the model being changed.aLMPath
: The path to the language model binary file.aTriePath
: The path to the trie file build from the same vocabu- lary as the language model binary.aLMAlpha
: The alpha hyperparameter of the CTC decoder. Language Model weight.aLMBeta
: The beta hyperparameter of the CTC decoder. Word insertion weight.
-
int
DS_GetModelSampleRate
(ModelState *aCtx)¶ Return the sample rate expected by a model.
- Return
Sample rate expected by the model for its input.
- Parameters
aCtx
: A ModelState pointer created with DS_CreateModel.
-
char *
DS_SpeechToText
(ModelState *aCtx, const short *aBuffer, unsigned int aBufferSize)¶ Use the DeepSpeech model to perform Speech-To-Text.
- Return
The STT result. The user is responsible for freeing the string using DS_FreeString(). Returns NULL on error.
- Parameters
aCtx
: The ModelState pointer for the model to use.aBuffer
: A 16-bit, mono raw audio signal at the appropriate sample rate (matching what the model was trained on).aBufferSize
: The number of samples in the audio signal.
-
Metadata *
DS_SpeechToTextWithMetadata
(ModelState *aCtx, const short *aBuffer, unsigned int aBufferSize)¶ Use the DeepSpeech model to perform Speech-To-Text and output metadata about the results.
- Return
Outputs a struct of individual letters along with their timing information. The user is responsible for freeing Metadata by calling DS_FreeMetadata(). Returns NULL on error.
- Parameters
aCtx
: The ModelState pointer for the model to use.aBuffer
: A 16-bit, mono raw audio signal at the appropriate sample rate (matching what the model was trained on).aBufferSize
: The number of samples in the audio signal.
-
int
DS_CreateStream
(ModelState *aCtx, StreamingState **retval)¶ Create a new streaming inference state. The streaming state returned by this function can then be passed to DS_FeedAudioContent() and DS_FinishStream().
- Return
Zero for success, non-zero on failure.
- Parameters
aCtx
: The ModelState pointer for the model to use.[out] retval
: an opaque pointer that represents the streaming state. Can be NULL if an error occurs.
-
void
DS_FeedAudioContent
(StreamingState *aSctx, const short *aBuffer, unsigned int aBufferSize)¶ Feed audio samples to an ongoing streaming inference.
- Parameters
aSctx
: A streaming state pointer returned by DS_CreateStream().aBuffer
: An array of 16-bit, mono raw audio samples at the appropriate sample rate (matching what the model was trained on).aBufferSize
: The number of samples inaBuffer
.
-
char *
DS_IntermediateDecode
(StreamingState *aSctx)¶ Compute the intermediate decoding of an ongoing streaming inference.
- Return
The STT intermediate result. The user is responsible for freeing the string using DS_FreeString().
- Parameters
aSctx
: A streaming state pointer returned by DS_CreateStream().
-
char *
DS_FinishStream
(StreamingState *aSctx)¶ Signal the end of an audio signal to an ongoing streaming inference, returns the STT result over the whole audio signal.
- Return
The STT result. The user is responsible for freeing the string using DS_FreeString().
- Note
This method will free the state pointer (
aSctx
).- Parameters
aSctx
: A streaming state pointer returned by DS_CreateStream().
-
Metadata *
DS_FinishStreamWithMetadata
(StreamingState *aSctx)¶ Signal the end of an audio signal to an ongoing streaming inference, returns per-letter metadata.
- Return
Outputs a struct of individual letters along with their timing information. The user is responsible for freeing Metadata by calling DS_FreeMetadata(). Returns NULL on error.
- Note
This method will free the state pointer (
aSctx
).- Parameters
aSctx
: A streaming state pointer returned by DS_CreateStream().
-
void
DS_FreeStream
(StreamingState *aSctx)¶ Destroy a streaming state without decoding the computed logits. This can be used if you no longer need the result of an ongoing streaming inference and don’t want to perform a costly decode operation.
- Note
This method will free the state pointer (
aSctx
).- Parameters
aSctx
: A streaming state pointer returned by DS_CreateStream().
-
void
DS_FreeString
(char *str)¶ Free a char* string returned by the DeepSpeech API.
-
void
DS_PrintVersions
()¶ Print version of this library and of the linked TensorFlow library.