Mozilla DeepSpeech
latest

Introduction

  • Using a Pre-trained Model
    • CUDA dependency (inference)
    • Getting the pre-trained model
    • Important considerations on model inputs
    • Model compatibility
    • Using the Python package
      • Create a DeepSpeech virtual environment
      • Activating the environment
      • Installing DeepSpeech Python bindings
    • Using the Node.JS / Electron.JS package
    • Using the command-line client
    • Installing bindings from source
    • Dockerfile for building from source
    • Third party bindings
  • Training Your Own Model
    • Prerequisites for training a model
    • Getting the training code
    • Creating a virtual environment
    • Activating the environment
    • Installing DeepSpeech Training Code and its dependencies
    • Recommendations
    • Basic Dockerfile for training
    • Common Voice training data
    • Training a model
    • Training with automatic mixed precision
    • Checkpointing
    • Exporting a model for inference
    • Exporting a model for TFLite
    • Making a mmap-able model for inference
      • Continuing training from a release model
    • Fine-Tuning (same alphabet)
    • Transfer-Learning (new alphabet)
    • UTF-8 mode
    • Augmentation
      • Sample domain augmentations
      • Spectrogram domain augmentations
      • Multi domain augmentations
    • Training from an Anaconda or miniconda environment
  • Supported platforms for inference
    • Linux / AMD64 without GPU
    • Linux / AMD64 with GPU
    • Linux / ARMv7
    • Linux / Aarch64
    • Android / ARMv7
    • Android / Aarch64
    • macOS / AMD64
    • Windows / AMD64 without GPU
    • Windows / AMD64 with GPU
  • Building DeepSpeech Binaries
    • Dependencies
      • Checkout source code
      • Bazel: Download & Install
      • TensorFlow: Configure with Bazel
    • Compile DeepSpeech
      • Compile libdeepspeech.so
      • Compile generate_scorer_package
      • Compile Language Bindings
    • Installing your own Binaries
      • Install Python bindings
      • Install NodeJS / ElectronJS bindings
      • Install the CTC decoder package
    • Cross-building
      • RPi3 ARMv7 and LePotato ARM64
    • Android devices support
      • Using the library from Android project
      • Building libdeepspeech.so
      • Building libdeepspeech.aar
      • Building C++ deepspeech binary
      • Android demo APK
      • Running deepspeech via adb
      • Delegation API

Decoder and scorer

  • CTC beam search decoder
    • Introduction
    • External scorer
    • Decoding modes
    • Default mode (alphabet based)
    • Bytes output mode
    • Implementation
  • External scorer scripts
    • Reproducing our external scorer
    • Building your own scorer

Architecture and training

  • DeepSpeech Model
  • Geometric Constants
    • n_input
    • n_context
    • n_hidden_1, n_hidden_2, n_hidden_5
    • n_cell_dim
    • n_hidden_3
    • n_hidden_6
  • Parallel Optimization
    • Asynchronous Parallel Optimization
    • Synchronous Optimization
    • Hybrid Parallel Optimization
    • Adam Optimization

API Reference

  • Error codes
  • C API
    • Data structures
      • Metadata
      • CandidateTranscript
      • TokenMetadata
  • .NET Framework
    • DeepSpeech Class
    • DeepSpeechStream Class
    • ErrorCodes
    • Metadata
    • CandidateTranscript
    • TokenMetadata
    • DeepSpeech Interface
  • Java
    • DeepSpeechModel
    • Metadata
    • CandidateTranscript
    • TokenMetadata
  • JavaScript (NodeJS / ElectronJS)
    • Model
    • Stream
    • Module exported methods
    • Metadata
    • CandidateTranscript
    • TokenMetadata
  • Python
    • Model
    • Stream
    • Metadata
    • CandidateTranscript
    • TokenMetadata

Examples

  • C API Usage example
    • Creating a model instance and loading model
    • Performing inference
    • Full source code
  • .NET API Usage example
    • Creating a model instance and loading model
    • Performing inference
    • Full source code
  • Java API Usage example
    • Creating a model instance and loading model
    • Performing inference
    • Full source code
  • JavaScript API Usage example
    • Creating a model instance and loading model
    • Performing inference
    • Full source code
  • Python API Usage example
    • Creating a model instance and loading model
    • Performing inference
    • Full source code
  • User contributed examples
Mozilla DeepSpeech
  • Docs »
  • Search
  • Edit on GitHub


© Copyright 2016-2020 Mozilla Corporation, 2020 DeepSpeech authors Revision bfccca32.

Built with Sphinx using a theme provided by Read the Docs.