StellaScript Project Architecture
This document details the structure of the StellaScript project, the role of each file, and how the modules interact to perform audio transcription and diarization.
Overview
The project is structured around a main module, stellascript
, which contains all the application logic. Execution is initiated by main.py
at the project root, which acts as the entry point.
Root Files
main.py
: Application entry point. It is responsible for parsing command-line arguments, initializing the orchestrator, and launching the transcription process (either live or from a file).README.md
: Main documentation. Provides an overview of the project, installation instructions, and usage guidelines.pyproject.toml
&uv.lock
: Dependency management. These files define the Python libraries required for the project to function..gitignore
: Configuration file for Git, specifying files and folders to be ignored.LICENSE
: Contains the MIT license under which the project is distributed.
stellascript
Module (Application Core)
The stellascript/
directory contains the main source code of the application, organized into several modules and sub-modules.
orchestrator.py
: The conductor. This is the most important file in the project. TheStellaScriptTranscription
class manages the entire processing pipeline. It initializes the various components (transcriber, diarizer, etc.) and coordinates their interactions, whether for real-time or file-based processing.config.py
: Central configuration. This file centralizes all technical constants and parameters used in the application (e.g., sampling rate, audio buffer duration, voice detection thresholds). This allows for easy modification of the application’s behavior from a single location.cli.py
: Command-line interface. Defines all the arguments that the user can pass to the program (such as--file
,--language
,--mode
) and ensures they are correctly interpreted.logging_config.py
: Logging configuration. Sets up the logging system to display informational messages, warnings, or errors during execution, which is crucial for debugging.
stellascript/audio
Sub-module
This module is dedicated to handling raw audio data.
capture.py
: Audio capture. Manages interaction with the microphone to record the audio stream in real-time.enhancement.py
: Audio enhancement. Contains the logic for applying audio cleaning models, such asDeepFilterNet
orDemucs
, to reduce background noise and improve voice clarity before transcription.
stellascript/processing
Sub-module
This module contains the components responsible for the intelligent analysis and processing of audio.
transcriber.py
: Transcription module. Encapsulates the speech recognition model (Whisper viawhisperx
). Its sole responsibility is to take an audio segment and convert it into text.diarizer.py
: Diarization module. Its role is to answer the question: “who is speaking and when?”. It uses models likepyannote.audio
or a combination of VAD (Voice Activity Detection) and clustering to segment the audio based on speakers.speaker_manager.py
: Speaker manager. Works closely with thediarizer
, especially for thecluster
method. It is responsible for creating and managing “voiceprints” (embeddings) to identify and differentiate speakers consistently.