Veja este código de aplicação de console do Delphi...
Criado em: 25 de fevereiro de 2025
Criado em: 25 de fevereiro de 2025
Veja este código de aplicação de console do Delphi, para fazer uso da biblioteca Python, chamada de Kokoro TTS, por meio P4D (Python4Delphi):
textprogram pP4DKokoro; {$APPTYPE CONSOLE} {$R *.res} uses System.SysUtils, System.Types, System.Diagnostics, System.IOUtils, // TPath, TDirectory, TFile Windows, PythonEngine, VarPyth, System.Classes, System.Net.HttpClient, // For downloading models System.Net.URLClient, // For downloading models System.Zip, System.Variants; // Required for Variant operations var PythonEngine: TPythonEngine; PythonModule: TPythonModule; // Structured Python module for Delphi functions PythonHome: string; PyFuncGenerate: PPyObject; // Global reference to the generation function // ----------------------------------------------------------------------------- // Constants for Kokoro Model and Voice Management // ----------------------------------------------------------------------------- const KOKORO_REPO_ID = 'hexgrad/Kokoro-82M'; KOKORO_MODELS_BASE_PATH = 'kokoroModels'; // Base path for Kokoro models // ----------------------------------------------------------------------------- // Helper Functions: Directory and File Management // ----------------------------------------------------------------------------- // Check if a directory exists; create if it doesn't. Returns True if exists or was created. function EnsureDirectoryExists(const DirPath: string): Boolean; begin Result := TDirectory.Exists(DirPath); if not Result then begin try TDirectory.CreateDirectory(DirPath); Result := True; except // Handle potential exceptions during directory creation (e.g., permissions) Result := False; end; end; end; // ----------------------------------------------------------------------------- // Helper Functions: Model and Voice Download and Verification // ----------------------------------------------------------------------------- function DownloadFile(const URL, LocalFilename: string): Boolean; var HTTPClient: THTTPClient; Response: IHTTPResponse; FileStream: TFileStream; begin Result := False; HTTPClient := THTTPClient.Create; try Response := HTTPClient.Get(URL); if Response.StatusCode = 200 then begin FileStream := TFileStream.Create(LocalFilename, fmCreate); try FileStream.CopyFrom(Response.ContentStream, 0); Result := True; finally FileStream.Free; end; end else begin Writeln(Format(' Error downloading %s. HTTP Status Code: %d', [URL, Response.StatusCode])); end; finally HTTPClient.Free; end; end; function CheckKokoroModelInstallation(const ModelsBasePath: string): Boolean; var ModelConfigPath: string; ModelPath: string; begin ModelConfigPath := TPath.Combine(ModelsBasePath, 'config.json'); ModelPath := TPath.Combine(ModelsBasePath, 'kokoro-v1_0.pth'); Result := TFile.Exists(ModelConfigPath) and TFile.Exists(ModelPath); if not Result then Writeln('Kokoro base model is NOT installed correctly!'); end; function DownloadKokoroModelFiles(const BasePath: string): Boolean; var BaseURL: string; Filename: string; LocalFilePath: string; FilesToDownload: TArray<string>; i: Integer; begin Result := True; // Assume success, set to False on any failure // List of necessary files from the base model. FilesToDownload := ['config.json', 'kokoro-v1_0.pth']; BaseURL := Format('https://huggingface.co/%s/resolve/main/', [KOKORO_REPO_ID]); for i := Low(FilesToDownload) to High(FilesToDownload) do begin Filename := FilesToDownload[i]; LocalFilePath := TPath.Combine(BasePath, Filename); // Combine base path with filename Writeln(Format(' Downloading file "%s"...', [Filename])); if DownloadFile(BaseURL + Filename, LocalFilePath) then begin Writeln(Format(' File "%s" downloaded successfully.', [Filename])); end else begin Writeln(Format(' Failed to download file "%s".', [Filename])); Result := False; // Set Result to False on failure // Consider whether to exit early or continue to try downloading other files. end; end; end; function CheckKokoroVoiceInstallation(const ModelsBasePath, VoiceName: string): Boolean; var VoicePath: string; begin VoicePath := TPath.Combine(ModelsBasePath, 'voices\' + VoiceName + '.pt'); Result := TFile.Exists(VoicePath); if not Result then Writeln('Kokoro voice ' + VoiceName + ' is NOT installed correctly!'); end; function DownloadKokoroVoice(const ModelsBasePath: String; const VoiceName: string): Boolean; var VoiceURL: string; VoiceFilePath: string; begin Result := False; // Assume failure until successful download // Construct the URL and file path for the voice VoiceURL := Format('https://huggingface.co/%s/resolve/main/voices/%s.pt', [KOKORO_REPO_ID, VoiceName]); VoiceFilePath := TPath.Combine(ModelsBasePath, 'voices\' + VoiceName + '.pt'); Writeln(Format(' Downloading voice "%s"...', [VoiceName])); // Download the voice file if DownloadFile(VoiceURL, VoiceFilePath) then begin Writeln(Format(' Voice "%s" downloaded successfully.', [VoiceName])); Result := True; // Set Result to True on success end else begin Writeln(Format(' Failed to download voice "%s".', [VoiceName])); // You might choose to raise an exception or handle this failure differently end; end; // ----------------------------------------------------------------------------- // Embedded Python Code (Optimized) - Kokoro // ----------------------------------------------------------------------------- const // Initialization script: import dependencies, define globals and model init for Kokoro EMBEDDED_PYTHON_SCRIPT_INIT_KOKORO: string = 'from kokoro import KPipeline' + sLineBreak + 'import soundfile as sf' + sLineBreak + 'import torch' + sLineBreak + 'from loguru import logger' + sLineBreak + 'import time' + sLineBreak + 'import os' + sLineBreak + '# Set the PHONEMIZER_ESPEAK_LIBRARY and PHONEMIZER_ESPEAK_PATH environment variables' + sLineBreak + 'os.environ["PHONEMIZER_ESPEAK_LIBRARY"] = r"C:\\Program Files\\eSpeak NG\\libespeak-ng.dll"' + sLineBreak + 'os.environ["PHONEMIZER_ESPEAK_PATH"] = r"C:\\Program Files\\eSpeak NG\\espeak-ng.exe"' + sLineBreak + 'pipeline = None' + sLineBreak + // Global KPipeline instance 'def init_kokoro(lang_code, device):' + sLineBreak + ' global pipeline' + sLineBreak + ' pipeline = KPipeline(lang_code=lang_code, device=device)' + sLineBreak; // Generation script: batch process text in one go for Kokoro EMBEDDED_PYTHON_SCRIPT_GENERATE_OPTIMIZED_KOKORO: string = 'def generate_audio_optimized(text, voice, speed):' + sLineBreak + ' global pipeline' + sLineBreak + ' generator = pipeline(text, voice=voice, speed=speed, split_pattern=r"\n+")' + sLineBreak+ ' for result in generator:' + sLineBreak + ' if result.audio is not None:' + sLineBreak + ' return result.audio.cpu().numpy()' + sLineBreak + // Explicitly move to CPU and convert to NumPy ' return None' + sLineBreak; // ----------------------------------------------------------------------------- // CUDA and Python Engine Setup Functions (Pre-caching and GPU init) // ----------------------------------------------------------------------------- procedure SetupCUDAEnvironment; var OldPath: string; begin // Set CUDA-related environment variables SetEnvironmentVariable('CUDA_PATH', 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4'); OldPath := GetEnvironmentVariable('PATH'); SetEnvironmentVariable('PATH', PChar(OldPath + ';C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\bin' + ';C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\libnvvp')); // Preload key CUDA and CTranslate2 libraries (and potentially torch/faster-whisper deps) LoadLibrary(PChar('C:\Users\user\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\lib\cudnn_graph64_9.dll')); // Preload CUDA DLL (if needed) - Adapt path if necessary LoadLibrary(PChar('C:\Windows\system32\DriverStore\FileRepository\nvdmui.inf_amd64_fdc98cdf10f69918\nvcuda64.dll')); // Add any other necessary CUDA/cuDNN related DLLs if needed based on Kokoro dependencies end; procedure InitializeCUDAContext; begin with PythonEngine do begin // Minimal CUDA initialization for fast startup ExecString(AnsiString('import torch; torch.cuda.init(); print("CUDA Device:", torch.cuda.get_device_name(0))')); CheckError; end; end; // Dummy Delphi method (not used in translation) function DoNothing(Self, Args: PPyObject): PPyObject; cdecl; begin Result := PythonEngine.ReturnNone; end; // ----------------------------------------------------------------------------- // Python Engine Initialization (pre-cache engine core and libraries) // ----------------------------------------------------------------------------- procedure InitializePythonEngine; begin PythonEngine := TPythonEngine.Create(nil); PythonEngine.Name := 'PythonEngine'; // Specify the Python 3.9 DLL and home directory PythonEngine.DllName := 'C:\Users\user\AppData\Local\Programs\Python\Python39\python39.dll'; PythonHome := 'C:\Users\user\AppData\Local\Programs\Python\Python39'; PythonEngine.SetPythonHome(PWideChar(PythonHome)); PythonEngine.LoadDll; // Create a Python module for any Delphi exports (if needed) PythonModule := TPythonModule.Create(nil); PythonModule.Engine := PythonEngine; PythonModule.ModuleName := 'delphi_module'; PythonModule.AddMethod('do_nothing', @DoNothing, 'A dummy method.'); end; // ----------------------------------------------------------------------------- // Initialize and Pre-load Kokoro Model // ----------------------------------------------------------------------------- procedure InitializeKokoro(const LangCode, Device: string); var pyArgs, pyInitFunc, pyMainModule: PPyObject; pyLangCode, pyDevice: PPyObject; begin with PythonEngine do begin // Run the initialization script (pre-cache libraries and globals) ExecString(AnsiString(EMBEDDED_PYTHON_SCRIPT_INIT_KOKORO)); CheckError; pyMainModule := GetMainModule; if not Assigned(pyMainModule) then raise Exception.Create('Cannot retrieve __main__ module.'); pyInitFunc := PyObject_GetAttrString(pyMainModule, 'init_kokoro'); Py_XDECREF(pyMainModule); if not Assigned(pyInitFunc) then raise Exception.Create('Function init_kokoro not found.'); // Prepare arguments for Python function pyLangCode := PyUnicode_FromWideChar(PWideChar(LangCode), Length(LangCode)); if not Assigned(pyLangCode) then raise Exception.Create('Error creating Python string for language code.'); pyDevice := PyUnicode_FromWideChar(PWideChar(Device), Length(Device)); if not Assigned(pyDevice) then raise Exception.Create('Error creating Python string for device.'); pyArgs := MakePyTuple([pyLangCode, pyDevice]); Py_XDECREF(pyLangCode); Py_XDECREF(pyDevice); // Call the Python init function PyObject_CallObject(pyInitFunc, pyArgs); Py_XDECREF(pyArgs); Py_XDECREF(pyInitFunc); if PyErr_Occurred <> nil then begin WriteLn('Error initializing Kokoro model.'); PyErr_Print; // Print detailed Python error to console PyErr_Clear; raise Exception.Create('Error initializing Kokoro model (see Python error output).'); end; // Load the optimized generation script and retrieve the generation function ExecString(AnsiString(EMBEDDED_PYTHON_SCRIPT_GENERATE_OPTIMIZED_KOKORO)); CheckError; PyFuncGenerate := PyObject_GetAttrString(GetMainModule, 'generate_audio_optimized'); if not Assigned(PyFuncGenerate) then raise Exception.Create('Generation function not found.'); CheckError; end; end; // ----------------------------------------------------------------------------- // Fast Path: Call Optimized Generation Function (minimal overhead) // ----------------------------------------------------------------------------- function CallOptimizedGenerate(const TextToGenerate, Voice: string; Speed: Double): TBytes; var pyArgsTuple, pyResult: PPyObject; AudioBytes: TBytes; NumPyArray: Variant; pVarArray: System.PVarArray; // Pointer to Variant array DataPtr: Pointer; DataSize: NativeInt; VarTypeStr: string; begin // Initialize Result to an empty array SetLength(AudioBytes, 0); Result := AudioBytes; with PythonEngine do begin // Build tuple: (text, voice, speed) pyArgsTuple := MakePyTuple([ PyUnicode_FromWideChar(PWideChar(TextToGenerate), Length(TextToGenerate)), PyUnicode_FromWideChar(PWideChar(Voice), Length(Voice)), PyFloat_FromDouble(Speed) ]); // Call the optimized generation function pyResult := PyObject_CallObject(PyFuncGenerate, pyArgsTuple); Py_XDECREF(pyArgsTuple); // If an error occurred, return an empty byte array. if (pyResult = nil) or (PyErr_Occurred <> nil) then begin WriteLn('Generation Error: pyResult is nil or Python error occurred.'); PyErr_Print; // Print Python error to console for debugging PyErr_Clear; Exit; // Exit the function, returning the empty array end; // Check if the returned object is None (using Python's None object) if pyResult = PythonEngine.Py_None then begin Py_XDECREF(pyResult); // Decrement the reference count of Py_None WriteLn('No audio generated (Python returned None).'); Exit; // Return empty array end; NumPyArray := PyObjectAsVariant(pyResult); VarTypeStr := VarToStr(VarType(NumPyArray)); // Get Variant type as string. Use VarToStr WriteLn('Debug: Variant Type: ' + VarTypeStr); // Output Variant type Py_XDECREF(pyResult); if VarIsEmpty(NumPyArray) or VarIsClear(NumPyArray) then begin WriteLn('Error: Could not convert audio data to Variant (Variant is Empty or Clear).'); Exit; end; if VarIsArray(NumPyArray) and (VarArrayDimCount(NumPyArray) = 1) then begin pVarArray := VarArrayAsPSafeArray(NumPyArray); if Assigned(pVarArray) then begin DataPtr := pVarArray.Data; // Correct DataSize calculation using pVarArray^.ElementSize DataSize := pVarArray.Bounds[0].ElementCount * pVarArray^.ElementSize; WriteLn('Debug: DataSize calculated: ' + IntToStr(DataSize) + ' bytes'); // Output DataSize SetLength(AudioBytes, DataSize); Move(DataPtr^, AudioBytes[0], DataSize); end else WriteLn('Failed to access variant array (pVarArray is nil).'); end else begin WriteLn('Data format received is incorrect (Not a 1D array or not an array at all).'); end; end; Result := AudioBytes; end; procedure SaveAudioToFile(const AudioBytes: TBytes; const FilePath: string); var FileStream: TFileStream; NumSamples: Integer; ChunkID: array[0..3] of Char; Format: array[0..3] of Char; Subchunk1ID: array[0..3] of Char; Subchunk2ID: array[0..3] of Char; ChunkSize: Integer; Subchunk1Size: Integer; AudioFormat: SmallInt; NumChannels: SmallInt; SampleRate: Integer; ByteRate: Integer; BlockAlign: SmallInt; BitsPerSample: SmallInt; Subchunk2Size: Integer; begin // Create a file stream to write the WAV file FileStream := TFileStream.Create(FilePath, fmCreate); try // Check if AudioBytes has data before writing header if Length(AudioBytes) = 0 then begin WriteLn('Warning: AudioBytes is empty, WAV file might be invalid.'); end; // Calculate the number of samples NumSamples := Length(AudioBytes) div 2; // 2 bytes per sample for 16-bit audio // Prepare WAV header values ChunkID := 'RIFF'; ChunkSize := 36 + NumSamples * 2; Format := 'WAVE'; Subchunk1ID := 'fmt '; Subchunk1Size := 16; // 16 for PCM AudioFormat := 1; // 1 for PCM NumChannels := 1; // 1 for mono SampleRate := 24000; // 24000 Hz ByteRate := SampleRate * 2; // SampleRate * NumChannels * BytesPerSample BlockAlign := 2; // NumChannels * BytesPerSample BitsPerSample := 16; // 16 bits Subchunk2ID := 'data'; Subchunk2Size := NumSamples * 2; // NumSamples * BytesPerSample // Debug output for header values WriteLn('Debug: WAV Header Values:'); WriteLn('Debug: ChunkSize: ' + IntToStr(ChunkSize)); WriteLn('Debug: Subchunk2Size: ' + IntToStr(Subchunk2Size)); // Write WAV file header FileStream.Write(ChunkID, 4); FileStream.Write(ChunkSize, 4); FileStream.Write(Format, 4); FileStream.Write(Subchunk1ID, 4); FileStream.Write(Subchunk1Size, 4); FileStream.Write(AudioFormat, 2); FileStream.Write(NumChannels, 2); FileStream.Write(SampleRate, 4); FileStream.Write(ByteRate, 4); FileStream.Write(BlockAlign, 2); FileStream.Write(BitsPerSample, 2); FileStream.Write(Subchunk2ID, 4); FileStream.Write(Subchunk2Size, 4); // Write the audio data FileStream.Write(AudioBytes[0], Length(AudioBytes)); finally FileStream.Free; end; end; // ----------------------------------------------------------------------------- // Destroy Python Engine and Clean-up // ----------------------------------------------------------------------------- procedure DestroyEngine; begin if Assigned(PyFuncGenerate) then PythonEngine.Py_XDECREF(PyFuncGenerate); if Assigned(PythonModule) then PythonModule.Free; if Assigned(PythonEngine) then PythonEngine.Free; end; // ----------------------------------------------------------------------------- // Main Program: Pre-cache engine/core and perform generation // ----------------------------------------------------------------------------- var TotalStopwatch: TStopwatch; CreateEngineTime, GenerationTime, DestroyEngineTime, InitKokoroTime, CUDAInitTime: Int64; AudioFilePath, TextToSynthesize, Voice: string; EngineTimer, GenerationTimer, InputTimer, KokoroInitTimer, CUDAInitTimer, DestroyEngineTimer: TStopwatch; Device, LangCode: string; AudioBytes: TBytes; KokoroModelsBasePath: string; // Moved declaration here ModelInstalled, VoiceInstalled: Boolean; DownloadConfirmation: string; begin try MaskFPUExceptions(True); TotalStopwatch := TStopwatch.StartNew; SetupCUDAEnvironment; // Preload CUDA/related DLLs WriteLn('=== Kokoro Text-to-Speech ==='); InputTimer := TStopwatch.StartNew; Write('Enter text to synthesize: '); ReadLn(TextToSynthesize); Write('Enter Kokoro voice name (e.g., pf_dora): '); ReadLn(Voice); LangCode := 'p'; // Hardcoded for Portuguese Device := 'cuda'; // Hardcoded for CUDA (GPU) InputTimer.Stop; WriteLn('User Input Time: ', InputTimer.ElapsedMilliseconds, ' ms'); KokoroModelsBasePath := TPath.Combine(ExtractFilePath(ParamStr(0)), KOKORO_MODELS_BASE_PATH); // Ensure the base model directory exists if not EnsureDirectoryExists(KokoroModelsBasePath) then begin Writeln('Failed to create Kokoro models base directory: ' + KokoroModelsBasePath); Exit; // Critical error: Cannot proceed without model directory end; ModelInstalled := CheckKokoroModelInstallation(KokoroModelsBasePath); if not ModelInstalled then begin Write('Kokoro base model not installed. Do you want to download it? (yes/no): '); ReadLn(DownloadConfirmation); if SameText(DownloadConfirmation, 'yes') then begin if DownloadKokoroModelFiles(KokoroModelsBasePath) then // Pass base path ModelInstalled := CheckKokoroModelInstallation(KokoroModelsBasePath) // Recheck else begin WriteLn('Failed to download Kokoro model files.'); Exit; end; end else Exit; end; // Ensure the 'voices' subdirectory exists if not EnsureDirectoryExists(TPath.Combine(KokoroModelsBasePath, 'voices')) then begin Writeln('Failed to create voices subdirectory.'); Exit; end; VoiceInstalled := CheckKokoroVoiceInstallation(KokoroModelsBasePath, Voice); if not VoiceInstalled then begin Write('Voice "' + Voice + '" not found. Do you want to download it? (yes/no): '); ReadLn(DownloadConfirmation); if SameText(DownloadConfirmation, 'yes') then begin if DownloadKokoroVoice(KokoroModelsBasePath, Voice) then VoiceInstalled := CheckKokoroVoiceInstallation(KokoroModelsBasePath, Voice) else WriteLn('Failed to download voice') end; end; if ModelInstalled and VoiceInstalled then begin // Pre-initialize Python engine and load required models EngineTimer := TStopwatch.StartNew; // Start Engine Timer InitializePythonEngine; EngineTimer.Stop; // Stop Engine Timer CreateEngineTime := EngineTimer.ElapsedMilliseconds; CUDAInitTimer := TStopwatch.StartNew; // Start CUDA Init Timer InitializeCUDAContext; CUDAInitTimer.Stop; // Stop CUDA Init Timer CUDAInitTime := CUDAInitTimer.ElapsedMilliseconds; KokoroInitTimer := TStopwatch.StartNew; InitializeKokoro(LangCode, Device); // Initialize with language and device KokoroInitTimer.Stop; // Stop Kokoro Init Timer InitKokoroTime := KokoroInitTimer.ElapsedMilliseconds; WriteLn(Format('Synthesizing speech for text using voice: %s', [Voice])); GenerationTimer := TStopwatch.StartNew; AudioBytes := CallOptimizedGenerate(TextToSynthesize, Voice, 1.0); // Generate with text and voice GenerationTimer.Stop; GenerationTime := GenerationTimer.ElapsedMilliseconds; if Length(AudioBytes) > 0 then begin AudioFilePath := TPath.Combine(ExtractFilePath(ParamStr(0)), 'kokoro_output.wav'); SaveAudioToFile(AudioBytes, AudioFilePath); WriteLn('Audio file saved to: ' + AudioFilePath); end else WriteLn('No audio generated.'); WriteLn(''); WriteLn('=== Synthesis Result ==='); WriteLn('Voice Name: ' + Voice); WriteLn('Text: ' + TextToSynthesize); WriteLn(''); WriteLn('--- Performance Metrics (ms) ---'); WriteLn('Engine creation: ', CreateEngineTime, ' ms'); WriteLn('CUDA Init: ', CUDAInitTime, ' ms'); WriteLn('Init Kokoro: ', InitKokoroTime, ' ms'); WriteLn('Generation call: ', GenerationTime, ' ms'); end else Writeln('Synthesis cancelled because model and/or voice are missing.') except on E: Exception do WriteLn(E.ClassName, ': ', E.Message); end; DestroyEngineTimer := TStopwatch.StartNew; // Start Engine Timer DestroyEngine; DestroyEngineTimer.Stop; // Stop Engine Timer DestroyEngineTime := DestroyEngineTimer.ElapsedMilliseconds; WriteLn('Engine Destruct: ', DestroyEngineTime, ' ms'); TotalStopwatch.Stop; WriteLn('Total Program Execution Time: ', TotalStopwatch.ElapsedMilliseconds, ' ms'); Write('Press Enter to exit...'); ReadLn; end. initialization // Pre-cache Python engine to avoid multiple loads InitializePythonEngine; finalization DestroyEngine; end.
However, it was not playable yet in 'Windows Media Player', the message was the same:
textCannot open kokoro_output.wav. This may be because the file type is not supported, the file extension is incorrect, or the file is corrupt. 0xC00D36C4 Send Feedback
See the information from MediaInfo:
textGeneral Complete name : C:\agents\SOURCE_neural_extern_TTS_qlty_prese_Kokoro\Kokoro_v0_m0_p0\Win64\Debug\kokoro_output.wav File size : 3.82 MiB
Note that now it has created a "body" in the file, now it has 3.82 MiB. It seems that it has been filled with audio data, but it is still unplayable.
Below I will post the SMPlayer log, after trying to open the file with it:
text[19:04:54:706] global_init [19:04:54:707] global_init: config file: 'C:/Users/user/.smplayer/smplayer.ini' [19:04:54:707] Preferences::load [19:04:54:707] AssStyles::load [19:04:54:707] Preferences::load: config_version: 5, CURRENT_CONFIG_VERSION: 5 [19:04:54:709] Translator::loadCatalog: can't load "qt_pt_BR" from "C:/Program Files/SMPlayer/translations" [19:04:54:709] Translator::loadCatalog: can't load "qtbase_pt_BR" from "C:/Program Files/SMPlayer/translations" [19:04:54:709] Translator::loadCatalog: successfully loaded "smplayer_pt_BR" from "C:/Program Files/SMPlayer/translations" [19:04:54:709] SMPlayer v.22.7.0 (revisão 10091) (64-bit) executando em Windows (Windows 10) [19:04:54:709] Compiled with Qt v. 5.15.2, using 5.15.2 [19:04:54:709] * application path: "C:/Program Files/SMPlayer" [19:04:54:709] * data path: "C:/Program Files/SMPlayer" [19:04:54:709] * translation path: "C:/Program Files/SMPlayer/translations" [19:04:54:709] * doc path: "C:/Program Files/SMPlayer/docs" [19:04:54:710] * themes path: "C:/Program Files/SMPlayer/themes" [19:04:54:710] * shortcuts path: "C:/Program Files/SMPlayer/shortcuts" [19:04:54:710] * config path: "C:/Users/user/.smplayer" [19:04:54:710] * ini path: "C:/Users/user/.smplayer" [19:04:54:710] * file for subtitles' styles: "C:/Users/user/.smplayer/styles.ass" [19:04:54:710] * current path: "C:/Program Files/SMPlayer" [19:04:54:710] main: hdpi_config_path: "C:/Users/user/.smplayer" [19:04:54:710] SMPlayer::processArgs: arguments: 2 [19:04:54:710] SMPlayer::processArgs: 0 = C:\Program Files\SMPlayer\smplayer.exe [19:04:54:710] SMPlayer::processArgs: 1 = C:\agents\SOURCE_neural_extern_TTS_qlty_prese_Kokoro\Kokoro_v0_m0_p0\Win64\Debug\kokoro_output.wav [19:04:54:710] SMPlayer::processArgs: files_to_play: count: 1 [19:04:54:710] SMPlayer::processArgs: files_to_play[0]: 'C:/agents/SOURCE_neural_extern_TTS_qlty_prese_Kokoro/Kokoro_v0_m0_p0/Win64/Debug/kokoro_output.wav' [19:04:54:711] SMPlayer::gui: changed working directory to app path [19:04:54:711] SMPlayer::gui: current directory: C:/Program Files/SMPlayer [19:04:54:711] ScreenHelper::setAutoHideCursor: 0 [19:04:54:711] Images::setTheme: "H2O" is an internal theme [19:04:54:711] Images::setThemesPath: "" [19:04:54:721] MediaSettings::reset [19:04:54:721] Core::changeFileSettingsMethod: hash [19:04:54:721] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:54:721] PlayerProcess::createPlayerProcess: creating MPVProcess [19:04:54:722] MPVProcess::initializeOptionVars [19:04:54:722] MPVProcess::MPVProcess: socket_name: "C:/Users/user/AppData/Local/Temp/smplayer-mpv-657c" [19:04:54:722] MediaSettings::reset [19:04:54:722] Preferences::monitor_aspect_double [19:04:54:722] warning: monitor_aspect couldn't be parsed! [19:04:54:722] monitor_aspect set to 0 [19:04:54:722] WinScreenSaver::retrieveState [19:04:54:723] WinScreenSaver::retrieveState: lowpower: 0, poweroff: 0, screensaver: 0 [19:04:55:748] Playlist::setModified: 0 [19:04:55:748] Playlist::updateWindowTitle: "Lista de reprodução sem nome" [19:04:55:756] Recents::addItem: 'http://smplayer.info/sample.m3u8' [19:04:55:756] Playlist::setConfigPath: "C:/Users/user/.smplayer" [19:04:55:756] Playlist::setConfigPath: ini file: "C:/Users/user/.smplayer/playlist.ini" [19:04:55:946] Playlist::loadSettings [19:04:55:946] Helper::qtVersion: 6502 [19:04:57:231] Playlist::updateWindowTitle: "C:/Users/user/Downloads/Video/no desc_2.m3u8" [19:04:57:231] Playlist::setModified: 0 [19:04:57:231] Playlist::updateWindowTitle: "C:/Users/user/Downloads/Video/no desc_2.m3u8" [19:04:57:241] BaseGui::BaseGui: default_style: "windowsvista" [19:04:57:241] Favorites::load [19:04:57:242] Favorites::load [19:04:57:242] Favorites::load [19:04:57:244] BaseGui::initializeMenus [19:04:57:244] BaseGui::updateBookmarks [19:04:57:254] BaseGui::initializeMenus [19:04:57:254] BaseGui::updateBookmarks [19:04:57:254] BaseGui::updateRecents [19:04:57:254] BaseGui::updateWidgets [19:04:57:254] Core::mute [19:04:57:254] Core::pausing_prefix [19:04:57:254] MplayerVersion::isMplayerAtLeast: comparing 27665 with 33472 [19:04:57:254] MPVProcess::sendCommand: "set mute yes" [19:04:57:254] WARNING: MPVProcess::sendCommand: mpv is not running. Command ignored. [19:04:57:254] Core::updateWidgets [19:04:57:254] BaseGui::updateWidgets [19:04:57:254] Core::changeUseCustomSubStyle: 1 [19:04:57:254] Core::changeSubVisilibity: 1 [19:04:57:254] Core::displayMessage [19:04:57:254] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:57:255] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:57:255] BaseGui::setupNetworkProxy [19:04:57:255] BaseGui::setupNetworkProxy: no proxy [19:04:57:255] BaseGui::setStayOnTop: 0 [19:04:57:255] BaseGui::setStayOnTop: nothing to do [19:04:57:255] BaseGui::updateWidgets [19:04:57:255] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:57:255] BaseGui::updateRecents [19:04:57:255] UpdateChecker::UpdateChecker: enabled: true [19:04:57:255] UpdateChecker::UpdateChecker: days_to_check: 7 [19:04:57:255] UpdateChecker::UpdateChecker: days since last check: 0 [19:04:57:256] BaseGuiPlus::updateSendToScreen [19:04:57:256] BaseGuiPlus::updateSendAudioMenu [19:04:57:256] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:57:256] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:57:256] DeviceInfo::mpvAudioDevices [19:04:57:444] DeviceInfo::mpvAudioDevices: "List of detected audio devices:" [19:04:57:444] DeviceInfo::mpvAudioDevices: "'auto' (Autoselect device)" [19:04:57:444] DeviceInfo::mpvAudioDevices: "'wasapi/{44eb922c-cea2-47e6-82ab-1b72368e7fd1}' (Speakers/Headphones (2- Realtek(R) Audio))" [19:04:57:444] DeviceInfo::mpvAudioDevices: device: "{44eb922c-cea2-47e6-82ab-1b72368e7fd1}" name: "Speakers/Headphones (2- Realtek(R) Audio)" [19:04:57:444] DeviceInfo::mpvAudioDevices: "'openal' (Default (openal))" [19:04:57:444] DeviceInfo::mpvAudioDevices: "'sdl' (Default (sdl))" [19:04:57:444] GlobalShortcuts::GlobalShortcuts [19:04:57:444] GlobalShortcuts::setEnabled: false [19:04:57:459] Chromecast::loadSettings [19:04:57:461] BaseGui::initializeMenus [19:04:57:461] BaseGui::updateBookmarks [19:04:57:461] BaseGui::updateRecents [19:04:57:461] BaseGuiPlus::updateWidgets [19:04:57:461] BaseGui::updateWidgets [19:04:57:461] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:57:470] BaseGuiPlus::loadConfig [19:04:57:470] DefaultGui::createStatusBar [19:04:57:471] StateWidget::StateWidget: supported formats for QMovie: ("gif", "webp") [19:04:57:471] DefaultGui::createActions [19:04:57:472] DefaultGui::createControlWidget [19:04:57:472] DefaultGui::createControlWidgetMini [19:04:57:472] AutohideWidget::installFilter: child name: "" [19:04:57:472] AutohideWidget::installFilter: child name: "mplayerwindowlogo" [19:04:57:472] AutohideWidget::installFilter: child name: "" [19:04:57:472] DefaultGui::adjustFloatingControlSize [19:04:57:472] DefaultGui::populateMainMenu [19:04:57:472] BaseGuiPlus::populateMainMenu [19:04:57:472] BaseGui::populateMainMenu [19:04:57:473] BaseGuiPlus::initializeSystrayMenu [19:04:57:474] BaseGui::initializeMenus [19:04:57:474] BaseGui::updateBookmarks [19:04:57:474] BaseGui::updateRecents [19:04:57:474] DefaultGui::updateWidgets [19:04:57:474] BaseGuiPlus::updateWidgets [19:04:57:474] BaseGui::updateWidgets [19:04:57:474] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:57:474] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:57:475] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:57:476] DefaultGui::loadConfig [19:04:57:568] DesktopInfo::isInsideScreen: center point of window: QPoint(967,491) [19:04:57:568] DesktopInfo::isInsideScreen: virtual geometry: QRect(0,0 1920x1030) [19:04:57:568] ToolbarEditor::load: 'toolbar1' [19:04:57:568] ToolbarEditor::load: loading action open_file [19:04:57:568] ToolbarEditor::load: loading action open_url [19:04:57:568] ToolbarEditor::load: loading action favorites_menu [19:04:57:568] ToolbarEditor::load: loading action separator [19:04:57:568] ToolbarEditor::load: adding separator [19:04:57:568] ToolbarEditor::load: loading action screenshot [19:04:57:568] ToolbarEditor::load: loading action separator [19:04:57:568] ToolbarEditor::load: adding separator [19:04:57:568] ToolbarEditor::load: loading action show_file_properties [19:04:57:568] ToolbarEditor::load: loading action show_playlist [19:04:57:568] ToolbarEditor::load: loading action show_tube_browser [19:04:57:568] ToolbarEditor::load: loading action separator [19:04:57:568] ToolbarEditor::load: adding separator [19:04:57:568] ToolbarEditor::load: loading action show_preferences [19:04:57:568] ToolbarEditor::load: loading action separator [19:04:57:568] ToolbarEditor::load: adding separator [19:04:57:568] ToolbarEditor::load: loading action play_prev [19:04:57:568] ToolbarEditor::load: loading action play_next [19:04:57:568] ToolbarEditor::load: loading action separator [19:04:57:568] ToolbarEditor::load: adding separator [19:04:57:568] ToolbarEditor::load: loading action audiotrack_menu [19:04:57:568] ToolbarEditor::load: loading action subtitlestrack_menu [19:04:57:568] ToolbarEditor::load: 'controlwidget' [19:04:57:568] ToolbarEditor::load: loading action play_or_pause [19:04:57:568] ToolbarEditor::load: loading action stop [19:04:57:568] ToolbarEditor::load: loading action separator [19:04:57:568] ToolbarEditor::load: adding separator [19:04:57:568] ToolbarEditor::load: loading action rewindbutton_action [19:04:57:568] ToolbarEditor::load: loading action timeslider_action [19:04:57:568] TimeSlider::setDragDelay: 100 [19:04:57:568] ToolbarEditor::load: loading action forwardbutton_action [19:04:57:569] ToolbarEditor::load: loading action separator [19:04:57:569] ToolbarEditor::load: adding separator [19:04:57:569] ToolbarEditor::load: loading action fullscreen [19:04:57:569] ToolbarEditor::load: loading action mute [19:04:57:569] ToolbarEditor::load: loading action volumeslider_action [19:04:57:569] ToolbarEditor::load: 'controlwidget_mini' [19:04:57:569] ToolbarEditor::load: loading action play_or_pause [19:04:57:569] ToolbarEditor::load: loading action stop [19:04:57:569] ToolbarEditor::load: loading action separator [19:04:57:569] ToolbarEditor::load: adding separator [19:04:57:569] ToolbarEditor::load: loading action rewind1 [19:04:57:569] ToolbarEditor::load: loading action timeslider_action [19:04:57:569] TimeSlider::setDragDelay: 100 [19:04:57:569] ToolbarEditor::load: loading action forward1 [19:04:57:569] ToolbarEditor::load: loading action separator [19:04:57:569] ToolbarEditor::load: adding separator [19:04:57:569] ToolbarEditor::load: loading action mute [19:04:57:569] ToolbarEditor::load: loading action volumeslider_action [19:04:57:569] ToolbarEditor::load: 'floating_control' [19:04:57:569] ToolbarEditor::load: loading action play_or_pause [19:04:57:569] ToolbarEditor::load: loading action stop [19:04:57:569] ToolbarEditor::load: loading action separator [19:04:57:569] ToolbarEditor::load: adding separator [19:04:57:569] ToolbarEditor::load: loading action rewind1 [19:04:57:569] ToolbarEditor::load: loading action current_timelabel_action [19:04:57:569] ToolbarEditor::load: loading action timeslider_action [19:04:57:569] TimeSlider::setDragDelay: 100 [19:04:57:569] ToolbarEditor::load: loading action total_timelabel_action [19:04:57:569] ToolbarEditor::load: loading action forward1 [19:04:57:570] ToolbarEditor::load: loading action separator [19:04:57:570] ToolbarEditor::load: adding separator [19:04:57:570] ToolbarEditor::load: loading action fullscreen [19:04:57:570] ToolbarEditor::load: loading action mute [19:04:57:570] ToolbarEditor::load: loading action volumeslider_action [19:04:57:571] Helper::qtVersion: 6502 [19:04:57:571] DefaultGui::updateWidgets [19:04:57:571] BaseGuiPlus::updateWidgets [19:04:57:571] BaseGui::updateWidgets [19:04:57:571] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:57:572] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:57:572] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:57:572] BaseGui::applyStyles [19:04:57:572] BaseGui::applyStyles: stylesheet: "H2O" [19:04:57:572] BaseGui::changeStyleSheet: "H2O" [19:04:57:572] BaseGui::loadQss: :/default-theme/style.qss [19:04:57:572] Images::setTheme: "H2O" is an internal theme [19:04:57:572] Images::setThemesPath: "" [19:04:57:572] BaseGui::changeStyleSheet: ":/H2O/style.qss" [19:04:57:572] BaseGui::loadQss: :/H2O/style.qss [19:04:57:572] Images::setTheme: "H2O" is an internal theme [19:04:57:572] Images::setThemesPath: "" [19:04:57:585] BaseGui::applyStyles: style: "Fusion" [19:04:57:608] BaseGui::showEvent [19:04:57:660] BaseGui::setInitialSecond: 0 [19:04:57:660] BaseGui::openFiles [19:04:57:660] Playlist::setModified: 0 [19:04:57:660] Playlist::updateWindowTitle: "C:/Users/user/Downloads/Video/no desc_2.m3u8" [19:04:57:660] Playlist::addFiles [19:04:57:661] Playlist::addFiles: latest_dir: "C:/agents/SOURCE_neural_extern_TTS_qlty_prese_Kokoro/Kokoro_v0_m0_p0/Win64/Debug" [19:04:57:661] BaseGui::open: 'C:/agents/SOURCE_neural_extern_TTS_qlty_prese_Kokoro/Kokoro_v0_m0_p0/Win64/Debug/kokoro_output.wav' [19:04:57:661] Core::open: 'C:/agents/SOURCE_neural_extern_TTS_qlty_prese_Kokoro/Kokoro_v0_m0_p0/Win64/Debug/kokoro_output.wav' [19:04:57:661] Core::open: * identified as local file [19:04:57:661] Core::openFile: 'C:/agents/SOURCE_neural_extern_TTS_qlty_prese_Kokoro/Kokoro_v0_m0_p0/Win64/Debug/kokoro_output.wav' [19:04:57:661] Core::playNewFile: 'C:/agents/SOURCE_neural_extern_TTS_qlty_prese_Kokoro/Kokoro_v0_m0_p0/Win64/Debug/kokoro_output.wav' [19:04:57:661] Core::saveMediaInfo [19:04:57:661] MediaSettings::reset [19:04:57:661] FileSettingsHash::existSettingsFor: "C:/agents/SOURCE_neural_extern_TTS_qlty_prese_Kokoro/Kokoro_v0_m0_p0/Win64/Debug/kokoro_output.wav" [19:04:57:662] FileSettingsHash::existSettingsFor: config_file: "C:/Users/user/.smplayer/file_settings/0/010cdc9801149451.ini" [19:04:57:662] Core::playNewFile: we have settings for this file [19:04:57:662] Core::restoreSettingsForMedia: "C:/agents/SOURCE_neural_extern_TTS_qlty_prese_Kokoro/Kokoro_v0_m0_p0/Win64/Debug/kokoro_output.wav" type: 0 [19:04:57:662] FileSettings::loadSettingsFor: "C:/agents/SOURCE_neural_extern_TTS_qlty_prese_Kokoro/Kokoro_v0_m0_p0/Win64/Debug/kokoro_output.wav" type: 0 [19:04:57:663] FileSettingsHash::loadSettingsFor: config_file: "C:/Users/user/.smplayer/file_settings/0/010cdc9801149451.ini" [19:04:57:663] MediaSettings::reset [19:04:57:663] MediaSettings::load [19:04:57:663] MediaSettings::load: demuxer_section: demuxer_unknown [19:04:57:663] MediaSettings::load: list of video tracks: [19:04:57:664] MediaSettings::load: list of audio tracks: [19:04:57:664] MediaSettings::load: list of subtitle tracks: [19:04:57:664] Core::restoreSettingsForMedia: media settings read [19:04:57:664] Core::playNewFile: volume: 40, old_volume: 40 [19:04:57:664] Core::initPlaying [19:04:57:664] MplayerWindow::setLogoVisible: false [19:04:57:664] Core::startMplayer: file: "C:/agents/SOURCE_neural_extern_TTS_qlty_prese_Kokoro/Kokoro_v0_m0_p0/Win64/Debug/kokoro_output.wav" seek: 0 [19:04:57:664] RetrieveYoutubeUrl::close [19:04:57:664] Core::startMplayer: checking if stream is a playlist [19:04:57:664] Core::startMplayer: url path: '/agents/SOURCE_neural_extern_TTS_qlty_prese_Kokoro/Kokoro_v0_m0_p0/Win64/Debug/kokoro_output.wav' [19:04:57:664] Core::startMplayer: url_is_playlist: 0 [19:04:57:664] InfoReader::setPlayerBin: mplayerbin: "mpv/mpv.exe" [19:04:57:668] InfoReader::getInfo: sname: "mpv.exe_4a8f9e8017aa512e" [19:04:57:669] InfoReader::getInfo: loaded info from "C:/Users/user/.smplayer/player_info.ini" [19:04:57:743] Core::startMplayer: edl file: '' [19:04:57:743] Core::startMplayer: extra_params: () [19:04:57:743] ScreenHelper::playingStarted [19:04:57:743] ScreenHelper::setAutoHideCursor: 1 [19:04:57:743] VideoLayer::playingStarted [19:04:57:743] Core::disableScreensaver [19:04:57:743] WinScreenSaver::disable [19:04:57:744] Core::startMplayer: command: 'mpv/mpv.exe --no-quiet --terminal --no-msg-color --input-ipc-server=C:/Users/user/AppData/Local/Temp/smplayer-mpv-657c --msg-level=ffmpeg/demuxer=error --video-rotate=no --no-config --no-fs --hwdec=no --sub-auto=fuzzy --priority=normal --no-input-default-bindings --input-vo-keyboard=no --no-input-cursor --cursor-autohide=no --no-keepaspect --wid=6167854 --monitorpixelaspect=1 --osd-level=1 --osd-scale=1 --osd-bar-align-y=0.6 --sub-ass --embeddedfonts --sub-ass-line-spacing=0 --sub-scale=1 --sub-font=Arial --sub-color=#ffffffff --sub-shadow-color=#ff000000 --sub-border-color=#ff000000 --sub-border-size=0.75 --sub-shadow-offset=2.5 --sub-font-size=50 --sub-bold=no --sub-italic=no --sub-margin-y=8 --sub-margin-x=20 --sub-codepage=ISO-8859-1 --sub-pos=100 --volume=100 --mute=yes --cache=auto --screenshot-template=cap_%F_%p_%02n --screenshot-format=jpg --screenshot-directory=C:\Users\user\Pictures\smplayer_screenshots --audio-pitch-correction=yes --af-add=@aeq:lavfi=[firequalizer=gain='cubic_interpolate(f)':zero_phase=on:wfunc=tukey:delay=0.027:gain_entry='entry(0,0);entry(62.5,0);entry(125,0);entry(250,0);entry(500,0);entry(1000,0);entry(2000,0);entry(4000,0);entry(8000,0);entry(16000,0)'] --volume-max=100 --term-playing-msg=MPV_VERSION=${=mpv-version:} INFO_VIDEO_WIDTH=${=width} INFO_VIDEO_HEIGHT=${=height} INFO_VIDEO_ASPECT=${=video-params/aspect} INFO_VIDEO_FPS=${=container-fps:${=fps}} INFO_VIDEO_FORMAT=${=video-format} INFO_VIDEO_CODEC=${=video-codec} INFO_DEMUX_ROTATION=${=track-list/0/demux-rotation} INFO_AUDIO_FORMAT=${=audio-codec-name} INFO_AUDIO_CODEC=${=audio-codec} INFO_AUDIO_RATE=${=audio-params/samplerate} INFO_AUDIO_NCH=${=audio-params/channel-count} INFO_LENGTH=${=duration:${=length}} INFO_DEMUXER=${=current-demuxer:${=demuxer}} INFO_SEEKABLE=${=seekable} INFO_TITLES=${=disc-titles} INFO_CHAPTERS=${=chapters} INFO_TRACKS_COUNT=${=track-list/count} METADATA_TITLE=${metadata/by-key/title:} METADATA_ARTIST=${metadata/by-key/artist:} METADATA_ALBUM=${metadata/by-key/album:} METADATA_GENRE=${metadata/by-key/genre:} METADATA_DATE=${metadata/by-key/date:} METADATA_TRACK=${metadata/by-key/track:} METADATA_COPYRIGHT=${metadata/by-key/copyright:} INFO_MEDIA_TITLE=${=media-title:} INFO_STREAM_PATH=${stream-path} --term-status-msg=STATUS: ${=time-pos} / ${=duration:${=length:0}} P: ${=pause} B: ${=paused-for-cache} I: ${=core-idle} VB: ${=video-bitrate:0} AB: ${=audio-bitrate:0} C:/agents/SOURCE_neural_extern_TTS_qlty_prese_Kokoro/Kokoro_v0_m0_p0/Win64/Debug/kokoro_output.wav' [19:04:57:744] MyProcess::start: environment path: "C:\\Program Files\\SMPlayer;C:\\Program Files\\NVIDIA\\CUDNN\\v9.1\\bin;C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.4\\bin;C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.4\\libnvvp;;C:\\Program Files (x86)\\Embarcadero\\Studio\\22.0\\bin;C:\\Users\\Public\\Documents\\Embarcadero\\Studio\\22.0\\Bpl;C:\\Program Files (x86)\\Embarcadero\\Studio\\22.0\\bin64;C:\\Users\\Public\\Documents\\Embarcadero\\Studio\\22.0\\Bpl\\Win64;C:\\Windows\\system32;C:\\Windows;C:\\Windows\\System32\\Wbem;C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\;C:\\Windows\\System32\\OpenSSH\\;C:\\Program Files\\Meld\\;C:\\Windows\\System32\\Wbem;C:\\openjdk\\jdk-22.0.1\\bin;C:\\ProgramData\\chocolatey\\bin;C:\\msys64\\ucrt64\\bin;C:\\Program Files\\NVIDIA\\CUDNN\\v12.4\\bin;C:\\Program Files\\Tesseract-OCR;C:\\Program Files\\CMake\\bin;C:\\Program Files\\Git\\cmd;C:\\Program Files\\NVIDIA Corporation\\Nsight Compute 2024.1.1\\;C:\\Program Files (x86)\\NVIDIA Corporation\\PhysX\\Common;C:\\Program Files\\NVIDIA Corporation\\NVIDIA NvDLISR;C:\\Program Files\\eSpeak NG\\;C:\\Users\\user\\AppData\\Local\\Programs\\Python\\Python39\\Scripts\\;C:\\Users\\user\\AppData\\Local\\Programs\\Python\\Python39\\;C:\\Users\\user\\.cargo\\bin;C:\\Users\\user\\AppData\\Local\\Microsoft\\WindowsApps;C:\\texlive\\2023\\bin\\windows;C:\\Program Files\\JetBrains\\PyCharm Community Edition 2024.1.3\\bin;C:\\Users\\user\\.cache\\lm-studio\\bin;C:\\Users\\user\\AppData\\Local\\Programs\\Ollama;C:\\Users\\user\\AppData\\Local\\Programs\\Microsoft VS Code\\bin;C:\\Program Files\\NVIDIA\\CUDNN\\v12.4\\bin" [19:04:57:744] MyProcess::start: current directory: "C:/Program Files/SMPlayer" [19:04:57:770] BaseGui::loadActions [19:04:57:770] ActionsEditor::loadFromConfig [19:04:57:774] BaseGuiPlus::updateShortcutsContext [19:04:57:818] BaseGui::initializeMenus [19:04:57:818] BaseGui::updateBookmarks [19:04:57:818] BaseGui::updateRecents [19:04:57:818] DefaultGui::updateWidgets [19:04:57:818] BaseGuiPlus::updateWidgets [19:04:57:818] BaseGui::updateWidgets [19:04:57:820] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:57:821] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:57:822] PlayerID::Player: player_bin: "mpv/mpv.exe" filename: "mpv.exe" [19:04:58:256] BaseGui::checkReminder [19:04:58:534] MPVProcess::parseLine: "Failed to recognize file format." [19:04:58:534] MPVProcess::parseLine: "Exiting... (Errors when loading file)" [19:04:58:592] MyProcess::procFinished [19:04:58:592] MyProcess::procFinished: Bytes available: 0 [19:04:58:592] MPVProcess::processFinished: exitCode: 2, status: 0 [19:04:58:592] ScreenHelper::playingStopped [19:04:58:592] ScreenHelper::setAutoHideCursor: 0 [19:04:58:592] VideoLayer::playingStopped [19:04:58:592] Core::enableScreensaver [19:04:58:592] WinScreenSaver::enable [19:04:58:592] WinScreenSaver::restoreState: lowpower: 0, poweroff: 0, screensaver: 0 [19:04:58:592] Core::processFinished [19:04:58:592] Core::processFinished: we_are_restarting: 0 [19:04:58:592] Core::processFinished: play has finished! [19:04:58:592] Core::processFinished: exit_code: 2 [19:04:58:592] BaseGui::displayState: "Stopped" [19:04:58:592] DefaultGui::togglePlayAction [19:04:58:592] BaseGui::togglePlayAction [19:04:58:592] StateWidget::watchState: 0 [19:04:58:592] BaseGui::showExitCodeFromMplayer: 2 [19:04:58:592] BaseGui::showExitCodeFromMplayer: not displaying error dialog [19:04:58:592] Playlist::playerFinishedWithError: 2 [19:04:58:592] BaseGui::checkStayOnTop [19:05:16:940] BaseGui::showFilePropertiesDialog [19:05:16:940] BaseGui::createFilePropertiesDialog [19:05:16:943] InfoFile::style: stylesheet_file: ":/H2O/infofile.css" [19:05:16:944] InfoReader::setPlayerBin: mplayerbin: "mpv/mpv.exe" [19:05:16:945] InfoReader::getInfo: sname: "mpv.exe_4a8f9e8017aa512e" [19:05:16:945] InfoReader::getInfo: loaded info from "C:/Users/user/.smplayer/player_info.ini" [19:05:16:946] FilePropertiesDialog::setDemuxer [19:05:16:946] FilePropertiesDialog::find [19:05:16:946] * demuxer: '', pos: -1 [19:05:16:946] FilePropertiesDialog::setAudioCodec [19:05:16:946] FilePropertiesDialog::find [19:05:16:946] * ac: '', pos: -1 [19:05:16:946] FilePropertiesDialog::setVideoCodec [19:05:16:946] FilePropertiesDialog::find [19:05:16:946] * vc: '', pos: -1 [19:05:16:946] InfoFile::style: stylesheet_file: ":/H2O/infofile.css" [19:05:32:747] BaseGui::showLog [19:05:58:594] BaseGui::clear_just_stopped
I hope this additional information is helpful in helping you resolve this issue once and for all.
I tried again like this:
textprogram pP4DKokoro; {$APPTYPE CONSOLE} {$R *.res} uses System.SysUtils, System.Types, System.Diagnostics, System.IOUtils, // TPath, TDirectory, TFile Windows, PythonEngine, VarPyth, System.Classes, System.Net.HttpClient, // For downloading models System.Net.URLClient, // For downloading models System.Zip, System.Variants; // Required for Variant operations var PythonEngine: TPythonEngine; PythonModule: TPythonModule; // Structured Python module for Delphi functions PythonHome: string; PyFuncGenerate: PPyObject; // Global reference to the generation function // ----------------------------------------------------------------------------- // Constants for Kokoro Model and Voice Management // ----------------------------------------------------------------------------- const KOKORO_REPO_ID = 'hexgrad/Kokoro-82M'; KOKORO_MODELS_BASE_PATH = 'kokoroModels'; // Base path for Kokoro models // ----------------------------------------------------------------------------- // Helper Functions: Directory and File Management // ----------------------------------------------------------------------------- // Check if a directory exists; create if it doesn't. Returns True if exists or was created. function EnsureDirectoryExists(const DirPath: string): Boolean; begin Result := TDirectory.Exists(DirPath); if not Result then begin try TDirectory.CreateDirectory(DirPath); Result := True; except // Handle potential exceptions during directory creation (e.g., permissions) Result := False; end; end; end; // ----------------------------------------------------------------------------- // Helper Functions: Model and Voice Download and Verification // ----------------------------------------------------------------------------- function DownloadFile(const URL, LocalFilename: string): Boolean; var HTTPClient: THTTPClient; Response: IHTTPResponse; FileStream: TFileStream; begin Result := False; HTTPClient := THTTPClient.Create; try Response := HTTPClient.Get(URL); if Response.StatusCode = 200 then begin FileStream := TFileStream.Create(LocalFilename, fmCreate); try FileStream.CopyFrom(Response.ContentStream, 0); Result := True; finally FileStream.Free; end; end else begin Writeln(Format(' Error downloading %s. HTTP Status Code: %d', [URL, Response.StatusCode])); end; finally HTTPClient.Free; end; end; function CheckKokoroModelInstallation(const ModelsBasePath: string): Boolean; var ModelConfigPath: string; ModelPath: string; begin ModelConfigPath := TPath.Combine(ModelsBasePath, 'config.json'); ModelPath := TPath.Combine(ModelsBasePath, 'kokoro-v1_0.pth'); Result := TFile.Exists(ModelConfigPath) and TFile.Exists(ModelPath); if not Result then Writeln('Kokoro base model is NOT installed correctly!'); end; function DownloadKokoroModelFiles(const BasePath: string): Boolean; var BaseURL: string; Filename: string; LocalFilePath: string; FilesToDownload: TArray<string>; i: Integer; begin Result := True; // Assume success, set to False on any failure // List of necessary files from the base model. FilesToDownload := ['config.json', 'kokoro-v1_0.pth']; BaseURL := Format('https://huggingface.co/%s/resolve/main/', [KOKORO_REPO_ID]); for i := Low(FilesToDownload) to High(FilesToDownload) do begin Filename := FilesToDownload[i]; LocalFilePath := TPath.Combine(BasePath, Filename); // Combine base path with filename Writeln(Format(' Downloading file "%s"...', [Filename])); if DownloadFile(BaseURL + Filename, LocalFilePath) then begin Writeln(Format(' File "%s" downloaded successfully.', [Filename])); end else begin Writeln(Format(' Failed to download file "%s".', [Filename])); Result := False; // Set Result to False on failure // Consider whether to exit early or continue to try downloading other files. end; end; end; function CheckKokoroVoiceInstallation(const ModelsBasePath, VoiceName: string): Boolean; var VoicePath: string; begin VoicePath := TPath.Combine(ModelsBasePath, 'voices\' + VoiceName + '.pt'); Result := TFile.Exists(VoicePath); if not Result then Writeln('Kokoro voice ' + VoiceName + ' is NOT installed correctly!'); end; function DownloadKokoroVoice(const ModelsBasePath: String; const VoiceName: string): Boolean; var VoiceURL: string; VoiceFilePath: string; begin Result := False; // Assume failure until successful download // Construct the URL and file path for the voice VoiceURL := Format('https://huggingface.co/%s/resolve/main/voices/%s.pt', [KOKORO_REPO_ID, VoiceName]); VoiceFilePath := TPath.Combine(ModelsBasePath, 'voices\' + VoiceName + '.pt'); Writeln(Format(' Downloading voice "%s"...', [VoiceName])); // Download the voice file if DownloadFile(VoiceURL, VoiceFilePath) then begin Writeln(Format(' Voice "%s" downloaded successfully.', [VoiceName])); Result := True; // Set Result to True on success end else begin Writeln(Format(' Failed to download voice "%s".', [VoiceName])); // You might choose to raise an exception or handle this failure differently end; end; // ----------------------------------------------------------------------------- // Embedded Python Code (Optimized) - Kokoro // ----------------------------------------------------------------------------- const // Initialization script: import dependencies, define globals and model init for Kokoro EMBEDDED_PYTHON_SCRIPT_INIT_KOKORO: string = 'from kokoro import KPipeline' + sLineBreak + 'import soundfile as sf' + sLineBreak + 'import torch' + sLineBreak + 'from loguru import logger' + sLineBreak + 'import time' + sLineBreak + 'import os' + sLineBreak + 'import numpy as np' + sLineBreak + // Import NumPy '# Set the PHONEMIZER_ESPEAK_LIBRARY and PHONEMIZER_ESPEAK_PATH environment variables' + sLineBreak + 'os.environ["PHONEMIZER_ESPEAK_LIBRARY"] = r"C:\\Program Files\\eSpeak NG\\libespeak-ng.dll"' + sLineBreak + 'os.environ["PHONEMIZER_ESPEAK_PATH"] = r"C:\\Program Files\\eSpeak NG\\espeak-ng.exe"' + sLineBreak + 'pipeline = None' + sLineBreak + // Global KPipeline instance 'def init_kokoro(lang_code, device):' + sLineBreak + ' global pipeline' + sLineBreak + ' pipeline = KPipeline(lang_code=lang_code, device=device)' + sLineBreak; // Generation script: batch process text in one go for Kokoro EMBEDDED_PYTHON_SCRIPT_GENERATE_OPTIMIZED_KOKORO: string = 'def generate_audio_optimized(text, voice, speed):' + sLineBreak + ' global pipeline' + sLineBreak + ' generator = pipeline(text, voice=voice, speed=speed, split_pattern=r"\n+")' + sLineBreak+ ' for result in generator:' + sLineBreak + ' if result.audio is not None:' + sLineBreak + ' return result.audio.cpu().numpy().astype(np.float32)' + sLineBreak + // Explicitly convert to float32 ' return None' + sLineBreak; // ----------------------------------------------------------------------------- // CUDA and Python Engine Setup Functions (Pre-caching and GPU init) // ----------------------------------------------------------------------------- procedure SetupCUDAEnvironment; var OldPath: string; begin // Set CUDA-related environment variables SetEnvironmentVariable('CUDA_PATH', 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4'); OldPath := GetEnvironmentVariable('PATH'); SetEnvironmentVariable('PATH', PChar(OldPath + ';C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\bin' + ';C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\libnvvp')); // Preload key CUDA and CTranslate2 libraries (and potentially torch/faster-whisper deps) LoadLibrary(PChar('C:\Users\user\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\lib\cudnn_graph64_9.dll')); // Preload CUDA DLL (if needed) - Adapt path if necessary LoadLibrary(PChar('C:\Windows\system32\DriverStore\FileRepository\nvdmui.inf_amd64_fdc98cdf10f69918\nvcuda64.dll')); // Add any other necessary CUDA/cuDNN related DLLs if needed based on Kokoro dependencies end; procedure InitializeCUDAContext; begin with PythonEngine do begin // Minimal CUDA initialization for fast startup ExecString(AnsiString('import torch; torch.cuda.init(); print("CUDA Device:", torch.cuda.get_device_name(0))')); CheckError; end; end; // Dummy Delphi method (not used in translation) function DoNothing(Self, Args: PPyObject): PPyObject; cdecl; begin Result := PythonEngine.ReturnNone; end; // ----------------------------------------------------------------------------- // Python Engine Initialization (pre-cache engine core and libraries) // ----------------------------------------------------------------------------- procedure InitializePythonEngine; begin PythonEngine := TPythonEngine.Create(nil); PythonEngine.Name := 'PythonEngine'; // Specify the Python 3.9 DLL and home directory PythonEngine.DllName := 'C:\Users\user\AppData\Local\Programs\Python\Python39\python39.dll'; PythonHome := 'C:\Users\user\AppData\Local\Programs\Python\Python39'; PythonEngine.SetPythonHome(PWideChar(PythonHome)); PythonEngine.LoadDll; // Create a Python module for any Delphi exports (if needed) PythonModule := TPythonModule.Create(nil); PythonModule.Engine := PythonEngine; PythonModule.ModuleName := 'delphi_module'; PythonModule.AddMethod('do_nothing', @DoNothing, 'A dummy method.'); end; // ----------------------------------------------------------------------------- // Initialize and Pre-load Kokoro Model // ----------------------------------------------------------------------------- procedure InitializeKokoro(const LangCode, Device: string); var pyArgs, pyInitFunc, pyMainModule: PPyObject; pyLangCode, pyDevice: PPyObject; begin with PythonEngine do begin // Run the initialization script (pre-cache libraries and globals) ExecString(AnsiString(EMBEDDED_PYTHON_SCRIPT_INIT_KOKORO)); CheckError; pyMainModule := GetMainModule; if not Assigned(pyMainModule) then raise Exception.Create('Cannot retrieve __main__ module.'); pyInitFunc := PyObject_GetAttrString(pyMainModule, 'init_kokoro'); Py_XDECREF(pyMainModule); if not Assigned(pyInitFunc) then raise Exception.Create('Function init_kokoro not found.'); // Prepare arguments for Python function pyLangCode := PyUnicode_FromWideChar(PWideChar(LangCode), Length(LangCode)); if not Assigned(pyLangCode) then raise Exception.Create('Error creating Python string for language code.'); pyDevice := PyUnicode_FromWideChar(PWideChar(Device), Length(Device)); if not Assigned(pyDevice) then raise Exception.Create('Error creating Python string for device.'); pyArgs := MakePyTuple([pyLangCode, pyDevice]); Py_XDECREF(pyLangCode); Py_XDECREF(pyDevice); // Call the Python init function PyObject_CallObject(pyInitFunc, pyArgs); Py_XDECREF(pyArgs); Py_XDECREF(pyInitFunc); if PyErr_Occurred <> nil then begin WriteLn('Error initializing Kokoro model.'); PyErr_Print; // Print detailed Python error to console PyErr_Clear; raise Exception.Create('Error initializing Kokoro model (see Python error output).'); end; // Load the optimized generation script and retrieve the generation function ExecString(AnsiString(EMBEDDED_PYTHON_SCRIPT_GENERATE_OPTIMIZED_KOKORO)); CheckError; PyFuncGenerate := PyObject_GetAttrString(GetMainModule, 'generate_audio_optimized'); if not Assigned(PyFuncGenerate) then raise Exception.Create('Generation function not found.'); CheckError; end; end; // ----------------------------------------------------------------------------- // Fast Path: Call Optimized Generation Function (minimal overhead) // ----------------------------------------------------------------------------- function CallOptimizedGenerate(const TextToGenerate, Voice: string; Speed: Double): TBytes; var pyArgsTuple, pyResult: PPyObject; AudioBytes: TBytes; NumPyArray: Variant; pVarArray: System.PVarArray; // Pointer to Variant array DataPtr: Pointer; DataSize: NativeInt; VarTypeStr: string; begin // Initialize Result to an empty array SetLength(AudioBytes, 0); Result := AudioBytes; with PythonEngine do begin // Build tuple: (text, voice, speed) pyArgsTuple := MakePyTuple([ PyUnicode_FromWideChar(PWideChar(TextToGenerate), Length(TextToGenerate)), PyUnicode_FromWideChar(PWideChar(Voice), Length(Voice)), PyFloat_FromDouble(Speed) ]); // Call the optimized generation function pyResult := PyObject_CallObject(PyFuncGenerate, pyArgsTuple); Py_XDECREF(pyArgsTuple); // If an error occurred, return an empty byte array. if (pyResult = nil) or (PyErr_Occurred <> nil) then begin WriteLn('Generation Error: pyResult is nil or Python error occurred.'); PyErr_Print; // Print Python error to console for debugging PyErr_Clear; Exit; // Exit the function, returning the empty array end; // Check if the returned object is None (using Python's None object) if pyResult = PythonEngine.Py_None then begin Py_XDECREF(pyResult); // Decrement the reference count of Py_None WriteLn('No audio generated (Python returned None).'); Exit; // Return empty array end; NumPyArray := PyObjectAsVariant(pyResult); VarTypeStr := VarToStr(VarType(NumPyArray)); // Get Variant type as string. Use VarToStr WriteLn('Debug: Variant Type: ' + VarTypeStr); // Output Variant type Py_XDECREF(pyResult); if VarIsEmpty(NumPyArray) or VarIsClear(NumPyArray) then begin WriteLn('Error: Could not convert audio data to Variant (Variant is Empty or Clear).'); Exit; end; if VarIsArray(NumPyArray) and (VarArrayDimCount(NumPyArray) = 1) then begin pVarArray := VarArrayAsPSafeArray(NumPyArray); if Assigned(pVarArray) then begin DataPtr := pVarArray.Data; DataSize := pVarArray.Bounds[0].ElementCount * pVarArray^.ElementSize; // CORRECT WriteLn('Debug: DataSize calculated: ' + IntToStr(DataSize) + ' bytes'); // Output DataSize SetLength(AudioBytes, DataSize); Move(DataPtr^, AudioBytes[0], DataSize); end else WriteLn('Failed to access variant array (pVarArray is nil).'); end else begin WriteLn('Data format received is incorrect (Not a 1D array or not an array at all).'); end; end; Result := AudioBytes; end; procedure SaveAudioToFile(const AudioBytes: TBytes; const FilePath: string); var FileStream: TFileStream; NumSamples: Integer; NumBytes: Integer; ChunkID: array[0..3] of Char; Format: array[0..3] of Char; Subchunk1ID: array[0..3] of Char; Subchunk2ID: array[0..3] of Char; ChunkSize: Integer; Subchunk1Size: Integer; AudioFormat: SmallInt; NumChannels: SmallInt; SampleRate: Integer; ByteRate: Integer; BlockAlign: SmallInt; BitsPerSample: SmallInt; Subchunk2Size: Integer; i: Integer; Sample: Single; // Corrected to Single ScaledSample: Int16; begin // Create a file stream to write the WAV file FileStream := TFileStream.Create(FilePath, fmCreate); try // Check if AudioBytes has data before writing header if Length(AudioBytes) = 0 then begin WriteLn('Warning: AudioBytes is empty, WAV file might be invalid.'); Exit; // Don't create an invalid WAV file. end; // Calculate the number of samples and total bytes NumBytes := Length(AudioBytes); NumSamples := NumBytes div SizeOf(Single); // 4 bytes per sample (Single/float32) // Prepare WAV header values. PCM *FLOAT* format. ChunkID := 'RIFF'; ChunkSize := 36 + NumBytes; // Total size, excluding RIFF and ChunkSize Format := 'WAVE'; Subchunk1ID := 'fmt '; Subchunk1Size := 16; // 16 for PCM. Use 18 for floats, and add extra bytes. AudioFormat := 1; // 1 for PCM, 3 for IEEE Float NumChannels := 1; // 1 for mono SampleRate := 24000; // 24000 Hz ByteRate := SampleRate * NumChannels * 2; // NumChannels * BytesPerSample * SampleRate. BytesPerSample = 2 for Int16! BlockAlign := NumChannels * 2; // NumChannels * BytesPerSample BitsPerSample := 16; // 16 bits for Int16 Subchunk2ID := 'data'; Subchunk2Size := NumSamples * NumChannels * 2; // NumChannels * BytesPerSample * NumSamples // Debug output for header values WriteLn('Debug: WAV Header Values:'); WriteLn('Debug: ChunkSize: ' + IntToStr(ChunkSize)); WriteLn('Debug: Subchunk2Size: ' + IntToStr(Subchunk2Size)); WriteLn('Debug: NumSamples: ' + IntToStr(NumSamples)); WriteLn('Debug: NumBytes: ' + IntToStr(NumBytes)); // Write WAV file header FileStream.Write(ChunkID, 4); FileStream.Write(ChunkSize, 4); FileStream.Write(Format, 4); FileStream.Write(Subchunk1ID, 4); FileStream.Write(Subchunk1Size, 4); FileStream.Write(AudioFormat, 2); FileStream.Write(NumChannels, 2); FileStream.Write(SampleRate, 4); FileStream.Write(ByteRate, 4); FileStream.Write(BlockAlign, 2); FileStream.Write(BitsPerSample, 2); FileStream.Write(Subchunk2ID, 4); FileStream.Write(Subchunk2Size, 4); // Convert float32 to int16 and write the audio data for i := 0 to NumSamples - 1 do begin // Get the float sample Sample := Single(PSingle(@AudioBytes[i * SizeOf(Single)])^); // Scale and convert to Int16. Clamp to prevent overflow. ScaledSample := Round(Sample * 32767.0); if ScaledSample > 32767 then ScaledSample := 32767 else if ScaledSample < -32768 then ScaledSample := -32768; // Write as Int16 (Little Endian) FileStream.Write(ScaledSample, 2); end; finally FileStream.Free; end; end; // ----------------------------------------------------------------------------- // Destroy Python Engine and Clean-up // ----------------------------------------------------------------------------- procedure DestroyEngine; begin if Assigned(PyFuncGenerate) then PythonEngine.Py_XDECREF(PyFuncGenerate); if Assigned(PythonModule) then PythonModule.Free; if Assigned(PythonEngine) then PythonEngine.Free; end; // ----------------------------------------------------------------------------- // Main Program: Pre-cache engine/core and perform generation // ----------------------------------------------------------------------------- var TotalStopwatch: TStopwatch; CreateEngineTime, GenerationTime, DestroyEngineTime, InitKokoroTime, CUDAInitTime: Int64; AudioFilePath, TextToSynthesize, Voice: string; EngineTimer, GenerationTimer, InputTimer, KokoroInitTimer, CUDAInitTimer, DestroyEngineTimer: TStopwatch; Device, LangCode: string; AudioBytes: TBytes; KokoroModelsBasePath: string; // Moved declaration here ModelInstalled, VoiceInstalled: Boolean; DownloadConfirmation: string; begin try MaskFPUExceptions(True); TotalStopwatch := TStopwatch.StartNew; SetupCUDAEnvironment; // Preload CUDA/related DLLs WriteLn('=== Kokoro Text-to-Speech ==='); InputTimer := TStopwatch.StartNew; Write('Enter text to synthesize: '); ReadLn(TextToSynthesize); Write('Enter Kokoro voice name (e.g., pf_dora): '); ReadLn(Voice); LangCode := 'p'; // Hardcoded for Portuguese Device := 'cuda'; // Hardcoded for CUDA (GPU) InputTimer.Stop; WriteLn('User Input Time: ', InputTimer.ElapsedMilliseconds, ' ms'); KokoroModelsBasePath := TPath.Combine(ExtractFilePath(ParamStr(0)), KOKORO_MODELS_BASE_PATH); // Ensure the base model directory exists if not EnsureDirectoryExists(KokoroModelsBasePath) then begin Writeln('Failed to create Kokoro models base directory: ' + KokoroModelsBasePath); Exit; // Critical error: Cannot proceed without model directory end; ModelInstalled := CheckKokoroModelInstallation(KokoroModelsBasePath); if not ModelInstalled then begin Write('Kokoro base model not installed. Do you want to download it? (yes/no): '); ReadLn(DownloadConfirmation); if SameText(DownloadConfirmation, 'yes') then begin if DownloadKokoroModelFiles(KokoroModelsBasePath) then // Pass base path ModelInstalled := CheckKokoroModelInstallation(KokoroModelsBasePath) // Recheck else begin WriteLn('Failed to download Kokoro model files.'); Exit; end; end else Exit; end; // Ensure the 'voices' subdirectory exists if not EnsureDirectoryExists(TPath.Combine(KokoroModelsBasePath, 'voices')) then begin Writeln('Failed to create voices subdirectory.'); Exit; end; VoiceInstalled := CheckKokoroVoiceInstallation(KokoroModelsBasePath, Voice); if not VoiceInstalled then begin Write('Voice "' + Voice + '" not found. Do you want to download it? (yes/no): '); ReadLn(DownloadConfirmation); if SameText(DownloadConfirmation, 'yes') then begin if DownloadKokoroVoice(KokoroModelsBasePath, Voice) then VoiceInstalled := CheckKokoroVoiceInstallation(KokoroModelsBasePath, Voice) else WriteLn('Failed to download voice') end; end; if ModelInstalled and VoiceInstalled then begin // Pre-initialize Python engine and load required models EngineTimer := TStopwatch.StartNew; // Start Engine Timer InitializePythonEngine; EngineTimer.Stop; // Stop Engine Timer CreateEngineTime := EngineTimer.ElapsedMilliseconds; CUDAInitTimer := TStopwatch.StartNew; // Start CUDA Init Timer InitializeCUDAContext; CUDAInitTimer.Stop; // Stop CUDA Init Timer CUDAInitTime := CUDAInitTimer.ElapsedMilliseconds; KokoroInitTimer := TStopwatch.StartNew; InitializeKokoro(LangCode, Device); // Initialize with language and device KokoroInitTimer.Stop; // Stop Kokoro Init Timer InitKokoroTime := KokoroInitTimer.ElapsedMilliseconds; WriteLn(Format('Synthesizing speech for text using voice: %s', [Voice])); GenerationTimer := TStopwatch.StartNew; AudioBytes := CallOptimizedGenerate(TextToSynthesize, Voice, 1.0); // Generate with text and voice GenerationTimer.Stop; GenerationTime := GenerationTimer.ElapsedMilliseconds; if Length(AudioBytes) > 0 then begin AudioFilePath := TPath.Combine(ExtractFilePath(ParamStr(0)), 'kokoro_output.wav'); SaveAudioToFile(AudioBytes, AudioFilePath); WriteLn('Audio file saved to: ' + AudioFilePath); end else WriteLn('No audio generated.'); WriteLn(''); WriteLn('=== Synthesis Result ==='); WriteLn('Voice Name: ' + Voice); WriteLn('Text: ' + TextToSynthesize); WriteLn(''); WriteLn('--- Performance Metrics (ms) ---'); WriteLn('Engine creation: ', CreateEngineTime, ' ms'); WriteLn('CUDA Init: ', CUDAInitTime, ' ms'); WriteLn('Init Kokoro: ', InitKokoroTime, ' ms'); WriteLn('Generation call: ', GenerationTime, ' ms'); end else Writeln('Synthesis cancelled because model and/or voice are missing.') except on E: Exception do WriteLn(E.ClassName, ': ', E.Message); end; DestroyEngineTimer := TStopwatch.StartNew; // Start Engine Timer DestroyEngine; DestroyEngineTimer.Stop; // Stop Engine Timer DestroyEngineTime := DestroyEngineTimer.ElapsedMilliseconds; WriteLn('Engine Destruct: ', DestroyEngineTime, ' ms'); TotalStopwatch.Stop; WriteLn('Total Program Execution Time: ', TotalStopwatch.ElapsedMilliseconds, ' ms'); Write('Press Enter to exit...'); ReadLn; end. initialization // Pre-cache Python engine to avoid multiple loads InitializePythonEngine; finalization DestroyEngine; end.
But it failed, see the output:
text=== Kokoro Text-to-Speech === Enter text to synthesize: Elabora relatórios de acompanhamento cronológico para as diferentes unidades do Departamento que propõem contratos. Enter Kokoro voice name (e.g., pf_dora): pf_dora User Input Time: 5142 ms CUDA Device: NVIDIA GeForce GTX 1060 with Max-Q Design C:\Users\user\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\modules\rnn.py:123: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.2 and num_layers=1 warnings.warn( C:\Users\user\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\nn\utils\weight_norm.py:143: FutureWarning: `torch.nn.utils.weight_norm` is deprecated in favor of `torch.nn.utils.parametrizations.weight_norm`. WeightNorm.apply(module, name, dim) Synthesizing speech for text using voice: pf_dora Debug: Variant Type: 8204 Debug: DataSize calculated: 4003200 bytes Debug: WAV Header Values: Debug: ChunkSize: 4003236 Debug: Subchunk2Size: 2001600 Debug: NumSamples: 1000800 Debug: NumBytes: 4003200 Audio file saved to: C:\agents\SOURCE_neural_extern_TTS_qlty_prese_Kokoro\Kokoro_v0_m0_p0\Win64\Debug\kokoro_output.wav === Synthesis Result === Voice Name: pf_dora Text: Elabora relatórios de acompanhamento cronológico para as diferentes unidades do Departamento que propõem contratos. --- Performance Metrics (ms) --- Engine creation: 186 ms CUDA Init: 11989 ms Init Kokoro: 27661 ms Generation call: 6076 ms Engine Destruct: 1167 ms Total Program Execution Time: 55078 ms Press Enter to exit...
MediaInfo:
textGeneral Complete name : C:\agents\SOURCE_neural_extern_TTS_qlty_prese_Kokoro\Kokoro_v0_m0_p0\Win64\Debug\kokoro_output.wav File size : 1.91 MiB
Solve this once and for all.
Below is a concise explanation of why the WAV files were corrupt and how to fix them, followed by a working final version of your SaveAudioToFile
routine. Once you make these adjustments, Windows Media Player (and most other players) should be able to open the resulting WAV.
Mismatch in Chunk Sizes
You were originally using:
pascalChunkSize := 36 + NumBytes; // <— NumBytes = float buffer size in bytes Subchunk2Size := NumSamples * 2; // writing 16-bit samples
But the actual data you write to the file is only NumSamples * 2
bytes (2 bytes per sample). Meanwhile, NumBytes
was the size of your float32 array in memory (4 bytes/sample), which overcounts the data size in the WAV header.
Writing 16-bit PCM data from float32 samples
Converting float32 samples to 16-bit is fine, but your header must declare 16-bit PCM and must specify the correct total size of the final data—not the size of the float array.
Because of these inconsistencies, the WAV header promised ~4 MB of audio data while only ~2 MB of PCM data followed the header. Many players will refuse to read a file if the ChunkSize
and/or data
size are obviously incorrect, displaying a "file corrupt" or "format not recognized" error.
ChunkSize
and Subchunk2Size
.AudioFormat = 1
(PCM), BitsPerSample = 16
, and so on.In short:
NumFloats
= number of float samples = Length(AudioBytes) div 4
.2
bytes per sample in the file.data
chunk = NumFloats * 2
.ChunkSize
= 36 + Subchunk2Size
.Below is a corrected version of SaveAudioToFile
. This ensures the WAV header is 100% consistent with the actual data you write.
SaveAudioToFile
(PCM 16-bit)pascalprocedure SaveAudioToFile(const AudioBytes: TBytes; const FilePath: string); var FileStream: TFileStream; FloatDataSize: Integer; // size of all floats in bytes NumFloats: Integer; // number of float32 samples ChunkID: array[0..3] of AnsiChar; Format: array[0..3] of AnsiChar; Subchunk1ID: array[0..3] of AnsiChar; Subchunk2ID: array[0..3] of AnsiChar; ChunkSize: Cardinal; Subchunk1Size: Cardinal; AudioFormat: Word; NumChannels: Word; SampleRate: Cardinal; ByteRate: Cardinal; BlockAlign: Word; BitsPerSample: Word; Subchunk2Size: Cardinal; i: Integer; SampleFloat: Single; SampleInt16: SmallInt; begin // If there is no audio, don't bother creating a broken file if Length(AudioBytes) = 0 then begin WriteLn('Warning: AudioBytes is empty, WAV file will not be created.'); Exit; end; // Create the file FileStream := TFileStream.Create(FilePath, fmCreate); try // Number of floats in the buffer FloatDataSize := Length(AudioBytes); // total bytes of float buffer NumFloats := FloatDataSize div SizeOf(Single); // each float is 4 bytes // We are converting from float32 to 16-bit PCM, so: // total bytes of data = NumFloats * 2 Subchunk2Size := NumFloats * 2; // 2 bytes per sample // Build standard PCM 16-bit WAV header ChunkID := 'RIFF'; Format := 'WAVE'; Subchunk1ID := 'fmt '; Subchunk2ID := 'data'; // For linear PCM: Subchunk1Size := 16; // PCM header size AudioFormat := 1; // 1 = PCM NumChannels := 1; // 1 channel (mono) SampleRate := 24000; // 24 kHz BitsPerSample := 16; // 16 bits per sample // Bytes per second = sampleRate * numChannels * bytesPerSample ByteRate := SampleRate * NumChannels * (BitsPerSample div 8); // BlockAlign = numChannels * bytesPerSample BlockAlign := NumChannels * (BitsPerSample div 8); // Main chunk size = (file size - 8) ChunkSize := 36 + Subchunk2Size; // 36 = size of the rest of the header // Debug WriteLn('--- WAV Header Debug ---'); WriteLn('Num float32 samples (NumFloats): ', NumFloats); WriteLn('Subchunk2Size (actual PCM bytes): ', Subchunk2Size); WriteLn('ChunkSize: ', ChunkSize); // Write the header FileStream.Write(ChunkID[0], 4); FileStream.Write(ChunkSize, 4); FileStream.Write(Format[0], 4); FileStream.Write(Subchunk1ID[0], 4); FileStream.Write(Subchunk1Size, 4); FileStream.Write(AudioFormat, 2); FileStream.Write(NumChannels, 2); FileStream.Write(SampleRate, 4); FileStream.Write(ByteRate, 4); FileStream.Write(BlockAlign, 2); FileStream.Write(BitsPerSample, 2); FileStream.Write(Subchunk2ID[0], 4); FileStream.Write(Subchunk2Size, 4); // Convert float32 → int16 and write the PCM samples for i := 0 to NumFloats - 1 do begin // Read one float sample SampleFloat := PSingle(@AudioBytes[i * SizeOf(Single)])^; // Scale/Clamp [-1.0..+1.0] to [-32768..32767] SampleInt16 := Round(SampleFloat * 32767); if SampleInt16 > 32767 then SampleInt16 := 32767 else if SampleInt16 < -32768 then SampleInt16 := -32768; // Write 2 bytes (little-endian) FileStream.Write(SampleInt16, SizeOf(SampleInt16)); end; finally FileStream.Free; end; end;
ChunkSize
: 36 + Subchunk2Size
(not 36 + FloatDataSize
).Subchunk2Size
: The final bytes you are actually writing, i.e. NumFloats * 2
, since you write 2 bytes per sample.AudioFormat=1
, BitsPerSample=16
: standard PCM, which Windows Media Player definitely supports.[-1.0..+1.0]
to 16-bit range [-32768..32767]
.With these changes, the resulting file’s header will match the size of the raw sample data appended, so Windows Media Player, ffplay, mpv, etc., should all accept and play it correctly.
If for some reason you wanted a 32-bit float WAV and a player that recognizes float WAV (not all do), you would change:
AudioFormat := 3; // WAVE_FORMAT_IEEE_FLOAT
BitsPerSample := 32;
Subchunk2Size := FloatDataSize; // just copy them as-is
FileStream.Write(AudioBytes[0], FloatDataSize)
(no conversion).But for maximum compatibility, 16-bit PCM is by far the safest bet on Windows.
That’s the root cause: the wave header claimed more data than was actually present due to reusing NumBytes
from float arrays in the chunk size. Correcting ChunkSize
/Subchunk2Size
to match the final 16-bit data length immediately resolves the corruption problem. Once you apply the above corrected SaveAudioToFile
, you should be able to open the WAV in Windows Media Player without error.