TTS Warnings and Fixes
생성일: 2025년 1월 13일
생성일: 2025년 1월 13일
C:\Users\muham\PycharmProjects\coqui-XTTS.venv\Scripts\python.exe C:\Users\muham\PycharmProjects\coqui-XTTS\tts_server.py
C:\Users\muham\PycharmProjects\coqui-XTTS.venv\Lib\site-packages\TTS\api.py:70: UserWarning: gpu
will be deprecated. Please use tts.to(device)
instead.
warnings.warn("gpu
will be deprecated. Please use tts.to(device)
instead.")
tts_models/multilingual/multi-dataset/xtts_v2 is already downloaded.
Using model: xtts
C:\Users\muham\PycharmProjects\coqui-XTTS.venv\Lib\site-packages\TTS\utils\io.py:54: FutureWarning: You are usingtorch.load
withweights_only=False
(the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value forweights_only
will be flipped toTrue
. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user viatorch.serialization.add_safe_globals
. We recommend you start settingweights_only=True
for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
return torch.load(f,map_location=map_location, **kwargs)
GPT2InferenceModel has generative capabilities, asprepare_inputs_for_generation
is explicitly overwritten. However, it doesn't directly inherit fromGenerationMixin
. From 👉v4.50👈 onwards,PreTrainedModel
will NOT inherit fromGenerationMixin
, and this model will lose the ability to callgenerate
and other related functions.
trust_remote_code=True
, you can get rid of this warning by loading the model with an auto class. See https://huggingface.co/docs/transformers/en/model_doc/auto#auto-classesGenerationMixin
(after PreTrainedModel
, otherwise you'll get an exception).Text splitted to sentences.
['Lisan al-Ghaib (also spelled Lisān al-Ġayb or Lisan ul Ghaib) is a title given to the Prophet Muhammad in Shia.']
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input'sattention_mask
to obtain reliable results.
Processing time: 9.72713017463684
Real-time factor: 1.07542729818864
this is the warnings which i am getting while running coqui tts locally. troubleshoot them please
Below is a breakdown of each warning you’re seeing and possible steps to address (or safely ignore) them. In many cases, these warnings won’t break your code immediately, but they foreshadow future deprecations or potential bugs.
gpu
will be deprecated. Please use tts.to(device)
instead."Coqui TTS used to have a gpu
parameter (e.g. TTS(gpu=True)
), but it is being phased out in favor of the more generic PyTorch way of placing a model on a device (e.g. tts.to("cuda")
).
gpu=True
or similar in your code.pythontts = TTS(model_path=...) tts.to("cuda") # or "cpu", depending on your environment
gpu
.torch.load
with weights_only=False
" FutureWarningPyTorch is warning that loading entire pickled model objects can be risky if the source is untrusted. In future versions, weights_only=True
will be the default for torch.load
, which means it will only load the weights rather than any other pickled Python objects.
Make sure your code (and the library code that callspython# Example usage with weights_only=True checkpoint = torch.load(model_file, map_location="cpu", weights_only=True)
torch.load
) supports loading weights only. Otherwise, you might have to wait until Coqui TTS (or any library you rely on) officially supports this option.GPT2InferenceModel
has generative capabilities but does NOT inherit from GenerationMixin
..."Hugging Face Transformers is alerting you that models deriving from PreTrainedModel
will soon lose built-in .generate()
functionality unless they explicitly inherit GenerationMixin
. This is part of Transformers’ internal design changes post v4.50.
GPT2InferenceModel
class):
pythonfrom transformers.generation_utils import GenerationMixin # or correct import path class GPT2InferenceModel(PreTrainedModel, GenerationMixin): ...
.generate()
and related methods..generate()
..generate()
, consider loading the model with trust_remote_code=True
and use an AutoModel class if the model host supports it:
pythonfrom transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained( "path_or_id_to_model", trust_remote_code=True )
This occurs when a Transformer-based model (often GPT-2 style) sees a pad_token_id
that’s the same as the eos_token_id
. Because the library cannot infer where actual padding vs. actual end-of-sequence occurs, you might get unexpected generation or attention behavior.
attention_mask
. For example:
pythoninputs = tokenizer("some text here", padding=True, return_tensors="pt") # returns something like {'input_ids':..., 'attention_mask':...} # Then feed these into the model outputs = model(**inputs)
gpu=True
(or gpu
arg) in Coqui TTS and use tts.to("cuda")
(or "cpu"
) instead.torch.load(weights_only=True)
flow, depending on whether you trust your model source and if the library is prepared to handle weight-only loading..generate()
, or modify your custom model class to inherit GenerationMixin
if you rely on generation capabilities.attention_mask
if the model usage and code path allows for it, which can prevent confusion between padding and the end-of-sequence token.Generally, these are all warnings—none should completely break your code immediately. They do indicate future deprecations or possibly unexpected behavior, so it’s a good idea to address them sooner rather than later.