Layers
Base layers
- class audiotools.ml.layers.base.BaseModel[source]
Bases:
Module
This is a class that adds useful save/load functionality to a
torch.nn.Module
object.BaseModel
objects can be saved astorch.package
easily, making them super easy to port between machines without requiring a ton of dependencies. Files can also be saved as just weights, in the standard way.>>> class Model(ml.BaseModel): >>> def __init__(self, arg1: float = 1.0): >>> super().__init__() >>> self.arg1 = arg1 >>> self.linear = nn.Linear(1, 1) >>> >>> def forward(self, x): >>> return self.linear(x) >>> >>> model1 = Model() >>> >>> with tempfile.NamedTemporaryFile(suffix=".pth") as f: >>> model1.save( >>> f.name, >>> ) >>> model2 = Model.load(f.name) >>> out2 = seed_and_run(model2, x) >>> assert torch.allclose(out1, out2) >>> >>> model1.save(f.name, package=True) >>> model2 = Model.load(f.name) >>> model2.save(f.name, package=False) >>> model3 = Model.load(f.name) >>> out3 = seed_and_run(model3, x) >>> >>> with tempfile.TemporaryDirectory() as d: >>> model1.save_to_folder(d, {"data": 1.0}) >>> Model.load_from_folder(d)
- EXTERN = ['audiotools.**', 'tqdm', '__main__', 'numpy.**', 'julius.**', 'torchaudio.**', 'scipy.**', 'einops']
Names of libraries that are external to the torch.package saving mechanism. Source code from these libraries will not be packaged into the model. This can be edited by the user of this class by editing
model.EXTERN
.
- INTERN = []
Names of libraries that are internal to the torch.package saving mechanism. Source code from these libraries will be saved alongside the model.
- property device
Gets the device the model is on by looking at the device of the first parameter. May not be valid if model is split across multiple devices.
- classmethod load(location: str, *args, package_name: Optional[str] = None, strict: bool = False, **kwargs)[source]
Load model from a path. Tries first to load as a package, and if that fails, tries to load as weights. The arguments to the class are specified inside the model weights file.
- Parameters
location (str) – Path to file.
package_name (str, optional) – Name of package, by default
cls.__name__
.strict (bool, optional) – Ignore unmatched keys, by default False
kwargs (dict) – Additional keyword arguments to the model instantiation, if not loading from package.
- Returns
A model that inherits from BaseModel.
- Return type
- classmethod load_from_folder(folder: Union[str, Path], package: bool = True, strict: bool = False, **kwargs)[source]
Loads the model from a folder generated by
audiotools.ml.layers.base.BaseModel.save_to_folder()
. Like that function, this one looks for a subfolder that has the name of the class (e.g.folder/generator/[package, weights].pth
if the model name wasGenerator
).- Parameters
folder (Union[str, Path]) – _description_
package (bool, optional) – Whether to use
torch.package
to load the model, loading the model frompackage.pth
.strict (bool, optional) – Ignore unmatched keys, by default False
- Returns
tuple of model and extra data as saved by
audiotools.ml.layers.base.BaseModel.save_to_folder()
.- Return type
tuple
- save(path: str, metadata: Optional[dict] = None, package: bool = True, intern: list = [], extern: list = [], mock: list = [])[source]
Saves the model, either as a torch package, or just as weights, alongside some specified metadata.
- Parameters
path (str) – Path to save model to.
metadata (dict, optional) – Any metadata to save alongside the model, by default None
package (bool, optional) – Whether to use
torch.package
to save the model in a format that is portable, by default Trueintern (list, optional) – List of additional libraries that are internal to the model, used with torch.package, by default []
extern (list, optional) – List of additional libraries that are external to the model, used with torch.package, by default []
mock (list, optional) – List of libraries to mock, used with torch.package, by default []
- Returns
Path to saved model.
- Return type
str
- save_to_folder(folder: Union[str, Path], extra_data: Optional[dict] = None)[source]
Dumps a model into a folder, as both a package and as weights, as well as anything specified in
extra_data
.extra_data
is a dictionary of other pickleable files, with the keys being the paths to save them in. The model is saved under a subfolder specified by the name of the class (e.g.folder/generator/[package, weights].pth
if the model name wasGenerator
).>>> with tempfile.TemporaryDirectory() as d: >>> extra_data = { >>> "optimizer.pth": optimizer.state_dict() >>> } >>> model.save_to_folder(d, extra_data) >>> Model.load_from_folder(d)
- Parameters
folder (Union[str, Path]) – _description_
extra_data (dict, optional) – _description_, by default None
- Returns
Path to folder
- Return type
str
- training: bool
Spectral gate
- class audiotools.ml.layers.spectral_gate.SpectralGate(n_freq: int = 3, n_time: int = 5)[source]
Bases:
Module
Spectral gating algorithm for noise reduction, as in Audacity/Ocenaudio. The steps are as follows:
An FFT is calculated over the noise audio clip
Statistics are calculated over FFT of the the noise (in frequency)
A threshold is calculated based upon the statistics of the noise (and the desired sensitivity of the algorithm)
An FFT is calculated over the signal
A mask is determined by comparing the signal FFT to the threshold
The mask is smoothed with a filter over frequency and time
The mask is appled to the FFT of the signal, and is inverted
Implementation inspired by Tim Sainburg’s noisereduce:
https://timsainburg.com/noise-reduction-python.html
- Parameters
n_freq (int, optional) – Number of frequency bins to smooth by, by default 3
n_time (int, optional) – Number of time bins to smooth by, by default 5
- forward(audio_signal: AudioSignal, nz_signal: AudioSignal, denoise_amount: float = 1.0, n_std: float = 3.0, win_length: int = 2048, hop_length: int = 512)[source]
Perform noise reduction.
- Parameters
audio_signal (AudioSignal) – Audio signal that noise will be removed from.
nz_signal (AudioSignal, optional) – Noise signal to compute noise statistics from.
denoise_amount (float, optional) – Amount to denoise by, by default 1.0
n_std (float, optional) – Number of standard deviations above which to consider noise, by default 3.0
win_length (int, optional) – Length of window for STFT, by default 2048
hop_length (int, optional) – Hop length for STFT, by default 512
- Returns
Denoised audio signal.
- Return type
- training: bool