Using AIminify

Using aiminify.minify in your Python code

To compress a model using a Python script it is possible to import aiminify and call the minify(...) function. As shown in the example below:
from aiminify import minify
from your_project import model
compressed_model, _ = minify(model)
Depending on the type of model you are compressing it can be stored using the appropriate methods. As aiminify supports PyTorch and Tensorflow as backends, the different storing methods are implemented in the save_model() function.
#
# Using PyTorch as a backend
#
from aiminify import minify
from aiminify import save_model
from your_project import model
# For Quantized Torch models, the input shape is needed to save the model
compressed_model, _ = minify(model)
save_model(compressed_model, "./compressed-pytorch-model.pt", input_shape)
# For non Quantized models, you can save without the input shape
compressed_model, _ = minify(model, quantization=false)
save_model(compressed_model, "./compressed-pytorch-model.pt")
#
# Using Tensorflow as a backend
#
from aiminify import minify
from aiminify import save_model
from your_project import model
compressed_model, _ = minify(model)
save_model(compressed_model, "./compressed-tensorflow-model.keras")

aiminify.minify arguments

Title
Title
Title
Name
Type and default value
model
Any, required
The model to compress
compression_strength
int(0, 5), default 3
Compression strength, setting the strength of the pruner
training_generator
Any, default None
Training set used for fine tuning
validation_generator
Any, default None
Validation set used for fine tuning
training_function
Callable, defaults to our built-in training function
Function used for training the model
optimizer
Any, default None
Optimizer used for fine tuning
loss_function
Any, default None
Loss function used for fine tuning
verbose
int(0, 5), default 0
Verbosity level
quantization
bool, default False
Quantize model
fine_tune
bool, default False
Fine tune the model after compression
precision
str, default fp32
Precision for fine tuning. Other option is mixed
accumulation_steps
int, default 1
Number of training steps to accumulate gradients before a backward update
smart_pruning
bool, default True
Use a smart strategy for determining the amount of filters to prune instead of flat pruning x % of all filters.
debug_mode
bool, default False
Debug logging mode
The return value of minify is (compressed_model, feedback_dictionary)compressed_model is, as the name would suggest, your compressed model using the same backend as the input (Tensorflow, PyTorch). feedback_dictionary contains logs and miscellaneous messages from the compression algorithm.
training_generatorvalidation_generatorloss_function can be specific to the backend you’re using. For example when using pytorch the training_generator and validation_generator need to be a subclass of torch.utils.data.Dataset. For Tensorflow these can be a subclass of tensorflow.data.Dataset. Similar with loss_function for PyTorch this can be any member of torch.nn.modules.loss and for Tensorflow this can be any implementation of tf.keras.losses.Loss.

Custom training functions

It is possible to pass your own training function to AIminify. We expect a function with the following signature for PyTorch models:
def fine_tune_pytorch_model(
model: torch.nn.Module,
epochs: int,
learning_rate: float,
loss_function: Any = None,
optimizer: Callable = None,
training_generator: torch.utils.data.DataLoader = None,
validation_generator: torch.utils.data.DataLoader = None,
precision: str = 'fp32',
accumulation_steps: int = 1,
) -> Tuple[torch.nn.Module, Dict]:
"""
Fine-tunes a PyTorch model on the provided training dataset for a specified number of epochs.
Optionally performs validation if a loss function and validation data are provided.

Args:
model (torch.nn.Module): The PyTorch model to be fine-tuned.
epochs (int): Number of training epochs.
learning_rate (float): Learning rate for the optimizer.
loss_function (Any, optional): The loss function for training and validation. If not provided,
it is assumed that the model directly returns the loss.
optimizer (torch.optim.Optimizer, optional): The optimizer to use.
training_generator (torch.utils.data.DataLoader, optional): DataLoader for the training set.
Must provide batches of (inputs, targets) for training.
validation_generator (torch.utils.data.DataLoader, optional): DataLoader for the validation set.
Used to evaluate model performance after each epoch.
precision (str, optional): Precision for training (default is 'fp32').
Supported values are ['fp32', 'mixed'].
accumulation_steps (int, optional): Number of training steps to accumulate gradients before a backward update.
Emulates a larger batch size by accumulating gradients over multiple steps.

Returns:
tuple: (model, feedback), where:
- model (torch.nn.Module): The fine-tuned model.
- feedback (dict): Contains information about the fine-tuning process, with keys:
- "fine_tuning": str, provides information if no training set is provided.
- "success": bool, indicates whether the training was successful.

Notes:
- If `training_generator` is None, no training occurs and the model is returned unchanged.
- If `loss_function` is None, the model is expected to directly return the loss during training.
- The function logs training and validation losses for each epoch.

Raises:
ValueError: if `epochs` is not a positive value.
ValueError: if `precision` is not one of ['fp32', 'bf16', 'mixed'].
"""
...
And with the following signature for TensorFlow models:
def fine_tune_model(
model: Model,
epochs: int,
learning_rate: float,
loss_function: Any,
optimizer: Optional[tf.keras.optimizers.Optimizer] = None,
training_generator: Optional[Union[tf.keras.utils.Sequence, tf.data.Dataset]] = None,
validation_generator: Optional[Union[tf.keras.utils.Sequence, tf.data.Dataset]] = None,
precision: str = 'fp32',
accumulation_steps: int = 1,
verbose: int = 0,
) -> Tuple[Model, Dict[str, Any]]:
"""
Fine tuning of a model after pruning or other methods used for compression.

Parameters
----------
model : Model
The original model.
epochs : int
Number of epochs to fine tune.
learning_rate : float
Learning rate for fine tuning the model.
loss_function : -
loss function used to compile the model.
optimizer : tf.keras.optimizers.Optimizer
the optimizer class to use
training_generator : Union[tf.keras.utils.Sequence, tf.data.Dataset], optional
A datagenerator containing the training set.
validation_generator : Union[tf.keras.utils.Sequence, tf.data.Dataset], optional
Validation datagenerator. The default is None.
precision (str, optional): Precision for training (default is 'fp32').
Supported values are ['fp32', 'mixed'].
accumulation_steps (int, optional):
Number of steps to accumulate gradients.
verbose : int, optional
set verbose of model.

Returns
-------
(Model, dict)
A tuple containing the fine tuned model and a feedback dictionary.
"""