Check the total number of parameters in a PyTorch model

Deep LearningPytorch

Deep Learning Problem Overview


How to count the total number of parameters in a PyTorch model? Something similar to model.count_params() in Keras.

Deep Learning Solutions


Solution 1 - Deep Learning

PyTorch doesn't have a function to calculate the total number of parameters as Keras does, but it's possible to sum the number of elements for every parameter group:

pytorch_total_params = sum(p.numel() for p in model.parameters())

If you want to calculate only the trainable parameters:

pytorch_total_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

Answer inspired by this answer on PyTorch Forums.

Note: I'm answering my own question. If anyone has a better solution, please share with us.

Solution 2 - Deep Learning

To get the parameter count of each layer like Keras, PyTorch has model.named_paramters() that returns an iterator of both the parameter name and the parameter itself.

Here is an example:

from prettytable import PrettyTable

def count_parameters(model):
    table = PrettyTable(["Modules", "Parameters"])
    total_params = 0
    for name, parameter in model.named_parameters():
        if not parameter.requires_grad: continue
        params = parameter.numel()
        table.add_row([name, params])
        total_params+=params
    print(table)
    print(f"Total Trainable Params: {total_params}")
    return total_params
    
count_parameters(net)

The output would look something like this:

+-------------------+------------+
|      Modules      | Parameters |
+-------------------+------------+
| embeddings.weight |   922866   |
|    conv1.weight   |  1048576   |
|     conv1.bias    |    1024    |
|     bn1.weight    |    1024    |
|      bn1.bias     |    1024    |
|    conv2.weight   |  2097152   |
|     conv2.bias    |    1024    |
|     bn2.weight    |    1024    |
|      bn2.bias     |    1024    |
|    conv3.weight   |  2097152   |
|     conv3.bias    |    1024    |
|     bn3.weight    |    1024    |
|      bn3.bias     |    1024    |
|    lin1.weight    |  50331648  |
|     lin1.bias     |    512     |
|    lin2.weight    |   265728   |
|     lin2.bias     |    519     |
+-------------------+------------+
Total Trainable Params: 56773369

Solution 3 - Deep Learning

If you want to avoid double counting shared parameters, you can use torch.Tensor.data_ptr. E.g.:

sum(dict((p.data_ptr(), p.numel()) for p in model.parameters()).values())

Here's a more verbose implementation that includes an option to filter out non-trainable parameters:

def numel(m: torch.nn.Module, only_trainable: bool = False):
    """
    returns the total number of parameters used by `m` (only counting
    shared parameters once); if `only_trainable` is True, then only
    includes parameters with `requires_grad = True`
    """
    parameters = list(m.parameters())
    if only_trainable:
        parameters = [p for p in parameters if p.requires_grad]
    unique = {p.data_ptr(): p for p in parameters}.values()
    return sum(p.numel() for p in unique)

Solution 4 - Deep Learning

If you want to calculate the number of weights and biases in each layer without instantiating the model, you can simply load the raw file and iterate over the resulting collections.OrderedDict like so:

import torch


tensor_dict = torch.load('model.dat', map_location='cpu') # OrderedDict
tensor_list = list(tensor_dict.items())
for layer_tensor_name, tensor in tensor_list:
    print('Layer {}: {} elements'.format(layer_tensor_name, torch.numel(tensor)))

You'll get something like

conv1.weight: 312
conv1.bias: 26
batch_norm1.weight: 26
batch_norm1.bias: 26
batch_norm1.running_mean: 26
batch_norm1.running_var: 26
conv2.weight: 2340
conv2.bias: 10
batch_norm2.weight: 10
batch_norm2.bias: 10
batch_norm2.running_mean: 10
batch_norm2.running_var: 10
fcs.layers.0.weight: 135200
fcs.layers.0.bias: 260
fcs.layers.1.weight: 33800
fcs.layers.1.bias: 130
fcs.batch_norm_layers.0.weight: 260
fcs.batch_norm_layers.0.bias: 260
fcs.batch_norm_layers.0.running_mean: 260
fcs.batch_norm_layers.0.running_var: 260

Solution 5 - Deep Learning

You can use torchsummary to do the same thing. It's just two lines of code.

from torchsummary import summary

print(summary(model, (input_shape)))

Solution 6 - Deep Learning

Another possible solution with respect

def model_summary(model):
  print("model_summary")
  print()
  print("Layer_name"+"\t"*7+"Number of Parameters")
  print("="*100)
  model_parameters = [layer for layer in model.parameters() if layer.requires_grad]
  layer_name = [child for child in model.children()]
  j = 0
  total_params = 0
  print("\t"*10)
  for i in layer_name:
    print()
    param = 0
    try:
      bias = (i.bias is not None)
    except:
      bias = False  
    if not bias:
      param =model_parameters[j].numel()+model_parameters[j+1].numel()
      j = j+2
    else:
      param =model_parameters[j].numel()
      j = j+1
    print(str(i)+"\t"*3+str(param))
    total_params+=param
  print("="*100)
  print(f"Total Params:{total_params}")       

model_summary(net)

This would give output similar to below

model_summary

Layer_name							Number of Parameters
====================================================================================================
										
Conv2d(1, 6, kernel_size=(3, 3), stride=(1, 1))			    60
Conv2d(6, 16, kernel_size=(3, 3), stride=(1, 1))			880
Linear(in_features=576, out_features=120, bias=True)	    69240
Linear(in_features=120, out_features=84, bias=True)			10164
Linear(in_features=84, out_features=10, bias=True)			850
====================================================================================================
Total Params:81194

Solution 7 - Deep Learning

Straight and simple

print(sum(p.numel() for p in model.parameters()))

Solution 8 - Deep Learning

There is a builtin utility function to convert an iterable of tensors into a tensor: torch.nn.utils.parameters_to_vector, then combine with torch.numel:

torch.nn.utils.parameters_to_vector(model.parameters()).numel()

Or shorter with a named import (from torch.nn.utils import parameters_to_vector):

parameters_to_vector(model.parameters()).numel()

Solution 9 - Deep Learning

As @fábio-perez mentioned, there is no such built-in function in PyTorch.

However, I found this to be a compact and neat way of achieving the same result:

num_of_parameters = sum(map(torch.numel, model.parameters()))

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionFábio PerezView Question on Stackoverflow
Solution 1 - Deep LearningFábio PerezView Answer on Stackoverflow
Solution 2 - Deep LearningThong NguyenView Answer on Stackoverflow
Solution 3 - Deep LearningteichertView Answer on Stackoverflow
Solution 4 - Deep LearningZhanwen ChenView Answer on Stackoverflow
Solution 5 - Deep LearningSrujan2k21View Answer on Stackoverflow
Solution 6 - Deep LearningShashank NigamView Answer on Stackoverflow
Solution 7 - Deep LearningPrajot KuvalekarView Answer on Stackoverflow
Solution 8 - Deep LearningIvanView Answer on Stackoverflow
Solution 9 - Deep LearningA. MamanView Answer on Stackoverflow