Skip to content

Layers

Neural network layers for compression models.

GDN - Generalized Divisive Normalization

The GDN layer is commonly used in learned image compression for its effectiveness at decorrelating features.

GDN

GDN(in_channels, inverse=False, beta_min=1e-06, gamma_init=0.1)

Bases: Module

Generalized Divisive Normalization layer.

Introduced in "Density Modeling of Images Using a Generalized Normalization Transformation" <https://arxiv.org/abs/1511.06281>_, by Balle Johannes, Valero Laparra, and Eero P. Simoncelli, (2016).

.. math::

y[i] = \frac{x[i]}{\sqrt{\beta[i] + \sum_j(\gamma[j, i] * x[j]^2)}}

Source code in tinify/layers/gdn.py
def __init__(
    self,
    in_channels: int,
    inverse: bool = False,
    beta_min: float = 1e-6,
    gamma_init: float = 0.1,
) -> None:
    super().__init__()

    beta_min = float(beta_min)
    gamma_init = float(gamma_init)
    self.inverse = bool(inverse)

    self.beta_reparam = NonNegativeParametrizer(minimum=beta_min)
    beta = torch.ones(in_channels)
    beta = self.beta_reparam.init(beta)
    self.beta = nn.Parameter(beta)

    self.gamma_reparam = NonNegativeParametrizer()
    gamma = gamma_init * torch.eye(in_channels)
    gamma = self.gamma_reparam.init(gamma)
    self.gamma = nn.Parameter(gamma)

forward

forward(x)
Source code in tinify/layers/gdn.py
def forward(self, x: Tensor) -> Tensor:
    _, C, _, _ = x.size()

    beta = self.beta_reparam(self.beta)
    gamma = self.gamma_reparam(self.gamma)
    gamma = gamma.reshape(C, C, 1, 1)
    norm = F.conv2d(x**2, gamma, beta)

    if self.inverse:
        norm = torch.sqrt(norm)
    else:
        norm = torch.rsqrt(norm)

    out = x * norm

    return out

Attention Modules

AttentionBlock

AttentionBlock(N)

Bases: Module

Self attention block.

Simplified variant from "Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules" <https://arxiv.org/abs/2001.01568>_, by Zhengxue Cheng, Heming Sun, Masaru Takeuchi, Jiro Katto.

Parameters:

Name Type Description Default
N int

Number of channels)

required

Convolutional Layers

conv3x3

conv3x3(in_ch, out_ch, stride=1)

3x3 convolution with padding.

Source code in tinify/layers/layers.py
def conv3x3(in_ch: int, out_ch: int, stride: int = 1) -> nn.Module:
    """3x3 convolution with padding."""
    return nn.Conv2d(in_ch, out_ch, kernel_size=3, stride=stride, padding=1)

subpel_conv3x3

subpel_conv3x3(in_ch, out_ch, r=1)

3x3 sub-pixel convolution for up-sampling.

Source code in tinify/layers/layers.py
def subpel_conv3x3(in_ch: int, out_ch: int, r: int = 1) -> nn.Sequential:
    """3x3 sub-pixel convolution for up-sampling."""
    return nn.Sequential(
        nn.Conv2d(in_ch, out_ch * r**2, kernel_size=3, padding=1), nn.PixelShuffle(r)
    )

Residual Blocks

ResidualBlock

ResidualBlock(in_ch, out_ch)

Bases: Module

Simple residual block with two 3x3 convolutions.

Parameters:

Name Type Description Default
in_ch int

number of input channels

required
out_ch int

number of output channels

required

ResidualBlockUpsample

ResidualBlockUpsample(in_ch, out_ch, upsample=2)

Bases: Module

Residual block with sub-pixel upsampling on the last convolution.

Parameters:

Name Type Description Default
in_ch int

number of input channels

required
out_ch int

number of output channels

required
upsample int

upsampling factor (default: 2)

2

ResidualBlockWithStride

ResidualBlockWithStride(in_ch, out_ch, stride=2)

Bases: Module

Residual block with a stride on the first convolution.

Parameters:

Name Type Description Default
in_ch int

number of input channels

required
out_ch int

number of output channels

required
stride int

stride value (default: 2)

2