Complex Networks

When dealing with complex functions, we usually need to code a lot to build a complex Neural Network . Therefore, in order to make it easier for users to build complex network models, we provide some common basic function modules to simplify the user’s code and reduce development costs. These modules are usually composed of fine-grained functions combined based on certain logics. For implementation, please refer to nets .


simple_img_conv_pool is got by concatenating conv2d with pool2d . This module is widely used in image classification models, such as the MNIST number classification.

For API Reference, please refer to simple_img_conv_pool


img_conv_group is composed of conv2d , batch_norm, dropout and pool2d. This module can implement the combination of multiple conv2d , batch_norm , dropout and a single pool2d. Among them, the number of conv2d , batch_norm and dropout can be controlled separately, resulting in various combinations. This module is widely used in more complex image classification tasks, such as VGG.

For API Reference, please refer to img_conv_group


sequence_conv_pool is got by concatenating sequence_conv with sequence_pool. The module is widely used in the field of natural language processing and speech recognition . Models such as the text classification model , TagSpace and Multi view Simnet.

For API Reference, please refer to sequence_conv_pool


The full name of glu is Gated Linear Units, which originates from the paper Language Modeling with Gated Convolutional Networks . It consists of split , sigmoid and elementwise_mul. It divides the input data into 2 equal parts, calculates the Sigmoid of second part, and then performs dot product of the sigmoid vlaue with the first part to get the output.

For API Reference, please refer to glu


scaled_dot_product_attention originates from the paper Attention Is All You Need , mainly composed of fc and softmax . For the input data Queries , Key and Values, calculate the Attention according to the following formula.

\[Attention(Q, K, V)= softmax(QK^\mathrm{T})V\]

This module is widely used in the model of machine translation, such as Transformer .

For API Reference, please refer to scaled_dot_product_attention