Skip to content

Commit

Permalink
Add grouped convolution to arm_convolve_s8 (ARM-software#99)
Browse files Browse the repository at this point in the history
- Adds support for calling arm_convolve_s8 with conditions input_dims->c
= filter_dims->c * N, output_dims->c = N * M, for N > 1, M >= 1, to use
grouped convolution with N groups of size M
- Adds argument row_address_offset to arm_nn_mat_mult_nt_t_s8
- Adds arm_nn_mat_mult_kernel_row_offset_s8_s16
- Update arm_convolve_wrapper to use only arm_convolve_s8 for grouped
convolutions
- Adds unit tests for grouped convolution

Change-Id: Iaa2701a8bb460f4470c41a0e1f82ea09ca095802
  • Loading branch information
AdrianLundell authored Jan 22, 2024
1 parent 3b4e406 commit ffeca90
Show file tree
Hide file tree
Showing 59 changed files with 1,447 additions and 170 deletions.
10 changes: 6 additions & 4 deletions Include/arm_nnfunctions.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@
* Title: arm_nnfunctions.h
* Description: Public header file for CMSIS NN Library
*
* $Date: 9 January 2024
* $Revision: V.12.5.1
* $Date: 11 January 2024
* $Revision: V.12.6.0
*
* Target : Arm(R) M-Profile Architecture
* -------------------------------------------------------------------- */
Expand Down Expand Up @@ -365,8 +365,10 @@ arm_cmsis_nn_status arm_convolve_s4(const cmsis_nn_context *ctx,
* It contains the multiplier and shift values to be applied to each output channel
* @param[in] input_dims Input (activation) tensor dimensions. Format: [N, H, W, C_IN]
* @param[in] input_data Input (activation) data pointer. Data type: int8
* @param[in] filter_dims Filter tensor dimensions. Format: [C_OUT, HK, WK, C_IN] where HK and WK are the
* spatial filter dimensions
* @param[in] filter_dims Filter tensor dimensions. Format: [C_OUT, HK, WK, CK] where HK, WK and CK are the
* spatial filter dimensions. CK != C_IN is used for grouped convolution, in which
* case the required conditions are C_IN = N * CK and C_OUT = N * M for N groups of
* size M.
* @param[in] filter_data Filter data pointer. Data type: int8
* @param[in] bias_dims Bias tensor dimensions. Format: [C_OUT]
* @param[in] bias_data Optional bias data pointer. Data type: int32
Expand Down
47 changes: 45 additions & 2 deletions Include/arm_nnsupportfunctions.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@
* Title: arm_nnsupportfunctions.h
* Description: Public header file of support functions for CMSIS NN Library
*
* $Date: 9 January 2024
* $Revision: V.17.6.3
* $Date: 11 January 2024
* $Revision: V.17.7.0
*
* Target : Arm(R) M-Profile Architecture
* -------------------------------------------------------------------- */
Expand Down Expand Up @@ -423,6 +423,7 @@ arm_cmsis_nn_status arm_nn_mat_mult_nt_t_s4(const int8_t *lhs,
* @param[in] dst_offset Offset to be applied the output result
* @param[in] activation_min Minimum value to clamp down the output. Range : int8
* @param[in] activation_max Maximum value to clamp up the output. Range : int8
* @param[in] row_address_offset Address offset between rows in output. NOTE: Only used for MVEI extension.
* @param[in] lhs_cols_offset Column offset between subsequent lhs_rows
*
* @return The function returns <code>ARM_CMSIS_NN_SUCCESS</code>
Expand All @@ -441,6 +442,7 @@ arm_cmsis_nn_status arm_nn_mat_mult_nt_t_s8(const int8_t *lhs,
const int32_t dst_offset,
const int32_t activation_min,
const int32_t activation_max,
const int32_t row_address_offset,
const int32_t lhs_cols_offset);

/**
Expand Down Expand Up @@ -1006,6 +1008,47 @@ int8_t *arm_nn_mat_mult_kernel_s8_s16(const int8_t *input_a,
const int32_t *const output_bias,
int8_t *out_0);

/**
* @brief Matrix-multiplication function for convolution with per-channel requantization, supporting an address offset
* between rows.
* @param[in] input_a pointer to operand A
* @param[in] input_b pointer to operand B, always consists of 2 vectors.
* @param[in] output_ch number of rows of A
* @param[in] out_shift pointer to per output channel requantization shift parameter.
* @param[in] out_mult pointer to per output channel requantization multiplier parameter.
* @param[in] out_offset output tensor offset.
* @param[in] activation_min minimum value to clamp the output to. Range : int8
* @param[in] activation_max maximum value to clamp the output to. Range : int8
* @param[in] num_col_a number of columns of A
* @param[in] aligned_num_col_a number of columns of A aligned by 4
* @param[in] output_bias per output channel bias. Range : int32
* @param[in] row_address_offset address offset between rows in the output
* @param[in,out] out_0 pointer to output
* @return The function returns one of the two
* 1. The incremented output pointer for a successful operation or
* 2. NULL if implementation is not available.
*
* @details This function does the matrix multiplication of weight matrix for all output channels
* with 2 columns from im2col and produces two elements/output_channel. The outputs are
* clamped in the range provided by activation min and max.
*
* This function is slighly less performant than arm_nn_mat_mult_kernel_s8_s16, but allows support for
* grouped convolution. Supported framework: TensorFlow Lite micro.
*/
int8_t *arm_nn_mat_mult_kernel_row_offset_s8_s16(const int8_t *input_a,
const int16_t *input_b,
const uint16_t output_ch,
const int32_t *out_shift,
const int32_t *out_mult,
const int32_t out_offset,
const int16_t activation_min,
const int16_t activation_max,
const int32_t num_col_a,
const int32_t aligned_num_col_a,
const int32_t *const output_bias,
const int32_t row_address_offset,
int8_t *out_0);

/**
* @brief Common softmax function for s8 input and s8 or s16 output
* @param[in] input Pointer to the input tensor
Expand Down
8 changes: 4 additions & 4 deletions Source/ConvolutionFunctions/arm_convolve_1_x_n_s8.c
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* SPDX-FileCopyrightText: Copyright 2010-2023 Arm Limited and/or its affiliates <open-source-office@arm.com>
* SPDX-FileCopyrightText: Copyright 2010-2024 Arm Limited and/or its affiliates <open-source-office@arm.com>
*
* SPDX-License-Identifier: Apache-2.0
*
Expand All @@ -21,16 +21,15 @@
* Title: arm_convolve_1_x_n_s8.c
* Description: s8 version of 1xN convolution using symmetric quantization.
*
* $Date: 8 March 2023
* $Revision: V.3.4.0
* $Date: 04 January 2024
* $Revision: V.3.5.0
*
* Target : Arm(R) M-Profile Architecture
*
* -------------------------------------------------------------------- */

#include "arm_nnfunctions.h"
#include "arm_nnsupportfunctions.h"

/**
* @ingroup Public
*/
Expand Down Expand Up @@ -149,6 +148,7 @@ arm_cmsis_nn_status arm_convolve_1_x_n_s8(const cmsis_nn_context *ctx,
conv_params->output_offset,
conv_params->activation.min,
conv_params->activation.max,
rhs_rows,
lhs_offset);

output_data += lhs_rows * rhs_rows;
Expand Down
7 changes: 4 additions & 3 deletions Source/ConvolutionFunctions/arm_convolve_1x1_s8.c
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* SPDX-FileCopyrightText: Copyright 2022-2023 Arm Limited and/or its affiliates <open-source-office@arm.com>
* SPDX-FileCopyrightText: Copyright 2022-2024 Arm Limited and/or its affiliates <open-source-office@arm.com>
*
* SPDX-License-Identifier: Apache-2.0
*
Expand All @@ -21,8 +21,8 @@
* Title: arm_convolve_1x1_s8.c
* Description: Generic s8 version of 1x1 convolution
*
* $Date: 20 January 2023
* $Revision: V.1.0.1
* $Date: 04 January 2024
* $Revision: V.1.1.0
*
* Target : Arm(R) M-Profile Architecture
*
Expand Down Expand Up @@ -96,6 +96,7 @@ arm_cmsis_nn_status arm_convolve_1x1_s8(const cmsis_nn_context *ctx,
conv_params->output_offset,
conv_params->activation.min,
conv_params->activation.max,
rhs_rows,
rhs_cols * stride_w);
if (result != ARM_CMSIS_NN_SUCCESS)
{
Expand Down
7 changes: 4 additions & 3 deletions Source/ConvolutionFunctions/arm_convolve_1x1_s8_fast.c
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* SPDX-FileCopyrightText: Copyright 2010-2023 Arm Limited and/or its affiliates <open-source-office@arm.com>
* SPDX-FileCopyrightText: Copyright 2010-2024 Arm Limited and/or its affiliates <open-source-office@arm.com>
*
* SPDX-License-Identifier: Apache-2.0
*
Expand All @@ -21,8 +21,8 @@
* Title: arm_convolve_1x1_s8_fast.c
* Description: Fast s8 version of 1x1 convolution (non-square shape)
*
* $Date: 30 October 2023
* $Revision: V.3.4.0
* $Date: 04 January 2024
* $Revision: V.3.5.0
*
* Target : Arm(R) M-Profile Architecture
*
Expand Down Expand Up @@ -86,6 +86,7 @@ arm_cmsis_nn_status arm_convolve_1x1_s8_fast(const cmsis_nn_context *ctx,
conv_params->output_offset,
conv_params->activation.min,
conv_params->activation.max,
rhs_rows,
rhs_cols);

/* Return to application */
Expand Down
Loading

0 comments on commit ffeca90

Please sign in to comment.