vsp algorithm transplantation guide*
- In order to help algorithm engineers deploy their own algorithms faster, some configurations mentioned in this article are based on the
GX8008C_Wukong_Prime
development board, referred to asWukong Development Board
. Therefore, if you do not have this development board in hand, please contact our local sales personnel.
1. Development board system block diagram:*
2. example firmware compilation and download*
2.1 example configuration*
You can modify the following configurations according to your needs.
- example_lib: an algorithm package that includes TX and RX processing
- TX: a simple multichannel mixing algorithm that mixes the input n-channel microphone data as output
- RX: amplifies the RX audio
8008c_wukong_v1.4_example_lib_16k_2Amic_1Aref_UAC4ch_SPKtxout.config
- Sampling rate: 16k
- UAC 1.0 sound card mode
- UAC upstream output 4-channel data: 1ch-TX_OUT, 2ch-Amic, 1ch-Aref
- AudioOut outputs TX_OUT
8008c_wukong_v1.4_example_lib_16k_2Amic_1Aref_UAC6ch_SPKrxout.config
- Sampling rate: 16k
- UAC 1.0 sound card mode
- UAC upstream output 6-channel data: 1ch-TX_OUT, 1ch-RX_OUT, 2ch-Amic, 1ch-Aref, 1ch-RX_IN
- AudioOut outputs RX_OUT
8008c_wukong_v1.4_example_lib_16k_4Dmic_1Aref_UAC6ch_SPKtxout.config
- Sampling rate: 16k
- UAC 1.0 sound card mode
- UAC upstream output 6-channel data: 1ch-TX_OUT, 4ch-Dmic, 1ch-Aref
- AudioOut outputs TX_OUT
8008c_wukong_v1.4_example_lib_16k_4Dmic_1Aref_UAC8ch_SPKrxout.config
- Sampling rate: 16k
- UAC 1.0 sound card mode
- UAC upstream output 8-channel data: 1ch-TX_OUT, 1ch-RX_OUT, 4ch-Dmic, 1ch-Aref, 1ch-RX_IN
- AudioOut outputs RX_OUT
8008c_wukong_v1.4_example_lib_48k_UAC2ch_SPKrxout2ch.config
- Sampling rate: 48k
- UAC 1.0 sound card mode
- UAC upstream output 2-channel data: 2ch-RX_OUT, UAC downstream dual-channel 48k
- AudioOut outputs RX_OUT
2.2 How to compile:*
- After downloading
vsp_sdk
to the local computer, execute the command in thevsp_sdk
directoryAfter compilation is completed, the generated firmware is in the$ cp configs/example_lib/8008c_wukong_v1.4_example_lib_16k_2Amic_1Aref_UAC4ch_SPKtxout.config .config $ make menuconfig # Open menuconfig, save and exit $ make clean; $ make
output
directory.mcu_nor.bin
is the firmware ofmcu
,dsp.fw
is the firmware ofdsp
, andvsp.bin
is the merged firmware of the two parts. ,
2.3 How to burn firmware [two methods]:*
-
Burn
vsp.bin
:$ sudo tools/bootx/bootx -m leo_mini -tu -c "download 0 output/vsp.bin;reboot"
-
Burn
mcu_nor.bin
anddsp.fw
$ cd tools/bootx $ ./flash_nor_mini.sh
-
After the download is complete, the PC will recognize the sound card device, and you can record it through
audacity (recording tool)
. Click here for specific recording instructions
3. Configuration of input and output channels*
- Execute
make menuconfig
, enterVSP I/O Buffer settings
Channel settings:
Set the channel. Set the correspondingmic
andref
channels according to the requirements. The output channel is also configured according to the requirements. If you needuac
recording, you must selectInterlaced
output and checkInterlaced OUT Channels
, and configure the number of channels you need to record viauac
as needed.Frame settings:
Configure the corresponding sampling rate and frame length according to needsContext settings:
The object processed by thedsp
algorithm iscontext
, andFrame Number in a Context
needs to be configured as needed
4. How to build your own algorithm package*
In the vsp_sdk/dsp/vpa
directory are independent algorithm packages. You can select the required algorithm package through menuconfig
. In this section, we refer to example_lib
to build our own algorithm package
- Copy the
example_lib
algorithm directory and rename it to the algorithm that needs to be transplanted, for example togx_lib
-
Modify the contents of
vpa.name
,Makefile
andKconfig
files in thegx_lib
directory, mainly to modify some paths and macros- Replace the contents of
vpa.name
withconfig VSP_VPA_GX_LIB bool "GX [Library]"
-
Replace
VSP_VPA_EXAMPLE_LIB
inKconfig
withVSP_VPA_GX_LIB
-
Replace
CONFIG_VSP_VPA_EXAMPLE_LIB
inMakefile
withCONFIG_VSP_VPA_GX_LIB
, replaceSRC_DIR=vpa/example_lib
withSRC_DIR=vpa/gx_lib
- Replace the contents of
-
After building the algorithm package, execute
make menuconfig
, select the newly added algorithmgx_lib
inVoice Process Algorithm select
, and what you compile is your newly added algorithm package
5. Algorithm transplantation*
5.1 Directory introduction*
- The entire directory structure of
example_lib
is as follows. The highlighted parts are the parts that engineers need to focus on.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
5.1 Algorithm initialization*
- The code that needs to be initialized by the algorithm is placed in the
VspInitialize
interface. This interface will be called once after power on.1 2 3 4 5
XIP_TEXT_ATTR int VspInitialize(VSP_CONTEXT_HEADER *context_header) { VspDoExampleInit(context_header); return 0; }
1 2 3 4 5
IRAM0_TEXT_ATTR int VspDoExampleInit(VSP_CONTEXT_HEADER *context_header) { /* Algorithm initialization code */ return 0; }
5.2 Algorithm processing*
- The entry function for algorithm processing is
VspProcessActive
invpa_process.c
. All algorithms are processed in this function. This function will be called back according to the configured context duration after initialization. For example, ifFrame Number in a Context
is 2 andFrame settings
is 16ms, thenVspProcessActive
will be called back every 32ms.
5.2.1 TX Processing*
IRAM0_TEXT_ATTR int VspDoTxExampleProc(VSP_CONTEXT *context, int *output_index)
{
VSP_CONTEXT_HEADER *ctx_header = context->ctx_header;
int frame_length = ctx_header->frame_length * ctx_header->sample_rate / 1000;
int context_sample_num = frame_length * ctx_header->frame_num_per_context;
int ref_num = ctx_header->ref_num;
int mic_num = ctx_header->mic_num;
int channel_num = mic_num + ref_num;
short *mic_buffer[ref_num];
short *ref_buffer[mic_num];
short out[context_sample_num];
int i, j;
for (i = 0; i < mic_num; i++) {
mic_buffer[i] = VspProcessGetMicFrame(context,i,0); //Get N mic data
}
for (i = 0; i < ref_num; i++) {
ref_buffer[i] = VspProcessGetRefFrame(context,i,0); //Get N ref data
}
for (j = 0; j < context_sample_num; j++) { // Organize mic and ref according to the format required by the algorithm api
for (i = 0; i < mic_num; i++) {
all_data[i+j*channel_num] = mic_buffer[i][j];
}
for (i = 0; i < ref_num; i++) {
all_data[mic_num+i+j*channel_num] = ref_buffer[i][j];
}
}
// TX data processing. eg:
MixAudio((short *)all_data, (short *)out, mic_num, ref_num, context_sample_num);
memcpy(context->out_buffer, out, context_sample_num*sizeof(short)); // Copy algorithm output data to context->out_buffer
return 0;
}
5.2.2 RX Processing*
IRAM0_TEXT_ATTR int VspDoRxExampleProc(VSP_CONTEXT *context, int *output_index)
{
VSP_CONTEXT_HEADER *ctx_header = context->ctx_header;
if (ctx_header->rx_num == 2) { // Dual channel processing
VspCopyRxChannelToOut(context, 0, *output_index);
VspCopyRxChannelToOut(context, 1, *output_index+1);
// RX data processing. eg:
VSPDoSetOutputGain(context, *output_index, -3); // -3 dB gain on RX data
VSPDoSetOutputGain(context, *output_index+1, -3);// -3 dB gain on RX data
*output_index += 2;
}
else if (ctx_header->rx_num == 1) { // Single channel processing
VspCopyRxChannelToOut(context, 0, *output_index);
// RX data processing. eg:
VSPDoSetOutputGain(context, *output_index, -3); // -3 dB gain on RX data
*output_index += 1;
}
return 0;
}
6. Algorithm-related helper
interface*
- All related
helper
interfaces are invsp_sdk/dsp/vsp/vsp_helper.c
.
Function Interface | Usage |
---|---|
int VspProcessInvalidateMicBuffer (VSP_CONTEXT *context) |
Invalidate cache data based on corresponding mic_buffer address and size in context |
int VspProcessWritebackMicBuffer (VSP_CONTEXT *context) |
Write back cache data to sram based on corresponding mic_buffer address and size in context |
int VspProcessInvalidateRefBuffer (VSP_CONTEXT *context) |
Invalidate cache data based on corresponding ref_buffer address and size in context |
int VspProcessWritebackRefBuffer (VSP_CONTEXT *context) |
|
int VspProcessInvalidateRxBuffer (VSP_CONTEXT *context) |
Invalidate cache data based on corresponding rx_buffer address and size in context |
short * VspProcessGetMicFrame (VSP_CONTEXT *context, unsigned int channel_num, int frame_index) |
Function : Get start address of a frame of mic data from a mic channel in the current context channel_num : mic channel numberframe_index : Frame number in current context |
short * VspProcessGetRefFrame (VSP_CONTEXT *context, unsigned int channel_num, int frame_index) |
Function : Get start address of a frame of ref data from a ref channel in the current context channel_num : ref channel numberframe_index : Frame number in current context |
short * VspProcessGetRxFrame (VSP_CONTEXT *context, unsigned int channel_num, int frame_index) |
Function : Get start address of a frame of ref data from a ref channel in the current context channel_num : ref channel numberframe_index : Frame number in current context |
short * VspProcessGetOutFrame (VSP_CONTEXT *context, unsigned int channel_num, unsigned int frame_index) |
Function : Get start address of a frame of output data from an output channel in the current context channel_num : output channel numberframe_index : Frame number in current context |
VSP_CONTEXT * VspProcessGetContext (const VSP_CONTEXT *context, unsigned int index) |
Function: Get address of the index th context with current context as start point |
Note
Before VspProcessActive
, we need to call VspProcessInvalidateMicBuffer
and VspProcessInvalidateRefBuffer
interfaces to invalidate the relevant cachedata in the current
dsp, and the
mic and ref
data obtained by the positive protection algorithm are all valid in real time
Note
When using VspProcessGetRxFrame
to get the dual-channel UAC downlink data, the data is stored interleaved.
7. Hashrate view*
We also provide the function of viewing DSP computing power in real time
- Execute
make menuconfig
and enterDSP settings
- Enable
Enable Process Cycle Statistic
andEnable log printing on DSP
, the default underEnable log printing on DSP
classification is fine, then recompile the firmware, connect to the serial port ofdsp
, it will print the computing power, more than 100 One hundred percent means that the computing power is too high, and the algorithm needs to be optimized.
Reminder: How to count the computing power of a certain hot function
We provide unsigned xthal_get_ccount(void) to get the value of the current CCOUNT register, and the DSP will automatically add 1 every time it takes a beat. If the frequency of the DSP is 400M, then this register It will automatically add 400M per second. Therefore, we call xthal_get_ccount() before and after the hot function, and then subtract the computing power consumed by the hot function.
8. Memory usage*
8.1 SRAM*
- Both
8008/8008C
have1536KB
SRAM, the default code runs on SRAM,1536KB
is shared by MCU and DSP, the size of SRAM memory used by DSP can be configured, and the remaining memory is reserved for MCU. - Execute
make menuconfig
, enterDSP settings
, configure(1300) SRAM size kept for DSP(KB)
How to determine MCU memory
MCU will not use dynamic memory. As long as make mcu can compile normally, there will be no problem with MCU memory.
8.2 DRAM0 and IRAM0*
-
In addition to SRAM, there are also 64k
DRAM0
and 64kIRAM0
available on the DSP. By default, SRAM is used. If you need to use this memory, you need to use the following macro#define IRAM0_TEXT_ATTR __attribute__((section(".iram0.text"))) #define DRAM0_BSS_ATTR __attribute__((section(".dram0.bss"))) #define DRAM0_DATA_ATTR __attribute__((section(".dram0.data"))) #define DRAM0_RODATA_ATTR __attribute__((section(".dram0.rodata")))
-
DRAM0
put some data, such asstatic short all_data[6*FRAME_LEN] DRAM0_DATA_ATTR;
-
IRAM0
put some code, such asIRAM0_TEXT_ATTR int VspProcessActive(VSP_CONTEXT *context)
8.3 XIP*
- On 8008c (not supported on 8008), you can also use XIP technology, you can put some codes with low execution frequency or some read-only data on the XIP segment, you need to use the following macros
#define XIP_TEXT_ATTR __attribute__((section(".xip.text")))
#define XIP_RODATA_ATTR __attribute__((section(".xip.rodata")))
- Execute
make menuconfig
, enterDSP settings
, configure[*] Enable XIP
XIP
put some code, such as
XIP_TEXT_ATTR int VspInitialize(VSP_CONTEXT_HEADER *context_header)
XIP
put some code, such as
short data[3] XIP_RODATA_ATTR = {1, 2, 3};
9. Manuals related to algorithm development*
- DSP Compilation Toolchain After the installation is completed, open Xplorer and click Help→PDF Documentation to see a lot about HIFI4 documentation. Contains Hifi4 specifications, instruction set instructions, and Xtexsa compilation toolchain instructions.