Evaluation Methodology
Python Model
A bit-exact Python model for the accelerator is defined in hwdesign/demos/conv2d/conv2d_demo.py. The same file serves three roles:
- It defines the command, response, debug, and enum types with the PySilicon
DataListandEnumFieldschema system. - It implements a software reference model in the
Conv2DAccelclass. - It provides the
Conv2DTestdriver that generates random data, prepares Vitis inputs, launches the HLS flow, and compares hardware outputs against the reference model.
The message schemas used by both the Python model and the C++ kernel are:
Conv2DCmd: image dimensions, kernel size, and the memory addresses of the input image, output image, and kernelConv2DResp: a single error code reporting whether the request completed successfullyConv2DDebug: per-stage instrumentation containing a row index and an event code
This schema-first approach is important for evaluation because the generated C++ include files and the Python serializer/deserializer are derived from the same definitions. That keeps the software model, the C++ testbench, and the HLS kernel aligned on the exact bit layout of every message.
The reference computation itself is implemented with scipy.signal.correlate2d(..., mode="same", boundary="fill", fillvalue=0). This matches the hardware kernel’s zero-padded same-size 2D correlation behavior. After correlation, the Python model applies the same fixed-point post-processing as the HLS implementation:
where the division by $2^7$ is implemented as an arithmetic right shift because the signed 8-bit kernel coefficients use kernel_fbits = 7 fractional bits. The result is then saturated into the unsigned 8-bit pixel range [0, 255].
Within Conv2DAccel.compute_conv2d(), the Python model also mirrors the hardware-side argument checks:
nrowsmust be between1andMAX_NROW = 512ncolsmust be between1andMAX_NCOL = 512kernel_sizemust be between1andMAX_KERNEL_SIZE = 4- invalid memory accesses raise an address error
As a result, the Python model is not just an approximate algorithmic model. It is the functional golden reference for both output data and error behavior.
Testbench
The Vitis C++ testbench is implemented in hwdesign/demos/conv2d/conv2d_tb.cpp. It is a file-based testbench that exchanges data with the Python harness through the hwdesign/demos/conv2d/data directory.
The testbench runs as follows:
- It reads
params.jsonto obtainnrows,ncols, andkernel_size. - It reads the binary input image and kernel files produced by the Python harness.
- It allocates regions in a simulated external memory and writes the packed image and kernel data into that memory.
- It constructs a
Conv2DCmdmessage with the image dimensions and memory addresses. - It sends the command on the AXI4-Stream input, invokes the kernel, and waits for the AXI4-Stream response.
- It drains the debug stream, decodes the emitted
Conv2DDebugmessages, and records them. - It reads the output image back from memory and writes all results to disk for post-processing by Python.
The main output artifacts produced by the testbench are:
resp_data.bin: serializedConv2DRespim_out_array.bin: output image written by the kerneldebug_data.bin: raw debug event streamsync_status.json: summary of termination status, last debug event, and event count
The sync_status.json file is especially useful for evaluation because it confirms that:
- the output response terminated with TLAST
- each debug event terminated correctly on the AXI stream
- the final event and row index are sensible
This catches interface-level errors that would not necessarily appear as pixel mismatches alone.
Correctness Checks
Correctness is established by comparing the HLS testbench outputs against the Python golden model generated by Conv2DTest.simulate().
The evaluation checks two things:
- Response correctness: the
Conv2DRespread fromresp_data.binmust match the error code produced by the Python model - Output-image correctness: the contents of
im_out_array.binmust exactly match the Python reference image element-by-element
These checks are performed by Conv2DTest.read_vitis_outputs(). If the response or output image differs from the reference model, the script raises a runtime error and prints both the observed and expected results.
Because the same Python harness generates the inputs, computes the expected result, and verifies the Vitis outputs, the full correctness loop is reproducible and automatable. The test is therefore stronger than a manual visual inspection of images.
Performance and Timing Evaluation
The repository also includes timing-oriented evaluation infrastructure.
During kernel execution, the accelerator emits debug events for:
MAIN_STARTandMAIN_ENDLOAD_STARTandLOAD_ENDCOMPUTE_STARTandCOMPUTE_ENDSTORE_STARTandSTORE_END
These events are written once per major phase and once per processed row, allowing the user to reconstruct the execution timeline.
For RTL-level timing analysis, the flow can generate a VCD waveform dump. The helper script hwdesign/demos/conv2d/timing_analysis.py parses the VCD, extracts the input, output, and debug AXI4-Stream bursts, and decodes them back into typed Python objects. The notebook hwdesign/demos/conv2d/view_timing.ipynb then uses these decoded events to measure the duration of load, compute, and store phases across rows.
This provides a simple way to evaluate:
- end-to-end command-to-response timing
- the relative cost of memory load, compute, and store stages
- whether the row-processing schedule behaves as expected
- whether the pipelined inner loop is dominating the execution time, as intended
For synthesis-level metrics, Conv2DTest.write_csynth_reports() parses the Vitis HLS csynth.xml report and writes summarized loop and resource tables into the data directory:
csynth_loop_info.csvcsynth_loop_info.jsoncsynth_resources.csv
These files can be used to inspect loop pipelining, achieved initiation interval, and estimated resource usage without manually opening the Vitis GUI.
Reproducing the Evaluation
The standard way to run the evaluation flow is from the hwdesign/demos/conv2d directory.
To run the Python model only:
python conv2d_demo.py --skip_vitis
To run C simulation and synthesis:
python conv2d_demo.py --through csynth
To run through co-simulation and generate a VCD for timing analysis:
python conv2d_demo.py --through generate_vcd --trace_level port
The most important evaluation outputs are then available under hwdesign/demos/conv2d/data and hwdesign/demos/conv2d/vcd.
Summary
The evaluation methodology combines three complementary checks:
- a bit-exact Python golden model for functional correctness
- a file-based C++ testbench that verifies the HLS kernel and its AXI interfaces
- timing and synthesis-report post-processing for performance analysis
Together, these provide confidence that the Conv2D accelerator is correct at the algorithmic level, consistent at the message/interface level, and measurable at the implementation level.