Memory Bandwidth Test
Benchmark application for measuring memory read/write bandwidth using PXL runtime and MU kernels.
Overview
The memory_test example measures memory subsystem performance by executing parallel read/write operations across multiple tasks. Located in example/memory_test directory.
Use cases:
- Evaluate memory bandwidth performance
- Compare different memory access patterns
- Benchmark system configuration changes
Build and Run
Build
cd example/memory_test
./build.sh
Run
./memory_test [options]
Options:
| Option | Description | Default |
|---|---|---|
-s | Total memory size (GiB) | 192 |
-t | Number of parallel tasks | 3072 |
-o | Operation: read or write | read |
-i | Number of iterations | 1 |
-u | Loop unrolling: 0 or 1 | 1 |
Examples:
# Default test: 192 GiB read with 3072 tasks
./memory_test
# Write test with 64 GiB, 1024 tasks
./memory_test -s 64 -t 1024 -o write
# Compare with/without loop unrolling
./memory_test -s 128 -t 2048 -u 1
./memory_test -s 128 -t 2048 -u 0
Output example (QEMU)
./memory_test -s 1
=== Memory Bandwidth Test ===
Operation : read
Total Memory Size: 1023.94 MiB (1.00 GiB)
Task Count : 3072
Size per Task : 349504 bytes (341.31 KiB)
Iterations : 1
Use Unrolling : 1
-----------------------------
Elapsed Time : 35275.891 ms
Throughput : 0.028 GiB/s (0.030 GB/s)
Code Structure
Host Application
main.cpp performs:
- Parse command-line arguments
- Initialize PXL runtime and load kernel
- Allocate memory and flush host cache
- Execute kernel across all tasks
- Measure time and calculate throughput
MU Kernel
mu_kernel/mu_memory_test.cpp implements:
- Memory read/write operations
- Loop unrolling optimization (8 x 64-bit accesses per iteration)
- Per-task memory region calculation using
mu::getTaskIdx()
Next Steps
- Explore more examples in Tutorials
- Learn kernel programming at Kernel Programming
- Monitor performance with pxltop