前言 按照测评计划,我们这一篇进行性能测试。CPU基准性能测试有很多种,比较常见的是coremark,我们就以coremark进行CPU的基准性能测试。性能是一个综合的指标不仅仅与CPU的运算性能有关,存储也是一个很关键的因素,所以后面我们也对存储的性能进行了测试。 COREMARK准备代码 打开WSL终端 下载代码 cd coremark/ vi simple/core_portme.h 修改 #define COMPILER_FLAGS FLAGS_STR /* "Please put compiler flags here (e.g. -o3)" */ #endif 为 #define COMPILER_FLAGS "-O3" /* "Please put compiler flags here (e.g. -o3)" */ #endif 如果-O0编译则改为”-O0” typedef ee_u32 ee_ptr_int; 改为 typedef unsigned long ee_ptr_int; 复制port文件 cp simple/core_portme.c simple/core_portme.h . 编译 aarch64-linux-gnu-gcc -o coremarko0 *.c -DPERFORMANCE_RUN=1 -DITERAtiONS=100000 -O0 aarch64-linux-gnu-gcc -o coremarko3 *.c -DPERFORMANCE_RUN=1 -DITERATIONS=100000 -O3 导出到windows下 cp coremarko0 coremarko3 /mnt/d 串口rz导入到开发板 添加可执行权限 chmod +x coremarko0 coremarko3 运行测试
root@g2uliot:~# ./coremarko0
2K performance run parameters for coremark.
CoreMark Size : 666
Total ticks : 142807122
Total time (secs): 142.807122
Iterations/Sec : 700.245188
Iterations : 100000
Compiler version : GCC9.4.0
Compiler flags : -O0
Memory location : STACK
seedcrc : 0xe9f5
[0]crclist : 0xe714
[0]crcmatrix : 0x1fd7
[0]crcstate : 0x8e3a
[0]crcfinal : 0xd340
Correct operation validated. See README.md for run and reporting rules.
CoreMark 1.0 : 700.245188 / GCC9.4.0 -O0 / STACK
root@g2uliot:~#
root@g2uliot:~# ./coremarko3
2K performance run parameters for coremark.
CoreMark Size : 666
Total ticks : 27313274
Total time (secs): 27.313274
Iterations/Sec : 3661.223477
Iterations : 100000
Compiler version : GCC9.4.0
Compiler flags : -O3
Memory location : STACK
seedcrc : 0xe9f5
[0]crclist : 0xe714
[0]crcmatrix : 0x1fd7
[0]crcstate : 0x8e3a
[0]crcfinal : 0xd340
Correct operation validated. See README.md for run and reporting rules.
CoreMark 1.0 : 3661.223477 / GCC9.4.0 -O3 / STACK
root@g2uliot:~#
可以看到不同优化等级差距较大。 https://www.eembc.org/coremark/scores.php中可以查看不同CPU的得分对比。 RAM性能测试WSL中 下载代码 cd STREAM/ 编译 aarch64-linux-gnu-gcc -O3 -DSTREAM_ARRAY_SIZE=5000000 stream.c -o stream.5M 导出到windows下 cp stream.5M /mnt/d 然后通过串口rz导入到开发板 添加可执行权限 chmod +x stream.5M 运行 ./stream.5M 结果如下
root@g2uliot:~# ./stream.5M
-------------------------------------------------------------
STREAM version $Revision: 5.10 $
-------------------------------------------------------------
This system uses 8 bytes per array element.
-------------------------------------------------------------
Array size = 5000000 (elements), Offset = 0 (elements)
Memory per array = 38.1 MiB (= 0.0 GiB).
Total memory required = 114.4 MiB (= 0.1 GiB).
Each kernel will be executed 10 times.
The *best* time for each kernel (excluding the first iteration)
will be used to compute the reported bandwidth.
-------------------------------------------------------------
Your clock granularity/precision appears to be 1 microseconds.
Each test below will take on the order of 50853 microseconds.
(= 50853 clock ticks)
Increase the size of the arrays if this shows that
you are not getting at least 20 clock ticks per test.
-------------------------------------------------------------
WARNING -- The above is only a rough guideline.
For best results, please be sure you know the
precision of your system timer.
-------------------------------------------------------------
Function Best Rate MB/s Avg time Min time Max time
Copy: 2460.4 0.032590 0.032515 0.032736
Scale: 2681.5 0.029885 0.029834 0.030033
Add: 2639.2 0.045563 0.045468 0.045694
Triad: 2723.6 0.044255 0.044060 0.045144
-------------------------------------------------------------
Solution Validates: avg error less than 1.000000e-13 on all three arrays
-------------------------------------------------------------
root@g2uliot:~#
参考https://www.cs.virginia.edu/stream/ref.html RAM压力测试参考 https://pyropus.ca./software/memtester/ WSL中 下载代码 tar -xvf memtester-4.5.1.tar.gz cd memtester-4.5.1/ 编译 aarch64-linux-gnu-gcc -O3 memtester.c tests.c -o memtester 导出到WINDOWS下,下载到开发板 cp memtester /mnt/d chmod +x memtester 运行 ./memtester 128M 1 128M表示测试RAM大小 1表示测试一次 另外也可以-p直接指定物理地址,适合在板子开发阶段裸机代码直接指定物理地址测试。
root@g2uliot:~# ./memtester 128M 1
memtester version 4.5.1 (64-bit)
Copyright (C) 2001-2020 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 128MB (134217728 bytes)
got 128MB (134217728 bytes), trying mlock ...locked.
Loop 1/1:
Stuck Address : ok
Random Value : ok
Compare XOR : ok
Compare SUB : ok
Compare MUL : ok
Compare DIV : ok
Compare OR : ok
Compare AND : ok
Sequential Increment: ok
Solid Bits : ok
Block Sequential : ok
Checkerboard : ok
Bit Spread : ok
Bit Flip : ok
Walking Ones : ok
Walking Zeroes : ok
Done.
root@g2uliot:~#
EMMC性能测试
root@g2uliot:~# dmesg | grep mmc
[ 0.000000] Kernel command line: rw rootwait earlycon root=/dev/mmcblk0p2
[ 3.951821] renesas_sdhi_internal_dmac 11c10000.mmc: Got CD GPIO
[ 3.997336] renesas_sdhi_internal_dmac 11c00000.mmc: mmc0 base at 0x0000000011c00000, max clock rate 133 MHz
[ 4.013901] renesas_sdhi_internal_dmac 11c10000.mmc: mmc1 base at 0x0000000011c10000, max clock rate 133 MHz
[ 4.164498] mmc0: new HS200 MMC card at address 0001
[ 4.172879] mmcblk0: mmc0:0001 8GTF4R 7.28 GiB
[ 4.180387] mmcblk0boot0: mmc0:0001 8GTF4R partition 1 4.00 MiB
[ 4.189148] mmcblk0boot1: mmc0:0001 8GTF4R partition 2 4.00 MiB
[ 4.197937] mmcblk0rpmb: mmc0:0001 8GTF4R partition 3 512 KiB, chardev (241:0)
[ 4.215165] mmcblk0: p1 p2
[ 4.402583] EXT4-fs (mmcblk0p2): recovery complete
[ 4.410651] EXT4-fs (mmcblk0p2): mounted filesystem with ordered data mode. Opts: (null)
[ 6.411398] EXT4-fs (mmcblk0p2): re-mounted. Opts: (null)
[ 15.961660] FAT-fs (mmcblk0p1): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
root@g2uliot:~#
查看EMMC版本为HS200 Df查看/dev/root 部分用户可以使用,以文件夹作为测试。
root@g2uliot:~# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 6844776 1616192 4861176 25% /
devtmpfs 241244 0 241244 0% /dev
tmpfs 438940 0 438940 0% /dev/shm
tmpfs 438940 9876 429064 3% /run
tmpfs 438940 0 438940 0% /sys/fs/cgroup
tmpfs 438940 0 438940 0% /tmp
tmpfs 438940 28 438912 1% /var/volatile
tmpfs 87788 0 87788 0% /run/user/0
/dev/mmcblk0p1 511720 21456 490264 5% /run/media/mmcblk0p1
root@g2uliot:~#
测试写
root@g2uliot:~# time dd if=/dev/zero of=/test.bin bs=16k count=65536
65536+0 records in
65536+0 records out
1073741824 bytes (1.1 GB) copied, 22.9686 s, 46.7 MB/s
real 0m22.973s
user 0m0.056s
sys 0m6.155
测试读
root@g2uliot:~# time dd if=/test.bin of=/dev/null bs=16k count=65536
65536+0 records in
65536+0 records out
1073741824 bytes (1.1 GB) copied, 18.4697 s, 58.1 MB/s
real 0m18.474s
user 0m0.081s
sys 0m3.257s
root@g2uliot:~#
| bs/count 1GB | 指令 | 结果 | 读 | 16k/65536 | time dd if=/test.bin of=/dev/null bs=16k count=65536 | 58.1 MB/s | 写 | 16k/65536 | time dd if=/dev/zero of=/test.bin bs=16k count=65536 | 46.7 MB/s |
以上可以看出EMMC的读写性能还不错的。
总结以上综合对性能进行了测试,感觉性能还是非常不错的,各测试结果仅作参考,因为环境等因素不一样测得结果也会不一样,包括存储的测试方法也不是很科学,比如没有考虑缓存等。上述测试只是一个定性的性能体验,板子的性能是一个综合的体验,需要是面对真是的应用场景才有意义,并且针对场景优化也很重要。
|