单片机学习小组
直播中

艾玛

12年用户 803经验值
擅长:模拟技术 EDA/IC设计 RF/无线
私信 关注

Intel Hex文件格式是怎样的?

Intel Hex文件格式是怎样的?

回帖(1)

孙婷婷

2022-2-9 15:27:41
简介
这种文件格式一般是以hex为后缀名,在嵌入式MCU程序开发中,经常编译链接后生成一个这样文件,然后将这个文件烧写到MCU的ROM中。
所以,这个是开发工程的输出文件类型的一种。
发明这种文件格式的目的就是存储最终的二进制程序和数据,然后使用这种文件来传输程序和数据,然后烧写到ROM里。
所以很多烧写器或调试模拟器都支持IntelHex这种类型的文件进行芯片的程序烧写。
比如,使用IAR EWARM开发工具开发STM32系列Cortex-M4的某款芯片,创建工程,编写代码后,在工程构建成功后,就可以输出一个hex文件。
此hex文件就是最终程序文件,最为软件的可用发布成果物。

Hex文件类型全名:Intel hexadecimal object file format, Intel hex format or Intellec Hex。
实际上是将二进制文件转换成了ASCII码形式的文本文件,我们用文本编辑器打开hex文件,就可以看到里面的内容。
这种文件类型的目的就是为了烧写microcontroller, EPROMs和其他一些可编程逻辑器件。

本来目标文件里面的内容,应该是源代码编译后得到的机器码。
然后将机器码的二进制文件转换为hex文件。

此文件格式,顾名思义,是Intel公司发明的。在1973年,纸片打孔输入时就开始使用。
为了能够加载和运行程序,要通过打孔纸片把数据信息传给ROM的生产环节。
具体来说就是把一个字节的数据的16进制的表示,转换成字符。
比如,一个字节是‘A’,值为0x41,那就用"41"表示这个字节。
这样做的好处,不但接收方能得到正确的数据,发送发使用这种表达方式,用可打印字符来统一表示所有数据,更方便编辑和操作。

举个例子打开一个hex文件看一下:
:020000040800F2:20000000F0A10520C9AD030849AA03084BAA03084DAA03084FAA030851AA0308000000009F:20002000000000000000000000000000C598010853AA0308000000006D98010855AA03083A:200040003DAE030841AE030845AE030849AE03084DAE030851AE030855AE030859AA030884:2000600059AE03085DAE030861AE030865AE030869AE03086DAE030871AE030875AE030880:2000800079AE03087DAE030881AE030885AE030889AE03088DAE030891AE030895AE030860 还有一种情况,当传输或存储可烧写的程序文件时,如果单纯使用二进制bin文件,如果文件较大或者地址不连续时就比较麻烦,因为bin文件是纯粹的镜像image文件。这时使用hex文件就能更好的处理这些情况。举个例子,我们主控芯片如果有需要烧写外部另一个MCU固件程序的功能,则主控芯片就要事先存储固件程序的数据,这时如果用bin文件,那就要定义一个很大的数组,而使用hex文件格式,在代码中只要定义一行行的字符串数据,这样更容易管理和操作。
比如对于赛普拉斯的蓝牙芯片,更新驱动程序的话把程序文件放在源文件里的话:
BTCypress_fw_hex.cpp:/* Cypress Mini Driver */const char * minidriver = ":02000004000DEDn:10020000F8B5664800F0CEF8654C0025E561E5627An:1002100000F0D2FC01F0A1F810B10120A07001E0C3n:1002200000F038FD00F046FD01F066FF00F0A5FC8Fn:100230005B4FA5627F1C07F12F06207840F001007Cn:10024000207001F0F6FFE07021780D2801F08001A8n:1002500021702581F1D00A28EFD03A2812D04C28FDn:1002600013D04D2814D04E2815D0CE2816D05E2895n:1002700017D0E52818D0CC2819D018281AD057281Cn:100280001ED050E000F0E7FB16E000F0BDFB13E0EDn:1002900000F08EFB10E000F0F5FA0DE000F0AFFA90n......:10C6D000000000000000000000000000000000005An:10C6E000000000000000000000000000000000004An:10C6F000000000000000000000000000000000003An:10C700000000000000000000000000000000000029n:10C710000000000000000000000000000000000019n:10C7200000000000281F38A138AD823586A04313D1n:10C730005C471E5DAE03283802FF1B666C080A5773n:10C740008E83994EA7F7BF50DDA302290328080561n:10C75000FF26FE2EE709244FB7914061D97A6CE895n:10C76000A203280207FF4BDEC4EDD4753B91EB47D3n:10C770002D2E08767FA40002120012000002150080n:10C780001500018900000000000A040002000002F8n:10C7900015001500018900000000000215001500B9n:10C7A0000182000000000806090B0A092B282D034En:10C7B0000B2C092B030000802622000806090B0A17n:10C7C0000000000B0000000044000001000000FF1An:10C7D0000000000100000002000000011100000044n:10C7E0000000000000000000000000FFFFFFFFFF4En:10C7F000FF0000AA55F00F68E597D2000000000A7Cn:10C8000050007CC722000B480C4908B5884202D072n:10C810000B4A05F698FB0A4A00210A4805F697FBE1n:10C82000F4F71BF9094B094A23F00F031360BDE825n:10C830000840F4F7A0B89026220090262200740B3En:10C84000000078950000043222007CCB22001C24DAn:0CC8500020000B04007DC72200FE000049n:00000001FFn"; 格式
1,每个hex文件内容是以行为单位,也就是说以换行符为分隔符。Windows系统每行以回车换行结尾CR(carriage return)+ LF(line feed),linux系统只有一个换行符。LF是0x0A,CR是0x0D。。
2,每条记录的内容是由十六进制字符组成,比如要表示一个字节的数据值为0x3F,那在hex里要用2个字符,“3F”来表示这个字节的数据。
3,每一行是一条记录,包含6个字段。一个hex文件可以包含任意数量的记录。参照下面内容:



第一个冒号是固定的开始字符,每一行HEX记录的开始。
第二个是表示数据字段的长度,常用的是10和20,16进制表示,长度是16或32个字节数据一行。最大是FF,255个。
第三个是偏移地址,4位十六进制数,表示本行数据的起始地址。偏移地址要加上基地址才是真正的物理地址。
确定了物理地址,相应的数据在烧写时就会被放置在芯片的对应地址空间内。
4位偏移地址,最大能表示64KB。超过64KB就需要使用不同的基地址来表示。
基地址如果不指定,默认位0。后面有介绍指定基地址的方法,使用不同的记录类型来表示。
注意,这里使用的都是大端表示法。比如偏移地址4个数字0100,表示的就是地址0x0100,阅读起来比小端方便一些。
第四个是记录类型,除了纯数据类型还有其他类型来表示更多信息。00 - 05,共有五种类型。
不同类型的数据字段部分的内容含义不同。参见后面记录类型的说明。
第五个是数据字段,是一个字节序列,多个字节,长度由第二个字段byte count决定。字节长度为n,则数据字段长度就是2n,每个字节2个16进制字符表示。
第六个是checksum,2位16进制数,一个字节。用来验证本条记录的数据正确性。


Checksum计算方法
:10000000F0A10520B1FD030889EB03088BEB030881
以上面这行数据举例:
1,先把前面的每个字节数据相加。
10+00+00+00+F0+A1+05+20+B1+FD+03+08+89+EB+03+08+8B+EB+03+08 = 67F
2,取最低位字节:7F
3,取反码:0x100 - 0x7F = 0x81

最后的这个checksum字节,就是这一行编码后的数据的和,取LSB最低有效字节,然后取补码(按位取反再加1)。
这样验证这一行数据的方法就是所有数据相加的和的最低有效字节为0,就表示数据正确。

记录类型
00:数据
如上面解释的数据记录格式。

01:文件结束
hex文件的最后一行,表示文件结束。字节长度是0,偏移地址为0(忽略),数据部分为空,最后有一个check sum字节。
举例::00000001FF

02:扩展记录类型:段地址
对于80x86处理器,指定16bit的段基地址,所以字节长度总是2。
举例: :020000021200EA
字节长度2,偏移地址固定为0x0000(忽略),记录类型0x02,段基地址为0x1200,checksum为0xEA。
段基地址的使用方法,在出现了这条段基地址指定的record后,后面的数据类型记录的起始数据的偏移地址要和段基地址一起运算,知道出现下一个段基地址。
运算方法:段基地址乘以16,然后和数据记录偏移地址相加。这样整个hex文件的最大寻址空间可以达到1MB,单个记录最大只能寻址64KB。
段基地址是16bit,乘以16相当于增加4个bit,所以就是20bit的地址,10 bit为1KB,再10bit为1MB。

03:起始段地址
对于80x86处理器,指定CS:IP寄存器的初始值,程序的起始运行地址,CS是代码段 code segment,IP是指令寄存器, instruction pointer。
举例: :0400000300003800C1
字节长度固定是4,偏移地址为0(忽略),数据字段前两个字节是CS的值,后两个是IP的值。


04:扩展记录类型:线性地址(Linear Address)
这个记录类型是针对32 bit的地址空间,最大4GB,
举例: :02000004FFFFFC
字节长度固定2,偏移地址0(忽略),类型04,地址0xFFFF,checksum是FC。
地址内容也是大端表示。
在后面出现的数据记录的偏移地址要和这个地址组合起来作为真正的绝对地址,知道出现新的04类型记录。
地址计算方法:数据记录的地址作为低16位,04记录的地址作为高16位。
举例:
:020000040800F2
:1000300093EB0308000000003D46010895EB030820
上面先指定基地址位0x0800,下面的数据记录的偏移地址是0x0030,所以这条数据记录的最终数据起始地址是0x08000030.
如果没有指定04记录,则基地址位0.

05:起始线性地址
:04000005000000CD2A
长度0x04,偏移地址0x0000(无用),类型0x05,32 bit地址:0x000000CD,checksum:2A。
在80386和更高类型CPU上,这个32位的big endian地址会被载入EIP寄存器。
这个寄存器是处理器开始读取指令的地址,extend instruction pointer。

总结
在使用IAR EWARM给项目设定输出文件类型时,就有一个Intel Extended Hex选项,这个就是指支持扩展记录类型的Hex文件。
一般就是使用00,01,04,05四个记录类型。
截取实际生成的hex文件举例:
:020000040800F2:10000000F0A10520B1FD030889EB03088BEB030881:100010008DEB03088FEB030891EB03080000000051:106EA0001C6014605C6021462B68104810229847D3:106EB0002B680F482146102298472B680D48214621。。。。。。:10C2E000FFF0FFF0FFF0FFF0FFF0FFF0FFF00000C5:10C2F000F0F000000000000000000000000000005E:0AC300000000000000000000000033:040000050803F8E113:00000001FF
------------------------------------------------------------------------------------
数据分析举例
Record Format
An Intel HEX file is composed of any number of HEX records. Each record is made up of five fields that are arranged in the following format:
:llaaaatt[dd...]cc Each group of letters corresponds to a different field, and each letter represents a single hexadecimal digit. Each field is composed of at least two hexadecimal digits-which make up a byte-as described below:
* : is the colon that starts every Intel HEX record.
* ll is the record-length field that represents the number of data bytes (dd) in the record.
* aaaa is the address field that represents the starting address for subsequent data in the record.
* tt is the field that represents the HEX record type, which may be one of the following:
00 - data record
01 - end-of-file record
02 - extended segment address record
04 - extended linear address record
05 - start linear address record (MDK-ARM only)
* dd is a data field that represents one byte of data. A record may have multiple data bytes. The number of data bytes in the record must match the number specified by the ll field.
* cc is the checksum field that represents the checksum of the record. The checksum is calculated by summing the values of all hexadecimal digit pairs in the record modulo 256 and taking the two's complement.
Data Records
The Intel HEX file is made up of any number of data records that are terminated with a carriage return and a linefeed. Data records appear as follows:
:10246200464C5549442050524F46494C4500464C33This record is decoded as follows::10246200464C5549442050524F46494C4500464C33|||||||||||                              CC->Checksum|||||||||DD->Data|||||||TT->Record Type|||AAAA->Address|LL->Record Length:->Colon where:
* 10 is the number of data bytes in the record.
* 2462 is the address where the data are to be located in memory.
* 00 is the record type 00 (a data record).
* 464C...464C is the data.
* 33 is the checksum of the record.

Extended Linear Address Records (HEX386)
Extended linear address records are also known as 32-bit address records and HEX386 records. These records contain the upper 16 bits (bits 16-31) of the data address. The extended linear address record always has two data bytes and appears as follows:
:02000004FFFFFC where:
* 02 is the number of data bytes in the record.
* 0000 is the address field. For the extended linear address record, this field is always 0000.
* 04 is the record type 04 (an extended linear address record).
* FFFF is the upper 16 bits of the address.
* FC is the checksum of the record and is calculated as
01h + NOT(02h + 00h + 00h + 04h + FFh + FFh).
When an extended linear address record is read, the extended linear address stored in the data field is saved and is applied to subsequent records read from the Intel HEX file. The linear address remains effective until changed by another extended address record.
The absolute-memory address of a data record is obtained by adding the address field in the record to the shifted address data from the extended linear address record. The following example illustrates this process..
Address from the data record's address field      2462Extended linear address record data field     FFFF                                              --------Absolute-memory address                       FFFF2462
Extended Segment Address Records (HEX86)
Extended segment address records-also known as HEX86 records-contain bits 4-19 of the data address segment. The extended segment address record always has two data bytes and appears as follows:
:020000021200EA where:
* 02 is the number of data bytes in the record.
* 0000 is the address field. For the extended segment address record, this field is always 0000.
* 02 is the record type 02 (an extended segment address record).
* 1200 is the segment of the address.
* EA is the checksum of the record and is calculated as
01h + NOT(02h + 00h + 00h + 02h + 12h + 00h).
When an extended segment address record is read, the extended segment address stored in the data field is saved and is applied to subsequent records read from the Intel HEX file. The segment address remains effective until changed by another extended address record.
The absolute-memory address of a data record is obtained by adding the address field in the record to the shifted-address data from the extended segment address record. The following example illustrates this process.
Address from the data record's address field     2462Extended segment address record data field      1200                                             --------Absolute memory address                      00014462
Start Linear Address Records (MDK-ARM only)
Start linear address records specify the start address of the application. These records contain the full linear 32 bit address. The start linear address record always has four data bytes and appears as follows:
:04000005000000CD2A where:
* 04 is the number of data bytes in the record.
* 0000 is the address field. For the start linear address record, this field is always 0000.
* 05 is the record type 05 (a start linear address record).
* 000000CD is the 4 byte linear start address of the application.
* 2A is the checksum of the record and is calculated as
01h + NOT(04h + 00h + 00h + 05h + 00h + 00h + 00h + CDh).
The Start Linear Address specifies the address of the __main (pre-main) function but not the address of the startup code which usually calls __main after calling SystemInit(). An odd linear start address specifies that __main is compiled for the Thumb instruction set.
The Start Linear Address Record can appear anywhere in hex file. In most cases this record can be ignored because it does not contain information which is needed to program flash memory.


End-of-File (EOF) Records
An Intel HEX file must end with an end-of-file (EOF) record. This record must have the value 01 in the record type field. An EOF record always appears as follows:
:00000001FF where:
* 00 is the number of data bytes in the record.
* 0000 is the address where the data are to be located in memory. The address in end-of-file records is meaningless and is ignored. An address of 0000h is typical.
* 01 is the record type 01 (an end-of-file record).
* FF is the checksum of the record and is calculated as
01h + NOT(00h + 00h + 00h + 01h).


Example Intel HEX File
Following is an example of a complete Intel HEX file:
:10001300AC12AD13AE10AF1112002F8E0E8F0F2244:10000300E50B250DF509E50A350CF5081200132259:03000000020023D8:0C002300787FE4F6D8FD7581130200031D:10002F00EFF88DF0A4FFEDC5F0CEA42EFEEC88F016:04003F00A42EFE22CB:00000001FF --------------------------------------------------------------------------------------
在IAR-ARM工程(比如STM32L4芯片,Cortext-M4)里输出程序文件的格式选项里是:Intel Extended Hex,就是Extended Linear Address Records (HEX386),里面有type 04的记录,能够表示的更大程序。
在Keil C51工程里,程序文件的输出格式是: HEX-80。这种程序较小,里面可能没有基地址设置命令。

遇到不同的HEX文件类型,是之里面的基地址设置方式不同,这只是Intel Hex文件格式的扩展。
"Intel Hex" is the name. "Hex-80" indicates that none of the extensions introduced for the 8086 (20-bit addresses) and/or the 386 (32-bit addresses) are used.

与Intel 32不同,由于Hex-80用于64K地址范围以内的系统,所以没有基址设定的指令。如果在Keil-51中,用跨BANK的方式超过了64K,编译器会产生多个HXX文件来标识BANK。如:
test.h00
test.h01
test.h02
分别在每个bank的视角来产生64K代码空间。对于Common Bank由于其在每个Bank的视角中都存在,所以在几个文件中都有同样的存在,这点要求编程人员注意。
举报

更多回帖

发帖
×
20
完善资料,
赚取积分