一、SPI协议简介
SPI是串行外设接口(Serial Peripheral Interface)的缩写,是一种同步串行接口,相对于之前介绍过的UART和I2C,SPI的速率就高出很多,最高能到100M左右,SPI协议比较简单,就不多做介绍,主要介绍下与UART和I2C的不同,SPI是一主多从协议,每个从机通过一根片选信号线和主机连接,所以从接口线数量上没有I2C的优势,但是SPI是全双工通信,两根数据线,同时读写,而且还有带宽更大的DUAL SPI和QUAD SPI,分别是两根数据线同时写和四根数据线同时写,不过DSPI和QSPI一般就是半双工的了,读写不能同时进行。相较于UART,最大的不同就是SPI是同步协议,数据随总线时钟一同发送,所以接收方不需要进行波特率匹配,使用更灵活。
二、SPI协议的verilog实现思路
SPI的时序简单,开始发送后,拉低对应从机的片选信号,然后随着总线时钟一位一位地发送数据到总线上,不同于UART的是SPI一般先发送最高位,发送完一个字节就接着发送下一个字节,没有停止位或者起始位,知道主机重新拉高片选信号,一次操作结束。
要实现SPI,首先要产生SPI时钟,时钟一般通过系统时钟分频产生,将其参数化,以便灵活设置SPI速率。
其次就是SPI的四种模式,由时钟极性和时钟相位两个参数来控制,其中模式0和3是上升沿采样,模式1和2是下降沿采样,模式0和1空闲时为低电平,模式2和3空闲时为高电平,记住这两个最重要的区别即可。
在上一篇I2C的内容中,最后实现了用寄存器去控制I2C接口,这次SPI也用同样的方法来实现,所以提前预留出各种寄存器的接口。
对参数和接口进行规划之后,就是具体时序的实现了,首先时钟分频自然是通过计数器来实现,计数的过程中还能顺便在时钟的跳变沿产生脉冲信号方便使用,同时每个时钟周期对应一比特操作,对操作的比特也进行计数,每八个比特就是一个字节;至于数据线的操作就和串口一模一样,根据比特计数器的值发送或接收对应位的比特即可。
根据上述的思路画出时序图:
可以看到图中片选信号提前一个时钟周期就拉低了,但是实际操作并不需要这样,只要在读写的过程中片选信号保持为低即可。
以上都是SPI主机的实现思路,对于SPI的从机,一般想到的首先就是SPI flash,或者SPI屏幕等等,可以看出SPI的从机不像I2C的从机那样有比较通用的实现方法,很难写出一个很通用的从机模块,不过万变不离其宗,从机无论如何都是要接收主机发来的数据的,所以对于从机,我就只实现一个兼容性较好的可以完成接收数据操作和完成写回数据操作的模块,具体要实现的功能就基于此模块的基础上去修改,应该也会减少很多工作量。
对于从机的参数,主要需要兼容主机的四种模式,一般读写flash的时候,SPI flash都同时支持模式0和模式3,或是模式1和模式2,而前文我提到模式0和模式3的共同点是上升沿采样,所以我们就用采样沿参数化,当采样沿为上升沿时,兼容模式0和3,反之兼容模式1和2。但是对于不同的采样沿,显然需要两套不同的代码,这种情况下就需要generate关键字在不同的情况下生成不同的逻辑。
三、SPI主机的Verilog实现
主机实现的难点主要在于兼容四种模式,一般对于四种模式的描述都是第一个变化沿怎么样,第二个变化沿怎么样,我只能说这种描述太抽象了,又难记又不方便转化为逻辑语言,对于模式0和3,无论是第几个变化沿,都是在上升沿采样,下降沿发送,这些是确定的,时钟空闲状态也很好确定,是协议规定的时钟极性,至于第一个变化沿的问题直接简化为发送时的时钟初始状态,无论时钟空闲状态是高还是低,都从一个低电平开始,假如空闲状态为高,自然会产生一个下降沿,假如空闲状态为低,则继续保持低,这样真正的第一个变化沿自然是上升沿了。
综上所述,编写的SPI主机代码如下:
`timescale 1ns / 1ps module spi_master#( parameter CLK_PHA = 0, // SPI时钟相位 parameter CLK_POL = 0, // SPI时钟极性 parameter SCK_DIV_CNT = 4 // SPI时钟分频系数 )( input clk, // 输入时钟 input rst_n, // 同步复位 input op_start, // 操作开始信号 output op_busy, // 操作忙碌信号 input [7:0] op_len, // 操作长度 input [7:0] cs_ctrl, // 片选信号 output txc, // 数据请求 input [7:0] txd, // 数据输入 output rxv, // 数据有效 output [7:0] rxd, // 数据输出 output sck, // SPI时钟 output mosi, // SPI主机输出从机输入 input miso, // SPI主机输入从机输出 output [7:0] cs_n // SPI片选 ); // 参数变量声明 // SPI时钟空闲状态 localparam [0:0] SCK_IDLE = CLK_POL; // SPI时钟初始状态 localparam [0:0] SCK_INIT = CLK_PHA ? ~CLK_POL : CLK_POL; // 寄存器 reg spi_clk; reg master_out; reg [3:0] clk_cnt; reg [3:0] bit_cnt; reg [7:0] byte_cnt; reg spi_busy; reg data_req; reg data_valid; reg [7:0] data_out; reg start_ff1; reg start_ff2; reg start_flag; reg sck_r; reg mosi_r; reg [7:0] cs_n_r; // 组合逻辑 wire half_bit = clk_cnt == SCK_DIV_CNT/2 - 1; wire one_bit = clk_cnt == SCK_DIV_CNT - 1; wire one_byte = bit_cnt == 7; wire one_op = byte_cnt == (op_len - 1) & one_byte & one_bit; // 模块输出连线 assign op_busy = spi_busy; assign txc = data_req; assign rxv = data_valid; assign rxd = data_out; assign sck = sck_r; assign mosi = mosi_r; assign cs_n = cs_n_r; // 时序逻辑 // SPI主机接口输出 always @(posedge clk) begin if(spi_busy) begin sck_r <= spi_clk; mosi_r <= master_out; cs_n_r <= cs_ctrl; end else begin sck_r <= SCK_IDLE; mosi_r <= 1'b0; cs_n_r <= 8'hff; end end // 启动信号二级同步 always @(posedge clk) begin start_ff1 <= op_start; start_ff2 <= start_ff1; end always @(posedge clk) begin if(start_ff1 & ~start_ff2) start_flag <= 1'b1; else if(spi_busy) start_flag <= 1'b0; end // 产生SPI时钟,忙碌状态下,每半个比特周期翻转时钟信号 always @(posedge clk) begin if(!rst_n) spi_clk <= SCK_INIT; else if(spi_busy & (half_bit | one_bit)) spi_clk <= ~spi_clk; end // 忙碌标志信号,接收到启动信号拉高,发送完操作长度个字节后拉低 always @(posedge clk) begin if(!rst_n) spi_busy <= 0; else if(start_flag) spi_busy <= 1; else if(one_op) spi_busy <= 0; end // SPI时钟周期计数器,忙碌状态下计数,计满一比特清零 always @(posedge clk) begin if(!rst_n) clk_cnt <= 0; else if(spi_busy) begin if(one_bit) clk_cnt <= 0; else clk_cnt <= clk_cnt + 1; end end // 发送比特计数,发送完一字节计数器清零 always @(posedge clk) begin if(!rst_n) bit_cnt <= 0; else if(spi_busy & one_bit) begin if(one_byte) bit_cnt <= 0; else bit_cnt <= bit_cnt + 1; end end // 在发送每比特的中间时刻对输入线进行采样 always @(posedge clk) begin if(!rst_n) data_out <= 0; else if(spi_busy & half_bit) begin case (bit_cnt) 0:data_out[7] <= miso; 1:data_out[6] <= miso; 2:data_out[5] <= miso; 3:data_out[4] <= miso; 4:data_out[3] <= miso; 5:data_out[2] <= miso; 6:data_out[1] <= miso; 7:data_out[0] <= miso; default: data_out <= data_out; endcase end end // 依次发送每个比特到输出线 always @(posedge clk) begin if(!rst_n) master_out <= 0; else if(start_flag & !spi_busy) master_out <= txd[7]; else if(spi_busy & one_bit) begin case (bit_cnt) 0:master_out <= txd[6]; 1:master_out <= txd[5]; 2:master_out <= txd[4]; 3:master_out <= txd[3]; 4:master_out <= txd[2]; 5:master_out <= txd[1]; 6:master_out <= txd[0]; 7:master_out <= txd[7]; default: master_out <= master_out; endcase end end // 每字节发送结束前一个比特拉高数据请求信号 always @(posedge clk) begin if(!rst_n) data_req <= 0; else if((bit_cnt == 6) & one_bit) data_req <= 1; else data_req <= 0; end // 每字节发送结束的比特拉高数据有效信号 always @(posedge clk) begin if(!rst_n) data_valid <= 0; else if(one_byte & one_bit) data_valid <= 1; else data_valid <= 0; end // 每字节操作完成后字节计数加一,计数达到操作长度后清零 always @(posedge clk) begin if(!rst_n) byte_cnt <= 0; else if(one_byte & one_bit) begin if(one_op) byte_cnt <= 0; else byte_cnt <= byte_cnt + 1; end end endmodule
本次的代码和串口很像,只不过加入了部分寄存器配置端口。以下是spi寄存器代码:
`timescale 1ns / 1ps module spi_reg( input clk, input en, input we, input [7:0] din, output [7:0] dout, input [7:0] addr, output op_start, input op_busy, output [7:0] op_len, output [7:0] cs_ctrl, input txc, output [7:0] txd, input rxv, input [7:0] rxd ); reg [7:0] r_data_out; reg [7:0] r_tx_buffer [0:31]; // 0x00 - 0x1f write only reg [7:0] r_rx_buffer [0:31]; // 0x20 - 0x3f read only // bit 4-0: tx_buffer ptr reg [7:0] r_tx_ctrl = 0; // bit 4-0: rx_buffer ptr reg [7:0] r_rx_ctrl = 0; // bit 7: tx_buffer reset,self clear reg [7:0] r_tx_rst = 0; // bit 7: rx_buffer reset,self clear reg [7:0] r_rx_rst = 0; // bit 7: operate start,self clear reg [7:0] r_op_startup = 0; // bit 0: operate busy flag,read only reg [7:0] r_op_status = 0; // bit 7-0: operate length reg [7:0] r_op_length = 0; // bit 7-0: chip select control reg [7:0] r_cs_control = 0; reg [7:0] r_reserve_0 = 0; reg [7:0] r_reserve_1 = 0; reg [7:0] r_reserve_2 = 0; reg [7:0] r_reserve_3 = 0; reg [7:0] r_reserve_4 = 0; reg [7:0] r_reserve_5 = 0; reg [7:0] r_reserve_6 = 0; reg [7:0] r_reserve_7 = 0; // 0x40 - 0x4f reg [7:0] start_cnt; reg [7:0] txrst_cnt; reg [7:0] rxrst_cnt; assign dout = r_data_out; assign op_start = r_op_startup[7]; assign op_len = r_op_length; assign cs_ctrl = r_cs_control; assign txd = r_tx_buffer[r_tx_ctrl[4:0]]; always @(posedge clk) begin:READ_REGISTER if(en) begin case (addr) 8'h20: r_data_out <= r_rx_buffer[0]; 8'h21: r_data_out <= r_rx_buffer[1]; 8'h22: r_data_out <= r_rx_buffer[2]; 8'h23: r_data_out <= r_rx_buffer[3]; 8'h24: r_data_out <= r_rx_buffer[4]; 8'h25: r_data_out <= r_rx_buffer[5]; 8'h26: r_data_out <= r_rx_buffer[6]; 8'h27: r_data_out <= r_rx_buffer[7]; 8'h28: r_data_out <= r_rx_buffer[8]; 8'h29: r_data_out <= r_rx_buffer[9]; 8'h2a: r_data_out <= r_rx_buffer[10]; 8'h2b: r_data_out <= r_rx_buffer[11]; 8'h2c: r_data_out <= r_rx_buffer[12]; 8'h2d: r_data_out <= r_rx_buffer[13]; 8'h2e: r_data_out <= r_rx_buffer[14]; 8'h2f: r_data_out <= r_rx_buffer[15]; 8'h30: r_data_out <= r_rx_buffer[16]; 8'h31: r_data_out <= r_rx_buffer[17]; 8'h32: r_data_out <= r_rx_buffer[18]; 8'h33: r_data_out <= r_rx_buffer[19]; 8'h34: r_data_out <= r_rx_buffer[20]; 8'h35: r_data_out <= r_rx_buffer[21]; 8'h36: r_data_out <= r_rx_buffer[22]; 8'h37: r_data_out <= r_rx_buffer[23]; 8'h38: r_data_out <= r_rx_buffer[24]; 8'h39: r_data_out <= r_rx_buffer[25]; 8'h3a: r_data_out <= r_rx_buffer[26]; 8'h3b: r_data_out <= r_rx_buffer[27]; 8'h3c: r_data_out <= r_rx_buffer[28]; 8'h3d: r_data_out <= r_rx_buffer[29]; 8'h3e: r_data_out <= r_rx_buffer[30]; 8'h3f: r_data_out <= r_rx_buffer[31]; 8'h40: r_data_out <= r_tx_ctrl; 8'h41: r_data_out <= r_rx_ctrl; 8'h42: r_data_out <= r_tx_rst; 8'h43: r_data_out <= r_rx_rst; 8'h44: r_data_out <= r_op_startup; 8'h45: r_data_out <= r_op_status; 8'h46: r_data_out <= r_op_length; 8'h47: r_data_out <= r_cs_control; 8'h48: r_data_out <= r_reserve_0; 8'h49: r_data_out <= r_reserve_1; 8'h4a: r_data_out <= r_reserve_2; 8'h4b: r_data_out <= r_reserve_3; 8'h4c: r_data_out <= r_reserve_4; 8'h4d: r_data_out <= r_reserve_5; 8'h4e: r_data_out <= r_reserve_6; 8'h4f: r_data_out <= r_reserve_7; default: r_data_out <= r_data_out; endcase end end always @(posedge clk) begin:TX_BUFFER integer i; if(en & we) begin r_tx_buffer[addr] <= din; end else if(r_tx_rst[7]) begin for (i = 0;i < 32;i = i + 1) begin r_tx_buffer[i] <= 0; end end end always @(posedge clk) begin:RX_BUFFER integer j; if(rxv) r_rx_buffer[r_rx_ctrl[4:0]] <= rxd; else if(r_rx_rst[7]) begin for (j = 0;j < 32;j = j + 1) begin r_rx_buffer[j] <= 0; end end end always @(posedge clk) begin if(en & we & addr == 8'h40) r_tx_ctrl <= din; else if(r_tx_rst[7]) r_tx_ctrl <= 0; else if(txc) r_tx_ctrl <= (r_tx_ctrl != 8'h1f) ? r_tx_ctrl + 1 : 0; end always @(posedge clk) begin if(en & we & addr == 8'h41) r_rx_ctrl <= din; else if(r_rx_rst[7]) r_rx_ctrl <= 0; else if(rxv) r_rx_ctrl <= (r_rx_ctrl != 8'h1f) ? r_rx_ctrl + 1 : 0; end always @(posedge clk) begin if(en & we & addr == 8'h42) r_tx_rst <= din; else if(&txrst_cnt) r_tx_rst <= r_tx_rst & 8'b0111_1111; end always @(posedge clk) begin if(en & we & addr == 8'h43) r_rx_rst <= din; else if(&rxrst_cnt) r_rx_rst <= r_rx_rst & 8'b0111_1111; end always @(posedge clk) begin if(en & we & addr == 8'h44) r_op_startup <= din; else if(&start_cnt) r_op_startup <= r_op_startup & 8'b0111_1111; end always @(posedge clk) begin r_op_status <= {7'b0000000,op_busy}; end always @(posedge clk) begin if(en & we) begin case(addr) 8'h46:r_op_length <= din; 8'h47:r_cs_control <= din; 8'h48:r_reserve_0 <= din; 8'h49:r_reserve_1 <= din; 8'h4a:r_reserve_2 <= din; 8'h4b:r_reserve_3 <= din; 8'h4c:r_reserve_4 <= din; 8'h4d:r_reserve_5 <= din; 8'h4e:r_reserve_6 <= din; 8'h4f:r_reserve_7 <= din; endcase end end initial begin:TX_BUF_INIT integer n; for(n = 0;n < 32;n = n + 1) begin r_tx_buffer[n] = 0; end end initial begin:RX_BUF_INIT integer m; for(m = 0;m < 32;m = m + 1) begin r_rx_buffer[m] = 0; end end always @(posedge clk) begin if(r_op_startup[7]) start_cnt <= (&start_cnt) ? start_cnt : start_cnt + 1; else start_cnt <= 0; end always @(posedge clk) begin if(r_tx_rst[7]) txrst_cnt <= (&txrst_cnt) ? txrst_cnt : txrst_cnt + 1; else txrst_cnt <= 0; end always @(posedge clk) begin if(r_rx_rst[7]) rxrst_cnt <= (&rxrst_cnt) ? rxrst_cnt : rxrst_cnt + 1; else rxrst_cnt <= 0; end endmodule
以上SPI主机是最常见的单线输出单线输入SPI接口,在实际项目中还经常会用到四线输出输入的qspi接口,正好最近的项目中也要用到qspi,就基于以上的spi修改了一个qspi主机模块,此qspi模块主要用于数据流写入,所以没有做读写双向处理,如果要支持四线读写,还需要做一些小修改。代码如下:
`timescale 1ns / 1ps module qspi_master#( parameter CLK_PHA = 0, // SPI时钟相位 parameter CLK_POL = 0, // SPI时钟极性 parameter SCK_DIV_CNT = 4 // SPI时钟分频系数 )( input clk, // 输入时钟 input rst_n, // 同步复位 input empty_n, // 发送缓存空 input [1:0] wire_mode, // SPI线模式 0:单线 1:双线 2:四线 output txc, // 数据请求 input [7:0] txd, // 数据输入 output rxv, // 数据有效 output [7:0] rxd, // 数据输出 output sck, // SPI时钟 output cs_n, // SPI片选 output sd_0, // 单线模式MOSI,双线四线输出线 inout sd_1, // 单线模式MISO,双线四线输出线 output sd_2, // 四线模式输出线 output sd_3 // 四线模式输出线 ); // 参数变量声明 // SPI时钟空闲状态 localparam [0:0] SCK_IDLE = CLK_POL; // SPI时钟初始状态 localparam [0:0] SCK_INIT = CLK_PHA ? ~CLK_POL : CLK_POL; // 寄存器 reg spi_clk; reg [3:0] clk_cnt; reg [3:0] bit_cnt; reg spi_busy; reg data_req; reg data_valid; reg [7:0] data_out; reg start_ff1; reg start_ff2; reg start_flag; reg sck_r; reg cs_n_r; reg sd_0_r; reg sd_1_r; reg sd_2_r; reg sd_3_r; reg out_0; reg out_1; reg out_2; reg out_3; // 组合逻辑 wire miso = sd_1; wire half_bit = clk_cnt == SCK_DIV_CNT/2 - 1; wire one_bit = clk_cnt == SCK_DIV_CNT - 1; wire one_byte = wire_mode == 0 ? bit_cnt == 7 : wire_mode == 1 ? bit_cnt == 3 : wire_mode == 2 ? bit_cnt == 1 : 1'b0; wire nxt_byte = wire_mode == 0 ? bit_cnt == 6 : wire_mode == 1 ? bit_cnt == 2 : wire_mode == 2 ? bit_cnt == 0 : 1'b0; // 模块输出连线 assign txc = data_req; assign rxv = data_valid; assign rxd = data_out; assign sck = sck_r; assign cs_n = cs_n_r; assign sd_0 = sd_0_r; assign sd_1 = ((wire_mode == 1)|(wire_mode == 2)) ? sd_1_r : 1'bz; assign sd_2 = sd_2_r; assign sd_3 = sd_3_r; // 时序逻辑 // SPI主机接口输出 always @(posedge clk) begin if(spi_busy) begin sck_r <= spi_clk; cs_n_r <= 1'b0; sd_0_r <= out_0; sd_1_r <= out_1; sd_2_r <= out_2; sd_3_r <= out_3; end else begin sck_r <= SCK_IDLE; cs_n_r <= 1'b1; sd_0_r <= 1'b0; sd_1_r <= 1'b0; sd_2_r <= 1'b0; sd_3_r <= 1'b0; end end // 启动信号二级同步 always @(posedge clk) begin start_ff1 <= empty_n; start_ff2 <= start_ff1; end always @(posedge clk) begin if(start_ff1 & ~start_ff2) start_flag <= 1'b1; else if(spi_busy) start_flag <= 1'b0; end // 产生SPI时钟,忙碌状态下,每半个比特周期翻转时钟信号 always @(posedge clk) begin if(!rst_n) spi_clk <= SCK_INIT; else if(spi_busy & (half_bit | one_bit)) spi_clk <= ~spi_clk; end // 忙碌标志信号,接收到启动信号拉高,发送完操作长度个字节后拉低 always @(posedge clk) begin if(!rst_n) spi_busy <= 0; else if(start_flag) spi_busy <= 1; else if(one_bit & one_byte & !empty_n) spi_busy <= 0; end // SPI时钟周期计数器,忙碌状态下计数,计满一比特清零 always @(posedge clk) begin if(!rst_n) clk_cnt <= 0; else if(spi_busy) begin if(one_bit) clk_cnt <= 0; else clk_cnt <= clk_cnt + 1; end end // 发送比特计数,发送完一字节计数器清零 always @(posedge clk) begin if(!rst_n) bit_cnt <= 0; else if(spi_busy & one_bit) begin if(one_byte) bit_cnt <= 0; else bit_cnt <= bit_cnt + 1; end end // 在发送每比特的中间时刻对输入线进行采样 always @(posedge clk) begin if(!rst_n) data_out <= 0; else if(wire_mode == 0 & spi_busy & half_bit) begin case (bit_cnt) 0:data_out[7] <= miso; 1:data_out[6] <= miso; 2:data_out[5] <= miso; 3:data_out[4] <= miso; 4:data_out[3] <= miso; 5:data_out[2] <= miso; 6:data_out[1] <= miso; 7:data_out[0] <= miso; default: data_out <= data_out; endcase end end // 依次发送每个比特到输出线 always @(posedge clk) begin if(!rst_n) out_0 <= 0; else if(start_flag & !spi_busy) out_0 <= txd[7]; else if(wire_mode == 0 & spi_busy & one_bit) begin case (bit_cnt) 0:out_0 <= txd[6]; 1:out_0 <= txd[5]; 2:out_0 <= txd[4]; 3:out_0 <= txd[3]; 4:out_0 <= txd[2]; 5:out_0 <= txd[1]; 6:out_0 <= txd[0]; 7:out_0 <= txd[7]; default: out_0 <= out_0; endcase end else if(wire_mode == 1 & spi_busy & one_bit) begin case (bit_cnt) 0:out_0 <= txd[5]; 1:out_0 <= txd[3]; 2:out_0 <= txd[1]; 3:out_0 <= txd[7]; default: out_0 <= out_0; endcase end else if(wire_mode == 2 & spi_busy & one_bit) begin case (bit_cnt) 0:out_0 <= txd[3]; 1:out_0 <= txd[7]; default: out_0 <= out_0; endcase end end always @(posedge clk) begin if(!rst_n) out_1 <= 0; else if(wire_mode != 0 & start_flag & !spi_busy) out_1 <= txd[6]; else if(wire_mode == 1 & spi_busy & one_bit) begin case (bit_cnt) 0:out_1 <= txd[4]; 1:out_1 <= txd[2]; 2:out_1 <= txd[0]; 3:out_1 <= txd[6]; default: out_1 <= out_1; endcase end else if(wire_mode == 2 & spi_busy & one_bit) begin case (bit_cnt) 0:out_1 <= txd[2]; 1:out_1 <= txd[6]; default: out_1 <= out_1; endcase end end always @(posedge clk) begin if(!rst_n) out_2 <= 0; else if(wire_mode == 2 & start_flag & !spi_busy) out_2 <= txd[5]; else if(wire_mode == 2 & spi_busy & one_bit) begin case (bit_cnt) 0:out_2 <= txd[1]; 1:out_2 <= txd[5]; default: out_2 <= out_2; endcase end end always @(posedge clk) begin if(!rst_n) out_3 <= 0; else if(wire_mode == 2 & start_flag & !spi_busy) out_3 <= txd[4]; else if(wire_mode == 2 & spi_busy & one_bit) begin case (bit_cnt) 0:out_3 <= txd[0]; 1:out_3 <= txd[4]; default: out_3 <= out_3; endcase end end // 每字节发送结束前一个比特拉高数据请求信号 always @(posedge clk) begin if(!rst_n) data_req <= 0; else if(nxt_byte & one_bit) data_req <= 1; else data_req <= 0; end // 每字节发送结束的比特拉高数据有效信号 always @(posedge clk) begin if(!rst_n) data_valid <= 0; else if(one_byte & one_bit) data_valid <= 1; else data_valid <= 0; end endmodule
主机部分到此结束,接下来介绍从机的实现。
四、SPI从机的Verilog实现
从机的实现方式有两种,一是直接使用SCK作为时钟,二是使用高速时钟对SCK进行过采样来获取SCK的时钟沿,第一种方式实现的设计较为紧凑,第二种方式实现的设计可靠性较高,对于I2C来说,SCL最快也不过几M的频率,很容易实现第二种方式,但是SPI就不一样了,可能SCK基本都在几十M甚至100M左右,这样的话采样时钟起码需要两百多M,对于FPGA来说已经算是很高的系统频率了,舍本逐末了属于是,所以对于SPI从机我们选择用第一种方式实现。
至于具体的功能,此从机主要模拟flash,接收几个字节的数据后返回数据,我还在内部ram的前几个字节提前初始化了几个数据来观察返回数据的正确性。代码如下:
module spi_slave#( parameter SAMPLE_EDGE = "rise" // "rise" or "fall",update edge is the opposite one )( input wire sck, // SPI串行时钟 input wire cs_n, // SPI片选信号 input wire mosi, // SPI从机输入 output wire miso // SPI从机输出 ); localparam RX_BYTE_CNT = 4; // 接收字节数,指一般情况下指令 + 地址的字节数 reg [7:0] ram [0:255]; // 内部RAM reg [7:0] addr; // 操作RAM地址 reg rx_valid; // 接收数据有效信号 reg [7:0] rx_buffer; // 接收数据缓存区 reg [7:0] des_reg; // 接收数据存放目标寄存器 reg [2:0] bit_cnt; // 操作比特计数 reg [3:0] byte_cnt; // 操作字节计数 reg slave_out; // 从机输出 reg state; // 状态信号 0:接收 1:发送 // 操作计数满一字节 wire one_byte = &bit_cnt; // 接收状态结束 wire rx_done = byte_cnt == RX_BYTE_CNT - 1; // 接收进行中 wire rx_busy = state == 1'b0; // 发送进行中 wire tx_busy = state == 1'b1; // 发送缓存区,指向内部RAM wire [7:0] tx_buffer = ram[addr]; assign miso = slave_out; // 初始状态为接收,接收完成后开始发送 always @(posedge sck or posedge cs_n) begin if(cs_n) state <= 1'b0; else if(rx_done & one_byte) state <= 1'b1; end generate // 上升沿采样,即可满足模式0和模式3,其余逻辑简单,懒得注释了 if(SAMPLE_EDGE == "rise") begin:MODE_0_3 always @(posedge sck or posedge cs_n) begin if(cs_n) addr <= 1'b0; else if(tx_busy & one_byte) addr <= addr + 1; end always @(posedge sck or posedge cs_n) begin if(cs_n) des_reg <= 1'b0; else if(rx_valid) des_reg <= rx_buffer; end always @(posedge sck or posedge cs_n) begin if(cs_n) bit_cnt <= 0; else bit_cnt <= bit_cnt + 1; end always @(posedge sck or posedge cs_n) begin if(cs_n) byte_cnt <= 0; else if(one_byte) byte_cnt <= (rx_done) ? byte_cnt : byte_cnt + 1; end always @(posedge sck or posedge cs_n) begin if(cs_n) begin rx_valid <= 0; rx_buffer <= 0; end else if(rx_busy) begin rx_valid <= one_byte; case (bit_cnt) 0:rx_buffer[7] <= mosi; 1:rx_buffer[6] <= mosi; 2:rx_buffer[5] <= mosi; 3:rx_buffer[4] <= mosi; 4:rx_buffer[3] <= mosi; 5:rx_buffer[2] <= mosi; 6:rx_buffer[1] <= mosi; 7:rx_buffer[0] <= mosi; default:rx_buffer <= rx_buffer; endcase end end always @(negedge sck or posedge cs_n) begin if(cs_n) slave_out <= 0; else if(tx_busy) begin case (bit_cnt) 0:slave_out <= tx_buffer[7]; 1:slave_out <= tx_buffer[6]; 2:slave_out <= tx_buffer[5]; 3:slave_out <= tx_buffer[4]; 4:slave_out <= tx_buffer[3]; 5:slave_out <= tx_buffer[2]; 6:slave_out <= tx_buffer[1]; 7:slave_out <= tx_buffer[0]; default:slave_out <= slave_out; endcase end end // 下降沿采样,即可满足模式1和模式2 end else if(SAMPLE_EDGE == "fall") begin:MODE_1_2 always @(negedge sck or posedge cs_n) begin if(cs_n) addr <= 1'b0; else if(tx_busy & one_byte) addr <= addr + 1; end always @(negedge sck or posedge cs_n) begin if(cs_n) des_reg <= 1'b0; else if(rx_busy & one_byte) des_reg <= rx_buffer; end always @(negedge sck or posedge cs_n) begin if(cs_n) bit_cnt <= 0; else bit_cnt <= bit_cnt + 1; end always @(negedge sck or posedge cs_n) begin if(cs_n) byte_cnt <= 0; else if(one_byte) byte_cnt <= (rx_done) ? byte_cnt : byte_cnt + 1; end always @(negedge sck or posedge cs_n) begin if(cs_n) begin rx_valid <= 0; rx_buffer <= 0; end else if(rx_busy) begin rx_valid <= one_byte; case (bit_cnt) 0:rx_buffer[7] <= mosi; 1:rx_buffer[6] <= mosi; 2:rx_buffer[5] <= mosi; 3:rx_buffer[4] <= mosi; 4:rx_buffer[3] <= mosi; 5:rx_buffer[2] <= mosi; 6:rx_buffer[1] <= mosi; 7:rx_buffer[0] <= mosi; default:rx_buffer <= rx_buffer; endcase end end always @(posedge sck or posedge cs_n) begin if(cs_n) slave_out <= 0; else if(tx_busy) begin case (bit_cnt) 0:slave_out <= tx_buffer[7]; 1:slave_out <= tx_buffer[6]; 2:slave_out <= tx_buffer[5]; 3:slave_out <= tx_buffer[4]; 4:slave_out <= tx_buffer[3]; 5:slave_out <= tx_buffer[2]; 6:slave_out <= tx_buffer[1]; 7:slave_out <= tx_buffer[0]; default:slave_out <= slave_out; endcase end end end endgenerate // 初始化RAM数据 initial begin:ram_initialize integer i; ram[0] <= 8'h53; ram[1] <= 8'h8b; ram[2] <= 8'h9c; ram[3] <= 8'hea; for (i = 4;i < 256;i = i + 1) begin ram[i] <= 0; end end endmodule
接收的数据都写入了des_reg中,实际要使用时,只需要根据条件把des_reg改为别的寄存器或内存即可正确存储数据。
我一般习惯使用同步复位,但是由于此从机是使用SCK作为时钟,无法同步复位,所以将片选信号作为异步复位来对整个模块进行复位。
主机从机都实现了,接下来对两个模块进行仿真。
五、SPI主从仿真
和上一篇的I2C主机类似,SPI主机模块也是通过寄存器控制的,像上次那样通过vio对各寄存器进行读写就可以完成对模块的操作,但是这样操作太麻烦了,一般对寄存器的操作是交给PS端来做的,我们可以写一个简单的类似MCU的模块,通过指令来控制读写寄存器,这样每次复位MCU就会从头到尾执行一遍我们提前写好的指令,就不用一个一个寄存器去操作了。
MCU代码如下:
`timescale 1ns / 1ps module mcu( input clk, input rst_n, input [23:0] ir, output [7:0] pc, output wr_en, output [7:0] wr_data, output rd_en, input [7:0] rd_data, output en, output we, output [7:0] din, input [7:0] dout, output [7:0] addr ); localparam INITIAL = 8'b1111_0000; localparam FIFO_WR = 8'b0000_0001; localparam FIFO_RD = 8'b0000_0010; localparam RAM_WR = 8'b0000_1001; localparam RAM_RD = 8'b0000_1010; localparam JUMP = 8'b1000_0000; reg init; reg run; // [23:16]:opcode [15:8]:data [7:0]:addr reg [23:0] r_instr; reg [7:0] r_pcntr; reg [7:0] op_code; reg [7:0] op_data; reg [7:0] op_addr; reg fifo_we; reg [7:0] fifo_wd; reg fifo_re; reg [7:0] fifo_rd; reg ram_en; reg ram_we; reg [7:0] ram_di; reg [7:0] ram_ad; always @(posedge clk) begin if(!rst_n) begin init <= 1'b1; run <= 1'b0; end else if(&r_pcntr) begin init <= 1'b0; run <= 1'b0; end else if(init) begin init <= 1'b0; run <= 1'b1; end end // fetch always @(posedge clk) begin if(init) begin r_instr <= 0; r_pcntr <= 0; end else if(run) begin r_instr <= ir; r_pcntr <= ir[23:16] == JUMP ? ir[7:0] : r_pcntr + 1; end end // decode always @(posedge clk) begin if(init) begin op_code <= 0; op_data <= 0; op_addr <= 0; end else if(run) begin op_code <= r_instr[23:16]; op_data <= r_instr[15:8]; op_addr <= r_instr[7:0]; end end // execute always @(posedge clk) begin if(init) begin fifo_we <= 0; fifo_wd <= 0; fifo_re <= 0; fifo_rd <= 0; ram_en <= 0; ram_we <= 0; ram_di <= 0; ram_ad <= 0; end else if(run) begin case (op_code) INITIAL:begin fifo_we <= 0; fifo_wd <= 0; fifo_re <= 0; fifo_rd <= 0; ram_en <= 0; ram_we <= 0; ram_di <= 0; ram_ad <= 0; end FIFO_WR:begin fifo_we <= 1'b1; fifo_wd <= op_data; end FIFO_RD:begin fifo_re <= 1'b1; end RAM_WR:begin ram_en <= 1'b1; ram_we <= 1'b1; ram_di <= op_data; ram_ad <= op_addr; end RAM_RD:begin ram_en <= 1'b1; ram_ad <= op_addr; end default:begin fifo_we <= 1'b0; fifo_re <= 1'b0; ram_en <= 1'b0; ram_we <= 1'b0; end endcase end end assign pc = r_pcntr; assign wr_en = fifo_we; assign wr_data = fifo_wd; assign rd_en = fifo_re; assign en = ram_en; assign we = ram_we; assign din = ram_di; assign addr = ram_ad; endmodule
一个极其简单的三级流水线架构,只能实现ram读写和fifo读写,用来控制寄存器足够了。
Vivado有一个很好用的功能,可以综合initial块以实现对寄存器的初始化,我们可以用这种方法来预设mcu的指令,编写一个简单的rom,然后对每个地址的内容进行初始化:
`timescale 1ns / 1ps module irom( input clk, input en, input [7:0] addr, output [23:0] dout ); reg [23:0] r_dout; (* ROM_STYLE = "distributed" *) reg [23:0] rom [0:255]; assign dout = r_dout; always @(posedge clk) begin if(en) r_dout <= rom[addr]; end localparam INITIAL = 8'b1111_0000; localparam FIFO_WR = 8'b0000_0001; localparam FIFO_RD = 8'b0000_0010; localparam RAM_WR = 8'b0000_1001; localparam RAM_RD = 8'b0000_1010; localparam JUMP = 8'b1000_0000; localparam OPCODE = 8'b0000_0000; localparam OPDATA = 8'b0000_0000; localparam OPADDR = 8'b0000_0000; initial begin rom[0] <= {INITIAL,OPDATA,OPADDR}; rom[1] <= {OPCODE,OPDATA,OPADDR}; rom[2] <= {RAM_WR,8'h85,8'h00}; rom[3] <= {RAM_WR,8'h90,8'h01}; rom[4] <= {RAM_WR,8'h4a,8'h02}; rom[5] <= {RAM_WR,8'h5c,8'h03}; rom[6] <= {RAM_WR,8'hff,8'h04}; rom[7] <= {RAM_WR,8'hff,8'h05}; rom[8] <= {RAM_WR,8'hff,8'h06}; rom[9] <= {RAM_WR,8'hff,8'h07}; rom[10] <= {OPCODE,OPDATA,OPADDR}; rom[11] <= {RAM_WR,8'h08,8'h46}; rom[12] <= {OPCODE,OPDATA,OPADDR}; rom[13] <= {RAM_WR,8'b11111110,8'h47}; rom[14] <= {OPCODE,OPDATA,OPADDR}; rom[15] <= {RAM_WR,8'h00,8'h40}; rom[16] <= {OPCODE,OPDATA,OPADDR}; rom[17] <= {RAM_WR,8'h00,8'h41}; rom[18] <= {OPCODE,OPDATA,OPADDR}; rom[19] <= {RAM_WR,8'h80,8'h44}; rom[20] <= {OPCODE,OPDATA,OPADDR}; rom[21] <= {OPCODE,OPDATA,OPADDR}; rom[22] <= {OPCODE,OPDATA,OPADDR}; rom[23] <= {OPCODE,OPDATA,OPADDR}; rom[24] <= {OPCODE,OPDATA,OPADDR}; rom[25] <= {OPCODE,OPDATA,OPADDR}; rom[26] <= {OPCODE,OPDATA,OPADDR}; rom[27] <= {OPCODE,OPDATA,OPADDR}; rom[28] <= {OPCODE,OPDATA,OPADDR}; rom[29] <= {OPCODE,OPDATA,OPADDR}; rom[30] <= {OPCODE,OPDATA,OPADDR}; rom[31] <= {OPCODE,OPDATA,OPADDR}; rom[32] <= {OPCODE,OPDATA,OPADDR}; rom[33] <= {OPCODE,OPDATA,OPADDR}; rom[34] <= {OPCODE,OPDATA,OPADDR}; rom[35] <= {OPCODE,OPDATA,OPADDR}; rom[36] <= {OPCODE,OPDATA,OPADDR}; rom[37] <= {OPCODE,OPDATA,OPADDR}; rom[38] <= {OPCODE,OPDATA,OPADDR}; rom[39] <= {OPCODE,OPDATA,OPADDR}; rom[40] <= {OPCODE,OPDATA,OPADDR}; rom[41] <= {OPCODE,OPDATA,OPADDR}; rom[42] <= {OPCODE,OPDATA,OPADDR}; rom[43] <= {OPCODE,OPDATA,OPADDR}; rom[44] <= {OPCODE,OPDATA,OPADDR}; rom[45] <= {OPCODE,OPDATA,OPADDR}; rom[46] <= {OPCODE,OPDATA,OPADDR}; rom[47] <= {OPCODE,OPDATA,OPADDR}; rom[48] <= {OPCODE,OPDATA,OPADDR}; rom[49] <= {OPCODE,OPDATA,OPADDR}; rom[50] <= {OPCODE,OPDATA,OPADDR}; rom[51] <= {OPCODE,OPDATA,OPADDR}; rom[52] <= {OPCODE,OPDATA,OPADDR}; rom[53] <= {OPCODE,OPDATA,OPADDR}; rom[54] <= {OPCODE,OPDATA,OPADDR}; rom[55] <= {OPCODE,OPDATA,OPADDR}; rom[56] <= {OPCODE,OPDATA,OPADDR}; rom[57] <= {OPCODE,OPDATA,OPADDR}; rom[58] <= {OPCODE,OPDATA,OPADDR}; rom[59] <= {OPCODE,OPDATA,OPADDR}; rom[60] <= {OPCODE,OPDATA,OPADDR}; rom[61] <= {OPCODE,OPDATA,OPADDR}; rom[62] <= {OPCODE,OPDATA,OPADDR}; rom[63] <= {OPCODE,OPDATA,OPADDR}; rom[64] <= {OPCODE,OPDATA,OPADDR}; rom[65] <= {OPCODE,OPDATA,OPADDR}; rom[66] <= {OPCODE,OPDATA,OPADDR}; rom[67] <= {OPCODE,OPDATA,OPADDR}; rom[68] <= {OPCODE,OPDATA,OPADDR}; rom[69] <= {OPCODE,OPDATA,OPADDR}; rom[70] <= {OPCODE,OPDATA,OPADDR}; rom[71] <= {OPCODE,OPDATA,OPADDR}; rom[72] <= {OPCODE,OPDATA,OPADDR}; rom[73] <= {OPCODE,OPDATA,OPADDR}; rom[74] <= {OPCODE,OPDATA,OPADDR}; rom[75] <= {OPCODE,OPDATA,OPADDR}; rom[76] <= {OPCODE,OPDATA,OPADDR}; rom[77] <= {OPCODE,OPDATA,OPADDR}; rom[78] <= {OPCODE,OPDATA,OPADDR}; rom[79] <= {OPCODE,OPDATA,OPADDR}; rom[80] <= {OPCODE,OPDATA,OPADDR}; rom[81] <= {OPCODE,OPDATA,OPADDR}; rom[82] <= {OPCODE,OPDATA,OPADDR}; rom[83] <= {OPCODE,OPDATA,OPADDR}; rom[84] <= {OPCODE,OPDATA,OPADDR}; rom[85] <= {OPCODE,OPDATA,OPADDR}; rom[86] <= {OPCODE,OPDATA,OPADDR}; rom[87] <= {OPCODE,OPDATA,OPADDR}; rom[88] <= {OPCODE,OPDATA,OPADDR}; rom[89] <= {OPCODE,OPDATA,OPADDR}; rom[90] <= {OPCODE,OPDATA,OPADDR}; rom[91] <= {OPCODE,OPDATA,OPADDR}; rom[92] <= {OPCODE,OPDATA,OPADDR}; rom[93] <= {OPCODE,OPDATA,OPADDR}; rom[94] <= {OPCODE,OPDATA,OPADDR}; rom[95] <= {OPCODE,OPDATA,OPADDR}; rom[96] <= {OPCODE,OPDATA,OPADDR}; rom[97] <= {OPCODE,OPDATA,OPADDR}; rom[98] <= {OPCODE,OPDATA,OPADDR}; rom[99] <= {OPCODE,OPDATA,OPADDR}; rom[100] <= {OPCODE,OPDATA,OPADDR}; rom[101] <= {OPCODE,OPDATA,OPADDR}; rom[102] <= {OPCODE,OPDATA,OPADDR}; rom[103] <= {OPCODE,OPDATA,OPADDR}; rom[104] <= {OPCODE,OPDATA,OPADDR}; rom[105] <= {OPCODE,OPDATA,OPADDR}; rom[106] <= {OPCODE,OPDATA,OPADDR}; rom[107] <= {OPCODE,OPDATA,OPADDR}; rom[108] <= {OPCODE,OPDATA,OPADDR}; rom[109] <= {OPCODE,OPDATA,OPADDR}; rom[110] <= {OPCODE,OPDATA,OPADDR}; rom[111] <= {OPCODE,OPDATA,OPADDR}; rom[112] <= {OPCODE,OPDATA,OPADDR}; rom[113] <= {OPCODE,OPDATA,OPADDR}; rom[114] <= {OPCODE,OPDATA,OPADDR}; rom[115] <= {OPCODE,OPDATA,OPADDR}; rom[116] <= {OPCODE,OPDATA,OPADDR}; rom[117] <= {OPCODE,OPDATA,OPADDR}; rom[118] <= {OPCODE,OPDATA,OPADDR}; rom[119] <= {OPCODE,OPDATA,OPADDR}; rom[120] <= {OPCODE,OPDATA,OPADDR}; rom[121] <= {OPCODE,OPDATA,OPADDR}; rom[122] <= {OPCODE,OPDATA,OPADDR}; rom[123] <= {OPCODE,OPDATA,OPADDR}; rom[124] <= {OPCODE,OPDATA,OPADDR}; rom[125] <= {OPCODE,OPDATA,OPADDR}; rom[126] <= {OPCODE,OPDATA,OPADDR}; rom[127] <= {OPCODE,OPDATA,OPADDR}; rom[128] <= {OPCODE,OPDATA,OPADDR}; rom[129] <= {OPCODE,OPDATA,OPADDR}; rom[130] <= {OPCODE,OPDATA,OPADDR}; rom[131] <= {OPCODE,OPDATA,OPADDR}; rom[132] <= {OPCODE,OPDATA,OPADDR}; rom[133] <= {OPCODE,OPDATA,OPADDR}; rom[134] <= {OPCODE,OPDATA,OPADDR}; rom[135] <= {OPCODE,OPDATA,OPADDR}; rom[136] <= {OPCODE,OPDATA,OPADDR}; rom[137] <= {OPCODE,OPDATA,OPADDR}; rom[138] <= {OPCODE,OPDATA,OPADDR}; rom[139] <= {OPCODE,OPDATA,OPADDR}; rom[140] <= {OPCODE,OPDATA,OPADDR}; rom[141] <= {OPCODE,OPDATA,OPADDR}; rom[142] <= {OPCODE,OPDATA,OPADDR}; rom[143] <= {OPCODE,OPDATA,OPADDR}; rom[144] <= {OPCODE,OPDATA,OPADDR}; rom[145] <= {OPCODE,OPDATA,OPADDR}; rom[146] <= {OPCODE,OPDATA,OPADDR}; rom[147] <= {OPCODE,OPDATA,OPADDR}; rom[148] <= {OPCODE,OPDATA,OPADDR}; rom[149] <= {OPCODE,OPDATA,OPADDR}; rom[150] <= {OPCODE,OPDATA,OPADDR}; rom[151] <= {OPCODE,OPDATA,OPADDR}; rom[152] <= {OPCODE,OPDATA,OPADDR}; rom[153] <= {OPCODE,OPDATA,OPADDR}; rom[154] <= {OPCODE,OPDATA,OPADDR}; rom[155] <= {OPCODE,OPDATA,OPADDR}; rom[156] <= {OPCODE,OPDATA,OPADDR}; rom[157] <= {OPCODE,OPDATA,OPADDR}; rom[158] <= {OPCODE,OPDATA,OPADDR}; rom[159] <= {OPCODE,OPDATA,OPADDR}; rom[160] <= {OPCODE,OPDATA,OPADDR}; rom[161] <= {OPCODE,OPDATA,OPADDR}; rom[162] <= {OPCODE,OPDATA,OPADDR}; rom[163] <= {OPCODE,OPDATA,OPADDR}; rom[164] <= {OPCODE,OPDATA,OPADDR}; rom[165] <= {OPCODE,OPDATA,OPADDR}; rom[166] <= {OPCODE,OPDATA,OPADDR}; rom[167] <= {OPCODE,OPDATA,OPADDR}; rom[168] <= {OPCODE,OPDATA,OPADDR}; rom[169] <= {OPCODE,OPDATA,OPADDR}; rom[170] <= {OPCODE,OPDATA,OPADDR}; rom[171] <= {OPCODE,OPDATA,OPADDR}; rom[172] <= {OPCODE,OPDATA,OPADDR}; rom[173] <= {OPCODE,OPDATA,OPADDR}; rom[174] <= {OPCODE,OPDATA,OPADDR}; rom[175] <= {OPCODE,OPDATA,OPADDR}; rom[176] <= {OPCODE,OPDATA,OPADDR}; rom[177] <= {OPCODE,OPDATA,OPADDR}; rom[178] <= {OPCODE,OPDATA,OPADDR}; rom[179] <= {OPCODE,OPDATA,OPADDR}; rom[180] <= {OPCODE,OPDATA,OPADDR}; rom[181] <= {OPCODE,OPDATA,OPADDR}; rom[182] <= {OPCODE,OPDATA,OPADDR}; rom[183] <= {OPCODE,OPDATA,OPADDR}; rom[184] <= {OPCODE,OPDATA,OPADDR}; rom[185] <= {OPCODE,OPDATA,OPADDR}; rom[186] <= {OPCODE,OPDATA,OPADDR}; rom[187] <= {OPCODE,OPDATA,OPADDR}; rom[188] <= {OPCODE,OPDATA,OPADDR}; rom[189] <= {OPCODE,OPDATA,OPADDR}; rom[190] <= {OPCODE,OPDATA,OPADDR}; rom[191] <= {OPCODE,OPDATA,OPADDR}; rom[192] <= {OPCODE,OPDATA,OPADDR}; rom[193] <= {OPCODE,OPDATA,OPADDR}; rom[194] <= {OPCODE,OPDATA,OPADDR}; rom[195] <= {OPCODE,OPDATA,OPADDR}; rom[196] <= {OPCODE,OPDATA,OPADDR}; rom[197] <= {OPCODE,OPDATA,OPADDR}; rom[198] <= {OPCODE,OPDATA,OPADDR}; rom[199] <= {OPCODE,OPDATA,OPADDR}; rom[200] <= {OPCODE,OPDATA,OPADDR}; rom[201] <= {OPCODE,OPDATA,OPADDR}; rom[202] <= {OPCODE,OPDATA,OPADDR}; rom[203] <= {OPCODE,OPDATA,OPADDR}; rom[204] <= {OPCODE,OPDATA,OPADDR}; rom[205] <= {OPCODE,OPDATA,OPADDR}; rom[206] <= {OPCODE,OPDATA,OPADDR}; rom[207] <= {OPCODE,OPDATA,OPADDR}; rom[208] <= {OPCODE,OPDATA,OPADDR}; rom[209] <= {OPCODE,OPDATA,OPADDR}; rom[210] <= {OPCODE,OPDATA,OPADDR}; rom[211] <= {OPCODE,OPDATA,OPADDR}; rom[212] <= {OPCODE,OPDATA,OPADDR}; rom[213] <= {OPCODE,OPDATA,OPADDR}; rom[214] <= {OPCODE,OPDATA,OPADDR}; rom[215] <= {OPCODE,OPDATA,OPADDR}; rom[216] <= {OPCODE,OPDATA,OPADDR}; rom[217] <= {OPCODE,OPDATA,OPADDR}; rom[218] <= {OPCODE,OPDATA,OPADDR}; rom[219] <= {OPCODE,OPDATA,OPADDR}; rom[220] <= {OPCODE,OPDATA,OPADDR}; rom[221] <= {OPCODE,OPDATA,OPADDR}; rom[222] <= {OPCODE,OPDATA,OPADDR}; rom[223] <= {OPCODE,OPDATA,OPADDR}; rom[224] <= {OPCODE,OPDATA,OPADDR}; rom[225] <= {OPCODE,OPDATA,OPADDR}; rom[226] <= {OPCODE,OPDATA,OPADDR}; rom[227] <= {OPCODE,OPDATA,OPADDR}; rom[228] <= {OPCODE,OPDATA,OPADDR}; rom[229] <= {OPCODE,OPDATA,OPADDR}; rom[230] <= {OPCODE,OPDATA,OPADDR}; rom[231] <= {OPCODE,OPDATA,OPADDR}; rom[232] <= {OPCODE,OPDATA,OPADDR}; rom[233] <= {OPCODE,OPDATA,OPADDR}; rom[234] <= {OPCODE,OPDATA,OPADDR}; rom[235] <= {OPCODE,OPDATA,OPADDR}; rom[236] <= {OPCODE,OPDATA,OPADDR}; rom[237] <= {OPCODE,OPDATA,OPADDR}; rom[238] <= {OPCODE,OPDATA,OPADDR}; rom[239] <= {OPCODE,OPDATA,OPADDR}; rom[240] <= {OPCODE,OPDATA,OPADDR}; rom[241] <= {OPCODE,OPDATA,OPADDR}; rom[242] <= {OPCODE,OPDATA,OPADDR}; rom[243] <= {OPCODE,OPDATA,OPADDR}; rom[244] <= {OPCODE,OPDATA,OPADDR}; rom[245] <= {OPCODE,OPDATA,OPADDR}; rom[246] <= {OPCODE,OPDATA,OPADDR}; rom[247] <= {OPCODE,OPDATA,OPADDR}; rom[248] <= {OPCODE,OPDATA,OPADDR}; rom[249] <= {OPCODE,OPDATA,OPADDR}; rom[250] <= {OPCODE,OPDATA,OPADDR}; rom[251] <= {OPCODE,OPDATA,OPADDR}; rom[252] <= {OPCODE,OPDATA,OPADDR}; rom[253] <= {OPCODE,OPDATA,OPADDR}; rom[254] <= {OPCODE,OPDATA,OPADDR}; rom[255] <= {OPCODE,OPDATA,OPADDR}; end endmodule
这个过程就有点像写软件代码了,使用的还是自己的指令集。
最终实现的效果应该是主机向从机写入85 90 4a 5c四个字节的数据,然后写入四个ff的同时读出53 8b 9c ea四个字节的数据。以下是仿真波形:
首先看上电复位后mcu的操作,从0地址开始读出指令然后顺序执行,读写对应地址的寄存器。寄存器控制SPI开始操作之后,SPI的波形如下:
结果符合预期。
六、SPI主机板级验证
最后就是上板验证,本次验证将使用SPI主机去读nor flash的ID,flash的型号是W25Q64,阅读芯片手册,得知其读ID的指令为90,还需要写入24位0xXXXX00的地址,然后就可以返回生产ID和器件ID:
在手册中得知生产ID和器件ID分别是0xEF和0x16。
在irom中修改指令为写入90 00 00 00,其他保持不变,在顶层中例化irom,mcu和spi_master,综合实现生成比特流。
`timescale 1ns / 1ps module top( input sys_clk, input sys_rst_n, output led, output sck, output mosi, input miso, output ss ); wire clk_50M; wire [23:0] ir; wire [7:0] pc; wire en; wire we; wire [7:0] din; wire [7:0] dout; wire [7:0] addr; wire [7:0] cs_n; assign ss = cs_n[0]; assign led = 1'b1; BUFG BUFG_inst( .I(sys_clk), .O(clk_50M) ); irom irom_inst( .clk(clk_50M), .en(1'b1), .addr(pc), .dout(ir)); mcu mcu_inst( .clk(clk_50M), .rst_n(sys_rst_n), .ir(ir), .pc(pc), .en(en), .we(we), .din(din), .dout(dout), .addr(addr)); m_spi_top m_spi_top_inst( .clk(clk_50M), .rst_n(sys_rst_n), .en(en), .we(we), .din(din), .dout(dout), .addr(addr), .sck(sck), .mosi(mosi), .miso(miso), .cs_n(cs_n)); ila_spi ila_inst( .clk(clk_50M), .probe0({sck,ss,mosi,miso}) ); endmodule
将比特流下载到FPGA后,抓取波形如下:
如图所示,正确地读出了ID,因为操作长度为8,所以又重复读出了两个字节的ID。
三个常见低速接口终于更完了,接下来打算更新图像接口,常见的包括VGA时序,BT1120时序,DVI接口,HDMI接口,SDI接口,MIPI CSI和DSI等,欢迎持续关注!