Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customized IP Guidelines #3

Open
BernardoMadeira opened this issue Jul 20, 2023 · 3 comments
Open

Customized IP Guidelines #3

BernardoMadeira opened this issue Jul 20, 2023 · 3 comments

Comments

@BernardoMadeira
Copy link

BernardoMadeira commented Jul 20, 2023

Hello everyone.

I am a user of the OscimpDigital framework and I would like to start by thanking you guys all the good work.

I use a 14b RedPitaya to do some signal processing on my sensors and wanted to implement a biquad structured IIR lpf with coefficients that could be accessed externally by the user and injected through the add_const IP. There would be no need to add these to the libOscimpDig libraries since I could change the coefficients in such a simple way like in the FIR IP.

I saw an (unfinished?) implementation of the IIR lpf (https://github.com/oscimp/fpga_ip/blob/master/iir_lpf_complex/hdl/iir_lpf_complex.vhd). I have tried to adapt this script (annexed) for my implementation, but when I tried connecting it to the other IPs the connections did not match.


library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;

entity iir_lpf_complex is 
	generic (
		DATA_WIDTH : natural := 16
	);
	port (
		data_i_i   : in std_logic_vector(DATA_WIDTH-1 downto 0);
		data_q_i   : in std_logic_vector(DATA_WIDTH-1 downto 0);
		data_en_i  : in std_logic;
		data_clk_i : in std_logic;
		data_rst_i : in std_logic;
		data_i_o   : out std_logic_vector(DATA_WIDTH-1 downto 0);
		data_q_o   : out std_logic_vector(DATA_WIDTH-1 downto 0);
		data_en_o  : out std_logic;
		data_clk_o : out std_logic;
		data_rst_o : out std_logic;
		a0         : in signed(DATA_WIDTH-1 downto 0); -- Biquad filter coefficients
		a1         : in signed(DATA_WIDTH-1 downto 0);
		b0         : in signed(DATA_WIDTH-1 downto 0);
		b1         : in signed(DATA_WIDTH-1 downto 0);
		b2         : in signed(DATA_WIDTH-1 downto 0)
	);
end entity;

architecture bhv of iir_lpf_complex is
	signal data_i_1, data_i_2, data_i_3 : signed(DATA_WIDTH+1 downto 0) := (others => '0');
	signal data_q_1, data_q_2, data_q_3 : signed(DATA_WIDTH+1 downto 0) := (others => '0');
	signal en_i_1 : std_logic := '0';
begin
	data_i_2 <= to_signed(0, DATA_WIDTH+2) when data_rst_i = '1' else data_i_1 - b1 * data_i_3 - b2 * data_i_2;
	data_q_2 <= to_signed(0, DATA_WIDTH+2) when data_rst_i = '1' else data_q_1 - b1 * data_q_3 - b2 * data_q_2;
	data_i_3 <= to_signed(0, DATA_WIDTH+2) when data_rst_i = '1' else data_i_1 - a1 * data_i_3 - a0 * data_i_2;
	data_q_3 <= to_signed(0, DATA_WIDTH+2) when data_rst_i = '1' else data_q_1 - a1 * data_q_3 - a0 * data_q_2;

	data_i_o <= std_logic_vector(data_i_1(DATA_WIDTH-1 downto 0));
	data_q_o <= std_logic_vector(data_q_1(DATA_WIDTH-1 downto 0));
	data_en_o <= en_i_1;
	data_clk_o <= data_clk_i;
	data_rst_o <= data_rst_i;

	process(data_clk_i) is
	begin 
		if rising_edge(data_clk_i) then
			en_i_1 <= data_en_i;
			if (data_rst_i = '1') then 
				en_i_1 <= '0';
				data_i_1 <= (others => '0');
				data_q_1 <= (others => '0');
				data_i_2 <= (others => '0');
				data_q_2 <= (others => '0');
				data_i_3 <= (others => '0');
				data_q_3 <= (others => '0');
			end if;

			if (data_en_i = '1') then
				data_i_1 <= signed(data_i_i);
				data_q_1 <= signed(data_q_i);
			else 
				data_i_1 <= data_i_1;
				data_q_1 <= data_q_1;
			end if;

			if (en_i_1 = '1') then
				data_i_1 <= data_i_3;
				data_q_1 <= data_q_3;
			else	
				data_i_1 <= data_i_1;
				data_q_1 <= data_q_1;
			end if;
		end if;
	end process;
end architecture bhv;

I used Vivado's IP Packager.

Are there guidelines to make customized IP compatible with the OscimpDigital library? Or could you point me in a direction where I could make this specific source compatible?

Thank you!

Bernardo

@mer0m
Copy link
Member

mer0m commented Jul 26, 2023

Hi Bernardo,

The iir_lpf_{real,complex} IP are not a implementation of a generic IIR filter (aka with a_i, b_i coefficients).
Here, it's a simplified transfer function H(z) = 2^(-N)*z^(-1)/(1-(1-2^-N)*z^(-1)) which reduce the ressources requirements of the ip.
If you already developed and if you want to add a IIR ip based on the FIR ip (with settable coefficients through the AXI bus), feel free to propose your ip :)

By the way, a generic IIR is really missing, let discuss about it here.

@BernardoMadeira
Copy link
Author

BernardoMadeira commented Jul 28, 2023

Because I haven't really understood the intricacies of how you configure the IP to be handled by the user and that would require a lot more digging through your code and try to emulate the structure - I have been avoiding that.
I developed the vhd code a bit in line with the existing iir_lpf IP, then I added a bus for each coefficients so that I could externally feed it from an add_const_real block.

So my system looks a bit like:
NCO >> MIXER (sine_out * axi2dac) >> iir_lpf_complex (my IP)
ADD_CONST{each coefficient} >> iir_lpf_complex{a1,a2,b0,b1,b2}
The IP has 6 bus interfaces, data_in, data_out, a1, a2, b0, b1 and b2. It correctly connects with other IPs in the oscimpDigital library.

Finally I connect it to a data_real_to_ram to see the transfer function, but it has been outputting zero - which means something is wrong either with the IP or the setup.

My inexperience with fpga's and vhdl really shines from here on out... would a s00_axi port be necessary to change dynamically the coefficients? I notice a pattern between the blocks that you can change through PS-PL communication (add const, nco_counter, data{real,complex}_to_ram) and others which are more "static" (dupp, mixer). I thought that by avoiding the need to communicate with the host computer I could avoid having to deal with the axi interface but maybe I am wrong here.

Anyway, here is the .vhdl source:
biquad_iir_lpf_complex_v1_0.zip

@BernardoMadeira
Copy link
Author

I kept working on this generalized IIR filter, and even if the design uses DSP48 slices there will still be negative slack of 0.2-0.4 ns bringing the total delay to 8.2-8.4. This doesn't respect the timing constraints and therefore always results in a timing failure. The logic takes most of this time, as expected (a1, a2 from feedback * input). I have tried all the Direct Forms, I, II, TFI, TFII. As expected, none obeys the timing constraints.

Making a pipelined IIR filter is out of my expertise at the moment, but could be an option to solve this problem. Calculation of the coefficients is rather convoluted and has been a mental block to invest more time exploring this option.

In order to respect timing I have been thinking of 3 solutions:
(1) I tried using clk_wizard to generate a 62.5 MHz clock and connect this to the clock of the filter and dataOUT but I have some trouble understanding how the constraints are set. Two clocks are always created when I use Oscimp IPs: clk_fpga_0 and adc_clk.
I tried creating a generated clock using either as a primary clock, but the resulting nets are always unconstrained instead of showing 62.5 MHz. Any experience with this?
(2) I tried using clk_prescaler from OscimpDigital, but I am unsure how to setup the constraints. Same reason as in (1), whenever I set the constraints the result loads are unconstrained or show 125 MHz.
(3) Using an enable decimator IP: basically a counter that opens every N clocks. But the same trouble appears. I think this approach should be avoided, since there is no reason to use a clock enable if you have still clock resources available in your fpga (which is the case). Since the clock will always be running at 125 MHz anyway, this leads to having to set up the route as a multi cycle path. Upon doing so, I notice that the path between my decimator and IIR filter IP has not been routed by Vivado.

I'd really love to contribute with a generalized IIR, but at the moment I am struggling a lot with timing closure when decimating (either by dividing the clock or by using a clock enable). Implementing and calculating the coefficients for a pipelined IIR filter is extremely convoluted. What approach do you think might be the best?

Best regards,
Bernardo

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants