// DESCRIPTION: Verilator: List of To Do issues.
//
// Copyright 2004-2009 by Wilson Snyder. This program is free software; you can
// redistribute it and/or modify it under the terms of either the GNU
// Lesser General Public License Version 3 or the Perl Artistic License
// Version 2.0.


Features:
	Finish 3.400 new ordering fixes
	Latch optimizations {Need here}
	Task I/Os connecting to non-simple variables.
	Fix nested casez statements expanding into to huge C++. [JeanPaul Vanitegem]
	Fix FileLine memory inefficiency
	Fix ordering of each bit separately in a signal (mips)
		assign b[3:0] = b[7:4];  assign b[7:4] = in;
	Support gate primitives/ cell libraries from xilinx, etc
	Assign dont_care value to an 1'bzzz assignment
	Function to eval combo logic after /*verilator public*/ functions [gwaters]
	Support generated clocks (correctness)
	?gcov coverage
	Selectable SystemC types based on widths (see notes below)
	Compile time
		Inline first level trace routines
	Coverage
		Points should be per-scope like everything else rather then per-module
		Expression coverage (see notes)
	Constant functions for widths, etc, IE   "input [log2(PARAM):0] xx;"
	More Verilog 2001 Support
		(* *) Attributes  (just ignore -- preprocessor?)
		Real numbers (NEVER)
		Recursive functions (NEVER)
		Verilog configuration files (NEVER)
	DPI to define C/C++ calls from Verilog

Long-term Features
	Assertions
	VHDL parser  [Philips]
	Tristate support
	SystemPerl integration
	Multithreaded execution
	Front end parser integration into Verilog-Perl like Netlist

Testing:
	Capture all inputs into global "rerun it" file
	Code to make wrapper that sets signals, so can do comparison checks
	New random program generator
	Better graph viewer with search and zoom
	Port and test against opencores.org code

Usability:
	Better reporting of unopt problems, including what lines of code
	Report more errors (all of them?) before exiting [Eugene Weber]

Internal Code:
	Eliminate the AstNUser* passed to all visitors; its only needed in V3Width,
	and removing it will speed up and simplify all the other code.
	V3Graph should be templated container type, taking in Vertex + Edge types

Performance:
	Constant propagation
		Extra cleaning AND:  1 & ((VARREF >> 1) | ((&VARREF >> 1) & VARREF))
		Extra shift (perhaps due to clean): if (1 & CAST (VARREF >> #))
	Gated clock and latch conversion to flops.  [JeanPaul Vanitegem]
		Could propagate the AND into pos/negedges and let domaining optimize.
	Negedge reset
		Switch to remove negedges that don't matter
		Can't remove async resets from control flops (like in syncronizers)
	If all references to array have a constant index, blow up into separate signals-per-index
	Multithreaded execution
	Bit-multiply for faster bit swapping and a=b[1,3,2] random bit reorderings.
	Move _last sets and all other combo logic inside master
		if() that triggers on all possible sense items
	Rewrite and combine V3Life, V3Subst
		If block temp only ever set in one place to constant, propagate it
			Used in t_mem for array delayed assignments
		Replace variables if set later in same cfunc branch
			See for example duplicate sets of _narrow in cycle 90/91 of t_select_plusloop
	Same assignment on both if branches
		"if (a) { ... b=2; } else { ... b=2;}" -> "b=2; if ..."
		Careful though, as b could appear in the statement or multiple times in statement
		(Could just require exatly two 'b's in statement)
	Simplify XOR/XNOR/AND/OR bit selection trees
		Foo = A[1] ^ A[2] ^ A[3] etc are better as ^ ( A & 32'b...1110 )
	Combine variables into wider elements
		Parallel statements on different bits should become single signal
		Variables that are always consumed in "parallel" can be joined
	Duplicate assignments in gate optimization
		Common to have many separate posedge blocks, each with identical
		reset_r <= rst_in
	*If signal is used only once (not counting trace), always gate substitute
		Don't merge if any combining would form circ logic (out goes back to in)
	Multiple assignments each bit can become single assign with concat
		Make sure a SEL of a CONCAT can get the single bit back.
	Usually blocks/values
		Enable only after certain time, so VL_TIME_I(32) > 0x1e gets eliminated out
	Better ordering of a<=b, b<=c, put all refs to 'b' next to each other to optimize caching
	Allow Split of case statements without a $display/$stop
	I-cache packing improvements (what/how?)
	Data cache organization (order of vars in class)
		First have clocks,
		then bools instead of uint32_t's
		then based on what sense list they come from, all outputs, then all inputs
		finally have any signals part of a "usually" block, or constant.
	Rather then tracking widths, have a MSB...LSB of this expression
		(or better, a bitmask of bits relevant in this expression)
	Track recirculation and convert into clock-enables
	Clock enables should become new clocking domains for speed
	If floped(a) & flopped(b) and no other a&b, then instead flop(a&b).
	Sort by output bitselects so can combine more assignments (see DDP example dx_dm signal)

	All of the temp vars that get set, exp pre_ vars and never feedback
	(not flops) don't need to be stored in the structs, but instead can
	be per-invocation, and even better register-colored-like to reuse
	the space.  This will greatly reduce the data footprint.


//**********************************************************************
//* Detailed notes on 'todo' features

Selectable SystemC types based on widths (see notes below)
	Statements:
		verilator sc_type    uint32_t  {override specific I/O signals}
		verilator sc_typedef width 1     bool
		verilator sc_typedef width 2-32  uint32_t
		verilator sc_typedef width 33-64 uint64_t
		verilator sc_typedef width 65+   sc_bv<width>	//<< programmable width
	Replace `systemc_clock, --pins64
	Have accessor function with type overloading to get or set the pin value.
	(Get rid of getdatap mess?)

//**********************************************************************
//* Eventual tristate bus Stuff allowed (old verilator)

 1) Tristate assignments must be continuous assignments
    The RHS of a tristate assignment can be the following
       a) a node (tristate or non-tristate)
       b) a constant (must be all or no z's)
	    x'b0, x'bz, x{x'bz}, x{x'b0} -> are allowed
       c) a conditional whose possible values are (a) or (b)

 2) One can lose that fact that a node is a tristate node.  This happens
    if a tristate node is assigned to a 'standard' node, or is used on
    RHS of a conditional. The following infer tristate signals:
       a) inout <SIGNAL>
       b) tri <SIGNAL>
       c) assigning to 'Z' (maybe through a conditional)
    Note: tristate-ness of an output port determined only by
          statements in the module (not the instances it calls)

 4) Tristate variables can't be multidimensional arrays
 5) Only check tristate contention between modules (not within!)
 6) Only simple compares with 'Z' are allowed (===)

