Computation cost if circuit size is large
Does this method support gate level SW prediction?
why FC layer, and why raise input feature dimension?
Why there is outlier for different circuit?
If two circuits are similar, can we trust the result?
How to debug outliers in the experiment?

We can prepare zero-delay simulation first, 
and saif/VCD analyze utility set.

Then assign intern to implement the idea using python, even with GPU.