It is preferred at least for onert to use fullyconnected if lhs is const.
onert (and similarly tflite) does not allow constant for batchmatmul lhs.
See Samsung/ONE#16064.
In this case, fullyconnected provides more efficient kernels.
I tried to emit fulyconnected instead of batchmatmul.
Perhaps, op_bmm.py
def define_node(
self,
node: torch.fx.Node,
) -> circle.Operator.OperatorT:
args = BmmArgs(*node.args, **node.kwargs) # type: ignore[arg-type]
input = args.input
mat2 = args.mat2
// TODO: check input is constant
// Then create FULLY_CONNECTED instead of BATCH_MATMUL
op_index = get_op_index(
circle.BuiltinOperator.BuiltinOperator.BATCH_MATMUL, self._op_codes
)
inputs = [input, mat2]
outputs = [node]
operator = create_builtin_operator(self.graph, op_index, inputs, outputs)
However, I am not familiar with TICO yet.
Could anyone help me?
It is preferred at least for onert to use fullyconnected if lhs is const.
onert (and similarly tflite) does not allow constant for batchmatmul lhs.
See Samsung/ONE#16064.
In this case, fullyconnected provides more efficient kernels.
I tried to emit fulyconnected instead of batchmatmul.
Perhaps,
op_bmm.pyHowever, I am not familiar with TICO yet.
Could anyone help me?