HI, I want to see the change of Peak Memory Consumption after applying multi-stream execution. I try ASAP and WaveFront, but it seems that the Peak Memory Consumption doesn't change at all.
And this is my code.
@yaoyaoding Hello, would you please help me solve this problem? THX :)
from raf._ffi.pass_ import ToGraphNormalForm, WavefrontStreamSchedule
import tvm
from raf._core.vm import VMCompiler
from raf._ffi.pass_ import EstimateMemory, InferType
from raf._core.device import Device
device="cuda"
model, _ = inception.get_model()
# Generate a dummy input.
in_out , _ = inception.get_input(batch_size=32, device="cuda")
m_in=in_out[0]
m_out=in_out[1]
mod = model._internal(m_in,m_out).mod
'''
with raf.ir.PassContext(opt_level=2, config={"raf.stream_schedule.policy": "wavefront"}):
mod = RAFSequential([ToGraphNormalForm(), WavefrontStreamSchedule()])(mod)
'''
mod = RAFSequential([ToGraphNormalForm(), ASAPStreamSchedule()])(mod)
compiler = VMCompiler()
with tvm.transform.PassContext(opt_level=3):
mod, _ = compiler.optimize(mod, device)
mod = InferType()(mod)
trace = [(name, mem.value) for name, mem in EstimateMemory(mod, Device(device), True)]
print(max(trace, key=lambda x: x[1])[1])
HI, I want to see the change of Peak Memory Consumption after applying multi-stream execution. I try ASAP and WaveFront, but it seems that the Peak Memory Consumption doesn't change at all.
And this is my code.
@yaoyaoding Hello, would you please help me solve this problem? THX :)