Could be implemented in pytorch.
Testing will be tedious.
Problems:
They compare target and predicted surface if the distance between target predicted surface is above a certain threshold. This threshold value is
specific per dataset. We would also need to know the voxel spacing for each dataset.