Make Inference Faster: Efficient GPU Memory Management for Butterfly Sparse Matrix Multiplication

Published in submitted, 2024