Initial dummy draft of naive matmul kernals I’ve benchmarked on some old Nvidia GPUs!

Thanks for reading and God Bless You!