<p>A haiku library using the xmap operator in Jax for model parallelism of transformers. The parallelism scheme is similar to the original Megatron-LM, which is efficient on TPUs due to the high speed 2d mesh network. This library is designed</p>

Breakdown

A haiku library using the xmap operator in Jax for model parallelism of transformers.

The parallelism scheme is similar to the original Megatron-LM, which is efficient on TPUs due to the high speed 2d mesh network.

This library is designed

...