Skip to content

Instantly share code, notes, and snippets.

@v-prgmr
Created June 9, 2024 22:16
Show Gist options
  • Save v-prgmr/8d7c59a7584e63450dc55b999aa942cd to your computer and use it in GitHub Desktop.
Save v-prgmr/8d7c59a7584e63450dc55b999aa942cd to your computer and use it in GitHub Desktop.

microsoft/phi-2 for creating Mixture of Experts (MoE)

The microsoft/phi-2 is a small language model with 2.7billion parameters. Because of its small size, opensource license and thanks to finetuning techqniques like QLoRA, one can (fairly) quickly finetune a base model for performing downstream tasks and creating an expert phi-2 model. It would be interesting to combine the individual experts into a Mixture of Experts (MoE) to make the MoE perform the tasks of the individual experts. Follow the steps below to create your own version of a MoE based out of phi-2.

Phi-2 MoE generation

The mergekit in its original flavour does not support microsoft/phi-2 (at the time of writing this article) because of mismatch in layers names. This fork was done to make the mergekit work with microsoft/phi-2 based SLMs to create a "Phi2-Mixture of Experts" model. Follow the instructions below to create your own Mixture of Experts from multiple individual phi-2 experts. Please checkout the "phi2xtral" branch to start with.

Instructions for creating Phi-2 MoE

Checkout the phi2xtral branch of this repository.

Merging experts into MoE

  • Craete a merge configuration like config_moe_phi2.yaml where you either pass in the absolute path of the expert model's directory or to the expert's huggingface repository.
  • Run phi2_moe.py by passing the following arguments to it
    • merge configuration: config_moe_phi2.yaml
    • path to output folder: for example you can use this folder output_phi2_moe (as it has the configuration files needed for inferencing the MoE model)
    • load-in-4bit
    • trust-remote-code
    • therefore the run command looks like: phi2_moe.py config_moe_phi2.yaml output_phi2_moe --load-in-4bit --trust-remote-code
  • This should now create the Mixture of Experts model from your individual experts as per the merge configuration inside the output directory.
  • Note: If you are using your own custom finetuned phi-2, that was fine tuned using techniques like Qlora, merge the adapter weights back to the base model before using it as one of the experts in the mergekit.

Inference

  • You will find the customized configuration and modelling file within the output_phi2_moe folder.
    • You will need configuration_phi_2.py and m̀odelling_phi_2.py for inference of your Phi2MoE.
    • Create your inference script as you would normally do using the huggingface transformer library and load the MoE you just craeted above.
    • The model will use the customized configuration files present inside the output folder as per the config.json that gets created from the previous step using phi2_moe.py
  • Enjoy your Phi2MoE!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment