Fix LayerNorm, Multi-Query Attention, and Weight-Tying in OLMo to HF Conversion Script#820
Draft
ved1beta wants to merge 2 commits into
Draft
Fix LayerNorm, Multi-Query Attention, and Weight-Tying in OLMo to HF Conversion Script#820ved1beta wants to merge 2 commits into
ved1beta wants to merge 2 commits into