Skip to content

Commit

Permalink
removes mmlu results cells
Browse files Browse the repository at this point in the history
  • Loading branch information
djliden committed Mar 11, 2024
1 parent 9554705 commit 0ef4aa6
Showing 1 changed file with 0 additions and 158 deletions.
158 changes: 0 additions & 158 deletions notebooks/4_olmo_1b_instruction_tune/4_olmo_instruction_tune.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1017,164 +1017,6 @@
"\n",
"In short...this is nothing to worry about. Why do we get the warning in the first place? It originates from the [Hugging Face Accelerate](https://huggingface.co/docs/accelerate/en/index) library, which checks for tied weights by looking for distinct modules with shared weights. But OLMo 1B's weight-tying approach uses `self.model.transformer.ff_out = self.model.transformer.wte`, so the modules themselves, and not just their weights, are the same. So the check from the Accelerate library fails to identify the tied weights and returns the warning. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"| Tasks |Version|Filter|n-shot|Metric|Value | |Stderr|\n",
"|---------------------------------------|-------|------|-----:|------|-----:|---|-----:|\n",
"|mmlu |N/A |none | 0|acc |0.2611|± |0.0037|\n",
"| - humanities |N/A |none | 5|acc |0.2670|± |0.0064|\n",
"| - formal_logic | 0|none | 5|acc |0.1587|± |0.0327|\n",
"| - high_school_european_history | 0|none | 5|acc |0.2970|± |0.0357|\n",
"| - high_school_us_history | 0|none | 5|acc |0.2647|± |0.0310|\n",
"| - high_school_world_history | 0|none | 5|acc |0.2827|± |0.0293|\n",
"| - international_law | 0|none | 5|acc |0.3884|± |0.0445|\n",
"| - jurisprudence | 0|none | 5|acc |0.1944|± |0.0383|\n",
"| - logical_fallacies | 0|none | 5|acc |0.3006|± |0.0360|\n",
"| - moral_disputes | 0|none | 5|acc |0.2919|± |0.0245|\n",
"| - moral_scenarios | 0|none | 5|acc |0.2469|± |0.0144|\n",
"| - philosophy | 0|none | 5|acc |0.2926|± |0.0258|\n",
"| - prehistory | 0|none | 5|acc |0.2716|± |0.0247|\n",
"| - professional_law | 0|none | 5|acc |0.2588|± |0.0112|\n",
"| - world_religions | 0|none | 5|acc |0.2982|± |0.0351|\n",
"| - other |N/A |none | 5|acc |0.2613|± |0.0079|\n",
"| - business_ethics | 0|none | 5|acc |0.2300|± |0.0423|\n",
"| - clinical_knowledge | 0|none | 5|acc |0.2038|± |0.0248|\n",
"| - college_medicine | 0|none | 5|acc |0.2486|± |0.0330|\n",
"| - global_facts | 0|none | 5|acc |0.3600|± |0.0482|\n",
"| - human_aging | 0|none | 5|acc |0.2018|± |0.0269|\n",
"| - management | 0|none | 5|acc |0.2039|± |0.0399|\n",
"| - marketing | 0|none | 5|acc |0.2564|± |0.0286|\n",
"| - medical_genetics | 0|none | 5|acc |0.2100|± |0.0409|\n",
"| - miscellaneous | 0|none | 5|acc |0.2874|± |0.0162|\n",
"| - nutrition | 0|none | 5|acc |0.2582|± |0.0251|\n",
"| - professional_accounting | 0|none | 5|acc |0.2660|± |0.0264|\n",
"| - professional_medicine | 0|none | 5|acc |0.3272|± |0.0285|\n",
"| - virology | 0|none | 5|acc |0.2470|± |0.0336|\n",
"| - social_sciences |N/A |none | 5|acc |0.2454|± |0.0078|\n",
"| - econometrics | 0|none | 5|acc |0.2456|± |0.0405|\n",
"| - high_school_geography | 0|none | 5|acc |0.2323|± |0.0301|\n",
"| - high_school_government_and_politics| 0|none | 5|acc |0.2487|± |0.0312|\n",
"| - high_school_macroeconomics | 0|none | 5|acc |0.2615|± |0.0223|\n",
"| - high_school_microeconomics | 0|none | 5|acc |0.2143|± |0.0267|\n",
"| - high_school_psychology | 0|none | 5|acc |0.2275|± |0.0180|\n",
"| - human_sexuality | 0|none | 5|acc |0.2443|± |0.0377|\n",
"| - professional_psychology | 0|none | 5|acc |0.2663|± |0.0179|\n",
"| - public_relations | 0|none | 5|acc |0.2273|± |0.0401|\n",
"| - security_studies | 0|none | 5|acc |0.2571|± |0.0280|\n",
"| - sociology | 0|none | 5|acc |0.2289|± |0.0297|\n",
"| - us_foreign_policy | 0|none | 5|acc |0.2700|± |0.0446|\n",
"| - stem |N/A |none | 5|acc |0.2674|± |0.0079|\n",
"| - abstract_algebra | 0|none | 5|acc |0.2200|± |0.0416|\n",
"| - anatomy | 0|none | 5|acc |0.3259|± |0.0405|\n",
"| - astronomy | 0|none | 5|acc |0.3224|± |0.0380|\n",
"| - college_biology | 0|none | 5|acc |0.2569|± |0.0365|\n",
"| - college_chemistry | 0|none | 5|acc |0.1800|± |0.0386|\n",
"| - college_computer_science | 0|none | 5|acc |0.2700|± |0.0446|\n",
"| - college_mathematics | 0|none | 5|acc |0.2500|± |0.0435|\n",
"| - college_physics | 0|none | 5|acc |0.2157|± |0.0409|\n",
"| - computer_security | 0|none | 5|acc |0.3200|± |0.0469|\n",
"| - conceptual_physics | 0|none | 5|acc |0.2255|± |0.0273|\n",
"| - electrical_engineering | 0|none | 5|acc |0.2966|± |0.0381|\n",
"| - elementary_mathematics | 0|none | 5|acc |0.2593|± |0.0226|\n",
"| - high_school_biology | 0|none | 5|acc |0.2452|± |0.0245|\n",
"| - high_school_chemistry | 0|none | 5|acc |0.3054|± |0.0324|\n",
"| - high_school_computer_science | 0|none | 5|acc |0.3200|± |0.0469|\n",
"| - high_school_mathematics | 0|none | 5|acc |0.2630|± |0.0268|\n",
"| - high_school_physics | 0|none | 5|acc |0.2781|± |0.0366|\n",
"| - high_school_statistics | 0|none | 5|acc |0.2500|± |0.0295|\n",
"| - machine_learning | 0|none | 5|acc |0.3214|± |0.0443|\n",
"\n",
"| Groups |Version|Filter|n-shot|Metric|Value | |Stderr|\n",
"|------------------|-------|------|-----:|------|-----:|---|-----:|\n",
"|mmlu |N/A |none | 0|acc |0.2611|± |0.0037|\n",
"| - humanities |N/A |none | 5|acc |0.2670|± |0.0064|\n",
"| - other |N/A |none | 5|acc |0.2613|± |0.0079|\n",
"| - social_sciences|N/A |none | 5|acc |0.2454|± |0.0078|\n",
"| - stem |N/A |none | 5|acc |0.2674|± |0.0079|"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Compared to base...need to investigate further\n",
"\n",
"| Tasks |Version|Filter|n-shot|Metric|Value | |Stderr|\n",
"|---------------------------------------|-------|------|-----:|------|-----:|---|-----:|\n",
"|mmlu |N/A |none | 0|acc |0.2607|± |0.0037|\n",
"| - humanities |N/A |none | 5|acc |0.2546|± |0.0064|\n",
"| - formal_logic | 0|none | 5|acc |0.1905|± |0.0351|\n",
"| - high_school_european_history | 0|none | 5|acc |0.2424|± |0.0335|\n",
"| - high_school_us_history | 0|none | 5|acc |0.2892|± |0.0318|\n",
"| - high_school_world_history | 0|none | 5|acc |0.2827|± |0.0293|\n",
"| - international_law | 0|none | 5|acc |0.2645|± |0.0403|\n",
"| - jurisprudence | 0|none | 5|acc |0.2500|± |0.0419|\n",
"| - logical_fallacies | 0|none | 5|acc |0.2515|± |0.0341|\n",
"| - moral_disputes | 0|none | 5|acc |0.2543|± |0.0234|\n",
"| - moral_scenarios | 0|none | 5|acc |0.2380|± |0.0142|\n",
"| - philosophy | 0|none | 5|acc |0.2958|± |0.0259|\n",
"| - prehistory | 0|none | 5|acc |0.2469|± |0.0240|\n",
"| - professional_law | 0|none | 5|acc |0.2588|± |0.0112|\n",
"| - world_religions | 0|none | 5|acc |0.2222|± |0.0319|\n",
"| - other |N/A |none | 5|acc |0.2726|± |0.0079|\n",
"| - business_ethics | 0|none | 5|acc |0.2200|± |0.0416|\n",
"| - clinical_knowledge | 0|none | 5|acc |0.2075|± |0.0250|\n",
"| - college_medicine | 0|none | 5|acc |0.2543|± |0.0332|\n",
"| - global_facts | 0|none | 5|acc |0.3400|± |0.0476|\n",
"| - human_aging | 0|none | 5|acc |0.2287|± |0.0282|\n",
"| - management | 0|none | 5|acc |0.2233|± |0.0412|\n",
"| - marketing | 0|none | 5|acc |0.2607|± |0.0288|\n",
"| - medical_genetics | 0|none | 5|acc |0.2200|± |0.0416|\n",
"| - miscellaneous | 0|none | 5|acc |0.2848|± |0.0161|\n",
"| - nutrition | 0|none | 5|acc |0.2974|± |0.0262|\n",
"| - professional_accounting | 0|none | 5|acc |0.2340|± |0.0253|\n",
"| - professional_medicine | 0|none | 5|acc |0.4338|± |0.0301|\n",
"| - virology | 0|none | 5|acc |0.2229|± |0.0324|\n",
"| - social_sciences |N/A |none | 5|acc |0.2428|± |0.0077|\n",
"| - econometrics | 0|none | 5|acc |0.2456|± |0.0405|\n",
"| - high_school_geography | 0|none | 5|acc |0.1919|± |0.0281|\n",
"| - high_school_government_and_politics| 0|none | 5|acc |0.2591|± |0.0316|\n",
"| - high_school_macroeconomics | 0|none | 5|acc |0.2744|± |0.0226|\n",
"| - high_school_microeconomics | 0|none | 5|acc |0.2311|± |0.0274|\n",
"| - high_school_psychology | 0|none | 5|acc |0.2349|± |0.0182|\n",
"| - human_sexuality | 0|none | 5|acc |0.2061|± |0.0355|\n",
"| - professional_psychology | 0|none | 5|acc |0.2516|± |0.0176|\n",
"| - public_relations | 0|none | 5|acc |0.2545|± |0.0417|\n",
"| - security_studies | 0|none | 5|acc |0.2367|± |0.0272|\n",
"| - sociology | 0|none | 5|acc |0.2139|± |0.0290|\n",
"| - us_foreign_policy | 0|none | 5|acc |0.3100|± |0.0465|\n",
"| - stem |N/A |none | 5|acc |0.2756|± |0.0079|\n",
"| - abstract_algebra | 0|none | 5|acc |0.2700|± |0.0446|\n",
"| - anatomy | 0|none | 5|acc |0.3407|± |0.0409|\n",
"| - astronomy | 0|none | 5|acc |0.1974|± |0.0324|\n",
"| - college_biology | 0|none | 5|acc |0.2500|± |0.0362|\n",
"| - college_chemistry | 0|none | 5|acc |0.2000|± |0.0402|\n",
"| - college_computer_science | 0|none | 5|acc |0.3900|± |0.0490|\n",
"| - college_mathematics | 0|none | 5|acc |0.2700|± |0.0446|\n",
"| - college_physics | 0|none | 5|acc |0.2059|± |0.0402|\n",
"| - computer_security | 0|none | 5|acc |0.2700|± |0.0446|\n",
"| - conceptual_physics | 0|none | 5|acc |0.2213|± |0.0271|\n",
"| - electrical_engineering | 0|none | 5|acc |0.2966|± |0.0381|\n",
"| - elementary_mathematics | 0|none | 5|acc |0.2460|± |0.0222|\n",
"| - high_school_biology | 0|none | 5|acc |0.2710|± |0.0253|\n",
"| - high_school_chemistry | 0|none | 5|acc |0.2808|± |0.0316|\n",
"| - high_school_computer_science | 0|none | 5|acc |0.3000|± |0.0461|\n",
"| - high_school_mathematics | 0|none | 5|acc |0.2593|± |0.0267|\n",
"| - high_school_physics | 0|none | 5|acc |0.3046|± |0.0376|\n",
"| - high_school_statistics | 0|none | 5|acc |0.3981|± |0.0334|\n",
"| - machine_learning | 0|none | 5|acc |0.3125|± |0.0440|\n",
"\n",
"| Groups |Version|Filter|n-shot|Metric|Value | |Stderr|\n",
"|------------------|-------|------|-----:|------|-----:|---|-----:|\n",
"|mmlu |N/A |none | 0|acc |0.2607|± |0.0037|\n",
"| - humanities |N/A |none | 5|acc |0.2546|± |0.0064|\n",
"| - other |N/A |none | 5|acc |0.2726|± |0.0079|\n",
"| - social_sciences|N/A |none | 5|acc |0.2428|± |0.0077|\n",
"| - stem |N/A |none | 5|acc |0.2756|± |0.0079|\n"
]
}
],
"metadata": {
Expand Down

0 comments on commit 0ef4aa6

Please sign in to comment.