Skip to content

Instantly share code, notes, and snippets.

@relyt0925
Created August 18, 2024 22:02
Show Gist options
  • Save relyt0925/dca01c04f85ac86e07aaa2500d418ddf to your computer and use it in GitHub Desktop.
Save relyt0925/dca01c04f85ac86e07aaa2500d418ddf to your computer and use it in GitHub Desktop.
mmlu_branch_eval
[root@tyler-a100-newimage-val root]# /root/bin/ilab.sh --config /var/mnt/inststg1/instructlab/config.yaml model evaluate --model /var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/ --base-model /var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/ --benchmark mmlu_branch --tasks-dir /var/mnt/inststg1/instructlab/generated/node_datasets_2024-08-18T15_57_14/
Using local safetensors found at '/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/' for '--model'
INFO 2024-08-18 22:00:17,135 numexpr.utils:145: Note: detected 80 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
INFO 2024-08-18 22:00:17,135 numexpr.utils:148: Note: NumExpr detected 80 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16.
INFO 2024-08-18 22:00:17,135 numexpr.utils:161: NumExpr defaulting to 16 threads.
INFO 2024-08-18 22:00:17,797 datasets:58: PyTorch version 2.3.1 available.
INFO 2024-08-18 22:00:29,170 lm-eval:152: Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
INFO 2024-08-18 22:00:29,171 lm-eval:189: Initializing hf model, with arguments: {'pretrained': '/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/', 'dtype': 'bfloat16'}
INFO 2024-08-18 22:00:29,356 lm-eval:170: Using device 'cuda'
Generating test split: 108 examples [00:00, 4558.66 examples/s]
WARNING 2024-08-18 22:00:34,788 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended.
WARNING 2024-08-18 22:00:34,788 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended.
WARNING 2024-08-18 22:00:34,808 lm-eval:251: Overwriting default num_fewshot of knowledge_compliance_personally-identifiable-information from None to 5
INFO 2024-08-18 22:00:34,808 lm-eval:261: Setting fewshot random generator seed to 1234
INFO 2024-08-18 22:00:34,809 lm-eval:411: Building contexts for knowledge_compliance_personally-identifiable-information on rank 0...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 108/108 [00:01<00:00, 71.45it/s]
INFO 2024-08-18 22:00:36,335 lm-eval:438: Running loglikelihood requests
Running loglikelihood requests: 0%| | 0/432 [00:00<?, ?it/s]Passed argument batch_size = auto:1. Detecting largest batch size
We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)
Determined largest batch size: 64
Running loglikelihood requests: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 432/432 [00:32<00:00, 13.16it/s]
WARNING 2024-08-18 22:01:10,217 lm-eval:1315: Failed to get model SHA for /var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/ at revision main. Error: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/'. Use `repo_type` argument if needed.
fatal: not a git repository (or any of the parent directories): .git
INFO 2024-08-18 22:01:19,669 lm-eval:152: Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
INFO 2024-08-18 22:01:19,669 lm-eval:189: Initializing hf model, with arguments: {'pretrained': '/var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/', 'dtype': 'bfloat16'}
INFO 2024-08-18 22:01:19,671 lm-eval:170: Using device 'cuda'
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:06<00:00, 2.04s/it]
WARNING 2024-08-18 22:01:26,005 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended.
WARNING 2024-08-18 22:01:26,006 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended.
WARNING 2024-08-18 22:01:26,024 lm-eval:251: Overwriting default num_fewshot of knowledge_compliance_personally-identifiable-information from None to 5
INFO 2024-08-18 22:01:26,024 lm-eval:261: Setting fewshot random generator seed to 1234
INFO 2024-08-18 22:01:26,024 lm-eval:411: Building contexts for knowledge_compliance_personally-identifiable-information on rank 0...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 108/108 [00:01<00:00, 70.26it/s]
INFO 2024-08-18 22:01:27,577 lm-eval:438: Running loglikelihood requests
Running loglikelihood requests: 0%| | 0/432 [00:00<?, ?it/s]Passed argument batch_size = auto:1. Detecting largest batch size
Determined largest batch size: 64
Running loglikelihood requests: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 432/432 [00:31<00:00, 13.56it/s]
WARNING 2024-08-18 22:02:00,506 lm-eval:1315: Failed to get model SHA for /var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/ at revision main. Error: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/'. Use `repo_type` argument if needed.
fatal: not a git repository (or any of the parent directories): .git
# KNOWLEDGE EVALUATION REPORT
## BASE MODEL
/var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/
## MODEL
/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/
### AVERAGE:
-0.02 (across 1)
### REGRESSIONS:
1. knowledge_compliance_personally-identifiable-information (-0.02)
[root@tyler-a100-newimage-val root]#
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment