Skip to content

Instantly share code, notes, and snippets.

@TheBloke
Last active September 12, 2023 22:49
Show Gist options
  • Save TheBloke/6fe3bb4d870e45c97acb71772906caaf to your computer and use it in GitHub Desktop.
Save TheBloke/6fe3bb4d870e45c97acb71772906caaf to your computer and use it in GitHub Desktop.
2023-09-12 22:33:49 INFO [jondurbin_spicyboros-70b-2.2.GGUF] Making unquantised GGUF at /workspace/process/jondurbin_spicyboros-70b-2.2/gguf/spicyboros-70b-2.2.fp16.gguf
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00001-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00001-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00002-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00003-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00004-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00005-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00006-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00007-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00008-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00009-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00010-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00011-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00012-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00013-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00014-of-00015.bin
Loading model file /workspace/process/jondurbin_spicyboros-70b-2.2/source/pytorch_model-00015-of-00015.bin
params = Params(n_vocab=32000, n_embd=8192, n_layer=80, n_ctx=4096, n_ff=28672, n_head=64, n_head_kv=8, f_norm_eps=1e-05, f_rope_freq_base=10000.0, f_rope_scale=None, ftype=<GGMLFileType.MostlyF16: 1>, path_model=PosixPath('/workspace/process/jondurbin_spicyboros-70b-2.2/source'))
Loading vocab file '/workspace/process/jondurbin_spicyboros-70b-2.2/source/tokenizer.model', type 'spm'
Permuting layer 0
Permuting layer 1
Permuting layer 2
Permuting layer 3
Permuting layer 4
Permuting layer 5
Permuting layer 6
Permuting layer 7
Permuting layer 8
Permuting layer 9
Permuting layer 10
Permuting layer 11
Permuting layer 12
Permuting layer 13
Permuting layer 14
Permuting layer 15
Permuting layer 16
Permuting layer 17
Permuting layer 18
Permuting layer 19
Permuting layer 20
Permuting layer 21
Permuting layer 22
Permuting layer 23
Permuting layer 24
Permuting layer 25
Permuting layer 26
Permuting layer 27
Permuting layer 28
Permuting layer 29
Permuting layer 30
Permuting layer 31
Permuting layer 32
Permuting layer 33
Permuting layer 34
Permuting layer 35
Permuting layer 36
Permuting layer 37
Permuting layer 38
Permuting layer 39
Permuting layer 40
Permuting layer 41
Permuting layer 42
Permuting layer 43
Permuting layer 44
Permuting layer 45
Permuting layer 46
Permuting layer 47
Permuting layer 48
Permuting layer 49
Permuting layer 50
Permuting layer 51
Permuting layer 52
Permuting layer 53
Permuting layer 54
Permuting layer 55
Permuting layer 56
Permuting layer 57
Permuting layer 58
Permuting layer 59
Permuting layer 60
Permuting layer 61
Permuting layer 62
Permuting layer 63
Permuting layer 64
Permuting layer 65
Permuting layer 66
Permuting layer 67
Permuting layer 68
Permuting layer 69
Permuting layer 70
Permuting layer 71
Permuting layer 72
Permuting layer 73
Permuting layer 74
Permuting layer 75
Permuting layer 76
Permuting layer 77
Permuting layer 78
Permuting layer 79
model.embed_tokens.weight -> token_embd.weight | BF16 | [32000, 8192]
model.layers.0.self_attn.q_proj.weight -> blk.0.attn_q.weight | BF16 | [8192, 8192]
model.layers.0.self_attn.k_proj.weight -> blk.0.attn_k.weight | BF16 | [1024, 8192]
model.layers.0.self_attn.v_proj.weight -> blk.0.attn_v.weight | BF16 | [1024, 8192]
model.layers.0.self_attn.o_proj.weight -> blk.0.attn_output.weight | BF16 | [8192, 8192]
model.layers.0.mlp.gate_proj.weight -> blk.0.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.0.mlp.up_proj.weight -> blk.0.ffn_up.weight | BF16 | [28672, 8192]
model.layers.0.mlp.down_proj.weight -> blk.0.ffn_down.weight | BF16 | [8192, 28672]
model.layers.0.input_layernorm.weight -> blk.0.attn_norm.weight | BF16 | [8192]
model.layers.0.post_attention_layernorm.weight -> blk.0.ffn_norm.weight | BF16 | [8192]
model.layers.1.self_attn.q_proj.weight -> blk.1.attn_q.weight | BF16 | [8192, 8192]
model.layers.1.self_attn.k_proj.weight -> blk.1.attn_k.weight | BF16 | [1024, 8192]
model.layers.1.self_attn.v_proj.weight -> blk.1.attn_v.weight | BF16 | [1024, 8192]
model.layers.1.self_attn.o_proj.weight -> blk.1.attn_output.weight | BF16 | [8192, 8192]
model.layers.1.mlp.gate_proj.weight -> blk.1.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.1.mlp.up_proj.weight -> blk.1.ffn_up.weight | BF16 | [28672, 8192]
model.layers.1.mlp.down_proj.weight -> blk.1.ffn_down.weight | BF16 | [8192, 28672]
model.layers.1.input_layernorm.weight -> blk.1.attn_norm.weight | BF16 | [8192]
model.layers.1.post_attention_layernorm.weight -> blk.1.ffn_norm.weight | BF16 | [8192]
model.layers.2.self_attn.q_proj.weight -> blk.2.attn_q.weight | BF16 | [8192, 8192]
model.layers.2.self_attn.k_proj.weight -> blk.2.attn_k.weight | BF16 | [1024, 8192]
model.layers.2.self_attn.v_proj.weight -> blk.2.attn_v.weight | BF16 | [1024, 8192]
model.layers.2.self_attn.o_proj.weight -> blk.2.attn_output.weight | BF16 | [8192, 8192]
model.layers.2.mlp.gate_proj.weight -> blk.2.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.2.mlp.up_proj.weight -> blk.2.ffn_up.weight | BF16 | [28672, 8192]
model.layers.2.mlp.down_proj.weight -> blk.2.ffn_down.weight | BF16 | [8192, 28672]
model.layers.2.input_layernorm.weight -> blk.2.attn_norm.weight | BF16 | [8192]
model.layers.2.post_attention_layernorm.weight -> blk.2.ffn_norm.weight | BF16 | [8192]
model.layers.3.self_attn.q_proj.weight -> blk.3.attn_q.weight | BF16 | [8192, 8192]
model.layers.3.self_attn.k_proj.weight -> blk.3.attn_k.weight | BF16 | [1024, 8192]
model.layers.3.self_attn.v_proj.weight -> blk.3.attn_v.weight | BF16 | [1024, 8192]
model.layers.3.self_attn.o_proj.weight -> blk.3.attn_output.weight | BF16 | [8192, 8192]
model.layers.3.mlp.gate_proj.weight -> blk.3.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.3.mlp.up_proj.weight -> blk.3.ffn_up.weight | BF16 | [28672, 8192]
model.layers.3.mlp.down_proj.weight -> blk.3.ffn_down.weight | BF16 | [8192, 28672]
model.layers.3.input_layernorm.weight -> blk.3.attn_norm.weight | BF16 | [8192]
model.layers.3.post_attention_layernorm.weight -> blk.3.ffn_norm.weight | BF16 | [8192]
model.layers.4.self_attn.q_proj.weight -> blk.4.attn_q.weight | BF16 | [8192, 8192]
model.layers.4.self_attn.k_proj.weight -> blk.4.attn_k.weight | BF16 | [1024, 8192]
model.layers.4.self_attn.v_proj.weight -> blk.4.attn_v.weight | BF16 | [1024, 8192]
model.layers.4.self_attn.o_proj.weight -> blk.4.attn_output.weight | BF16 | [8192, 8192]
model.layers.4.mlp.gate_proj.weight -> blk.4.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.4.mlp.up_proj.weight -> blk.4.ffn_up.weight | BF16 | [28672, 8192]
model.layers.4.mlp.down_proj.weight -> blk.4.ffn_down.weight | BF16 | [8192, 28672]
model.layers.4.input_layernorm.weight -> blk.4.attn_norm.weight | BF16 | [8192]
model.layers.4.post_attention_layernorm.weight -> blk.4.ffn_norm.weight | BF16 | [8192]
model.layers.5.self_attn.q_proj.weight -> blk.5.attn_q.weight | BF16 | [8192, 8192]
model.layers.5.self_attn.k_proj.weight -> blk.5.attn_k.weight | BF16 | [1024, 8192]
model.layers.5.self_attn.v_proj.weight -> blk.5.attn_v.weight | BF16 | [1024, 8192]
model.layers.5.self_attn.o_proj.weight -> blk.5.attn_output.weight | BF16 | [8192, 8192]
model.layers.5.mlp.gate_proj.weight -> blk.5.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.5.mlp.up_proj.weight -> blk.5.ffn_up.weight | BF16 | [28672, 8192]
model.layers.5.mlp.down_proj.weight -> blk.5.ffn_down.weight | BF16 | [8192, 28672]
model.layers.5.input_layernorm.weight -> blk.5.attn_norm.weight | BF16 | [8192]
model.layers.5.post_attention_layernorm.weight -> blk.5.ffn_norm.weight | BF16 | [8192]
model.layers.6.self_attn.q_proj.weight -> blk.6.attn_q.weight | BF16 | [8192, 8192]
model.layers.6.self_attn.k_proj.weight -> blk.6.attn_k.weight | BF16 | [1024, 8192]
model.layers.6.self_attn.v_proj.weight -> blk.6.attn_v.weight | BF16 | [1024, 8192]
model.layers.6.self_attn.o_proj.weight -> blk.6.attn_output.weight | BF16 | [8192, 8192]
model.layers.6.mlp.gate_proj.weight -> blk.6.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.6.mlp.up_proj.weight -> blk.6.ffn_up.weight | BF16 | [28672, 8192]
model.layers.6.mlp.down_proj.weight -> blk.6.ffn_down.weight | BF16 | [8192, 28672]
model.layers.6.input_layernorm.weight -> blk.6.attn_norm.weight | BF16 | [8192]
model.layers.6.post_attention_layernorm.weight -> blk.6.ffn_norm.weight | BF16 | [8192]
model.layers.7.self_attn.q_proj.weight -> blk.7.attn_q.weight | BF16 | [8192, 8192]
model.layers.7.self_attn.k_proj.weight -> blk.7.attn_k.weight | BF16 | [1024, 8192]
model.layers.7.self_attn.v_proj.weight -> blk.7.attn_v.weight | BF16 | [1024, 8192]
model.layers.7.self_attn.o_proj.weight -> blk.7.attn_output.weight | BF16 | [8192, 8192]
model.layers.7.mlp.gate_proj.weight -> blk.7.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.7.mlp.up_proj.weight -> blk.7.ffn_up.weight | BF16 | [28672, 8192]
model.layers.7.mlp.down_proj.weight -> blk.7.ffn_down.weight | BF16 | [8192, 28672]
model.layers.7.input_layernorm.weight -> blk.7.attn_norm.weight | BF16 | [8192]
model.layers.7.post_attention_layernorm.weight -> blk.7.ffn_norm.weight | BF16 | [8192]
model.layers.8.self_attn.q_proj.weight -> blk.8.attn_q.weight | BF16 | [8192, 8192]
model.layers.8.self_attn.k_proj.weight -> blk.8.attn_k.weight | BF16 | [1024, 8192]
model.layers.8.self_attn.v_proj.weight -> blk.8.attn_v.weight | BF16 | [1024, 8192]
model.layers.8.self_attn.o_proj.weight -> blk.8.attn_output.weight | BF16 | [8192, 8192]
model.layers.8.mlp.gate_proj.weight -> blk.8.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.8.mlp.up_proj.weight -> blk.8.ffn_up.weight | BF16 | [28672, 8192]
model.layers.8.mlp.down_proj.weight -> blk.8.ffn_down.weight | BF16 | [8192, 28672]
model.layers.8.input_layernorm.weight -> blk.8.attn_norm.weight | BF16 | [8192]
model.layers.8.post_attention_layernorm.weight -> blk.8.ffn_norm.weight | BF16 | [8192]
model.layers.9.self_attn.q_proj.weight -> blk.9.attn_q.weight | BF16 | [8192, 8192]
model.layers.9.self_attn.k_proj.weight -> blk.9.attn_k.weight | BF16 | [1024, 8192]
model.layers.9.self_attn.v_proj.weight -> blk.9.attn_v.weight | BF16 | [1024, 8192]
model.layers.9.self_attn.o_proj.weight -> blk.9.attn_output.weight | BF16 | [8192, 8192]
model.layers.9.mlp.gate_proj.weight -> blk.9.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.9.mlp.up_proj.weight -> blk.9.ffn_up.weight | BF16 | [28672, 8192]
model.layers.9.mlp.down_proj.weight -> blk.9.ffn_down.weight | BF16 | [8192, 28672]
model.layers.9.input_layernorm.weight -> blk.9.attn_norm.weight | BF16 | [8192]
model.layers.9.post_attention_layernorm.weight -> blk.9.ffn_norm.weight | BF16 | [8192]
model.layers.10.self_attn.q_proj.weight -> blk.10.attn_q.weight | BF16 | [8192, 8192]
model.layers.10.self_attn.k_proj.weight -> blk.10.attn_k.weight | BF16 | [1024, 8192]
model.layers.10.self_attn.v_proj.weight -> blk.10.attn_v.weight | BF16 | [1024, 8192]
model.layers.10.self_attn.o_proj.weight -> blk.10.attn_output.weight | BF16 | [8192, 8192]
model.layers.10.mlp.gate_proj.weight -> blk.10.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.10.mlp.up_proj.weight -> blk.10.ffn_up.weight | BF16 | [28672, 8192]
model.layers.10.mlp.down_proj.weight -> blk.10.ffn_down.weight | BF16 | [8192, 28672]
model.layers.10.input_layernorm.weight -> blk.10.attn_norm.weight | BF16 | [8192]
model.layers.10.post_attention_layernorm.weight -> blk.10.ffn_norm.weight | BF16 | [8192]
model.layers.11.self_attn.q_proj.weight -> blk.11.attn_q.weight | BF16 | [8192, 8192]
model.layers.11.self_attn.k_proj.weight -> blk.11.attn_k.weight | BF16 | [1024, 8192]
model.layers.11.self_attn.v_proj.weight -> blk.11.attn_v.weight | BF16 | [1024, 8192]
model.layers.11.self_attn.o_proj.weight -> blk.11.attn_output.weight | BF16 | [8192, 8192]
model.layers.11.mlp.gate_proj.weight -> blk.11.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.11.mlp.up_proj.weight -> blk.11.ffn_up.weight | BF16 | [28672, 8192]
model.layers.11.mlp.down_proj.weight -> blk.11.ffn_down.weight | BF16 | [8192, 28672]
model.layers.11.input_layernorm.weight -> blk.11.attn_norm.weight | BF16 | [8192]
model.layers.11.post_attention_layernorm.weight -> blk.11.ffn_norm.weight | BF16 | [8192]
model.layers.12.self_attn.q_proj.weight -> blk.12.attn_q.weight | BF16 | [8192, 8192]
model.layers.12.self_attn.k_proj.weight -> blk.12.attn_k.weight | BF16 | [1024, 8192]
model.layers.12.self_attn.v_proj.weight -> blk.12.attn_v.weight | BF16 | [1024, 8192]
model.layers.12.self_attn.o_proj.weight -> blk.12.attn_output.weight | BF16 | [8192, 8192]
model.layers.12.mlp.gate_proj.weight -> blk.12.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.12.mlp.up_proj.weight -> blk.12.ffn_up.weight | BF16 | [28672, 8192]
model.layers.12.mlp.down_proj.weight -> blk.12.ffn_down.weight | BF16 | [8192, 28672]
model.layers.12.input_layernorm.weight -> blk.12.attn_norm.weight | BF16 | [8192]
model.layers.12.post_attention_layernorm.weight -> blk.12.ffn_norm.weight | BF16 | [8192]
model.layers.13.self_attn.q_proj.weight -> blk.13.attn_q.weight | BF16 | [8192, 8192]
model.layers.13.self_attn.k_proj.weight -> blk.13.attn_k.weight | BF16 | [1024, 8192]
model.layers.13.self_attn.v_proj.weight -> blk.13.attn_v.weight | BF16 | [1024, 8192]
model.layers.13.self_attn.o_proj.weight -> blk.13.attn_output.weight | BF16 | [8192, 8192]
model.layers.13.mlp.gate_proj.weight -> blk.13.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.13.mlp.up_proj.weight -> blk.13.ffn_up.weight | BF16 | [28672, 8192]
model.layers.13.mlp.down_proj.weight -> blk.13.ffn_down.weight | BF16 | [8192, 28672]
model.layers.13.input_layernorm.weight -> blk.13.attn_norm.weight | BF16 | [8192]
model.layers.13.post_attention_layernorm.weight -> blk.13.ffn_norm.weight | BF16 | [8192]
model.layers.14.self_attn.q_proj.weight -> blk.14.attn_q.weight | BF16 | [8192, 8192]
model.layers.14.self_attn.k_proj.weight -> blk.14.attn_k.weight | BF16 | [1024, 8192]
model.layers.14.self_attn.v_proj.weight -> blk.14.attn_v.weight | BF16 | [1024, 8192]
model.layers.14.self_attn.o_proj.weight -> blk.14.attn_output.weight | BF16 | [8192, 8192]
model.layers.14.mlp.gate_proj.weight -> blk.14.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.14.mlp.up_proj.weight -> blk.14.ffn_up.weight | BF16 | [28672, 8192]
model.layers.14.mlp.down_proj.weight -> blk.14.ffn_down.weight | BF16 | [8192, 28672]
model.layers.14.input_layernorm.weight -> blk.14.attn_norm.weight | BF16 | [8192]
model.layers.14.post_attention_layernorm.weight -> blk.14.ffn_norm.weight | BF16 | [8192]
model.layers.15.self_attn.q_proj.weight -> blk.15.attn_q.weight | BF16 | [8192, 8192]
model.layers.15.self_attn.k_proj.weight -> blk.15.attn_k.weight | BF16 | [1024, 8192]
model.layers.15.self_attn.v_proj.weight -> blk.15.attn_v.weight | BF16 | [1024, 8192]
model.layers.15.self_attn.o_proj.weight -> blk.15.attn_output.weight | BF16 | [8192, 8192]
model.layers.15.mlp.gate_proj.weight -> blk.15.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.15.mlp.up_proj.weight -> blk.15.ffn_up.weight | BF16 | [28672, 8192]
model.layers.15.mlp.down_proj.weight -> blk.15.ffn_down.weight | BF16 | [8192, 28672]
model.layers.15.input_layernorm.weight -> blk.15.attn_norm.weight | BF16 | [8192]
model.layers.15.post_attention_layernorm.weight -> blk.15.ffn_norm.weight | BF16 | [8192]
model.layers.16.self_attn.q_proj.weight -> blk.16.attn_q.weight | BF16 | [8192, 8192]
model.layers.16.self_attn.k_proj.weight -> blk.16.attn_k.weight | BF16 | [1024, 8192]
model.layers.16.self_attn.v_proj.weight -> blk.16.attn_v.weight | BF16 | [1024, 8192]
model.layers.16.self_attn.o_proj.weight -> blk.16.attn_output.weight | BF16 | [8192, 8192]
model.layers.16.mlp.gate_proj.weight -> blk.16.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.16.mlp.up_proj.weight -> blk.16.ffn_up.weight | BF16 | [28672, 8192]
model.layers.16.mlp.down_proj.weight -> blk.16.ffn_down.weight | BF16 | [8192, 28672]
model.layers.16.input_layernorm.weight -> blk.16.attn_norm.weight | BF16 | [8192]
model.layers.16.post_attention_layernorm.weight -> blk.16.ffn_norm.weight | BF16 | [8192]
model.layers.17.self_attn.q_proj.weight -> blk.17.attn_q.weight | BF16 | [8192, 8192]
model.layers.17.self_attn.k_proj.weight -> blk.17.attn_k.weight | BF16 | [1024, 8192]
model.layers.17.self_attn.v_proj.weight -> blk.17.attn_v.weight | BF16 | [1024, 8192]
model.layers.17.self_attn.o_proj.weight -> blk.17.attn_output.weight | BF16 | [8192, 8192]
model.layers.17.mlp.gate_proj.weight -> blk.17.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.17.mlp.up_proj.weight -> blk.17.ffn_up.weight | BF16 | [28672, 8192]
model.layers.17.mlp.down_proj.weight -> blk.17.ffn_down.weight | BF16 | [8192, 28672]
model.layers.17.input_layernorm.weight -> blk.17.attn_norm.weight | BF16 | [8192]
model.layers.17.post_attention_layernorm.weight -> blk.17.ffn_norm.weight | BF16 | [8192]
model.layers.18.self_attn.q_proj.weight -> blk.18.attn_q.weight | BF16 | [8192, 8192]
model.layers.18.self_attn.k_proj.weight -> blk.18.attn_k.weight | BF16 | [1024, 8192]
model.layers.18.self_attn.v_proj.weight -> blk.18.attn_v.weight | BF16 | [1024, 8192]
model.layers.18.self_attn.o_proj.weight -> blk.18.attn_output.weight | BF16 | [8192, 8192]
model.layers.18.mlp.gate_proj.weight -> blk.18.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.18.mlp.up_proj.weight -> blk.18.ffn_up.weight | BF16 | [28672, 8192]
model.layers.18.mlp.down_proj.weight -> blk.18.ffn_down.weight | BF16 | [8192, 28672]
model.layers.18.input_layernorm.weight -> blk.18.attn_norm.weight | BF16 | [8192]
model.layers.18.post_attention_layernorm.weight -> blk.18.ffn_norm.weight | BF16 | [8192]
model.layers.19.self_attn.q_proj.weight -> blk.19.attn_q.weight | BF16 | [8192, 8192]
model.layers.19.self_attn.k_proj.weight -> blk.19.attn_k.weight | BF16 | [1024, 8192]
model.layers.19.self_attn.v_proj.weight -> blk.19.attn_v.weight | BF16 | [1024, 8192]
model.layers.19.self_attn.o_proj.weight -> blk.19.attn_output.weight | BF16 | [8192, 8192]
model.layers.19.mlp.gate_proj.weight -> blk.19.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.19.mlp.up_proj.weight -> blk.19.ffn_up.weight | BF16 | [28672, 8192]
model.layers.19.mlp.down_proj.weight -> blk.19.ffn_down.weight | BF16 | [8192, 28672]
model.layers.19.input_layernorm.weight -> blk.19.attn_norm.weight | BF16 | [8192]
model.layers.19.post_attention_layernorm.weight -> blk.19.ffn_norm.weight | BF16 | [8192]
model.layers.20.self_attn.q_proj.weight -> blk.20.attn_q.weight | BF16 | [8192, 8192]
model.layers.20.self_attn.k_proj.weight -> blk.20.attn_k.weight | BF16 | [1024, 8192]
model.layers.20.self_attn.v_proj.weight -> blk.20.attn_v.weight | BF16 | [1024, 8192]
model.layers.20.self_attn.o_proj.weight -> blk.20.attn_output.weight | BF16 | [8192, 8192]
model.layers.20.mlp.gate_proj.weight -> blk.20.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.20.mlp.up_proj.weight -> blk.20.ffn_up.weight | BF16 | [28672, 8192]
model.layers.20.mlp.down_proj.weight -> blk.20.ffn_down.weight | BF16 | [8192, 28672]
model.layers.20.input_layernorm.weight -> blk.20.attn_norm.weight | BF16 | [8192]
model.layers.20.post_attention_layernorm.weight -> blk.20.ffn_norm.weight | BF16 | [8192]
model.layers.21.self_attn.q_proj.weight -> blk.21.attn_q.weight | BF16 | [8192, 8192]
model.layers.21.self_attn.k_proj.weight -> blk.21.attn_k.weight | BF16 | [1024, 8192]
model.layers.21.self_attn.v_proj.weight -> blk.21.attn_v.weight | BF16 | [1024, 8192]
model.layers.21.self_attn.o_proj.weight -> blk.21.attn_output.weight | BF16 | [8192, 8192]
model.layers.21.mlp.gate_proj.weight -> blk.21.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.21.mlp.up_proj.weight -> blk.21.ffn_up.weight | BF16 | [28672, 8192]
model.layers.21.mlp.down_proj.weight -> blk.21.ffn_down.weight | BF16 | [8192, 28672]
model.layers.21.input_layernorm.weight -> blk.21.attn_norm.weight | BF16 | [8192]
model.layers.21.post_attention_layernorm.weight -> blk.21.ffn_norm.weight | BF16 | [8192]
model.layers.22.self_attn.q_proj.weight -> blk.22.attn_q.weight | BF16 | [8192, 8192]
model.layers.22.self_attn.k_proj.weight -> blk.22.attn_k.weight | BF16 | [1024, 8192]
model.layers.22.self_attn.v_proj.weight -> blk.22.attn_v.weight | BF16 | [1024, 8192]
model.layers.22.self_attn.o_proj.weight -> blk.22.attn_output.weight | BF16 | [8192, 8192]
model.layers.22.mlp.gate_proj.weight -> blk.22.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.22.mlp.up_proj.weight -> blk.22.ffn_up.weight | BF16 | [28672, 8192]
model.layers.22.mlp.down_proj.weight -> blk.22.ffn_down.weight | BF16 | [8192, 28672]
model.layers.22.input_layernorm.weight -> blk.22.attn_norm.weight | BF16 | [8192]
model.layers.22.post_attention_layernorm.weight -> blk.22.ffn_norm.weight | BF16 | [8192]
model.layers.23.self_attn.q_proj.weight -> blk.23.attn_q.weight | BF16 | [8192, 8192]
model.layers.23.self_attn.k_proj.weight -> blk.23.attn_k.weight | BF16 | [1024, 8192]
model.layers.23.self_attn.v_proj.weight -> blk.23.attn_v.weight | BF16 | [1024, 8192]
model.layers.23.self_attn.o_proj.weight -> blk.23.attn_output.weight | BF16 | [8192, 8192]
model.layers.23.mlp.gate_proj.weight -> blk.23.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.23.mlp.up_proj.weight -> blk.23.ffn_up.weight | BF16 | [28672, 8192]
model.layers.23.mlp.down_proj.weight -> blk.23.ffn_down.weight | BF16 | [8192, 28672]
model.layers.23.input_layernorm.weight -> blk.23.attn_norm.weight | BF16 | [8192]
model.layers.23.post_attention_layernorm.weight -> blk.23.ffn_norm.weight | BF16 | [8192]
model.layers.24.self_attn.q_proj.weight -> blk.24.attn_q.weight | BF16 | [8192, 8192]
model.layers.24.self_attn.k_proj.weight -> blk.24.attn_k.weight | BF16 | [1024, 8192]
model.layers.24.self_attn.v_proj.weight -> blk.24.attn_v.weight | BF16 | [1024, 8192]
model.layers.24.self_attn.o_proj.weight -> blk.24.attn_output.weight | BF16 | [8192, 8192]
model.layers.24.mlp.gate_proj.weight -> blk.24.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.24.mlp.up_proj.weight -> blk.24.ffn_up.weight | BF16 | [28672, 8192]
model.layers.24.mlp.down_proj.weight -> blk.24.ffn_down.weight | BF16 | [8192, 28672]
model.layers.24.input_layernorm.weight -> blk.24.attn_norm.weight | BF16 | [8192]
model.layers.24.post_attention_layernorm.weight -> blk.24.ffn_norm.weight | BF16 | [8192]
model.layers.25.self_attn.q_proj.weight -> blk.25.attn_q.weight | BF16 | [8192, 8192]
model.layers.25.self_attn.k_proj.weight -> blk.25.attn_k.weight | BF16 | [1024, 8192]
model.layers.25.self_attn.v_proj.weight -> blk.25.attn_v.weight | BF16 | [1024, 8192]
model.layers.25.self_attn.o_proj.weight -> blk.25.attn_output.weight | BF16 | [8192, 8192]
model.layers.25.mlp.gate_proj.weight -> blk.25.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.25.mlp.up_proj.weight -> blk.25.ffn_up.weight | BF16 | [28672, 8192]
model.layers.25.mlp.down_proj.weight -> blk.25.ffn_down.weight | BF16 | [8192, 28672]
model.layers.25.input_layernorm.weight -> blk.25.attn_norm.weight | BF16 | [8192]
model.layers.25.post_attention_layernorm.weight -> blk.25.ffn_norm.weight | BF16 | [8192]
model.layers.26.self_attn.q_proj.weight -> blk.26.attn_q.weight | BF16 | [8192, 8192]
model.layers.26.self_attn.k_proj.weight -> blk.26.attn_k.weight | BF16 | [1024, 8192]
model.layers.26.self_attn.v_proj.weight -> blk.26.attn_v.weight | BF16 | [1024, 8192]
model.layers.26.self_attn.o_proj.weight -> blk.26.attn_output.weight | BF16 | [8192, 8192]
model.layers.26.mlp.gate_proj.weight -> blk.26.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.26.mlp.up_proj.weight -> blk.26.ffn_up.weight | BF16 | [28672, 8192]
model.layers.26.mlp.down_proj.weight -> blk.26.ffn_down.weight | BF16 | [8192, 28672]
model.layers.26.input_layernorm.weight -> blk.26.attn_norm.weight | BF16 | [8192]
model.layers.26.post_attention_layernorm.weight -> blk.26.ffn_norm.weight | BF16 | [8192]
model.layers.27.self_attn.q_proj.weight -> blk.27.attn_q.weight | BF16 | [8192, 8192]
model.layers.27.self_attn.k_proj.weight -> blk.27.attn_k.weight | BF16 | [1024, 8192]
model.layers.27.self_attn.v_proj.weight -> blk.27.attn_v.weight | BF16 | [1024, 8192]
model.layers.27.self_attn.o_proj.weight -> blk.27.attn_output.weight | BF16 | [8192, 8192]
model.layers.27.mlp.gate_proj.weight -> blk.27.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.27.mlp.up_proj.weight -> blk.27.ffn_up.weight | BF16 | [28672, 8192]
model.layers.27.mlp.down_proj.weight -> blk.27.ffn_down.weight | BF16 | [8192, 28672]
model.layers.27.input_layernorm.weight -> blk.27.attn_norm.weight | BF16 | [8192]
model.layers.27.post_attention_layernorm.weight -> blk.27.ffn_norm.weight | BF16 | [8192]
model.layers.28.self_attn.q_proj.weight -> blk.28.attn_q.weight | BF16 | [8192, 8192]
model.layers.28.self_attn.k_proj.weight -> blk.28.attn_k.weight | BF16 | [1024, 8192]
model.layers.28.self_attn.v_proj.weight -> blk.28.attn_v.weight | BF16 | [1024, 8192]
model.layers.28.self_attn.o_proj.weight -> blk.28.attn_output.weight | BF16 | [8192, 8192]
model.layers.28.mlp.gate_proj.weight -> blk.28.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.28.mlp.up_proj.weight -> blk.28.ffn_up.weight | BF16 | [28672, 8192]
model.layers.28.mlp.down_proj.weight -> blk.28.ffn_down.weight | BF16 | [8192, 28672]
model.layers.28.input_layernorm.weight -> blk.28.attn_norm.weight | BF16 | [8192]
model.layers.28.post_attention_layernorm.weight -> blk.28.ffn_norm.weight | BF16 | [8192]
model.layers.29.self_attn.q_proj.weight -> blk.29.attn_q.weight | BF16 | [8192, 8192]
model.layers.29.self_attn.k_proj.weight -> blk.29.attn_k.weight | BF16 | [1024, 8192]
model.layers.29.self_attn.v_proj.weight -> blk.29.attn_v.weight | BF16 | [1024, 8192]
model.layers.29.self_attn.o_proj.weight -> blk.29.attn_output.weight | BF16 | [8192, 8192]
model.layers.29.mlp.gate_proj.weight -> blk.29.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.29.mlp.up_proj.weight -> blk.29.ffn_up.weight | BF16 | [28672, 8192]
model.layers.29.mlp.down_proj.weight -> blk.29.ffn_down.weight | BF16 | [8192, 28672]
model.layers.29.input_layernorm.weight -> blk.29.attn_norm.weight | BF16 | [8192]
model.layers.29.post_attention_layernorm.weight -> blk.29.ffn_norm.weight | BF16 | [8192]
model.layers.30.self_attn.q_proj.weight -> blk.30.attn_q.weight | BF16 | [8192, 8192]
model.layers.30.self_attn.k_proj.weight -> blk.30.attn_k.weight | BF16 | [1024, 8192]
model.layers.30.self_attn.v_proj.weight -> blk.30.attn_v.weight | BF16 | [1024, 8192]
model.layers.30.self_attn.o_proj.weight -> blk.30.attn_output.weight | BF16 | [8192, 8192]
model.layers.30.mlp.gate_proj.weight -> blk.30.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.30.mlp.up_proj.weight -> blk.30.ffn_up.weight | BF16 | [28672, 8192]
model.layers.30.mlp.down_proj.weight -> blk.30.ffn_down.weight | BF16 | [8192, 28672]
model.layers.30.input_layernorm.weight -> blk.30.attn_norm.weight | BF16 | [8192]
model.layers.30.post_attention_layernorm.weight -> blk.30.ffn_norm.weight | BF16 | [8192]
model.layers.31.self_attn.q_proj.weight -> blk.31.attn_q.weight | BF16 | [8192, 8192]
model.layers.31.self_attn.k_proj.weight -> blk.31.attn_k.weight | BF16 | [1024, 8192]
model.layers.31.self_attn.v_proj.weight -> blk.31.attn_v.weight | BF16 | [1024, 8192]
model.layers.31.self_attn.o_proj.weight -> blk.31.attn_output.weight | BF16 | [8192, 8192]
model.layers.31.mlp.gate_proj.weight -> blk.31.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.31.mlp.up_proj.weight -> blk.31.ffn_up.weight | BF16 | [28672, 8192]
model.layers.31.mlp.down_proj.weight -> blk.31.ffn_down.weight | BF16 | [8192, 28672]
model.layers.31.input_layernorm.weight -> blk.31.attn_norm.weight | BF16 | [8192]
model.layers.31.post_attention_layernorm.weight -> blk.31.ffn_norm.weight | BF16 | [8192]
model.layers.32.self_attn.q_proj.weight -> blk.32.attn_q.weight | BF16 | [8192, 8192]
model.layers.32.self_attn.k_proj.weight -> blk.32.attn_k.weight | BF16 | [1024, 8192]
model.layers.32.self_attn.v_proj.weight -> blk.32.attn_v.weight | BF16 | [1024, 8192]
model.layers.32.self_attn.o_proj.weight -> blk.32.attn_output.weight | BF16 | [8192, 8192]
model.layers.32.mlp.gate_proj.weight -> blk.32.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.32.mlp.up_proj.weight -> blk.32.ffn_up.weight | BF16 | [28672, 8192]
model.layers.32.mlp.down_proj.weight -> blk.32.ffn_down.weight | BF16 | [8192, 28672]
model.layers.32.input_layernorm.weight -> blk.32.attn_norm.weight | BF16 | [8192]
model.layers.32.post_attention_layernorm.weight -> blk.32.ffn_norm.weight | BF16 | [8192]
model.layers.33.self_attn.q_proj.weight -> blk.33.attn_q.weight | BF16 | [8192, 8192]
model.layers.33.self_attn.k_proj.weight -> blk.33.attn_k.weight | BF16 | [1024, 8192]
model.layers.33.self_attn.v_proj.weight -> blk.33.attn_v.weight | BF16 | [1024, 8192]
model.layers.33.self_attn.o_proj.weight -> blk.33.attn_output.weight | BF16 | [8192, 8192]
model.layers.33.mlp.gate_proj.weight -> blk.33.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.33.mlp.up_proj.weight -> blk.33.ffn_up.weight | BF16 | [28672, 8192]
model.layers.33.mlp.down_proj.weight -> blk.33.ffn_down.weight | BF16 | [8192, 28672]
model.layers.33.input_layernorm.weight -> blk.33.attn_norm.weight | BF16 | [8192]
model.layers.33.post_attention_layernorm.weight -> blk.33.ffn_norm.weight | BF16 | [8192]
model.layers.34.self_attn.q_proj.weight -> blk.34.attn_q.weight | BF16 | [8192, 8192]
model.layers.34.self_attn.k_proj.weight -> blk.34.attn_k.weight | BF16 | [1024, 8192]
model.layers.34.self_attn.v_proj.weight -> blk.34.attn_v.weight | BF16 | [1024, 8192]
model.layers.34.self_attn.o_proj.weight -> blk.34.attn_output.weight | BF16 | [8192, 8192]
model.layers.34.mlp.gate_proj.weight -> blk.34.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.34.mlp.up_proj.weight -> blk.34.ffn_up.weight | BF16 | [28672, 8192]
model.layers.34.mlp.down_proj.weight -> blk.34.ffn_down.weight | BF16 | [8192, 28672]
model.layers.34.input_layernorm.weight -> blk.34.attn_norm.weight | BF16 | [8192]
model.layers.34.post_attention_layernorm.weight -> blk.34.ffn_norm.weight | BF16 | [8192]
model.layers.35.self_attn.q_proj.weight -> blk.35.attn_q.weight | BF16 | [8192, 8192]
model.layers.35.self_attn.k_proj.weight -> blk.35.attn_k.weight | BF16 | [1024, 8192]
model.layers.35.self_attn.v_proj.weight -> blk.35.attn_v.weight | BF16 | [1024, 8192]
model.layers.35.self_attn.o_proj.weight -> blk.35.attn_output.weight | BF16 | [8192, 8192]
model.layers.35.mlp.gate_proj.weight -> blk.35.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.35.mlp.up_proj.weight -> blk.35.ffn_up.weight | BF16 | [28672, 8192]
model.layers.35.mlp.down_proj.weight -> blk.35.ffn_down.weight | BF16 | [8192, 28672]
model.layers.35.input_layernorm.weight -> blk.35.attn_norm.weight | BF16 | [8192]
model.layers.35.post_attention_layernorm.weight -> blk.35.ffn_norm.weight | BF16 | [8192]
model.layers.36.self_attn.q_proj.weight -> blk.36.attn_q.weight | BF16 | [8192, 8192]
model.layers.36.self_attn.k_proj.weight -> blk.36.attn_k.weight | BF16 | [1024, 8192]
model.layers.36.self_attn.v_proj.weight -> blk.36.attn_v.weight | BF16 | [1024, 8192]
model.layers.36.self_attn.o_proj.weight -> blk.36.attn_output.weight | BF16 | [8192, 8192]
model.layers.36.mlp.gate_proj.weight -> blk.36.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.36.mlp.up_proj.weight -> blk.36.ffn_up.weight | BF16 | [28672, 8192]
model.layers.36.mlp.down_proj.weight -> blk.36.ffn_down.weight | BF16 | [8192, 28672]
model.layers.36.input_layernorm.weight -> blk.36.attn_norm.weight | BF16 | [8192]
model.layers.36.post_attention_layernorm.weight -> blk.36.ffn_norm.weight | BF16 | [8192]
model.layers.37.self_attn.q_proj.weight -> blk.37.attn_q.weight | BF16 | [8192, 8192]
model.layers.37.self_attn.k_proj.weight -> blk.37.attn_k.weight | BF16 | [1024, 8192]
model.layers.37.self_attn.v_proj.weight -> blk.37.attn_v.weight | BF16 | [1024, 8192]
model.layers.37.self_attn.o_proj.weight -> blk.37.attn_output.weight | BF16 | [8192, 8192]
model.layers.37.mlp.gate_proj.weight -> blk.37.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.37.mlp.up_proj.weight -> blk.37.ffn_up.weight | BF16 | [28672, 8192]
model.layers.37.mlp.down_proj.weight -> blk.37.ffn_down.weight | BF16 | [8192, 28672]
model.layers.37.input_layernorm.weight -> blk.37.attn_norm.weight | BF16 | [8192]
model.layers.37.post_attention_layernorm.weight -> blk.37.ffn_norm.weight | BF16 | [8192]
model.layers.38.self_attn.q_proj.weight -> blk.38.attn_q.weight | BF16 | [8192, 8192]
model.layers.38.self_attn.k_proj.weight -> blk.38.attn_k.weight | BF16 | [1024, 8192]
model.layers.38.self_attn.v_proj.weight -> blk.38.attn_v.weight | BF16 | [1024, 8192]
model.layers.38.self_attn.o_proj.weight -> blk.38.attn_output.weight | BF16 | [8192, 8192]
model.layers.38.mlp.gate_proj.weight -> blk.38.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.38.mlp.up_proj.weight -> blk.38.ffn_up.weight | BF16 | [28672, 8192]
model.layers.38.mlp.down_proj.weight -> blk.38.ffn_down.weight | BF16 | [8192, 28672]
model.layers.38.input_layernorm.weight -> blk.38.attn_norm.weight | BF16 | [8192]
model.layers.38.post_attention_layernorm.weight -> blk.38.ffn_norm.weight | BF16 | [8192]
model.layers.39.self_attn.q_proj.weight -> blk.39.attn_q.weight | BF16 | [8192, 8192]
model.layers.39.self_attn.k_proj.weight -> blk.39.attn_k.weight | BF16 | [1024, 8192]
model.layers.39.self_attn.v_proj.weight -> blk.39.attn_v.weight | BF16 | [1024, 8192]
model.layers.39.self_attn.o_proj.weight -> blk.39.attn_output.weight | BF16 | [8192, 8192]
model.layers.39.mlp.gate_proj.weight -> blk.39.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.39.mlp.up_proj.weight -> blk.39.ffn_up.weight | BF16 | [28672, 8192]
model.layers.39.mlp.down_proj.weight -> blk.39.ffn_down.weight | BF16 | [8192, 28672]
model.layers.39.input_layernorm.weight -> blk.39.attn_norm.weight | BF16 | [8192]
model.layers.39.post_attention_layernorm.weight -> blk.39.ffn_norm.weight | BF16 | [8192]
model.layers.40.self_attn.q_proj.weight -> blk.40.attn_q.weight | BF16 | [8192, 8192]
model.layers.40.self_attn.k_proj.weight -> blk.40.attn_k.weight | BF16 | [1024, 8192]
model.layers.40.self_attn.v_proj.weight -> blk.40.attn_v.weight | BF16 | [1024, 8192]
model.layers.40.self_attn.o_proj.weight -> blk.40.attn_output.weight | BF16 | [8192, 8192]
model.layers.40.mlp.gate_proj.weight -> blk.40.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.40.mlp.up_proj.weight -> blk.40.ffn_up.weight | BF16 | [28672, 8192]
model.layers.40.mlp.down_proj.weight -> blk.40.ffn_down.weight | BF16 | [8192, 28672]
model.layers.40.input_layernorm.weight -> blk.40.attn_norm.weight | BF16 | [8192]
model.layers.40.post_attention_layernorm.weight -> blk.40.ffn_norm.weight | BF16 | [8192]
model.layers.41.self_attn.q_proj.weight -> blk.41.attn_q.weight | BF16 | [8192, 8192]
model.layers.41.self_attn.k_proj.weight -> blk.41.attn_k.weight | BF16 | [1024, 8192]
model.layers.41.self_attn.v_proj.weight -> blk.41.attn_v.weight | BF16 | [1024, 8192]
model.layers.41.self_attn.o_proj.weight -> blk.41.attn_output.weight | BF16 | [8192, 8192]
model.layers.41.mlp.gate_proj.weight -> blk.41.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.41.mlp.up_proj.weight -> blk.41.ffn_up.weight | BF16 | [28672, 8192]
model.layers.41.mlp.down_proj.weight -> blk.41.ffn_down.weight | BF16 | [8192, 28672]
model.layers.41.input_layernorm.weight -> blk.41.attn_norm.weight | BF16 | [8192]
model.layers.41.post_attention_layernorm.weight -> blk.41.ffn_norm.weight | BF16 | [8192]
model.layers.42.self_attn.q_proj.weight -> blk.42.attn_q.weight | BF16 | [8192, 8192]
model.layers.42.self_attn.k_proj.weight -> blk.42.attn_k.weight | BF16 | [1024, 8192]
model.layers.42.self_attn.v_proj.weight -> blk.42.attn_v.weight | BF16 | [1024, 8192]
model.layers.42.self_attn.o_proj.weight -> blk.42.attn_output.weight | BF16 | [8192, 8192]
model.layers.42.mlp.gate_proj.weight -> blk.42.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.42.mlp.up_proj.weight -> blk.42.ffn_up.weight | BF16 | [28672, 8192]
model.layers.42.mlp.down_proj.weight -> blk.42.ffn_down.weight | BF16 | [8192, 28672]
model.layers.42.input_layernorm.weight -> blk.42.attn_norm.weight | BF16 | [8192]
model.layers.42.post_attention_layernorm.weight -> blk.42.ffn_norm.weight | BF16 | [8192]
model.layers.43.self_attn.q_proj.weight -> blk.43.attn_q.weight | BF16 | [8192, 8192]
model.layers.43.self_attn.k_proj.weight -> blk.43.attn_k.weight | BF16 | [1024, 8192]
model.layers.43.self_attn.v_proj.weight -> blk.43.attn_v.weight | BF16 | [1024, 8192]
model.layers.43.self_attn.o_proj.weight -> blk.43.attn_output.weight | BF16 | [8192, 8192]
model.layers.43.mlp.gate_proj.weight -> blk.43.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.43.mlp.up_proj.weight -> blk.43.ffn_up.weight | BF16 | [28672, 8192]
model.layers.43.mlp.down_proj.weight -> blk.43.ffn_down.weight | BF16 | [8192, 28672]
model.layers.43.input_layernorm.weight -> blk.43.attn_norm.weight | BF16 | [8192]
model.layers.43.post_attention_layernorm.weight -> blk.43.ffn_norm.weight | BF16 | [8192]
model.layers.44.self_attn.q_proj.weight -> blk.44.attn_q.weight | BF16 | [8192, 8192]
model.layers.44.self_attn.k_proj.weight -> blk.44.attn_k.weight | BF16 | [1024, 8192]
model.layers.44.self_attn.v_proj.weight -> blk.44.attn_v.weight | BF16 | [1024, 8192]
model.layers.44.self_attn.o_proj.weight -> blk.44.attn_output.weight | BF16 | [8192, 8192]
model.layers.44.mlp.gate_proj.weight -> blk.44.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.44.mlp.up_proj.weight -> blk.44.ffn_up.weight | BF16 | [28672, 8192]
model.layers.44.mlp.down_proj.weight -> blk.44.ffn_down.weight | BF16 | [8192, 28672]
model.layers.44.input_layernorm.weight -> blk.44.attn_norm.weight | BF16 | [8192]
model.layers.44.post_attention_layernorm.weight -> blk.44.ffn_norm.weight | BF16 | [8192]
model.layers.45.self_attn.q_proj.weight -> blk.45.attn_q.weight | BF16 | [8192, 8192]
model.layers.45.self_attn.k_proj.weight -> blk.45.attn_k.weight | BF16 | [1024, 8192]
model.layers.45.self_attn.v_proj.weight -> blk.45.attn_v.weight | BF16 | [1024, 8192]
model.layers.45.self_attn.o_proj.weight -> blk.45.attn_output.weight | BF16 | [8192, 8192]
model.layers.45.mlp.gate_proj.weight -> blk.45.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.45.mlp.up_proj.weight -> blk.45.ffn_up.weight | BF16 | [28672, 8192]
model.layers.45.mlp.down_proj.weight -> blk.45.ffn_down.weight | BF16 | [8192, 28672]
model.layers.45.input_layernorm.weight -> blk.45.attn_norm.weight | BF16 | [8192]
model.layers.45.post_attention_layernorm.weight -> blk.45.ffn_norm.weight | BF16 | [8192]
model.layers.46.self_attn.q_proj.weight -> blk.46.attn_q.weight | BF16 | [8192, 8192]
model.layers.46.self_attn.k_proj.weight -> blk.46.attn_k.weight | BF16 | [1024, 8192]
model.layers.46.self_attn.v_proj.weight -> blk.46.attn_v.weight | BF16 | [1024, 8192]
model.layers.46.self_attn.o_proj.weight -> blk.46.attn_output.weight | BF16 | [8192, 8192]
model.layers.46.mlp.gate_proj.weight -> blk.46.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.46.mlp.up_proj.weight -> blk.46.ffn_up.weight | BF16 | [28672, 8192]
model.layers.46.mlp.down_proj.weight -> blk.46.ffn_down.weight | BF16 | [8192, 28672]
model.layers.46.input_layernorm.weight -> blk.46.attn_norm.weight | BF16 | [8192]
model.layers.46.post_attention_layernorm.weight -> blk.46.ffn_norm.weight | BF16 | [8192]
model.layers.47.self_attn.q_proj.weight -> blk.47.attn_q.weight | BF16 | [8192, 8192]
model.layers.47.self_attn.k_proj.weight -> blk.47.attn_k.weight | BF16 | [1024, 8192]
model.layers.47.self_attn.v_proj.weight -> blk.47.attn_v.weight | BF16 | [1024, 8192]
model.layers.47.self_attn.o_proj.weight -> blk.47.attn_output.weight | BF16 | [8192, 8192]
model.layers.47.mlp.gate_proj.weight -> blk.47.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.47.mlp.up_proj.weight -> blk.47.ffn_up.weight | BF16 | [28672, 8192]
model.layers.47.mlp.down_proj.weight -> blk.47.ffn_down.weight | BF16 | [8192, 28672]
model.layers.47.input_layernorm.weight -> blk.47.attn_norm.weight | BF16 | [8192]
model.layers.47.post_attention_layernorm.weight -> blk.47.ffn_norm.weight | BF16 | [8192]
model.layers.48.self_attn.q_proj.weight -> blk.48.attn_q.weight | BF16 | [8192, 8192]
model.layers.48.self_attn.k_proj.weight -> blk.48.attn_k.weight | BF16 | [1024, 8192]
model.layers.48.self_attn.v_proj.weight -> blk.48.attn_v.weight | BF16 | [1024, 8192]
model.layers.48.self_attn.o_proj.weight -> blk.48.attn_output.weight | BF16 | [8192, 8192]
model.layers.48.mlp.gate_proj.weight -> blk.48.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.48.mlp.up_proj.weight -> blk.48.ffn_up.weight | BF16 | [28672, 8192]
model.layers.48.mlp.down_proj.weight -> blk.48.ffn_down.weight | BF16 | [8192, 28672]
model.layers.48.input_layernorm.weight -> blk.48.attn_norm.weight | BF16 | [8192]
model.layers.48.post_attention_layernorm.weight -> blk.48.ffn_norm.weight | BF16 | [8192]
model.layers.49.self_attn.q_proj.weight -> blk.49.attn_q.weight | BF16 | [8192, 8192]
model.layers.49.self_attn.k_proj.weight -> blk.49.attn_k.weight | BF16 | [1024, 8192]
model.layers.49.self_attn.v_proj.weight -> blk.49.attn_v.weight | BF16 | [1024, 8192]
model.layers.49.self_attn.o_proj.weight -> blk.49.attn_output.weight | BF16 | [8192, 8192]
model.layers.49.mlp.gate_proj.weight -> blk.49.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.49.mlp.up_proj.weight -> blk.49.ffn_up.weight | BF16 | [28672, 8192]
model.layers.49.mlp.down_proj.weight -> blk.49.ffn_down.weight | BF16 | [8192, 28672]
model.layers.49.input_layernorm.weight -> blk.49.attn_norm.weight | BF16 | [8192]
model.layers.49.post_attention_layernorm.weight -> blk.49.ffn_norm.weight | BF16 | [8192]
model.layers.50.self_attn.q_proj.weight -> blk.50.attn_q.weight | BF16 | [8192, 8192]
model.layers.50.self_attn.k_proj.weight -> blk.50.attn_k.weight | BF16 | [1024, 8192]
model.layers.50.self_attn.v_proj.weight -> blk.50.attn_v.weight | BF16 | [1024, 8192]
model.layers.50.self_attn.o_proj.weight -> blk.50.attn_output.weight | BF16 | [8192, 8192]
model.layers.50.mlp.gate_proj.weight -> blk.50.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.50.mlp.up_proj.weight -> blk.50.ffn_up.weight | BF16 | [28672, 8192]
model.layers.50.mlp.down_proj.weight -> blk.50.ffn_down.weight | BF16 | [8192, 28672]
model.layers.50.input_layernorm.weight -> blk.50.attn_norm.weight | BF16 | [8192]
model.layers.50.post_attention_layernorm.weight -> blk.50.ffn_norm.weight | BF16 | [8192]
model.layers.51.self_attn.q_proj.weight -> blk.51.attn_q.weight | BF16 | [8192, 8192]
model.layers.51.self_attn.k_proj.weight -> blk.51.attn_k.weight | BF16 | [1024, 8192]
model.layers.51.self_attn.v_proj.weight -> blk.51.attn_v.weight | BF16 | [1024, 8192]
model.layers.51.self_attn.o_proj.weight -> blk.51.attn_output.weight | BF16 | [8192, 8192]
model.layers.51.mlp.gate_proj.weight -> blk.51.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.51.mlp.up_proj.weight -> blk.51.ffn_up.weight | BF16 | [28672, 8192]
model.layers.51.mlp.down_proj.weight -> blk.51.ffn_down.weight | BF16 | [8192, 28672]
model.layers.51.input_layernorm.weight -> blk.51.attn_norm.weight | BF16 | [8192]
model.layers.51.post_attention_layernorm.weight -> blk.51.ffn_norm.weight | BF16 | [8192]
model.layers.52.self_attn.q_proj.weight -> blk.52.attn_q.weight | BF16 | [8192, 8192]
model.layers.52.self_attn.k_proj.weight -> blk.52.attn_k.weight | BF16 | [1024, 8192]
model.layers.52.self_attn.v_proj.weight -> blk.52.attn_v.weight | BF16 | [1024, 8192]
model.layers.52.self_attn.o_proj.weight -> blk.52.attn_output.weight | BF16 | [8192, 8192]
model.layers.52.mlp.gate_proj.weight -> blk.52.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.52.mlp.up_proj.weight -> blk.52.ffn_up.weight | BF16 | [28672, 8192]
model.layers.52.mlp.down_proj.weight -> blk.52.ffn_down.weight | BF16 | [8192, 28672]
model.layers.52.input_layernorm.weight -> blk.52.attn_norm.weight | BF16 | [8192]
model.layers.52.post_attention_layernorm.weight -> blk.52.ffn_norm.weight | BF16 | [8192]
model.layers.53.self_attn.q_proj.weight -> blk.53.attn_q.weight | BF16 | [8192, 8192]
model.layers.53.self_attn.k_proj.weight -> blk.53.attn_k.weight | BF16 | [1024, 8192]
model.layers.53.self_attn.v_proj.weight -> blk.53.attn_v.weight | BF16 | [1024, 8192]
model.layers.53.self_attn.o_proj.weight -> blk.53.attn_output.weight | BF16 | [8192, 8192]
model.layers.53.mlp.gate_proj.weight -> blk.53.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.53.mlp.up_proj.weight -> blk.53.ffn_up.weight | BF16 | [28672, 8192]
model.layers.53.mlp.down_proj.weight -> blk.53.ffn_down.weight | BF16 | [8192, 28672]
model.layers.53.input_layernorm.weight -> blk.53.attn_norm.weight | BF16 | [8192]
model.layers.53.post_attention_layernorm.weight -> blk.53.ffn_norm.weight | BF16 | [8192]
model.layers.54.self_attn.q_proj.weight -> blk.54.attn_q.weight | BF16 | [8192, 8192]
model.layers.54.self_attn.k_proj.weight -> blk.54.attn_k.weight | BF16 | [1024, 8192]
model.layers.54.self_attn.v_proj.weight -> blk.54.attn_v.weight | BF16 | [1024, 8192]
model.layers.54.self_attn.o_proj.weight -> blk.54.attn_output.weight | BF16 | [8192, 8192]
model.layers.54.mlp.gate_proj.weight -> blk.54.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.54.mlp.up_proj.weight -> blk.54.ffn_up.weight | BF16 | [28672, 8192]
model.layers.54.mlp.down_proj.weight -> blk.54.ffn_down.weight | BF16 | [8192, 28672]
model.layers.54.input_layernorm.weight -> blk.54.attn_norm.weight | BF16 | [8192]
model.layers.54.post_attention_layernorm.weight -> blk.54.ffn_norm.weight | BF16 | [8192]
model.layers.55.self_attn.q_proj.weight -> blk.55.attn_q.weight | BF16 | [8192, 8192]
model.layers.55.self_attn.k_proj.weight -> blk.55.attn_k.weight | BF16 | [1024, 8192]
model.layers.55.self_attn.v_proj.weight -> blk.55.attn_v.weight | BF16 | [1024, 8192]
model.layers.55.self_attn.o_proj.weight -> blk.55.attn_output.weight | BF16 | [8192, 8192]
model.layers.55.mlp.gate_proj.weight -> blk.55.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.55.mlp.up_proj.weight -> blk.55.ffn_up.weight | BF16 | [28672, 8192]
model.layers.55.mlp.down_proj.weight -> blk.55.ffn_down.weight | BF16 | [8192, 28672]
model.layers.55.input_layernorm.weight -> blk.55.attn_norm.weight | BF16 | [8192]
model.layers.55.post_attention_layernorm.weight -> blk.55.ffn_norm.weight | BF16 | [8192]
model.layers.56.self_attn.q_proj.weight -> blk.56.attn_q.weight | BF16 | [8192, 8192]
model.layers.56.self_attn.k_proj.weight -> blk.56.attn_k.weight | BF16 | [1024, 8192]
model.layers.56.self_attn.v_proj.weight -> blk.56.attn_v.weight | BF16 | [1024, 8192]
model.layers.56.self_attn.o_proj.weight -> blk.56.attn_output.weight | BF16 | [8192, 8192]
model.layers.56.mlp.gate_proj.weight -> blk.56.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.56.mlp.up_proj.weight -> blk.56.ffn_up.weight | BF16 | [28672, 8192]
model.layers.56.mlp.down_proj.weight -> blk.56.ffn_down.weight | BF16 | [8192, 28672]
model.layers.56.input_layernorm.weight -> blk.56.attn_norm.weight | BF16 | [8192]
model.layers.56.post_attention_layernorm.weight -> blk.56.ffn_norm.weight | BF16 | [8192]
model.layers.57.self_attn.q_proj.weight -> blk.57.attn_q.weight | BF16 | [8192, 8192]
model.layers.57.self_attn.k_proj.weight -> blk.57.attn_k.weight | BF16 | [1024, 8192]
model.layers.57.self_attn.v_proj.weight -> blk.57.attn_v.weight | BF16 | [1024, 8192]
model.layers.57.self_attn.o_proj.weight -> blk.57.attn_output.weight | BF16 | [8192, 8192]
model.layers.57.mlp.gate_proj.weight -> blk.57.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.57.mlp.up_proj.weight -> blk.57.ffn_up.weight | BF16 | [28672, 8192]
model.layers.57.mlp.down_proj.weight -> blk.57.ffn_down.weight | BF16 | [8192, 28672]
model.layers.57.input_layernorm.weight -> blk.57.attn_norm.weight | BF16 | [8192]
model.layers.57.post_attention_layernorm.weight -> blk.57.ffn_norm.weight | BF16 | [8192]
model.layers.58.self_attn.q_proj.weight -> blk.58.attn_q.weight | BF16 | [8192, 8192]
model.layers.58.self_attn.k_proj.weight -> blk.58.attn_k.weight | BF16 | [1024, 8192]
model.layers.58.self_attn.v_proj.weight -> blk.58.attn_v.weight | BF16 | [1024, 8192]
model.layers.58.self_attn.o_proj.weight -> blk.58.attn_output.weight | BF16 | [8192, 8192]
model.layers.58.mlp.gate_proj.weight -> blk.58.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.58.mlp.up_proj.weight -> blk.58.ffn_up.weight | BF16 | [28672, 8192]
model.layers.58.mlp.down_proj.weight -> blk.58.ffn_down.weight | BF16 | [8192, 28672]
model.layers.58.input_layernorm.weight -> blk.58.attn_norm.weight | BF16 | [8192]
model.layers.58.post_attention_layernorm.weight -> blk.58.ffn_norm.weight | BF16 | [8192]
model.layers.59.self_attn.q_proj.weight -> blk.59.attn_q.weight | BF16 | [8192, 8192]
model.layers.59.self_attn.k_proj.weight -> blk.59.attn_k.weight | BF16 | [1024, 8192]
model.layers.59.self_attn.v_proj.weight -> blk.59.attn_v.weight | BF16 | [1024, 8192]
model.layers.59.self_attn.o_proj.weight -> blk.59.attn_output.weight | BF16 | [8192, 8192]
model.layers.59.mlp.gate_proj.weight -> blk.59.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.59.mlp.up_proj.weight -> blk.59.ffn_up.weight | BF16 | [28672, 8192]
model.layers.59.mlp.down_proj.weight -> blk.59.ffn_down.weight | BF16 | [8192, 28672]
model.layers.59.input_layernorm.weight -> blk.59.attn_norm.weight | BF16 | [8192]
model.layers.59.post_attention_layernorm.weight -> blk.59.ffn_norm.weight | BF16 | [8192]
model.layers.60.self_attn.q_proj.weight -> blk.60.attn_q.weight | BF16 | [8192, 8192]
model.layers.60.self_attn.k_proj.weight -> blk.60.attn_k.weight | BF16 | [1024, 8192]
model.layers.60.self_attn.v_proj.weight -> blk.60.attn_v.weight | BF16 | [1024, 8192]
model.layers.60.self_attn.o_proj.weight -> blk.60.attn_output.weight | BF16 | [8192, 8192]
model.layers.60.mlp.gate_proj.weight -> blk.60.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.60.mlp.up_proj.weight -> blk.60.ffn_up.weight | BF16 | [28672, 8192]
model.layers.60.mlp.down_proj.weight -> blk.60.ffn_down.weight | BF16 | [8192, 28672]
model.layers.60.input_layernorm.weight -> blk.60.attn_norm.weight | BF16 | [8192]
model.layers.60.post_attention_layernorm.weight -> blk.60.ffn_norm.weight | BF16 | [8192]
model.layers.61.self_attn.q_proj.weight -> blk.61.attn_q.weight | BF16 | [8192, 8192]
model.layers.61.self_attn.k_proj.weight -> blk.61.attn_k.weight | BF16 | [1024, 8192]
model.layers.61.self_attn.v_proj.weight -> blk.61.attn_v.weight | BF16 | [1024, 8192]
model.layers.61.self_attn.o_proj.weight -> blk.61.attn_output.weight | BF16 | [8192, 8192]
model.layers.61.mlp.gate_proj.weight -> blk.61.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.61.mlp.up_proj.weight -> blk.61.ffn_up.weight | BF16 | [28672, 8192]
model.layers.61.mlp.down_proj.weight -> blk.61.ffn_down.weight | BF16 | [8192, 28672]
model.layers.61.input_layernorm.weight -> blk.61.attn_norm.weight | BF16 | [8192]
model.layers.61.post_attention_layernorm.weight -> blk.61.ffn_norm.weight | BF16 | [8192]
model.layers.62.self_attn.q_proj.weight -> blk.62.attn_q.weight | BF16 | [8192, 8192]
model.layers.62.self_attn.k_proj.weight -> blk.62.attn_k.weight | BF16 | [1024, 8192]
model.layers.62.self_attn.v_proj.weight -> blk.62.attn_v.weight | BF16 | [1024, 8192]
model.layers.62.self_attn.o_proj.weight -> blk.62.attn_output.weight | BF16 | [8192, 8192]
model.layers.62.mlp.gate_proj.weight -> blk.62.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.62.mlp.up_proj.weight -> blk.62.ffn_up.weight | BF16 | [28672, 8192]
model.layers.62.mlp.down_proj.weight -> blk.62.ffn_down.weight | BF16 | [8192, 28672]
model.layers.62.input_layernorm.weight -> blk.62.attn_norm.weight | BF16 | [8192]
model.layers.62.post_attention_layernorm.weight -> blk.62.ffn_norm.weight | BF16 | [8192]
model.layers.63.self_attn.q_proj.weight -> blk.63.attn_q.weight | BF16 | [8192, 8192]
model.layers.63.self_attn.k_proj.weight -> blk.63.attn_k.weight | BF16 | [1024, 8192]
model.layers.63.self_attn.v_proj.weight -> blk.63.attn_v.weight | BF16 | [1024, 8192]
model.layers.63.self_attn.o_proj.weight -> blk.63.attn_output.weight | BF16 | [8192, 8192]
model.layers.63.mlp.gate_proj.weight -> blk.63.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.63.mlp.up_proj.weight -> blk.63.ffn_up.weight | BF16 | [28672, 8192]
model.layers.63.mlp.down_proj.weight -> blk.63.ffn_down.weight | BF16 | [8192, 28672]
model.layers.63.input_layernorm.weight -> blk.63.attn_norm.weight | BF16 | [8192]
model.layers.63.post_attention_layernorm.weight -> blk.63.ffn_norm.weight | BF16 | [8192]
model.layers.64.self_attn.q_proj.weight -> blk.64.attn_q.weight | BF16 | [8192, 8192]
model.layers.64.self_attn.k_proj.weight -> blk.64.attn_k.weight | BF16 | [1024, 8192]
model.layers.64.self_attn.v_proj.weight -> blk.64.attn_v.weight | BF16 | [1024, 8192]
model.layers.64.self_attn.o_proj.weight -> blk.64.attn_output.weight | BF16 | [8192, 8192]
model.layers.64.mlp.gate_proj.weight -> blk.64.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.64.mlp.up_proj.weight -> blk.64.ffn_up.weight | BF16 | [28672, 8192]
model.layers.64.mlp.down_proj.weight -> blk.64.ffn_down.weight | BF16 | [8192, 28672]
model.layers.64.input_layernorm.weight -> blk.64.attn_norm.weight | BF16 | [8192]
model.layers.64.post_attention_layernorm.weight -> blk.64.ffn_norm.weight | BF16 | [8192]
model.layers.65.self_attn.q_proj.weight -> blk.65.attn_q.weight | BF16 | [8192, 8192]
model.layers.65.self_attn.k_proj.weight -> blk.65.attn_k.weight | BF16 | [1024, 8192]
model.layers.65.self_attn.v_proj.weight -> blk.65.attn_v.weight | BF16 | [1024, 8192]
model.layers.65.self_attn.o_proj.weight -> blk.65.attn_output.weight | BF16 | [8192, 8192]
model.layers.65.mlp.gate_proj.weight -> blk.65.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.65.mlp.up_proj.weight -> blk.65.ffn_up.weight | BF16 | [28672, 8192]
model.layers.65.mlp.down_proj.weight -> blk.65.ffn_down.weight | BF16 | [8192, 28672]
model.layers.65.input_layernorm.weight -> blk.65.attn_norm.weight | BF16 | [8192]
model.layers.65.post_attention_layernorm.weight -> blk.65.ffn_norm.weight | BF16 | [8192]
model.layers.66.self_attn.q_proj.weight -> blk.66.attn_q.weight | BF16 | [8192, 8192]
model.layers.66.self_attn.k_proj.weight -> blk.66.attn_k.weight | BF16 | [1024, 8192]
model.layers.66.self_attn.v_proj.weight -> blk.66.attn_v.weight | BF16 | [1024, 8192]
model.layers.66.self_attn.o_proj.weight -> blk.66.attn_output.weight | BF16 | [8192, 8192]
model.layers.66.mlp.gate_proj.weight -> blk.66.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.66.mlp.up_proj.weight -> blk.66.ffn_up.weight | BF16 | [28672, 8192]
model.layers.66.mlp.down_proj.weight -> blk.66.ffn_down.weight | BF16 | [8192, 28672]
model.layers.66.input_layernorm.weight -> blk.66.attn_norm.weight | BF16 | [8192]
model.layers.66.post_attention_layernorm.weight -> blk.66.ffn_norm.weight | BF16 | [8192]
model.layers.67.self_attn.q_proj.weight -> blk.67.attn_q.weight | BF16 | [8192, 8192]
model.layers.67.self_attn.k_proj.weight -> blk.67.attn_k.weight | BF16 | [1024, 8192]
model.layers.67.self_attn.v_proj.weight -> blk.67.attn_v.weight | BF16 | [1024, 8192]
model.layers.67.self_attn.o_proj.weight -> blk.67.attn_output.weight | BF16 | [8192, 8192]
model.layers.67.mlp.gate_proj.weight -> blk.67.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.67.mlp.up_proj.weight -> blk.67.ffn_up.weight | BF16 | [28672, 8192]
model.layers.67.mlp.down_proj.weight -> blk.67.ffn_down.weight | BF16 | [8192, 28672]
model.layers.67.input_layernorm.weight -> blk.67.attn_norm.weight | BF16 | [8192]
model.layers.67.post_attention_layernorm.weight -> blk.67.ffn_norm.weight | BF16 | [8192]
model.layers.68.self_attn.q_proj.weight -> blk.68.attn_q.weight | BF16 | [8192, 8192]
model.layers.68.self_attn.k_proj.weight -> blk.68.attn_k.weight | BF16 | [1024, 8192]
model.layers.68.self_attn.v_proj.weight -> blk.68.attn_v.weight | BF16 | [1024, 8192]
model.layers.68.self_attn.o_proj.weight -> blk.68.attn_output.weight | BF16 | [8192, 8192]
model.layers.68.mlp.gate_proj.weight -> blk.68.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.68.mlp.up_proj.weight -> blk.68.ffn_up.weight | BF16 | [28672, 8192]
model.layers.68.mlp.down_proj.weight -> blk.68.ffn_down.weight | BF16 | [8192, 28672]
model.layers.68.input_layernorm.weight -> blk.68.attn_norm.weight | BF16 | [8192]
model.layers.68.post_attention_layernorm.weight -> blk.68.ffn_norm.weight | BF16 | [8192]
model.layers.69.self_attn.q_proj.weight -> blk.69.attn_q.weight | BF16 | [8192, 8192]
model.layers.69.self_attn.k_proj.weight -> blk.69.attn_k.weight | BF16 | [1024, 8192]
model.layers.69.self_attn.v_proj.weight -> blk.69.attn_v.weight | BF16 | [1024, 8192]
model.layers.69.self_attn.o_proj.weight -> blk.69.attn_output.weight | BF16 | [8192, 8192]
model.layers.69.mlp.gate_proj.weight -> blk.69.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.69.mlp.up_proj.weight -> blk.69.ffn_up.weight | BF16 | [28672, 8192]
model.layers.69.mlp.down_proj.weight -> blk.69.ffn_down.weight | BF16 | [8192, 28672]
model.layers.69.input_layernorm.weight -> blk.69.attn_norm.weight | BF16 | [8192]
model.layers.69.post_attention_layernorm.weight -> blk.69.ffn_norm.weight | BF16 | [8192]
model.layers.70.self_attn.q_proj.weight -> blk.70.attn_q.weight | BF16 | [8192, 8192]
model.layers.70.self_attn.k_proj.weight -> blk.70.attn_k.weight | BF16 | [1024, 8192]
model.layers.70.self_attn.v_proj.weight -> blk.70.attn_v.weight | BF16 | [1024, 8192]
model.layers.70.self_attn.o_proj.weight -> blk.70.attn_output.weight | BF16 | [8192, 8192]
model.layers.70.mlp.gate_proj.weight -> blk.70.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.70.mlp.up_proj.weight -> blk.70.ffn_up.weight | BF16 | [28672, 8192]
model.layers.70.mlp.down_proj.weight -> blk.70.ffn_down.weight | BF16 | [8192, 28672]
model.layers.70.input_layernorm.weight -> blk.70.attn_norm.weight | BF16 | [8192]
model.layers.70.post_attention_layernorm.weight -> blk.70.ffn_norm.weight | BF16 | [8192]
model.layers.71.self_attn.q_proj.weight -> blk.71.attn_q.weight | BF16 | [8192, 8192]
model.layers.71.self_attn.k_proj.weight -> blk.71.attn_k.weight | BF16 | [1024, 8192]
model.layers.71.self_attn.v_proj.weight -> blk.71.attn_v.weight | BF16 | [1024, 8192]
model.layers.71.self_attn.o_proj.weight -> blk.71.attn_output.weight | BF16 | [8192, 8192]
model.layers.71.mlp.gate_proj.weight -> blk.71.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.71.mlp.up_proj.weight -> blk.71.ffn_up.weight | BF16 | [28672, 8192]
model.layers.71.mlp.down_proj.weight -> blk.71.ffn_down.weight | BF16 | [8192, 28672]
model.layers.71.input_layernorm.weight -> blk.71.attn_norm.weight | BF16 | [8192]
model.layers.71.post_attention_layernorm.weight -> blk.71.ffn_norm.weight | BF16 | [8192]
model.layers.72.self_attn.q_proj.weight -> blk.72.attn_q.weight | BF16 | [8192, 8192]
model.layers.72.self_attn.k_proj.weight -> blk.72.attn_k.weight | BF16 | [1024, 8192]
model.layers.72.self_attn.v_proj.weight -> blk.72.attn_v.weight | BF16 | [1024, 8192]
model.layers.72.self_attn.o_proj.weight -> blk.72.attn_output.weight | BF16 | [8192, 8192]
model.layers.72.mlp.gate_proj.weight -> blk.72.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.72.mlp.up_proj.weight -> blk.72.ffn_up.weight | BF16 | [28672, 8192]
model.layers.72.mlp.down_proj.weight -> blk.72.ffn_down.weight | BF16 | [8192, 28672]
model.layers.72.input_layernorm.weight -> blk.72.attn_norm.weight | BF16 | [8192]
model.layers.72.post_attention_layernorm.weight -> blk.72.ffn_norm.weight | BF16 | [8192]
model.layers.73.self_attn.q_proj.weight -> blk.73.attn_q.weight | BF16 | [8192, 8192]
model.layers.73.self_attn.k_proj.weight -> blk.73.attn_k.weight | BF16 | [1024, 8192]
model.layers.73.self_attn.v_proj.weight -> blk.73.attn_v.weight | BF16 | [1024, 8192]
model.layers.73.self_attn.o_proj.weight -> blk.73.attn_output.weight | BF16 | [8192, 8192]
model.layers.73.mlp.gate_proj.weight -> blk.73.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.73.mlp.up_proj.weight -> blk.73.ffn_up.weight | BF16 | [28672, 8192]
model.layers.73.mlp.down_proj.weight -> blk.73.ffn_down.weight | BF16 | [8192, 28672]
model.layers.73.input_layernorm.weight -> blk.73.attn_norm.weight | BF16 | [8192]
model.layers.73.post_attention_layernorm.weight -> blk.73.ffn_norm.weight | BF16 | [8192]
model.layers.74.self_attn.q_proj.weight -> blk.74.attn_q.weight | BF16 | [8192, 8192]
model.layers.74.self_attn.k_proj.weight -> blk.74.attn_k.weight | BF16 | [1024, 8192]
model.layers.74.self_attn.v_proj.weight -> blk.74.attn_v.weight | BF16 | [1024, 8192]
model.layers.74.self_attn.o_proj.weight -> blk.74.attn_output.weight | BF16 | [8192, 8192]
model.layers.74.mlp.gate_proj.weight -> blk.74.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.74.mlp.up_proj.weight -> blk.74.ffn_up.weight | BF16 | [28672, 8192]
model.layers.74.mlp.down_proj.weight -> blk.74.ffn_down.weight | BF16 | [8192, 28672]
model.layers.74.input_layernorm.weight -> blk.74.attn_norm.weight | BF16 | [8192]
model.layers.74.post_attention_layernorm.weight -> blk.74.ffn_norm.weight | BF16 | [8192]
model.layers.75.self_attn.q_proj.weight -> blk.75.attn_q.weight | BF16 | [8192, 8192]
model.layers.75.self_attn.k_proj.weight -> blk.75.attn_k.weight | BF16 | [1024, 8192]
model.layers.75.self_attn.v_proj.weight -> blk.75.attn_v.weight | BF16 | [1024, 8192]
model.layers.75.self_attn.o_proj.weight -> blk.75.attn_output.weight | BF16 | [8192, 8192]
model.layers.75.mlp.gate_proj.weight -> blk.75.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.75.mlp.up_proj.weight -> blk.75.ffn_up.weight | BF16 | [28672, 8192]
model.layers.75.mlp.down_proj.weight -> blk.75.ffn_down.weight | BF16 | [8192, 28672]
model.layers.75.input_layernorm.weight -> blk.75.attn_norm.weight | BF16 | [8192]
model.layers.75.post_attention_layernorm.weight -> blk.75.ffn_norm.weight | BF16 | [8192]
model.layers.76.self_attn.q_proj.weight -> blk.76.attn_q.weight | BF16 | [8192, 8192]
model.layers.76.self_attn.k_proj.weight -> blk.76.attn_k.weight | BF16 | [1024, 8192]
model.layers.76.self_attn.v_proj.weight -> blk.76.attn_v.weight | BF16 | [1024, 8192]
model.layers.76.self_attn.o_proj.weight -> blk.76.attn_output.weight | BF16 | [8192, 8192]
model.layers.76.mlp.gate_proj.weight -> blk.76.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.76.mlp.up_proj.weight -> blk.76.ffn_up.weight | BF16 | [28672, 8192]
model.layers.76.mlp.down_proj.weight -> blk.76.ffn_down.weight | BF16 | [8192, 28672]
model.layers.76.input_layernorm.weight -> blk.76.attn_norm.weight | BF16 | [8192]
model.layers.76.post_attention_layernorm.weight -> blk.76.ffn_norm.weight | BF16 | [8192]
model.layers.77.self_attn.q_proj.weight -> blk.77.attn_q.weight | BF16 | [8192, 8192]
model.layers.77.self_attn.k_proj.weight -> blk.77.attn_k.weight | BF16 | [1024, 8192]
model.layers.77.self_attn.v_proj.weight -> blk.77.attn_v.weight | BF16 | [1024, 8192]
model.layers.77.self_attn.o_proj.weight -> blk.77.attn_output.weight | BF16 | [8192, 8192]
model.layers.77.mlp.gate_proj.weight -> blk.77.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.77.mlp.up_proj.weight -> blk.77.ffn_up.weight | BF16 | [28672, 8192]
model.layers.77.mlp.down_proj.weight -> blk.77.ffn_down.weight | BF16 | [8192, 28672]
model.layers.77.input_layernorm.weight -> blk.77.attn_norm.weight | BF16 | [8192]
model.layers.77.post_attention_layernorm.weight -> blk.77.ffn_norm.weight | BF16 | [8192]
model.layers.78.self_attn.q_proj.weight -> blk.78.attn_q.weight | BF16 | [8192, 8192]
model.layers.78.self_attn.k_proj.weight -> blk.78.attn_k.weight | BF16 | [1024, 8192]
model.layers.78.self_attn.v_proj.weight -> blk.78.attn_v.weight | BF16 | [1024, 8192]
model.layers.78.self_attn.o_proj.weight -> blk.78.attn_output.weight | BF16 | [8192, 8192]
model.layers.78.mlp.gate_proj.weight -> blk.78.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.78.mlp.up_proj.weight -> blk.78.ffn_up.weight | BF16 | [28672, 8192]
model.layers.78.mlp.down_proj.weight -> blk.78.ffn_down.weight | BF16 | [8192, 28672]
model.layers.78.input_layernorm.weight -> blk.78.attn_norm.weight | BF16 | [8192]
model.layers.78.post_attention_layernorm.weight -> blk.78.ffn_norm.weight | BF16 | [8192]
model.layers.79.self_attn.q_proj.weight -> blk.79.attn_q.weight | BF16 | [8192, 8192]
model.layers.79.self_attn.k_proj.weight -> blk.79.attn_k.weight | BF16 | [1024, 8192]
model.layers.79.self_attn.v_proj.weight -> blk.79.attn_v.weight | BF16 | [1024, 8192]
model.layers.79.self_attn.o_proj.weight -> blk.79.attn_output.weight | BF16 | [8192, 8192]
model.layers.79.mlp.gate_proj.weight -> blk.79.ffn_gate.weight | BF16 | [28672, 8192]
model.layers.79.mlp.up_proj.weight -> blk.79.ffn_up.weight | BF16 | [28672, 8192]
model.layers.79.mlp.down_proj.weight -> blk.79.ffn_down.weight | BF16 | [8192, 28672]
model.layers.79.input_layernorm.weight -> blk.79.attn_norm.weight | BF16 | [8192]
model.layers.79.post_attention_layernorm.weight -> blk.79.ffn_norm.weight | BF16 | [8192]
model.norm.weight -> output_norm.weight | BF16 | [8192]
lm_head.weight -> output.weight | BF16 | [32000, 8192]
Writing /workspace/process/jondurbin_spicyboros-70b-2.2/gguf/spicyboros-70b-2.2.fp16.gguf, format 1
gguf: Setting special token type bos to 1
gguf: Setting special token type eos to 2
gguf: Setting special token type unk to 0
[ 1/723] Writing tensor token_embd.weight | size 32000 x 8192 | type F16 | T+ 2
[ 2/723] Writing tensor blk.0.attn_q.weight | size 8192 x 8192 | type F16 | T+ 2
[ 3/723] Writing tensor blk.0.attn_k.weight | size 1024 x 8192 | type F16 | T+ 2
[ 4/723] Writing tensor blk.0.attn_v.weight | size 1024 x 8192 | type F16 | T+ 2
[ 5/723] Writing tensor blk.0.attn_output.weight | size 8192 x 8192 | type F16 | T+ 2
[ 6/723] Writing tensor blk.0.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 3
[ 7/723] Writing tensor blk.0.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 3
[ 8/723] Writing tensor blk.0.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 3
[ 9/723] Writing tensor blk.0.attn_norm.weight | size 8192 | type F32 | T+ 3
[ 10/723] Writing tensor blk.0.ffn_norm.weight | size 8192 | type F32 | T+ 3
[ 11/723] Writing tensor blk.1.attn_q.weight | size 8192 x 8192 | type F16 | T+ 3
[ 12/723] Writing tensor blk.1.attn_k.weight | size 1024 x 8192 | type F16 | T+ 3
[ 13/723] Writing tensor blk.1.attn_v.weight | size 1024 x 8192 | type F16 | T+ 3
[ 14/723] Writing tensor blk.1.attn_output.weight | size 8192 x 8192 | type F16 | T+ 3
[ 15/723] Writing tensor blk.1.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 5
[ 16/723] Writing tensor blk.1.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 5
[ 17/723] Writing tensor blk.1.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 5
[ 18/723] Writing tensor blk.1.attn_norm.weight | size 8192 | type F32 | T+ 6
[ 19/723] Writing tensor blk.1.ffn_norm.weight | size 8192 | type F32 | T+ 6
[ 20/723] Writing tensor blk.2.attn_q.weight | size 8192 x 8192 | type F16 | T+ 6
[ 21/723] Writing tensor blk.2.attn_k.weight | size 1024 x 8192 | type F16 | T+ 6
[ 22/723] Writing tensor blk.2.attn_v.weight | size 1024 x 8192 | type F16 | T+ 6
[ 23/723] Writing tensor blk.2.attn_output.weight | size 8192 x 8192 | type F16 | T+ 6
[ 24/723] Writing tensor blk.2.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 7
[ 25/723] Writing tensor blk.2.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 8
[ 26/723] Writing tensor blk.2.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 8
[ 27/723] Writing tensor blk.2.attn_norm.weight | size 8192 | type F32 | T+ 8
[ 28/723] Writing tensor blk.2.ffn_norm.weight | size 8192 | type F32 | T+ 8
[ 29/723] Writing tensor blk.3.attn_q.weight | size 8192 x 8192 | type F16 | T+ 8
[ 30/723] Writing tensor blk.3.attn_k.weight | size 1024 x 8192 | type F16 | T+ 8
[ 31/723] Writing tensor blk.3.attn_v.weight | size 1024 x 8192 | type F16 | T+ 8
[ 32/723] Writing tensor blk.3.attn_output.weight | size 8192 x 8192 | type F16 | T+ 8
[ 33/723] Writing tensor blk.3.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 10
[ 34/723] Writing tensor blk.3.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 10
[ 35/723] Writing tensor blk.3.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 11
[ 36/723] Writing tensor blk.3.attn_norm.weight | size 8192 | type F32 | T+ 11
[ 37/723] Writing tensor blk.3.ffn_norm.weight | size 8192 | type F32 | T+ 11
[ 38/723] Writing tensor blk.4.attn_q.weight | size 8192 x 8192 | type F16 | T+ 11
[ 39/723] Writing tensor blk.4.attn_k.weight | size 1024 x 8192 | type F16 | T+ 11
[ 40/723] Writing tensor blk.4.attn_v.weight | size 1024 x 8192 | type F16 | T+ 11
[ 41/723] Writing tensor blk.4.attn_output.weight | size 8192 x 8192 | type F16 | T+ 11
[ 42/723] Writing tensor blk.4.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 12
[ 43/723] Writing tensor blk.4.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 13
[ 44/723] Writing tensor blk.4.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 13
[ 45/723] Writing tensor blk.4.attn_norm.weight | size 8192 | type F32 | T+ 14
[ 46/723] Writing tensor blk.4.ffn_norm.weight | size 8192 | type F32 | T+ 14
[ 47/723] Writing tensor blk.5.attn_q.weight | size 8192 x 8192 | type F16 | T+ 14
[ 48/723] Writing tensor blk.5.attn_k.weight | size 1024 x 8192 | type F16 | T+ 14
[ 49/723] Writing tensor blk.5.attn_v.weight | size 1024 x 8192 | type F16 | T+ 14
[ 50/723] Writing tensor blk.5.attn_output.weight | size 8192 x 8192 | type F16 | T+ 14
[ 51/723] Writing tensor blk.5.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 15
[ 52/723] Writing tensor blk.5.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 15
[ 53/723] Writing tensor blk.5.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 16
[ 54/723] Writing tensor blk.5.attn_norm.weight | size 8192 | type F32 | T+ 16
[ 55/723] Writing tensor blk.5.ffn_norm.weight | size 8192 | type F32 | T+ 16
[ 56/723] Writing tensor blk.6.attn_q.weight | size 8192 x 8192 | type F16 | T+ 16
[ 57/723] Writing tensor blk.6.attn_k.weight | size 1024 x 8192 | type F16 | T+ 16
[ 58/723] Writing tensor blk.6.attn_v.weight | size 1024 x 8192 | type F16 | T+ 16
[ 59/723] Writing tensor blk.6.attn_output.weight | size 8192 x 8192 | type F16 | T+ 16
[ 60/723] Writing tensor blk.6.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 17
[ 61/723] Writing tensor blk.6.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 18
[ 62/723] Writing tensor blk.6.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 19
[ 63/723] Writing tensor blk.6.attn_norm.weight | size 8192 | type F32 | T+ 19
[ 64/723] Writing tensor blk.6.ffn_norm.weight | size 8192 | type F32 | T+ 19
[ 65/723] Writing tensor blk.7.attn_q.weight | size 8192 x 8192 | type F16 | T+ 19
[ 66/723] Writing tensor blk.7.attn_k.weight | size 1024 x 8192 | type F16 | T+ 19
[ 67/723] Writing tensor blk.7.attn_v.weight | size 1024 x 8192 | type F16 | T+ 19
[ 68/723] Writing tensor blk.7.attn_output.weight | size 8192 x 8192 | type F16 | T+ 19
[ 69/723] Writing tensor blk.7.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 20
[ 70/723] Writing tensor blk.7.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 21
[ 71/723] Writing tensor blk.7.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 21
[ 72/723] Writing tensor blk.7.attn_norm.weight | size 8192 | type F32 | T+ 21
[ 73/723] Writing tensor blk.7.ffn_norm.weight | size 8192 | type F32 | T+ 21
[ 74/723] Writing tensor blk.8.attn_q.weight | size 8192 x 8192 | type F16 | T+ 21
[ 75/723] Writing tensor blk.8.attn_k.weight | size 1024 x 8192 | type F16 | T+ 22
[ 76/723] Writing tensor blk.8.attn_v.weight | size 1024 x 8192 | type F16 | T+ 22
[ 77/723] Writing tensor blk.8.attn_output.weight | size 8192 x 8192 | type F16 | T+ 22
[ 78/723] Writing tensor blk.8.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 23
[ 79/723] Writing tensor blk.8.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 23
[ 80/723] Writing tensor blk.8.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 24
[ 81/723] Writing tensor blk.8.attn_norm.weight | size 8192 | type F32 | T+ 24
[ 82/723] Writing tensor blk.8.ffn_norm.weight | size 8192 | type F32 | T+ 24
[ 83/723] Writing tensor blk.9.attn_q.weight | size 8192 x 8192 | type F16 | T+ 24
[ 84/723] Writing tensor blk.9.attn_k.weight | size 1024 x 8192 | type F16 | T+ 24
[ 85/723] Writing tensor blk.9.attn_v.weight | size 1024 x 8192 | type F16 | T+ 24
[ 86/723] Writing tensor blk.9.attn_output.weight | size 8192 x 8192 | type F16 | T+ 24
[ 87/723] Writing tensor blk.9.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 25
[ 88/723] Writing tensor blk.9.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 26
[ 89/723] Writing tensor blk.9.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 26
[ 90/723] Writing tensor blk.9.attn_norm.weight | size 8192 | type F32 | T+ 27
[ 91/723] Writing tensor blk.9.ffn_norm.weight | size 8192 | type F32 | T+ 27
[ 92/723] Writing tensor blk.10.attn_q.weight | size 8192 x 8192 | type F16 | T+ 27
[ 93/723] Writing tensor blk.10.attn_k.weight | size 1024 x 8192 | type F16 | T+ 27
[ 94/723] Writing tensor blk.10.attn_v.weight | size 1024 x 8192 | type F16 | T+ 27
[ 95/723] Writing tensor blk.10.attn_output.weight | size 8192 x 8192 | type F16 | T+ 27
[ 96/723] Writing tensor blk.10.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 28
[ 97/723] Writing tensor blk.10.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 28
[ 98/723] Writing tensor blk.10.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 29
[ 99/723] Writing tensor blk.10.attn_norm.weight | size 8192 | type F32 | T+ 29
[100/723] Writing tensor blk.10.ffn_norm.weight | size 8192 | type F32 | T+ 29
[101/723] Writing tensor blk.11.attn_q.weight | size 8192 x 8192 | type F16 | T+ 29
[102/723] Writing tensor blk.11.attn_k.weight | size 1024 x 8192 | type F16 | T+ 29
[103/723] Writing tensor blk.11.attn_v.weight | size 1024 x 8192 | type F16 | T+ 29
[104/723] Writing tensor blk.11.attn_output.weight | size 8192 x 8192 | type F16 | T+ 29
[105/723] Writing tensor blk.11.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 30
[106/723] Writing tensor blk.11.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 31
[107/723] Writing tensor blk.11.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 32
[108/723] Writing tensor blk.11.attn_norm.weight | size 8192 | type F32 | T+ 32
[109/723] Writing tensor blk.11.ffn_norm.weight | size 8192 | type F32 | T+ 32
[110/723] Writing tensor blk.12.attn_q.weight | size 8192 x 8192 | type F16 | T+ 32
[111/723] Writing tensor blk.12.attn_k.weight | size 1024 x 8192 | type F16 | T+ 32
[112/723] Writing tensor blk.12.attn_v.weight | size 1024 x 8192 | type F16 | T+ 32
[113/723] Writing tensor blk.12.attn_output.weight | size 8192 x 8192 | type F16 | T+ 32
[114/723] Writing tensor blk.12.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 33
[115/723] Writing tensor blk.12.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 34
[116/723] Writing tensor blk.12.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 34
[117/723] Writing tensor blk.12.attn_norm.weight | size 8192 | type F32 | T+ 35
[118/723] Writing tensor blk.12.ffn_norm.weight | size 8192 | type F32 | T+ 35
[119/723] Writing tensor blk.13.attn_q.weight | size 8192 x 8192 | type F16 | T+ 35
[120/723] Writing tensor blk.13.attn_k.weight | size 1024 x 8192 | type F16 | T+ 35
[121/723] Writing tensor blk.13.attn_v.weight | size 1024 x 8192 | type F16 | T+ 35
[122/723] Writing tensor blk.13.attn_output.weight | size 8192 x 8192 | type F16 | T+ 35
[123/723] Writing tensor blk.13.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 36
[124/723] Writing tensor blk.13.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 36
[125/723] Writing tensor blk.13.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 37
[126/723] Writing tensor blk.13.attn_norm.weight | size 8192 | type F32 | T+ 37
[127/723] Writing tensor blk.13.ffn_norm.weight | size 8192 | type F32 | T+ 37
[128/723] Writing tensor blk.14.attn_q.weight | size 8192 x 8192 | type F16 | T+ 37
[129/723] Writing tensor blk.14.attn_k.weight | size 1024 x 8192 | type F16 | T+ 37
[130/723] Writing tensor blk.14.attn_v.weight | size 1024 x 8192 | type F16 | T+ 37
[131/723] Writing tensor blk.14.attn_output.weight | size 8192 x 8192 | type F16 | T+ 38
[132/723] Writing tensor blk.14.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 39
[133/723] Writing tensor blk.14.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 39
[134/723] Writing tensor blk.14.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 40
[135/723] Writing tensor blk.14.attn_norm.weight | size 8192 | type F32 | T+ 40
[136/723] Writing tensor blk.14.ffn_norm.weight | size 8192 | type F32 | T+ 40
[137/723] Writing tensor blk.15.attn_q.weight | size 8192 x 8192 | type F16 | T+ 40
[138/723] Writing tensor blk.15.attn_k.weight | size 1024 x 8192 | type F16 | T+ 40
[139/723] Writing tensor blk.15.attn_v.weight | size 1024 x 8192 | type F16 | T+ 40
[140/723] Writing tensor blk.15.attn_output.weight | size 8192 x 8192 | type F16 | T+ 40
[141/723] Writing tensor blk.15.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 41
[142/723] Writing tensor blk.15.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 42
[143/723] Writing tensor blk.15.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 43
[144/723] Writing tensor blk.15.attn_norm.weight | size 8192 | type F32 | T+ 43
[145/723] Writing tensor blk.15.ffn_norm.weight | size 8192 | type F32 | T+ 43
[146/723] Writing tensor blk.16.attn_q.weight | size 8192 x 8192 | type F16 | T+ 43
[147/723] Writing tensor blk.16.attn_k.weight | size 1024 x 8192 | type F16 | T+ 43
[148/723] Writing tensor blk.16.attn_v.weight | size 1024 x 8192 | type F16 | T+ 43
[149/723] Writing tensor blk.16.attn_output.weight | size 8192 x 8192 | type F16 | T+ 43
[150/723] Writing tensor blk.16.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 44
[151/723] Writing tensor blk.16.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 45
[152/723] Writing tensor blk.16.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 45
[153/723] Writing tensor blk.16.attn_norm.weight | size 8192 | type F32 | T+ 45
[154/723] Writing tensor blk.16.ffn_norm.weight | size 8192 | type F32 | T+ 45
[155/723] Writing tensor blk.17.attn_q.weight | size 8192 x 8192 | type F16 | T+ 45
[156/723] Writing tensor blk.17.attn_k.weight | size 1024 x 8192 | type F16 | T+ 46
[157/723] Writing tensor blk.17.attn_v.weight | size 1024 x 8192 | type F16 | T+ 46
[158/723] Writing tensor blk.17.attn_output.weight | size 8192 x 8192 | type F16 | T+ 46
[159/723] Writing tensor blk.17.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 47
[160/723] Writing tensor blk.17.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 47
[161/723] Writing tensor blk.17.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 48
[162/723] Writing tensor blk.17.attn_norm.weight | size 8192 | type F32 | T+ 48
[163/723] Writing tensor blk.17.ffn_norm.weight | size 8192 | type F32 | T+ 48
[164/723] Writing tensor blk.18.attn_q.weight | size 8192 x 8192 | type F16 | T+ 48
[165/723] Writing tensor blk.18.attn_k.weight | size 1024 x 8192 | type F16 | T+ 48
[166/723] Writing tensor blk.18.attn_v.weight | size 1024 x 8192 | type F16 | T+ 48
[167/723] Writing tensor blk.18.attn_output.weight | size 8192 x 8192 | type F16 | T+ 48
[168/723] Writing tensor blk.18.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 49
[169/723] Writing tensor blk.18.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 50
[170/723] Writing tensor blk.18.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 51
[171/723] Writing tensor blk.18.attn_norm.weight | size 8192 | type F32 | T+ 51
[172/723] Writing tensor blk.18.ffn_norm.weight | size 8192 | type F32 | T+ 51
[173/723] Writing tensor blk.19.attn_q.weight | size 8192 x 8192 | type F16 | T+ 51
[174/723] Writing tensor blk.19.attn_k.weight | size 1024 x 8192 | type F16 | T+ 51
[175/723] Writing tensor blk.19.attn_v.weight | size 1024 x 8192 | type F16 | T+ 51
[176/723] Writing tensor blk.19.attn_output.weight | size 8192 x 8192 | type F16 | T+ 51
[177/723] Writing tensor blk.19.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 52
[178/723] Writing tensor blk.19.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 53
[179/723] Writing tensor blk.19.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 53
[180/723] Writing tensor blk.19.attn_norm.weight | size 8192 | type F32 | T+ 54
[181/723] Writing tensor blk.19.ffn_norm.weight | size 8192 | type F32 | T+ 54
[182/723] Writing tensor blk.20.attn_q.weight | size 8192 x 8192 | type F16 | T+ 54
[183/723] Writing tensor blk.20.attn_k.weight | size 1024 x 8192 | type F16 | T+ 54
[184/723] Writing tensor blk.20.attn_v.weight | size 1024 x 8192 | type F16 | T+ 54
[185/723] Writing tensor blk.20.attn_output.weight | size 8192 x 8192 | type F16 | T+ 54
[186/723] Writing tensor blk.20.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 55
[187/723] Writing tensor blk.20.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 55
[188/723] Writing tensor blk.20.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 56
[189/723] Writing tensor blk.20.attn_norm.weight | size 8192 | type F32 | T+ 56
[190/723] Writing tensor blk.20.ffn_norm.weight | size 8192 | type F32 | T+ 56
[191/723] Writing tensor blk.21.attn_q.weight | size 8192 x 8192 | type F16 | T+ 56
[192/723] Writing tensor blk.21.attn_k.weight | size 1024 x 8192 | type F16 | T+ 56
[193/723] Writing tensor blk.21.attn_v.weight | size 1024 x 8192 | type F16 | T+ 56
[194/723] Writing tensor blk.21.attn_output.weight | size 8192 x 8192 | type F16 | T+ 56
[195/723] Writing tensor blk.21.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 57
[196/723] Writing tensor blk.21.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 58
[197/723] Writing tensor blk.21.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 59
[198/723] Writing tensor blk.21.attn_norm.weight | size 8192 | type F32 | T+ 59
[199/723] Writing tensor blk.21.ffn_norm.weight | size 8192 | type F32 | T+ 59
[200/723] Writing tensor blk.22.attn_q.weight | size 8192 x 8192 | type F16 | T+ 59
[201/723] Writing tensor blk.22.attn_k.weight | size 1024 x 8192 | type F16 | T+ 59
[202/723] Writing tensor blk.22.attn_v.weight | size 1024 x 8192 | type F16 | T+ 59
[203/723] Writing tensor blk.22.attn_output.weight | size 8192 x 8192 | type F16 | T+ 59
[204/723] Writing tensor blk.22.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 60
[205/723] Writing tensor blk.22.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 61
[206/723] Writing tensor blk.22.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 61
[207/723] Writing tensor blk.22.attn_norm.weight | size 8192 | type F32 | T+ 62
[208/723] Writing tensor blk.22.ffn_norm.weight | size 8192 | type F32 | T+ 62
[209/723] Writing tensor blk.23.attn_q.weight | size 8192 x 8192 | type F16 | T+ 62
[210/723] Writing tensor blk.23.attn_k.weight | size 1024 x 8192 | type F16 | T+ 62
[211/723] Writing tensor blk.23.attn_v.weight | size 1024 x 8192 | type F16 | T+ 62
[212/723] Writing tensor blk.23.attn_output.weight | size 8192 x 8192 | type F16 | T+ 62
[213/723] Writing tensor blk.23.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 63
[214/723] Writing tensor blk.23.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 63
[215/723] Writing tensor blk.23.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 64
[216/723] Writing tensor blk.23.attn_norm.weight | size 8192 | type F32 | T+ 64
[217/723] Writing tensor blk.23.ffn_norm.weight | size 8192 | type F32 | T+ 64
[218/723] Writing tensor blk.24.attn_q.weight | size 8192 x 8192 | type F16 | T+ 64
[219/723] Writing tensor blk.24.attn_k.weight | size 1024 x 8192 | type F16 | T+ 64
[220/723] Writing tensor blk.24.attn_v.weight | size 1024 x 8192 | type F16 | T+ 64
[221/723] Writing tensor blk.24.attn_output.weight | size 8192 x 8192 | type F16 | T+ 64
[222/723] Writing tensor blk.24.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 65
[223/723] Writing tensor blk.24.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 66
[224/723] Writing tensor blk.24.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 66
[225/723] Writing tensor blk.24.attn_norm.weight | size 8192 | type F32 | T+ 66
[226/723] Writing tensor blk.24.ffn_norm.weight | size 8192 | type F32 | T+ 66
[227/723] Writing tensor blk.25.attn_q.weight | size 8192 x 8192 | type F16 | T+ 66
[228/723] Writing tensor blk.25.attn_k.weight | size 1024 x 8192 | type F16 | T+ 67
[229/723] Writing tensor blk.25.attn_v.weight | size 1024 x 8192 | type F16 | T+ 67
[230/723] Writing tensor blk.25.attn_output.weight | size 8192 x 8192 | type F16 | T+ 67
[231/723] Writing tensor blk.25.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 68
[232/723] Writing tensor blk.25.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 68
[233/723] Writing tensor blk.25.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 69
[234/723] Writing tensor blk.25.attn_norm.weight | size 8192 | type F32 | T+ 69
[235/723] Writing tensor blk.25.ffn_norm.weight | size 8192 | type F32 | T+ 69
[236/723] Writing tensor blk.26.attn_q.weight | size 8192 x 8192 | type F16 | T+ 69
[237/723] Writing tensor blk.26.attn_k.weight | size 1024 x 8192 | type F16 | T+ 69
[238/723] Writing tensor blk.26.attn_v.weight | size 1024 x 8192 | type F16 | T+ 69
[239/723] Writing tensor blk.26.attn_output.weight | size 8192 x 8192 | type F16 | T+ 69
[240/723] Writing tensor blk.26.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 70
[241/723] Writing tensor blk.26.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 71
[242/723] Writing tensor blk.26.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 71
[243/723] Writing tensor blk.26.attn_norm.weight | size 8192 | type F32 | T+ 71
[244/723] Writing tensor blk.26.ffn_norm.weight | size 8192 | type F32 | T+ 71
[245/723] Writing tensor blk.27.attn_q.weight | size 8192 x 8192 | type F16 | T+ 71
[246/723] Writing tensor blk.27.attn_k.weight | size 1024 x 8192 | type F16 | T+ 72
[247/723] Writing tensor blk.27.attn_v.weight | size 1024 x 8192 | type F16 | T+ 72
[248/723] Writing tensor blk.27.attn_output.weight | size 8192 x 8192 | type F16 | T+ 72
[249/723] Writing tensor blk.27.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 73
[250/723] Writing tensor blk.27.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 73
[251/723] Writing tensor blk.27.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 74
[252/723] Writing tensor blk.27.attn_norm.weight | size 8192 | type F32 | T+ 74
[253/723] Writing tensor blk.27.ffn_norm.weight | size 8192 | type F32 | T+ 74
[254/723] Writing tensor blk.28.attn_q.weight | size 8192 x 8192 | type F16 | T+ 74
[255/723] Writing tensor blk.28.attn_k.weight | size 1024 x 8192 | type F16 | T+ 74
[256/723] Writing tensor blk.28.attn_v.weight | size 1024 x 8192 | type F16 | T+ 74
[257/723] Writing tensor blk.28.attn_output.weight | size 8192 x 8192 | type F16 | T+ 74
[258/723] Writing tensor blk.28.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 75
[259/723] Writing tensor blk.28.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 76
[260/723] Writing tensor blk.28.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 76
[261/723] Writing tensor blk.28.attn_norm.weight | size 8192 | type F32 | T+ 76
[262/723] Writing tensor blk.28.ffn_norm.weight | size 8192 | type F32 | T+ 76
[263/723] Writing tensor blk.29.attn_q.weight | size 8192 x 8192 | type F16 | T+ 76
[264/723] Writing tensor blk.29.attn_k.weight | size 1024 x 8192 | type F16 | T+ 77
[265/723] Writing tensor blk.29.attn_v.weight | size 1024 x 8192 | type F16 | T+ 77
[266/723] Writing tensor blk.29.attn_output.weight | size 8192 x 8192 | type F16 | T+ 77
[267/723] Writing tensor blk.29.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 78
[268/723] Writing tensor blk.29.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 78
[269/723] Writing tensor blk.29.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 79
[270/723] Writing tensor blk.29.attn_norm.weight | size 8192 | type F32 | T+ 79
[271/723] Writing tensor blk.29.ffn_norm.weight | size 8192 | type F32 | T+ 79
[272/723] Writing tensor blk.30.attn_q.weight | size 8192 x 8192 | type F16 | T+ 79
[273/723] Writing tensor blk.30.attn_k.weight | size 1024 x 8192 | type F16 | T+ 79
[274/723] Writing tensor blk.30.attn_v.weight | size 1024 x 8192 | type F16 | T+ 79
[275/723] Writing tensor blk.30.attn_output.weight | size 8192 x 8192 | type F16 | T+ 79
[276/723] Writing tensor blk.30.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 80
[277/723] Writing tensor blk.30.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 81
[278/723] Writing tensor blk.30.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 81
[279/723] Writing tensor blk.30.attn_norm.weight | size 8192 | type F32 | T+ 82
[280/723] Writing tensor blk.30.ffn_norm.weight | size 8192 | type F32 | T+ 82
[281/723] Writing tensor blk.31.attn_q.weight | size 8192 x 8192 | type F16 | T+ 82
[282/723] Writing tensor blk.31.attn_k.weight | size 1024 x 8192 | type F16 | T+ 82
[283/723] Writing tensor blk.31.attn_v.weight | size 1024 x 8192 | type F16 | T+ 82
[284/723] Writing tensor blk.31.attn_output.weight | size 8192 x 8192 | type F16 | T+ 82
[285/723] Writing tensor blk.31.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 83
[286/723] Writing tensor blk.31.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 83
[287/723] Writing tensor blk.31.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 84
[288/723] Writing tensor blk.31.attn_norm.weight | size 8192 | type F32 | T+ 84
[289/723] Writing tensor blk.31.ffn_norm.weight | size 8192 | type F32 | T+ 84
[290/723] Writing tensor blk.32.attn_q.weight | size 8192 x 8192 | type F16 | T+ 84
[291/723] Writing tensor blk.32.attn_k.weight | size 1024 x 8192 | type F16 | T+ 84
[292/723] Writing tensor blk.32.attn_v.weight | size 1024 x 8192 | type F16 | T+ 84
[293/723] Writing tensor blk.32.attn_output.weight | size 8192 x 8192 | type F16 | T+ 84
[294/723] Writing tensor blk.32.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 85
[295/723] Writing tensor blk.32.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 86
[296/723] Writing tensor blk.32.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 86
[297/723] Writing tensor blk.32.attn_norm.weight | size 8192 | type F32 | T+ 86
[298/723] Writing tensor blk.32.ffn_norm.weight | size 8192 | type F32 | T+ 86
[299/723] Writing tensor blk.33.attn_q.weight | size 8192 x 8192 | type F16 | T+ 86
[300/723] Writing tensor blk.33.attn_k.weight | size 1024 x 8192 | type F16 | T+ 87
[301/723] Writing tensor blk.33.attn_v.weight | size 1024 x 8192 | type F16 | T+ 87
[302/723] Writing tensor blk.33.attn_output.weight | size 8192 x 8192 | type F16 | T+ 87
[303/723] Writing tensor blk.33.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 88
[304/723] Writing tensor blk.33.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 89
[305/723] Writing tensor blk.33.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 89
[306/723] Writing tensor blk.33.attn_norm.weight | size 8192 | type F32 | T+ 89
[307/723] Writing tensor blk.33.ffn_norm.weight | size 8192 | type F32 | T+ 89
[308/723] Writing tensor blk.34.attn_q.weight | size 8192 x 8192 | type F16 | T+ 89
[309/723] Writing tensor blk.34.attn_k.weight | size 1024 x 8192 | type F16 | T+ 90
[310/723] Writing tensor blk.34.attn_v.weight | size 1024 x 8192 | type F16 | T+ 90
[311/723] Writing tensor blk.34.attn_output.weight | size 8192 x 8192 | type F16 | T+ 90
[312/723] Writing tensor blk.34.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 91
[313/723] Writing tensor blk.34.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 91
[314/723] Writing tensor blk.34.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 92
[315/723] Writing tensor blk.34.attn_norm.weight | size 8192 | type F32 | T+ 92
[316/723] Writing tensor blk.34.ffn_norm.weight | size 8192 | type F32 | T+ 92
[317/723] Writing tensor blk.35.attn_q.weight | size 8192 x 8192 | type F16 | T+ 92
[318/723] Writing tensor blk.35.attn_k.weight | size 1024 x 8192 | type F16 | T+ 92
[319/723] Writing tensor blk.35.attn_v.weight | size 1024 x 8192 | type F16 | T+ 92
[320/723] Writing tensor blk.35.attn_output.weight | size 8192 x 8192 | type F16 | T+ 92
[321/723] Writing tensor blk.35.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 94
[322/723] Writing tensor blk.35.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 94
[323/723] Writing tensor blk.35.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 95
[324/723] Writing tensor blk.35.attn_norm.weight | size 8192 | type F32 | T+ 95
[325/723] Writing tensor blk.35.ffn_norm.weight | size 8192 | type F32 | T+ 95
[326/723] Writing tensor blk.36.attn_q.weight | size 8192 x 8192 | type F16 | T+ 95
[327/723] Writing tensor blk.36.attn_k.weight | size 1024 x 8192 | type F16 | T+ 95
[328/723] Writing tensor blk.36.attn_v.weight | size 1024 x 8192 | type F16 | T+ 95
[329/723] Writing tensor blk.36.attn_output.weight | size 8192 x 8192 | type F16 | T+ 95
[330/723] Writing tensor blk.36.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 96
[331/723] Writing tensor blk.36.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 97
[332/723] Writing tensor blk.36.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 97
[333/723] Writing tensor blk.36.attn_norm.weight | size 8192 | type F32 | T+ 97
[334/723] Writing tensor blk.36.ffn_norm.weight | size 8192 | type F32 | T+ 97
[335/723] Writing tensor blk.37.attn_q.weight | size 8192 x 8192 | type F16 | T+ 97
[336/723] Writing tensor blk.37.attn_k.weight | size 1024 x 8192 | type F16 | T+ 97
[337/723] Writing tensor blk.37.attn_v.weight | size 1024 x 8192 | type F16 | T+ 97
[338/723] Writing tensor blk.37.attn_output.weight | size 8192 x 8192 | type F16 | T+ 97
[339/723] Writing tensor blk.37.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 99
[340/723] Writing tensor blk.37.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 99
[341/723] Writing tensor blk.37.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 100
[342/723] Writing tensor blk.37.attn_norm.weight | size 8192 | type F32 | T+ 100
[343/723] Writing tensor blk.37.ffn_norm.weight | size 8192 | type F32 | T+ 100
[344/723] Writing tensor blk.38.attn_q.weight | size 8192 x 8192 | type F16 | T+ 100
[345/723] Writing tensor blk.38.attn_k.weight | size 1024 x 8192 | type F16 | T+ 100
[346/723] Writing tensor blk.38.attn_v.weight | size 1024 x 8192 | type F16 | T+ 100
[347/723] Writing tensor blk.38.attn_output.weight | size 8192 x 8192 | type F16 | T+ 100
[348/723] Writing tensor blk.38.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 102
[349/723] Writing tensor blk.38.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 102
[350/723] Writing tensor blk.38.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 103
[351/723] Writing tensor blk.38.attn_norm.weight | size 8192 | type F32 | T+ 104
[352/723] Writing tensor blk.38.ffn_norm.weight | size 8192 | type F32 | T+ 104
[353/723] Writing tensor blk.39.attn_q.weight | size 8192 x 8192 | type F16 | T+ 104
[354/723] Writing tensor blk.39.attn_k.weight | size 1024 x 8192 | type F16 | T+ 104
[355/723] Writing tensor blk.39.attn_v.weight | size 1024 x 8192 | type F16 | T+ 104
[356/723] Writing tensor blk.39.attn_output.weight | size 8192 x 8192 | type F16 | T+ 104
[357/723] Writing tensor blk.39.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 104
[358/723] Writing tensor blk.39.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 105
[359/723] Writing tensor blk.39.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 106
[360/723] Writing tensor blk.39.attn_norm.weight | size 8192 | type F32 | T+ 106
[361/723] Writing tensor blk.39.ffn_norm.weight | size 8192 | type F32 | T+ 106
[362/723] Writing tensor blk.40.attn_q.weight | size 8192 x 8192 | type F16 | T+ 106
[363/723] Writing tensor blk.40.attn_k.weight | size 1024 x 8192 | type F16 | T+ 106
[364/723] Writing tensor blk.40.attn_v.weight | size 1024 x 8192 | type F16 | T+ 106
[365/723] Writing tensor blk.40.attn_output.weight | size 8192 x 8192 | type F16 | T+ 106
[366/723] Writing tensor blk.40.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 107
[367/723] Writing tensor blk.40.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 108
[368/723] Writing tensor blk.40.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 108
[369/723] Writing tensor blk.40.attn_norm.weight | size 8192 | type F32 | T+ 108
[370/723] Writing tensor blk.40.ffn_norm.weight | size 8192 | type F32 | T+ 108
[371/723] Writing tensor blk.41.attn_q.weight | size 8192 x 8192 | type F16 | T+ 108
[372/723] Writing tensor blk.41.attn_k.weight | size 1024 x 8192 | type F16 | T+ 108
[373/723] Writing tensor blk.41.attn_v.weight | size 1024 x 8192 | type F16 | T+ 109
[374/723] Writing tensor blk.41.attn_output.weight | size 8192 x 8192 | type F16 | T+ 109
[375/723] Writing tensor blk.41.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 110
[376/723] Writing tensor blk.41.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 110
[377/723] Writing tensor blk.41.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 111
[378/723] Writing tensor blk.41.attn_norm.weight | size 8192 | type F32 | T+ 111
[379/723] Writing tensor blk.41.ffn_norm.weight | size 8192 | type F32 | T+ 111
[380/723] Writing tensor blk.42.attn_q.weight | size 8192 x 8192 | type F16 | T+ 111
[381/723] Writing tensor blk.42.attn_k.weight | size 1024 x 8192 | type F16 | T+ 111
[382/723] Writing tensor blk.42.attn_v.weight | size 1024 x 8192 | type F16 | T+ 111
[383/723] Writing tensor blk.42.attn_output.weight | size 8192 x 8192 | type F16 | T+ 111
[384/723] Writing tensor blk.42.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 112
[385/723] Writing tensor blk.42.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 113
[386/723] Writing tensor blk.42.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 113
[387/723] Writing tensor blk.42.attn_norm.weight | size 8192 | type F32 | T+ 113
[388/723] Writing tensor blk.42.ffn_norm.weight | size 8192 | type F32 | T+ 113
[389/723] Writing tensor blk.43.attn_q.weight | size 8192 x 8192 | type F16 | T+ 113
[390/723] Writing tensor blk.43.attn_k.weight | size 1024 x 8192 | type F16 | T+ 113
[391/723] Writing tensor blk.43.attn_v.weight | size 1024 x 8192 | type F16 | T+ 113
[392/723] Writing tensor blk.43.attn_output.weight | size 8192 x 8192 | type F16 | T+ 114
[393/723] Writing tensor blk.43.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 116
[394/723] Writing tensor blk.43.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 116
[395/723] Writing tensor blk.43.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 117
[396/723] Writing tensor blk.43.attn_norm.weight | size 8192 | type F32 | T+ 117
[397/723] Writing tensor blk.43.ffn_norm.weight | size 8192 | type F32 | T+ 117
[398/723] Writing tensor blk.44.attn_q.weight | size 8192 x 8192 | type F16 | T+ 117
[399/723] Writing tensor blk.44.attn_k.weight | size 1024 x 8192 | type F16 | T+ 117
[400/723] Writing tensor blk.44.attn_v.weight | size 1024 x 8192 | type F16 | T+ 117
[401/723] Writing tensor blk.44.attn_output.weight | size 8192 x 8192 | type F16 | T+ 117
[402/723] Writing tensor blk.44.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 118
[403/723] Writing tensor blk.44.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 119
[404/723] Writing tensor blk.44.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 119
[405/723] Writing tensor blk.44.attn_norm.weight | size 8192 | type F32 | T+ 120
[406/723] Writing tensor blk.44.ffn_norm.weight | size 8192 | type F32 | T+ 120
[407/723] Writing tensor blk.45.attn_q.weight | size 8192 x 8192 | type F16 | T+ 120
[408/723] Writing tensor blk.45.attn_k.weight | size 1024 x 8192 | type F16 | T+ 120
[409/723] Writing tensor blk.45.attn_v.weight | size 1024 x 8192 | type F16 | T+ 120
[410/723] Writing tensor blk.45.attn_output.weight | size 8192 x 8192 | type F16 | T+ 120
[411/723] Writing tensor blk.45.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 121
[412/723] Writing tensor blk.45.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 121
[413/723] Writing tensor blk.45.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 122
[414/723] Writing tensor blk.45.attn_norm.weight | size 8192 | type F32 | T+ 122
[415/723] Writing tensor blk.45.ffn_norm.weight | size 8192 | type F32 | T+ 122
[416/723] Writing tensor blk.46.attn_q.weight | size 8192 x 8192 | type F16 | T+ 122
[417/723] Writing tensor blk.46.attn_k.weight | size 1024 x 8192 | type F16 | T+ 122
[418/723] Writing tensor blk.46.attn_v.weight | size 1024 x 8192 | type F16 | T+ 122
[419/723] Writing tensor blk.46.attn_output.weight | size 8192 x 8192 | type F16 | T+ 122
[420/723] Writing tensor blk.46.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 123
[421/723] Writing tensor blk.46.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 124
[422/723] Writing tensor blk.46.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 124
[423/723] Writing tensor blk.46.attn_norm.weight | size 8192 | type F32 | T+ 125
[424/723] Writing tensor blk.46.ffn_norm.weight | size 8192 | type F32 | T+ 125
[425/723] Writing tensor blk.47.attn_q.weight | size 8192 x 8192 | type F16 | T+ 125
[426/723] Writing tensor blk.47.attn_k.weight | size 1024 x 8192 | type F16 | T+ 125
[427/723] Writing tensor blk.47.attn_v.weight | size 1024 x 8192 | type F16 | T+ 125
[428/723] Writing tensor blk.47.attn_output.weight | size 8192 x 8192 | type F16 | T+ 125
[429/723] Writing tensor blk.47.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 126
[430/723] Writing tensor blk.47.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 126
[431/723] Writing tensor blk.47.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 127
[432/723] Writing tensor blk.47.attn_norm.weight | size 8192 | type F32 | T+ 127
[433/723] Writing tensor blk.47.ffn_norm.weight | size 8192 | type F32 | T+ 127
[434/723] Writing tensor blk.48.attn_q.weight | size 8192 x 8192 | type F16 | T+ 127
[435/723] Writing tensor blk.48.attn_k.weight | size 1024 x 8192 | type F16 | T+ 127
[436/723] Writing tensor blk.48.attn_v.weight | size 1024 x 8192 | type F16 | T+ 127
[437/723] Writing tensor blk.48.attn_output.weight | size 8192 x 8192 | type F16 | T+ 127
[438/723] Writing tensor blk.48.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 128
[439/723] Writing tensor blk.48.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 129
[440/723] Writing tensor blk.48.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 129
[441/723] Writing tensor blk.48.attn_norm.weight | size 8192 | type F32 | T+ 130
[442/723] Writing tensor blk.48.ffn_norm.weight | size 8192 | type F32 | T+ 130
[443/723] Writing tensor blk.49.attn_q.weight | size 8192 x 8192 | type F16 | T+ 130
[444/723] Writing tensor blk.49.attn_k.weight | size 1024 x 8192 | type F16 | T+ 130
[445/723] Writing tensor blk.49.attn_v.weight | size 1024 x 8192 | type F16 | T+ 130
[446/723] Writing tensor blk.49.attn_output.weight | size 8192 x 8192 | type F16 | T+ 130
[447/723] Writing tensor blk.49.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 131
[448/723] Writing tensor blk.49.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 131
[449/723] Writing tensor blk.49.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 132
[450/723] Writing tensor blk.49.attn_norm.weight | size 8192 | type F32 | T+ 132
[451/723] Writing tensor blk.49.ffn_norm.weight | size 8192 | type F32 | T+ 132
[452/723] Writing tensor blk.50.attn_q.weight | size 8192 x 8192 | type F16 | T+ 132
[453/723] Writing tensor blk.50.attn_k.weight | size 1024 x 8192 | type F16 | T+ 133
[454/723] Writing tensor blk.50.attn_v.weight | size 1024 x 8192 | type F16 | T+ 133
[455/723] Writing tensor blk.50.attn_output.weight | size 8192 x 8192 | type F16 | T+ 133
[456/723] Writing tensor blk.50.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 134
[457/723] Writing tensor blk.50.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 135
[458/723] Writing tensor blk.50.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 136
[459/723] Writing tensor blk.50.attn_norm.weight | size 8192 | type F32 | T+ 136
[460/723] Writing tensor blk.50.ffn_norm.weight | size 8192 | type F32 | T+ 136
[461/723] Writing tensor blk.51.attn_q.weight | size 8192 x 8192 | type F16 | T+ 136
[462/723] Writing tensor blk.51.attn_k.weight | size 1024 x 8192 | type F16 | T+ 136
[463/723] Writing tensor blk.51.attn_v.weight | size 1024 x 8192 | type F16 | T+ 136
[464/723] Writing tensor blk.51.attn_output.weight | size 8192 x 8192 | type F16 | T+ 136
[465/723] Writing tensor blk.51.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 137
[466/723] Writing tensor blk.51.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 138
[467/723] Writing tensor blk.51.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 138
[468/723] Writing tensor blk.51.attn_norm.weight | size 8192 | type F32 | T+ 138
[469/723] Writing tensor blk.51.ffn_norm.weight | size 8192 | type F32 | T+ 138
[470/723] Writing tensor blk.52.attn_q.weight | size 8192 x 8192 | type F16 | T+ 138
[471/723] Writing tensor blk.52.attn_k.weight | size 1024 x 8192 | type F16 | T+ 138
[472/723] Writing tensor blk.52.attn_v.weight | size 1024 x 8192 | type F16 | T+ 138
[473/723] Writing tensor blk.52.attn_output.weight | size 8192 x 8192 | type F16 | T+ 139
[474/723] Writing tensor blk.52.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 139
[475/723] Writing tensor blk.52.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 140
[476/723] Writing tensor blk.52.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 141
[477/723] Writing tensor blk.52.attn_norm.weight | size 8192 | type F32 | T+ 141
[478/723] Writing tensor blk.52.ffn_norm.weight | size 8192 | type F32 | T+ 141
[479/723] Writing tensor blk.53.attn_q.weight | size 8192 x 8192 | type F16 | T+ 141
[480/723] Writing tensor blk.53.attn_k.weight | size 1024 x 8192 | type F16 | T+ 141
[481/723] Writing tensor blk.53.attn_v.weight | size 1024 x 8192 | type F16 | T+ 141
[482/723] Writing tensor blk.53.attn_output.weight | size 8192 x 8192 | type F16 | T+ 141
[483/723] Writing tensor blk.53.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 142
[484/723] Writing tensor blk.53.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 143
[485/723] Writing tensor blk.53.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 143
[486/723] Writing tensor blk.53.attn_norm.weight | size 8192 | type F32 | T+ 144
[487/723] Writing tensor blk.53.ffn_norm.weight | size 8192 | type F32 | T+ 144
[488/723] Writing tensor blk.54.attn_q.weight | size 8192 x 8192 | type F16 | T+ 144
[489/723] Writing tensor blk.54.attn_k.weight | size 1024 x 8192 | type F16 | T+ 144
[490/723] Writing tensor blk.54.attn_v.weight | size 1024 x 8192 | type F16 | T+ 144
[491/723] Writing tensor blk.54.attn_output.weight | size 8192 x 8192 | type F16 | T+ 144
[492/723] Writing tensor blk.54.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 146
[493/723] Writing tensor blk.54.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 147
[494/723] Writing tensor blk.54.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 147
[495/723] Writing tensor blk.54.attn_norm.weight | size 8192 | type F32 | T+ 147
[496/723] Writing tensor blk.54.ffn_norm.weight | size 8192 | type F32 | T+ 147
[497/723] Writing tensor blk.55.attn_q.weight | size 8192 x 8192 | type F16 | T+ 147
[498/723] Writing tensor blk.55.attn_k.weight | size 1024 x 8192 | type F16 | T+ 148
[499/723] Writing tensor blk.55.attn_v.weight | size 1024 x 8192 | type F16 | T+ 148
[500/723] Writing tensor blk.55.attn_output.weight | size 8192 x 8192 | type F16 | T+ 148
[501/723] Writing tensor blk.55.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 148
[502/723] Writing tensor blk.55.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 149
[503/723] Writing tensor blk.55.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 150
[504/723] Writing tensor blk.55.attn_norm.weight | size 8192 | type F32 | T+ 150
[505/723] Writing tensor blk.55.ffn_norm.weight | size 8192 | type F32 | T+ 150
[506/723] Writing tensor blk.56.attn_q.weight | size 8192 x 8192 | type F16 | T+ 150
[507/723] Writing tensor blk.56.attn_k.weight | size 1024 x 8192 | type F16 | T+ 150
[508/723] Writing tensor blk.56.attn_v.weight | size 1024 x 8192 | type F16 | T+ 150
[509/723] Writing tensor blk.56.attn_output.weight | size 8192 x 8192 | type F16 | T+ 150
[510/723] Writing tensor blk.56.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 151
[511/723] Writing tensor blk.56.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 152
[512/723] Writing tensor blk.56.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 152
[513/723] Writing tensor blk.56.attn_norm.weight | size 8192 | type F32 | T+ 153
[514/723] Writing tensor blk.56.ffn_norm.weight | size 8192 | type F32 | T+ 153
[515/723] Writing tensor blk.57.attn_q.weight | size 8192 x 8192 | type F16 | T+ 153
[516/723] Writing tensor blk.57.attn_k.weight | size 1024 x 8192 | type F16 | T+ 153
[517/723] Writing tensor blk.57.attn_v.weight | size 1024 x 8192 | type F16 | T+ 153
[518/723] Writing tensor blk.57.attn_output.weight | size 8192 x 8192 | type F16 | T+ 153
[519/723] Writing tensor blk.57.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 154
[520/723] Writing tensor blk.57.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 154
[521/723] Writing tensor blk.57.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 155
[522/723] Writing tensor blk.57.attn_norm.weight | size 8192 | type F32 | T+ 155
[523/723] Writing tensor blk.57.ffn_norm.weight | size 8192 | type F32 | T+ 155
[524/723] Writing tensor blk.58.attn_q.weight | size 8192 x 8192 | type F16 | T+ 155
[525/723] Writing tensor blk.58.attn_k.weight | size 1024 x 8192 | type F16 | T+ 155
[526/723] Writing tensor blk.58.attn_v.weight | size 1024 x 8192 | type F16 | T+ 155
[527/723] Writing tensor blk.58.attn_output.weight | size 8192 x 8192 | type F16 | T+ 155
[528/723] Writing tensor blk.58.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 156
[529/723] Writing tensor blk.58.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 157
[530/723] Writing tensor blk.58.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 158
[531/723] Writing tensor blk.58.attn_norm.weight | size 8192 | type F32 | T+ 158
[532/723] Writing tensor blk.58.ffn_norm.weight | size 8192 | type F32 | T+ 158
[533/723] Writing tensor blk.59.attn_q.weight | size 8192 x 8192 | type F16 | T+ 158
[534/723] Writing tensor blk.59.attn_k.weight | size 1024 x 8192 | type F16 | T+ 158
[535/723] Writing tensor blk.59.attn_v.weight | size 1024 x 8192 | type F16 | T+ 158
[536/723] Writing tensor blk.59.attn_output.weight | size 8192 x 8192 | type F16 | T+ 158
[537/723] Writing tensor blk.59.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 159
[538/723] Writing tensor blk.59.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 160
[539/723] Writing tensor blk.59.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 161
[540/723] Writing tensor blk.59.attn_norm.weight | size 8192 | type F32 | T+ 161
[541/723] Writing tensor blk.59.ffn_norm.weight | size 8192 | type F32 | T+ 161
[542/723] Writing tensor blk.60.attn_q.weight | size 8192 x 8192 | type F16 | T+ 161
[543/723] Writing tensor blk.60.attn_k.weight | size 1024 x 8192 | type F16 | T+ 161
[544/723] Writing tensor blk.60.attn_v.weight | size 1024 x 8192 | type F16 | T+ 161
[545/723] Writing tensor blk.60.attn_output.weight | size 8192 x 8192 | type F16 | T+ 161
[546/723] Writing tensor blk.60.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 162
[547/723] Writing tensor blk.60.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 163
[548/723] Writing tensor blk.60.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 163
[549/723] Writing tensor blk.60.attn_norm.weight | size 8192 | type F32 | T+ 163
[550/723] Writing tensor blk.60.ffn_norm.weight | size 8192 | type F32 | T+ 163
[551/723] Writing tensor blk.61.attn_q.weight | size 8192 x 8192 | type F16 | T+ 163
[552/723] Writing tensor blk.61.attn_k.weight | size 1024 x 8192 | type F16 | T+ 164
[553/723] Writing tensor blk.61.attn_v.weight | size 1024 x 8192 | type F16 | T+ 164
[554/723] Writing tensor blk.61.attn_output.weight | size 8192 x 8192 | type F16 | T+ 164
[555/723] Writing tensor blk.61.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 164
[556/723] Writing tensor blk.61.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 165
[557/723] Writing tensor blk.61.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 166
[558/723] Writing tensor blk.61.attn_norm.weight | size 8192 | type F32 | T+ 166
[559/723] Writing tensor blk.61.ffn_norm.weight | size 8192 | type F32 | T+ 166
[560/723] Writing tensor blk.62.attn_q.weight | size 8192 x 8192 | type F16 | T+ 166
[561/723] Writing tensor blk.62.attn_k.weight | size 1024 x 8192 | type F16 | T+ 166
[562/723] Writing tensor blk.62.attn_v.weight | size 1024 x 8192 | type F16 | T+ 166
[563/723] Writing tensor blk.62.attn_output.weight | size 8192 x 8192 | type F16 | T+ 166
[564/723] Writing tensor blk.62.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 169
[565/723] Writing tensor blk.62.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 169
[566/723] Writing tensor blk.62.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 170
[567/723] Writing tensor blk.62.attn_norm.weight | size 8192 | type F32 | T+ 170
[568/723] Writing tensor blk.62.ffn_norm.weight | size 8192 | type F32 | T+ 170
[569/723] Writing tensor blk.63.attn_q.weight | size 8192 x 8192 | type F16 | T+ 170
[570/723] Writing tensor blk.63.attn_k.weight | size 1024 x 8192 | type F16 | T+ 170
[571/723] Writing tensor blk.63.attn_v.weight | size 1024 x 8192 | type F16 | T+ 170
[572/723] Writing tensor blk.63.attn_output.weight | size 8192 x 8192 | type F16 | T+ 170
[573/723] Writing tensor blk.63.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 171
[574/723] Writing tensor blk.63.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 172
[575/723] Writing tensor blk.63.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 173
[576/723] Writing tensor blk.63.attn_norm.weight | size 8192 | type F32 | T+ 173
[577/723] Writing tensor blk.63.ffn_norm.weight | size 8192 | type F32 | T+ 173
[578/723] Writing tensor blk.64.attn_q.weight | size 8192 x 8192 | type F16 | T+ 173
[579/723] Writing tensor blk.64.attn_k.weight | size 1024 x 8192 | type F16 | T+ 173
[580/723] Writing tensor blk.64.attn_v.weight | size 1024 x 8192 | type F16 | T+ 173
[581/723] Writing tensor blk.64.attn_output.weight | size 8192 x 8192 | type F16 | T+ 173
[582/723] Writing tensor blk.64.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 174
[583/723] Writing tensor blk.64.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 175
[584/723] Writing tensor blk.64.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 175
[585/723] Writing tensor blk.64.attn_norm.weight | size 8192 | type F32 | T+ 176
[586/723] Writing tensor blk.64.ffn_norm.weight | size 8192 | type F32 | T+ 176
[587/723] Writing tensor blk.65.attn_q.weight | size 8192 x 8192 | type F16 | T+ 176
[588/723] Writing tensor blk.65.attn_k.weight | size 1024 x 8192 | type F16 | T+ 176
[589/723] Writing tensor blk.65.attn_v.weight | size 1024 x 8192 | type F16 | T+ 176
[590/723] Writing tensor blk.65.attn_output.weight | size 8192 x 8192 | type F16 | T+ 176
[591/723] Writing tensor blk.65.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 177
[592/723] Writing tensor blk.65.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 177
[593/723] Writing tensor blk.65.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 178
[594/723] Writing tensor blk.65.attn_norm.weight | size 8192 | type F32 | T+ 178
[595/723] Writing tensor blk.65.ffn_norm.weight | size 8192 | type F32 | T+ 178
[596/723] Writing tensor blk.66.attn_q.weight | size 8192 x 8192 | type F16 | T+ 178
[597/723] Writing tensor blk.66.attn_k.weight | size 1024 x 8192 | type F16 | T+ 178
[598/723] Writing tensor blk.66.attn_v.weight | size 1024 x 8192 | type F16 | T+ 178
[599/723] Writing tensor blk.66.attn_output.weight | size 8192 x 8192 | type F16 | T+ 178
[600/723] Writing tensor blk.66.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 179
[601/723] Writing tensor blk.66.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 180
[602/723] Writing tensor blk.66.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 180
[603/723] Writing tensor blk.66.attn_norm.weight | size 8192 | type F32 | T+ 181
[604/723] Writing tensor blk.66.ffn_norm.weight | size 8192 | type F32 | T+ 181
[605/723] Writing tensor blk.67.attn_q.weight | size 8192 x 8192 | type F16 | T+ 181
[606/723] Writing tensor blk.67.attn_k.weight | size 1024 x 8192 | type F16 | T+ 181
[607/723] Writing tensor blk.67.attn_v.weight | size 1024 x 8192 | type F16 | T+ 181
[608/723] Writing tensor blk.67.attn_output.weight | size 8192 x 8192 | type F16 | T+ 181
[609/723] Writing tensor blk.67.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 182
[610/723] Writing tensor blk.67.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 182
[611/723] Writing tensor blk.67.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 183
[612/723] Writing tensor blk.67.attn_norm.weight | size 8192 | type F32 | T+ 183
[613/723] Writing tensor blk.67.ffn_norm.weight | size 8192 | type F32 | T+ 183
[614/723] Writing tensor blk.68.attn_q.weight | size 8192 x 8192 | type F16 | T+ 183
[615/723] Writing tensor blk.68.attn_k.weight | size 1024 x 8192 | type F16 | T+ 183
[616/723] Writing tensor blk.68.attn_v.weight | size 1024 x 8192 | type F16 | T+ 183
[617/723] Writing tensor blk.68.attn_output.weight | size 8192 x 8192 | type F16 | T+ 183
[618/723] Writing tensor blk.68.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 184
[619/723] Writing tensor blk.68.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 185
[620/723] Writing tensor blk.68.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 185
[621/723] Writing tensor blk.68.attn_norm.weight | size 8192 | type F32 | T+ 186
[622/723] Writing tensor blk.68.ffn_norm.weight | size 8192 | type F32 | T+ 186
[623/723] Writing tensor blk.69.attn_q.weight | size 8192 x 8192 | type F16 | T+ 186
[624/723] Writing tensor blk.69.attn_k.weight | size 1024 x 8192 | type F16 | T+ 186
[625/723] Writing tensor blk.69.attn_v.weight | size 1024 x 8192 | type F16 | T+ 186
[626/723] Writing tensor blk.69.attn_output.weight | size 8192 x 8192 | type F16 | T+ 186
[627/723] Writing tensor blk.69.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 187
[628/723] Writing tensor blk.69.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 187
[629/723] Writing tensor blk.69.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 188
[630/723] Writing tensor blk.69.attn_norm.weight | size 8192 | type F32 | T+ 188
[631/723] Writing tensor blk.69.ffn_norm.weight | size 8192 | type F32 | T+ 188
[632/723] Writing tensor blk.70.attn_q.weight | size 8192 x 8192 | type F16 | T+ 188
[633/723] Writing tensor blk.70.attn_k.weight | size 1024 x 8192 | type F16 | T+ 188
[634/723] Writing tensor blk.70.attn_v.weight | size 1024 x 8192 | type F16 | T+ 188
[635/723] Writing tensor blk.70.attn_output.weight | size 8192 x 8192 | type F16 | T+ 188
[636/723] Writing tensor blk.70.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 191
[637/723] Writing tensor blk.70.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 191
[638/723] Writing tensor blk.70.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 192
[639/723] Writing tensor blk.70.attn_norm.weight | size 8192 | type F32 | T+ 193
[640/723] Writing tensor blk.70.ffn_norm.weight | size 8192 | type F32 | T+ 193
[641/723] Writing tensor blk.71.attn_q.weight | size 8192 x 8192 | type F16 | T+ 193
[642/723] Writing tensor blk.71.attn_k.weight | size 1024 x 8192 | type F16 | T+ 193
[643/723] Writing tensor blk.71.attn_v.weight | size 1024 x 8192 | type F16 | T+ 193
[644/723] Writing tensor blk.71.attn_output.weight | size 8192 x 8192 | type F16 | T+ 193
[645/723] Writing tensor blk.71.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 193
[646/723] Writing tensor blk.71.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 194
[647/723] Writing tensor blk.71.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 195
[648/723] Writing tensor blk.71.attn_norm.weight | size 8192 | type F32 | T+ 195
[649/723] Writing tensor blk.71.ffn_norm.weight | size 8192 | type F32 | T+ 195
[650/723] Writing tensor blk.72.attn_q.weight | size 8192 x 8192 | type F16 | T+ 195
[651/723] Writing tensor blk.72.attn_k.weight | size 1024 x 8192 | type F16 | T+ 195
[652/723] Writing tensor blk.72.attn_v.weight | size 1024 x 8192 | type F16 | T+ 195
[653/723] Writing tensor blk.72.attn_output.weight | size 8192 x 8192 | type F16 | T+ 195
[654/723] Writing tensor blk.72.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 196
[655/723] Writing tensor blk.72.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 197
[656/723] Writing tensor blk.72.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 197
[657/723] Writing tensor blk.72.attn_norm.weight | size 8192 | type F32 | T+ 198
[658/723] Writing tensor blk.72.ffn_norm.weight | size 8192 | type F32 | T+ 198
[659/723] Writing tensor blk.73.attn_q.weight | size 8192 x 8192 | type F16 | T+ 198
[660/723] Writing tensor blk.73.attn_k.weight | size 1024 x 8192 | type F16 | T+ 198
[661/723] Writing tensor blk.73.attn_v.weight | size 1024 x 8192 | type F16 | T+ 198
[662/723] Writing tensor blk.73.attn_output.weight | size 8192 x 8192 | type F16 | T+ 198
[663/723] Writing tensor blk.73.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 199
[664/723] Writing tensor blk.73.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 199
[665/723] Writing tensor blk.73.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 200
[666/723] Writing tensor blk.73.attn_norm.weight | size 8192 | type F32 | T+ 200
[667/723] Writing tensor blk.73.ffn_norm.weight | size 8192 | type F32 | T+ 200
[668/723] Writing tensor blk.74.attn_q.weight | size 8192 x 8192 | type F16 | T+ 200
[669/723] Writing tensor blk.74.attn_k.weight | size 1024 x 8192 | type F16 | T+ 200
[670/723] Writing tensor blk.74.attn_v.weight | size 1024 x 8192 | type F16 | T+ 200
[671/723] Writing tensor blk.74.attn_output.weight | size 8192 x 8192 | type F16 | T+ 200
[672/723] Writing tensor blk.74.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 201
[673/723] Writing tensor blk.74.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 202
[674/723] Writing tensor blk.74.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 202
[675/723] Writing tensor blk.74.attn_norm.weight | size 8192 | type F32 | T+ 203
[676/723] Writing tensor blk.74.ffn_norm.weight | size 8192 | type F32 | T+ 203
[677/723] Writing tensor blk.75.attn_q.weight | size 8192 x 8192 | type F16 | T+ 203
[678/723] Writing tensor blk.75.attn_k.weight | size 1024 x 8192 | type F16 | T+ 203
[679/723] Writing tensor blk.75.attn_v.weight | size 1024 x 8192 | type F16 | T+ 203
[680/723] Writing tensor blk.75.attn_output.weight | size 8192 x 8192 | type F16 | T+ 203
[681/723] Writing tensor blk.75.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 204
[682/723] Writing tensor blk.75.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 204
[683/723] Writing tensor blk.75.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 205
[684/723] Writing tensor blk.75.attn_norm.weight | size 8192 | type F32 | T+ 205
[685/723] Writing tensor blk.75.ffn_norm.weight | size 8192 | type F32 | T+ 205
[686/723] Writing tensor blk.76.attn_q.weight | size 8192 x 8192 | type F16 | T+ 205
[687/723] Writing tensor blk.76.attn_k.weight | size 1024 x 8192 | type F16 | T+ 205
[688/723] Writing tensor blk.76.attn_v.weight | size 1024 x 8192 | type F16 | T+ 205
[689/723] Writing tensor blk.76.attn_output.weight | size 8192 x 8192 | type F16 | T+ 205
[690/723] Writing tensor blk.76.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 206
[691/723] Writing tensor blk.76.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 207
[692/723] Writing tensor blk.76.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 207
[693/723] Writing tensor blk.76.attn_norm.weight | size 8192 | type F32 | T+ 207
[694/723] Writing tensor blk.76.ffn_norm.weight | size 8192 | type F32 | T+ 207
[695/723] Writing tensor blk.77.attn_q.weight | size 8192 x 8192 | type F16 | T+ 207
[696/723] Writing tensor blk.77.attn_k.weight | size 1024 x 8192 | type F16 | T+ 207
[697/723] Writing tensor blk.77.attn_v.weight | size 1024 x 8192 | type F16 | T+ 207
[698/723] Writing tensor blk.77.attn_output.weight | size 8192 x 8192 | type F16 | T+ 207
[699/723] Writing tensor blk.77.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 209
[700/723] Writing tensor blk.77.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 209
[701/723] Writing tensor blk.77.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 210
[702/723] Writing tensor blk.77.attn_norm.weight | size 8192 | type F32 | T+ 210
[703/723] Writing tensor blk.77.ffn_norm.weight | size 8192 | type F32 | T+ 210
[704/723] Writing tensor blk.78.attn_q.weight | size 8192 x 8192 | type F16 | T+ 210
[705/723] Writing tensor blk.78.attn_k.weight | size 1024 x 8192 | type F16 | T+ 210
[706/723] Writing tensor blk.78.attn_v.weight | size 1024 x 8192 | type F16 | T+ 210
[707/723] Writing tensor blk.78.attn_output.weight | size 8192 x 8192 | type F16 | T+ 210
[708/723] Writing tensor blk.78.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 211
[709/723] Writing tensor blk.78.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 212
[710/723] Writing tensor blk.78.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 213
[711/723] Writing tensor blk.78.attn_norm.weight | size 8192 | type F32 | T+ 213
[712/723] Writing tensor blk.78.ffn_norm.weight | size 8192 | type F32 | T+ 213
[713/723] Writing tensor blk.79.attn_q.weight | size 8192 x 8192 | type F16 | T+ 213
[714/723] Writing tensor blk.79.attn_k.weight | size 1024 x 8192 | type F16 | T+ 213
[715/723] Writing tensor blk.79.attn_v.weight | size 1024 x 8192 | type F16 | T+ 213
[716/723] Writing tensor blk.79.attn_output.weight | size 8192 x 8192 | type F16 | T+ 213
[717/723] Writing tensor blk.79.ffn_gate.weight | size 28672 x 8192 | type F16 | T+ 214
[718/723] Writing tensor blk.79.ffn_up.weight | size 28672 x 8192 | type F16 | T+ 215
[719/723] Writing tensor blk.79.ffn_down.weight | size 8192 x 28672 | type F16 | T+ 215
[720/723] Writing tensor blk.79.attn_norm.weight | size 8192 | type F32 | T+ 216
[721/723] Writing tensor blk.79.ffn_norm.weight | size 8192 | type F32 | T+ 216
[722/723] Writing tensor output_norm.weight | size 8192 | type F32 | T+ 216
[723/723] Writing tensor output.weight | size 32000 x 8192 | type F16 | T+ 216
Wrote /workspace/process/jondurbin_spicyboros-70b-2.2/gguf/spicyboros-70b-2.2.fp16.gguf
2023-09-12 22:37:28 INFO [jondurbin_spicyboros-70b-2.2.GGUF] Making Q4_0 : /workspace/process/jondurbin_spicyboros-70b-2.2/gguf/spicyboros-70b-2.2.Q4_0.gguf
2023-09-12 22:37:28 INFO [jondurbin_spicyboros-70b-2.2.GGUF] Quantizing with command: /workspace/git/gguf-llama/quantize /workspace/process/jondurbin_spicyboros-70b-2.2/gguf/spicyboros-70b-2.2.fp16.gguf /workspace/process/jondurbin_spicyboros-70b-2.2/gguf/spicyboros-70b-2.2.Q4_0.gguf Q4_0
ggml_init_cublas: found 1 CUDA devices:
Device 0: NVIDIA A10, compute capability 8.6
main: build = 1216 (4f7cd6b)
main: quantizing '/workspace/process/jondurbin_spicyboros-70b-2.2/gguf/spicyboros-70b-2.2.fp16.gguf' to '/workspace/process/jondurbin_spicyboros-70b-2.2/gguf/spicyboros-70b-2.2.Q4_0.gguf' as Q4_0
llama_model_loader: loaded meta data with 19 key-value pairs and 723 tensors from /workspace/process/jondurbin_spicyboros-70b-2.2/gguf/spicyboros-70b-2.2.fp16.gguf (version GGUF V2 (latest))
llama_model_loader: - tensor 0: token_embd.weight f16 [ 8192, 32000, 1, 1 ]
llama_model_loader: - tensor 1: blk.0.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 2: blk.0.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 3: blk.0.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 4: blk.0.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 5: blk.0.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 6: blk.0.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 7: blk.0.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 8: blk.0.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 9: blk.0.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 10: blk.1.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 11: blk.1.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 12: blk.1.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 13: blk.1.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 14: blk.1.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 15: blk.1.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 16: blk.1.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 17: blk.1.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 18: blk.1.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 19: blk.2.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 20: blk.2.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 21: blk.2.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 22: blk.2.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 23: blk.2.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 24: blk.2.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 25: blk.2.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 26: blk.2.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 27: blk.2.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 28: blk.3.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 29: blk.3.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 30: blk.3.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 31: blk.3.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 32: blk.3.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 33: blk.3.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 34: blk.3.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 35: blk.3.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 36: blk.3.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 37: blk.4.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 38: blk.4.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 39: blk.4.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 40: blk.4.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 41: blk.4.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 42: blk.4.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 43: blk.4.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 44: blk.4.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 45: blk.4.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 46: blk.5.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 47: blk.5.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 48: blk.5.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 49: blk.5.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 50: blk.5.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 51: blk.5.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 52: blk.5.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 53: blk.5.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 54: blk.5.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 55: blk.6.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 56: blk.6.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 57: blk.6.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 58: blk.6.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 59: blk.6.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 60: blk.6.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 61: blk.6.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 62: blk.6.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 63: blk.6.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 64: blk.7.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 65: blk.7.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 66: blk.7.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 67: blk.7.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 68: blk.7.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 69: blk.7.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 70: blk.7.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 71: blk.7.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 72: blk.7.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 73: blk.8.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 74: blk.8.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 75: blk.8.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 76: blk.8.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 77: blk.8.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 78: blk.8.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 79: blk.8.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 80: blk.8.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 81: blk.8.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 82: blk.9.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 83: blk.9.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 84: blk.9.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 85: blk.9.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 86: blk.9.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 87: blk.9.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 88: blk.9.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 89: blk.9.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 90: blk.9.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 91: blk.10.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 92: blk.10.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 93: blk.10.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 94: blk.10.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 95: blk.10.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 96: blk.10.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 97: blk.10.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 98: blk.10.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 99: blk.10.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 100: blk.11.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 101: blk.11.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 102: blk.11.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 103: blk.11.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 104: blk.11.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 105: blk.11.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 106: blk.11.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 107: blk.11.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 108: blk.11.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 109: blk.12.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 110: blk.12.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 111: blk.12.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 112: blk.12.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 113: blk.12.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 114: blk.12.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 115: blk.12.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 116: blk.12.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 117: blk.12.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 118: blk.13.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 119: blk.13.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 120: blk.13.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 121: blk.13.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 122: blk.13.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 123: blk.13.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 124: blk.13.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 125: blk.13.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 126: blk.13.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 127: blk.14.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 128: blk.14.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 129: blk.14.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 130: blk.14.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 131: blk.14.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 132: blk.14.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 133: blk.14.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 134: blk.14.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 135: blk.14.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 136: blk.15.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 137: blk.15.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 138: blk.15.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 139: blk.15.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 140: blk.15.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 141: blk.15.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 142: blk.15.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 143: blk.15.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 144: blk.15.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 145: blk.16.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 146: blk.16.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 147: blk.16.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 148: blk.16.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 149: blk.16.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 150: blk.16.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 151: blk.16.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 152: blk.16.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 153: blk.16.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 154: blk.17.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 155: blk.17.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 156: blk.17.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 157: blk.17.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 158: blk.17.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 159: blk.17.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 160: blk.17.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 161: blk.17.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 162: blk.17.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 163: blk.18.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 164: blk.18.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 165: blk.18.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 166: blk.18.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 167: blk.18.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 168: blk.18.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 169: blk.18.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 170: blk.18.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 171: blk.18.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 172: blk.19.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 173: blk.19.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 174: blk.19.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 175: blk.19.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 176: blk.19.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 177: blk.19.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 178: blk.19.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 179: blk.19.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 180: blk.19.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 181: blk.20.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 182: blk.20.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 183: blk.20.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 184: blk.20.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 185: blk.20.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 186: blk.20.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 187: blk.20.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 188: blk.20.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 189: blk.20.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 190: blk.21.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 191: blk.21.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 192: blk.21.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 193: blk.21.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 194: blk.21.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 195: blk.21.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 196: blk.21.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 197: blk.21.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 198: blk.21.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 199: blk.22.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 200: blk.22.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 201: blk.22.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 202: blk.22.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 203: blk.22.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 204: blk.22.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 205: blk.22.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 206: blk.22.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 207: blk.22.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 208: blk.23.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 209: blk.23.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 210: blk.23.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 211: blk.23.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 212: blk.23.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 213: blk.23.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 214: blk.23.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 215: blk.23.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 216: blk.23.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 217: blk.24.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 218: blk.24.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 219: blk.24.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 220: blk.24.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 221: blk.24.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 222: blk.24.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 223: blk.24.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 224: blk.24.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 225: blk.24.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 226: blk.25.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 227: blk.25.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 228: blk.25.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 229: blk.25.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 230: blk.25.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 231: blk.25.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 232: blk.25.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 233: blk.25.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 234: blk.25.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 235: blk.26.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 236: blk.26.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 237: blk.26.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 238: blk.26.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 239: blk.26.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 240: blk.26.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 241: blk.26.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 242: blk.26.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 243: blk.26.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 244: blk.27.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 245: blk.27.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 246: blk.27.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 247: blk.27.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 248: blk.27.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 249: blk.27.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 250: blk.27.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 251: blk.27.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 252: blk.27.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 253: blk.28.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 254: blk.28.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 255: blk.28.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 256: blk.28.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 257: blk.28.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 258: blk.28.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 259: blk.28.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 260: blk.28.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 261: blk.28.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 262: blk.29.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 263: blk.29.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 264: blk.29.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 265: blk.29.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 266: blk.29.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 267: blk.29.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 268: blk.29.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 269: blk.29.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 270: blk.29.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 271: blk.30.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 272: blk.30.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 273: blk.30.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 274: blk.30.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 275: blk.30.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 276: blk.30.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 277: blk.30.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 278: blk.30.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 279: blk.30.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 280: blk.31.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 281: blk.31.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 282: blk.31.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 283: blk.31.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 284: blk.31.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 285: blk.31.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 286: blk.31.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 287: blk.31.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 288: blk.31.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 289: blk.32.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 290: blk.32.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 291: blk.32.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 292: blk.32.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 293: blk.32.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 294: blk.32.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 295: blk.32.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 296: blk.32.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 297: blk.32.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 298: blk.33.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 299: blk.33.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 300: blk.33.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 301: blk.33.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 302: blk.33.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 303: blk.33.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 304: blk.33.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 305: blk.33.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 306: blk.33.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 307: blk.34.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 308: blk.34.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 309: blk.34.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 310: blk.34.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 311: blk.34.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 312: blk.34.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 313: blk.34.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 314: blk.34.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 315: blk.34.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 316: blk.35.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 317: blk.35.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 318: blk.35.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 319: blk.35.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 320: blk.35.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 321: blk.35.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 322: blk.35.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 323: blk.35.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 324: blk.35.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 325: blk.36.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 326: blk.36.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 327: blk.36.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 328: blk.36.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 329: blk.36.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 330: blk.36.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 331: blk.36.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 332: blk.36.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 333: blk.36.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 334: blk.37.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 335: blk.37.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 336: blk.37.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 337: blk.37.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 338: blk.37.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 339: blk.37.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 340: blk.37.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 341: blk.37.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 342: blk.37.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 343: blk.38.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 344: blk.38.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 345: blk.38.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 346: blk.38.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 347: blk.38.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 348: blk.38.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 349: blk.38.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 350: blk.38.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 351: blk.38.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 352: blk.39.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 353: blk.39.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 354: blk.39.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 355: blk.39.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 356: blk.39.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 357: blk.39.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 358: blk.39.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 359: blk.39.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 360: blk.39.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 361: blk.40.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 362: blk.40.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 363: blk.40.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 364: blk.40.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 365: blk.40.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 366: blk.40.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 367: blk.40.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 368: blk.40.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 369: blk.40.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 370: blk.41.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 371: blk.41.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 372: blk.41.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 373: blk.41.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 374: blk.41.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 375: blk.41.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 376: blk.41.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 377: blk.41.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 378: blk.41.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 379: blk.42.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 380: blk.42.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 381: blk.42.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 382: blk.42.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 383: blk.42.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 384: blk.42.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 385: blk.42.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 386: blk.42.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 387: blk.42.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 388: blk.43.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 389: blk.43.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 390: blk.43.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 391: blk.43.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 392: blk.43.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 393: blk.43.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 394: blk.43.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 395: blk.43.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 396: blk.43.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 397: blk.44.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 398: blk.44.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 399: blk.44.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 400: blk.44.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 401: blk.44.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 402: blk.44.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 403: blk.44.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 404: blk.44.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 405: blk.44.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 406: blk.45.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 407: blk.45.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 408: blk.45.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 409: blk.45.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 410: blk.45.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 411: blk.45.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 412: blk.45.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 413: blk.45.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 414: blk.45.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 415: blk.46.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 416: blk.46.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 417: blk.46.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 418: blk.46.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 419: blk.46.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 420: blk.46.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 421: blk.46.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 422: blk.46.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 423: blk.46.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 424: blk.47.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 425: blk.47.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 426: blk.47.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 427: blk.47.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 428: blk.47.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 429: blk.47.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 430: blk.47.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 431: blk.47.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 432: blk.47.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 433: blk.48.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 434: blk.48.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 435: blk.48.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 436: blk.48.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 437: blk.48.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 438: blk.48.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 439: blk.48.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 440: blk.48.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 441: blk.48.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 442: blk.49.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 443: blk.49.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 444: blk.49.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 445: blk.49.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 446: blk.49.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 447: blk.49.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 448: blk.49.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 449: blk.49.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 450: blk.49.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 451: blk.50.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 452: blk.50.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 453: blk.50.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 454: blk.50.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 455: blk.50.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 456: blk.50.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 457: blk.50.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 458: blk.50.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 459: blk.50.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 460: blk.51.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 461: blk.51.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 462: blk.51.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 463: blk.51.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 464: blk.51.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 465: blk.51.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 466: blk.51.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 467: blk.51.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 468: blk.51.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 469: blk.52.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 470: blk.52.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 471: blk.52.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 472: blk.52.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 473: blk.52.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 474: blk.52.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 475: blk.52.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 476: blk.52.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 477: blk.52.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 478: blk.53.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 479: blk.53.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 480: blk.53.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 481: blk.53.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 482: blk.53.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 483: blk.53.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 484: blk.53.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 485: blk.53.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 486: blk.53.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 487: blk.54.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 488: blk.54.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 489: blk.54.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 490: blk.54.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 491: blk.54.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 492: blk.54.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 493: blk.54.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 494: blk.54.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 495: blk.54.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 496: blk.55.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 497: blk.55.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 498: blk.55.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 499: blk.55.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 500: blk.55.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 501: blk.55.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 502: blk.55.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 503: blk.55.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 504: blk.55.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 505: blk.56.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 506: blk.56.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 507: blk.56.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 508: blk.56.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 509: blk.56.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 510: blk.56.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 511: blk.56.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 512: blk.56.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 513: blk.56.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 514: blk.57.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 515: blk.57.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 516: blk.57.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 517: blk.57.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 518: blk.57.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 519: blk.57.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 520: blk.57.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 521: blk.57.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 522: blk.57.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 523: blk.58.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 524: blk.58.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 525: blk.58.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 526: blk.58.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 527: blk.58.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 528: blk.58.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 529: blk.58.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 530: blk.58.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 531: blk.58.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 532: blk.59.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 533: blk.59.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 534: blk.59.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 535: blk.59.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 536: blk.59.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 537: blk.59.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 538: blk.59.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 539: blk.59.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 540: blk.59.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 541: blk.60.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 542: blk.60.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 543: blk.60.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 544: blk.60.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 545: blk.60.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 546: blk.60.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 547: blk.60.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 548: blk.60.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 549: blk.60.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 550: blk.61.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 551: blk.61.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 552: blk.61.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 553: blk.61.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 554: blk.61.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 555: blk.61.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 556: blk.61.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 557: blk.61.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 558: blk.61.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 559: blk.62.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 560: blk.62.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 561: blk.62.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 562: blk.62.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 563: blk.62.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 564: blk.62.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 565: blk.62.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 566: blk.62.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 567: blk.62.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 568: blk.63.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 569: blk.63.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 570: blk.63.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 571: blk.63.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 572: blk.63.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 573: blk.63.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 574: blk.63.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 575: blk.63.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 576: blk.63.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 577: blk.64.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 578: blk.64.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 579: blk.64.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 580: blk.64.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 581: blk.64.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 582: blk.64.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 583: blk.64.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 584: blk.64.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 585: blk.64.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 586: blk.65.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 587: blk.65.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 588: blk.65.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 589: blk.65.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 590: blk.65.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 591: blk.65.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 592: blk.65.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 593: blk.65.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 594: blk.65.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 595: blk.66.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 596: blk.66.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 597: blk.66.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 598: blk.66.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 599: blk.66.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 600: blk.66.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 601: blk.66.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 602: blk.66.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 603: blk.66.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 604: blk.67.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 605: blk.67.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 606: blk.67.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 607: blk.67.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 608: blk.67.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 609: blk.67.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 610: blk.67.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 611: blk.67.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 612: blk.67.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 613: blk.68.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 614: blk.68.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 615: blk.68.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 616: blk.68.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 617: blk.68.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 618: blk.68.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 619: blk.68.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 620: blk.68.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 621: blk.68.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 622: blk.69.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 623: blk.69.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 624: blk.69.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 625: blk.69.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 626: blk.69.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 627: blk.69.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 628: blk.69.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 629: blk.69.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 630: blk.69.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 631: blk.70.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 632: blk.70.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 633: blk.70.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 634: blk.70.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 635: blk.70.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 636: blk.70.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 637: blk.70.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 638: blk.70.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 639: blk.70.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 640: blk.71.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 641: blk.71.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 642: blk.71.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 643: blk.71.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 644: blk.71.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 645: blk.71.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 646: blk.71.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 647: blk.71.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 648: blk.71.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 649: blk.72.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 650: blk.72.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 651: blk.72.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 652: blk.72.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 653: blk.72.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 654: blk.72.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 655: blk.72.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 656: blk.72.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 657: blk.72.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 658: blk.73.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 659: blk.73.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 660: blk.73.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 661: blk.73.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 662: blk.73.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 663: blk.73.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 664: blk.73.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 665: blk.73.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 666: blk.73.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 667: blk.74.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 668: blk.74.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 669: blk.74.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 670: blk.74.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 671: blk.74.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 672: blk.74.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 673: blk.74.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 674: blk.74.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 675: blk.74.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 676: blk.75.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 677: blk.75.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 678: blk.75.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 679: blk.75.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 680: blk.75.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 681: blk.75.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 682: blk.75.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 683: blk.75.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 684: blk.75.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 685: blk.76.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 686: blk.76.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 687: blk.76.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 688: blk.76.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 689: blk.76.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 690: blk.76.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 691: blk.76.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 692: blk.76.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 693: blk.76.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 694: blk.77.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 695: blk.77.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 696: blk.77.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 697: blk.77.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 698: blk.77.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 699: blk.77.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 700: blk.77.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 701: blk.77.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 702: blk.77.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 703: blk.78.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 704: blk.78.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 705: blk.78.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 706: blk.78.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 707: blk.78.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 708: blk.78.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 709: blk.78.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 710: blk.78.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 711: blk.78.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 712: blk.79.attn_q.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 713: blk.79.attn_k.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 714: blk.79.attn_v.weight f16 [ 8192, 1024, 1, 1 ]
llama_model_loader: - tensor 715: blk.79.attn_output.weight f16 [ 8192, 8192, 1, 1 ]
llama_model_loader: - tensor 716: blk.79.ffn_gate.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 717: blk.79.ffn_up.weight f16 [ 8192, 28672, 1, 1 ]
llama_model_loader: - tensor 718: blk.79.ffn_down.weight f16 [ 28672, 8192, 1, 1 ]
llama_model_loader: - tensor 719: blk.79.attn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 720: blk.79.ffn_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 721: output_norm.weight f32 [ 8192, 1, 1, 1 ]
llama_model_loader: - tensor 722: output.weight f16 [ 8192, 32000, 1, 1 ]
llama_model_loader: - kv 0: general.architecture str
llama_model_loader: - kv 1: general.name str
llama_model_loader: - kv 2: llama.context_length u32
llama_model_loader: - kv 3: llama.embedding_length u32
llama_model_loader: - kv 4: llama.block_count u32
llama_model_loader: - kv 5: llama.feed_forward_length u32
llama_model_loader: - kv 6: llama.rope.dimension_count u32
llama_model_loader: - kv 7: llama.attention.head_count u32
llama_model_loader: - kv 8: llama.attention.head_count_kv u32
llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32
llama_model_loader: - kv 10: llama.rope.freq_base f32
llama_model_loader: - kv 11: general.file_type u32
llama_model_loader: - kv 12: tokenizer.ggml.model str
llama_model_loader: - kv 13: tokenizer.ggml.tokens arr
llama_model_loader: - kv 14: tokenizer.ggml.scores arr
llama_model_loader: - kv 15: tokenizer.ggml.token_type arr
llama_model_loader: - kv 16: tokenizer.ggml.bos_token_id u32
llama_model_loader: - kv 17: tokenizer.ggml.eos_token_id u32
llama_model_loader: - kv 18: tokenizer.ggml.unknown_token_id u32
llama_model_loader: - type f32: 161 tensors
llama_model_loader: - type f16: 562 tensors
llama_model_quantize_internal: meta size = 766912 bytes
[ 1/ 723] token_embd.weight - [ 8192, 32000, 1, 1], type = f16, quantizing to q4_0 .. size = 500.00 MB -> 140.62 MB | hist: 0.036 0.015 0.025 0.038 0.056 0.077 0.097 0.112 0.118 0.112 0.097 0.077 0.056 0.038 0.025 0.020
[ 2/ 723] blk.0.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.039 0.000 0.024 0.016 0.042 0.057 0.073 0.176 0.175 0.172 0.080 0.058 0.041 0.013 0.021 0.012
[ 3/ 723] blk.0.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.038 0.000 0.022 0.015 0.040 0.055 0.072 0.186 0.173 0.181 0.079 0.055 0.039 0.013 0.019 0.011
[ 4/ 723] blk.0.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.042 0.024 0.061 0.070 0.078 0.148 0.094 0.149 0.078 0.071 0.067 0.014 0.043 0.016
[ 5/ 723] blk.0.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.042 0.000 0.038 0.023 0.058 0.070 0.080 0.154 0.100 0.153 0.080 0.071 0.063 0.015 0.038 0.015
[ 6/ 723] blk.0.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.089 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 7/ 723] blk.0.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.089 0.145 0.079 0.073 0.069 0.015 0.044 0.016
[ 8/ 723] blk.0.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.043 0.000 0.040 0.024 0.060 0.069 0.079 0.152 0.096 0.152 0.078 0.071 0.066 0.014 0.040 0.015
[ 9/ 723] blk.0.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 10/ 723] blk.0.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 11/ 723] blk.1.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.041 0.000 0.030 0.019 0.048 0.063 0.078 0.172 0.130 0.169 0.080 0.064 0.051 0.013 0.028 0.013
[ 12/ 723] blk.1.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.042 0.000 0.036 0.022 0.056 0.068 0.080 0.158 0.106 0.157 0.080 0.069 0.060 0.014 0.035 0.014
[ 13/ 723] blk.1.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.038 0.022 0.055 0.067 0.080 0.158 0.106 0.158 0.079 0.068 0.061 0.013 0.037 0.015
[ 14/ 723] blk.1.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.043 0.000 0.040 0.024 0.059 0.072 0.082 0.147 0.095 0.147 0.082 0.074 0.065 0.015 0.040 0.016
[ 15/ 723] blk.1.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.080 0.143 0.089 0.143 0.080 0.074 0.069 0.015 0.044 0.016
[ 16/ 723] blk.1.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.072 0.079 0.144 0.089 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 17/ 723] blk.1.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.146 0.077 0.072 0.070 0.015 0.044 0.016
[ 18/ 723] blk.1.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 19/ 723] blk.1.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 20/ 723] blk.2.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.043 0.000 0.039 0.023 0.059 0.070 0.079 0.153 0.099 0.153 0.080 0.071 0.064 0.015 0.038 0.015
[ 21/ 723] blk.2.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.040 0.024 0.061 0.069 0.078 0.151 0.098 0.151 0.079 0.071 0.065 0.015 0.040 0.015
[ 22/ 723] blk.2.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.041 0.024 0.060 0.070 0.079 0.150 0.095 0.150 0.079 0.071 0.066 0.014 0.041 0.015
[ 23/ 723] blk.2.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.143 0.088 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 24/ 723] blk.2.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.088 0.142 0.080 0.074 0.069 0.015 0.044 0.016
[ 25/ 723] blk.2.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.080 0.143 0.089 0.143 0.079 0.073 0.069 0.015 0.044 0.016
[ 26/ 723] blk.2.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 27/ 723] blk.2.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 28/ 723] blk.2.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 29/ 723] blk.3.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.072 0.079 0.146 0.092 0.146 0.079 0.073 0.068 0.015 0.043 0.016
[ 30/ 723] blk.3.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.042 0.024 0.063 0.070 0.078 0.148 0.093 0.147 0.078 0.071 0.068 0.015 0.042 0.016
[ 31/ 723] blk.3.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.072 0.079 0.145 0.092 0.146 0.079 0.073 0.068 0.014 0.043 0.016
[ 32/ 723] blk.3.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.078 0.143 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 33/ 723] blk.3.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.143 0.089 0.143 0.080 0.074 0.069 0.015 0.044 0.016
[ 34/ 723] blk.3.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.079 0.144 0.089 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 35/ 723] blk.3.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.070 0.078 0.145 0.089 0.146 0.077 0.072 0.070 0.015 0.044 0.016
[ 36/ 723] blk.3.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 37/ 723] blk.3.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 38/ 723] blk.4.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.073 0.080 0.144 0.091 0.144 0.080 0.074 0.067 0.015 0.043 0.016
[ 39/ 723] blk.4.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.063 0.071 0.079 0.147 0.092 0.146 0.079 0.072 0.068 0.015 0.043 0.016
[ 40/ 723] blk.4.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.041 0.024 0.060 0.071 0.080 0.149 0.095 0.149 0.080 0.072 0.066 0.014 0.041 0.015
[ 41/ 723] blk.4.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.081 0.142 0.089 0.142 0.081 0.074 0.069 0.015 0.044 0.016
[ 42/ 723] blk.4.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.074 0.081 0.141 0.088 0.141 0.081 0.075 0.068 0.015 0.044 0.016
[ 43/ 723] blk.4.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.089 0.143 0.080 0.074 0.068 0.015 0.044 0.016
[ 44/ 723] blk.4.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.070 0.078 0.145 0.089 0.146 0.077 0.072 0.070 0.015 0.045 0.016
[ 45/ 723] blk.4.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 46/ 723] blk.4.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 47/ 723] blk.5.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.073 0.080 0.143 0.090 0.143 0.080 0.074 0.068 0.015 0.043 0.016
[ 48/ 723] blk.5.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.063 0.071 0.078 0.146 0.091 0.146 0.079 0.072 0.068 0.015 0.043 0.016
[ 49/ 723] blk.5.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.061 0.072 0.080 0.146 0.092 0.146 0.080 0.073 0.068 0.014 0.043 0.016
[ 50/ 723] blk.5.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.024 0.064 0.072 0.079 0.142 0.088 0.143 0.079 0.074 0.070 0.014 0.045 0.016
[ 51/ 723] blk.5.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.081 0.142 0.088 0.142 0.081 0.075 0.068 0.015 0.044 0.016
[ 52/ 723] blk.5.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.143 0.089 0.143 0.080 0.074 0.069 0.015 0.044 0.016
[ 53/ 723] blk.5.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.070 0.078 0.145 0.089 0.146 0.077 0.072 0.070 0.015 0.044 0.016
[ 54/ 723] blk.5.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 55/ 723] blk.5.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 56/ 723] blk.6.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.080 0.143 0.089 0.144 0.080 0.074 0.068 0.015 0.044 0.016
[ 57/ 723] blk.6.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.025 0.063 0.071 0.078 0.146 0.091 0.147 0.078 0.072 0.069 0.015 0.043 0.016
[ 58/ 723] blk.6.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.071 0.079 0.146 0.092 0.147 0.079 0.072 0.068 0.014 0.043 0.016
[ 59/ 723] blk.6.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.070 0.015 0.045 0.016
[ 60/ 723] blk.6.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.088 0.142 0.080 0.074 0.069 0.015 0.044 0.016
[ 61/ 723] blk.6.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.088 0.143 0.080 0.074 0.069 0.015 0.044 0.016
[ 62/ 723] blk.6.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.070 0.078 0.145 0.089 0.146 0.077 0.072 0.070 0.015 0.044 0.016
[ 63/ 723] blk.6.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 64/ 723] blk.6.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 65/ 723] blk.7.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.143 0.089 0.143 0.080 0.074 0.068 0.015 0.044 0.016
[ 66/ 723] blk.7.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.072 0.079 0.145 0.090 0.145 0.079 0.072 0.069 0.015 0.044 0.016
[ 67/ 723] blk.7.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.072 0.080 0.144 0.090 0.144 0.080 0.073 0.068 0.015 0.043 0.016
[ 68/ 723] blk.7.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.070 0.015 0.045 0.016
[ 69/ 723] blk.7.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.088 0.143 0.080 0.074 0.069 0.015 0.044 0.016
[ 70/ 723] blk.7.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.143 0.088 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 71/ 723] blk.7.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.089 0.146 0.077 0.072 0.070 0.015 0.044 0.016
[ 72/ 723] blk.7.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 73/ 723] blk.7.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 74/ 723] blk.8.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.089 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 75/ 723] blk.8.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.090 0.146 0.078 0.072 0.069 0.015 0.044 0.016
[ 76/ 723] blk.8.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.071 0.079 0.145 0.090 0.145 0.079 0.072 0.069 0.015 0.044 0.016
[ 77/ 723] blk.8.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 78/ 723] blk.8.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.069 0.015 0.045 0.016
[ 79/ 723] blk.8.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.144 0.088 0.144 0.079 0.073 0.069 0.015 0.045 0.016
[ 80/ 723] blk.8.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.144 0.089 0.144 0.078 0.073 0.069 0.015 0.044 0.016
[ 81/ 723] blk.8.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 82/ 723] blk.8.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 83/ 723] blk.9.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.073 0.081 0.142 0.089 0.143 0.081 0.074 0.068 0.015 0.043 0.016
[ 84/ 723] blk.9.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.090 0.145 0.078 0.072 0.069 0.015 0.044 0.016
[ 85/ 723] blk.9.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.063 0.072 0.080 0.144 0.090 0.144 0.079 0.073 0.069 0.014 0.043 0.016
[ 86/ 723] blk.9.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.070 0.015 0.045 0.016
[ 87/ 723] blk.9.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.143 0.088 0.143 0.079 0.074 0.069 0.015 0.044 0.016
[ 88/ 723] blk.9.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.143 0.088 0.144 0.079 0.073 0.069 0.015 0.045 0.016
[ 89/ 723] blk.9.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.089 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 90/ 723] blk.9.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 91/ 723] blk.9.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 92/ 723] blk.10.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.082 0.142 0.091 0.142 0.082 0.075 0.067 0.015 0.042 0.016
[ 93/ 723] blk.10.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.063 0.071 0.078 0.146 0.091 0.146 0.078 0.072 0.068 0.015 0.043 0.016
[ 94/ 723] blk.10.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.072 0.080 0.144 0.090 0.144 0.080 0.073 0.068 0.014 0.044 0.016
[ 95/ 723] blk.10.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.078 0.073 0.070 0.015 0.045 0.016
[ 96/ 723] blk.10.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.088 0.142 0.080 0.074 0.069 0.015 0.044 0.016
[ 97/ 723] blk.10.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.144 0.088 0.144 0.078 0.073 0.069 0.015 0.045 0.016
[ 98/ 723] blk.10.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.089 0.146 0.077 0.072 0.070 0.015 0.045 0.016
[ 99/ 723] blk.10.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 100/ 723] blk.10.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 101/ 723] blk.11.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.073 0.081 0.142 0.090 0.142 0.081 0.074 0.068 0.015 0.043 0.016
[ 102/ 723] blk.11.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.071 0.078 0.145 0.090 0.145 0.078 0.072 0.069 0.015 0.044 0.016
[ 103/ 723] blk.11.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.073 0.081 0.143 0.090 0.143 0.080 0.074 0.068 0.014 0.044 0.016
[ 104/ 723] blk.11.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.073 0.079 0.142 0.088 0.143 0.079 0.074 0.070 0.015 0.045 0.016
[ 105/ 723] blk.11.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.088 0.143 0.080 0.074 0.069 0.015 0.044 0.016
[ 106/ 723] blk.11.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.069 0.015 0.045 0.016
[ 107/ 723] blk.11.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.089 0.146 0.077 0.072 0.070 0.015 0.045 0.016
[ 108/ 723] blk.11.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 109/ 723] blk.11.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 110/ 723] blk.12.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.143 0.089 0.143 0.080 0.074 0.068 0.015 0.044 0.016
[ 111/ 723] blk.12.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.090 0.146 0.078 0.072 0.069 0.015 0.044 0.016
[ 112/ 723] blk.12.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.073 0.080 0.143 0.090 0.143 0.080 0.074 0.068 0.014 0.044 0.016
[ 113/ 723] blk.12.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.078 0.073 0.070 0.015 0.045 0.016
[ 114/ 723] blk.12.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.082 0.141 0.088 0.141 0.081 0.075 0.068 0.015 0.044 0.016
[ 115/ 723] blk.12.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.144 0.088 0.144 0.078 0.073 0.069 0.015 0.045 0.016
[ 116/ 723] blk.12.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.070 0.078 0.145 0.089 0.146 0.077 0.072 0.070 0.015 0.044 0.016
[ 117/ 723] blk.12.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 118/ 723] blk.12.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 119/ 723] blk.13.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.073 0.081 0.142 0.090 0.142 0.081 0.074 0.068 0.015 0.043 0.016
[ 120/ 723] blk.13.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.063 0.072 0.079 0.144 0.091 0.144 0.079 0.073 0.068 0.015 0.043 0.016
[ 121/ 723] blk.13.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.072 0.079 0.144 0.090 0.145 0.079 0.073 0.069 0.014 0.044 0.016
[ 122/ 723] blk.13.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 123/ 723] blk.13.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.140 0.089 0.140 0.082 0.076 0.067 0.016 0.043 0.017
[ 124/ 723] blk.13.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.088 0.142 0.080 0.074 0.069 0.015 0.044 0.016
[ 125/ 723] blk.13.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.070 0.078 0.145 0.089 0.146 0.077 0.072 0.070 0.015 0.045 0.016
[ 126/ 723] blk.13.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 127/ 723] blk.13.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 128/ 723] blk.14.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.082 0.141 0.089 0.141 0.081 0.075 0.067 0.015 0.043 0.016
[ 129/ 723] blk.14.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.090 0.145 0.078 0.072 0.069 0.015 0.044 0.016
[ 130/ 723] blk.14.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.062 0.073 0.080 0.143 0.090 0.143 0.080 0.074 0.068 0.014 0.044 0.016
[ 131/ 723] blk.14.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.070 0.015 0.045 0.016
[ 132/ 723] blk.14.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.082 0.141 0.088 0.141 0.081 0.075 0.068 0.015 0.043 0.016
[ 133/ 723] blk.14.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 134/ 723] blk.14.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.089 0.146 0.077 0.072 0.070 0.015 0.045 0.016
[ 135/ 723] blk.14.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 136/ 723] blk.14.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 137/ 723] blk.15.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.063 0.073 0.081 0.143 0.090 0.143 0.081 0.074 0.068 0.015 0.043 0.016
[ 138/ 723] blk.15.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.063 0.072 0.079 0.145 0.091 0.145 0.079 0.072 0.068 0.015 0.043 0.016
[ 139/ 723] blk.15.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.071 0.079 0.145 0.090 0.145 0.079 0.072 0.069 0.015 0.044 0.016
[ 140/ 723] blk.15.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.072 0.070 0.015 0.045 0.016
[ 141/ 723] blk.15.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.074 0.081 0.141 0.088 0.141 0.081 0.075 0.068 0.015 0.044 0.016
[ 142/ 723] blk.15.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 143/ 723] blk.15.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.089 0.146 0.077 0.072 0.070 0.015 0.045 0.016
[ 144/ 723] blk.15.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 145/ 723] blk.15.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 146/ 723] blk.16.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.073 0.081 0.142 0.090 0.142 0.081 0.075 0.067 0.015 0.043 0.016
[ 147/ 723] blk.16.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.025 0.063 0.072 0.079 0.145 0.091 0.145 0.079 0.073 0.068 0.015 0.043 0.016
[ 148/ 723] blk.16.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.072 0.080 0.143 0.090 0.144 0.080 0.074 0.069 0.014 0.044 0.016
[ 149/ 723] blk.16.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.071 0.078 0.143 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 150/ 723] blk.16.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.074 0.081 0.141 0.088 0.141 0.081 0.075 0.068 0.015 0.044 0.016
[ 151/ 723] blk.16.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.144 0.088 0.144 0.079 0.073 0.069 0.015 0.045 0.016
[ 152/ 723] blk.16.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.089 0.146 0.077 0.072 0.070 0.015 0.045 0.016
[ 153/ 723] blk.16.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 154/ 723] blk.16.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 155/ 723] blk.17.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.063 0.073 0.080 0.143 0.089 0.143 0.080 0.074 0.068 0.015 0.043 0.016
[ 156/ 723] blk.17.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.063 0.071 0.079 0.145 0.091 0.145 0.079 0.073 0.069 0.015 0.043 0.016
[ 157/ 723] blk.17.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.090 0.146 0.078 0.072 0.069 0.015 0.044 0.016
[ 158/ 723] blk.17.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 159/ 723] blk.17.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.089 0.139 0.083 0.076 0.067 0.016 0.043 0.017
[ 160/ 723] blk.17.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.143 0.088 0.143 0.080 0.074 0.069 0.015 0.044 0.016
[ 161/ 723] blk.17.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.089 0.146 0.077 0.072 0.070 0.015 0.045 0.016
[ 162/ 723] blk.17.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 163/ 723] blk.17.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 164/ 723] blk.18.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.082 0.141 0.090 0.142 0.082 0.075 0.067 0.015 0.043 0.016
[ 165/ 723] blk.18.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.025 0.063 0.071 0.078 0.146 0.091 0.146 0.078 0.072 0.069 0.015 0.043 0.016
[ 166/ 723] blk.18.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.071 0.078 0.145 0.090 0.146 0.078 0.072 0.069 0.015 0.044 0.016
[ 167/ 723] blk.18.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 168/ 723] blk.18.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.138 0.089 0.139 0.083 0.077 0.066 0.016 0.042 0.017
[ 169/ 723] blk.18.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.069 0.015 0.044 0.016
[ 170/ 723] blk.18.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.089 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 171/ 723] blk.18.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 172/ 723] blk.18.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 173/ 723] blk.19.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.081 0.141 0.090 0.142 0.081 0.075 0.067 0.015 0.043 0.016
[ 174/ 723] blk.19.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.025 0.063 0.071 0.078 0.145 0.091 0.145 0.078 0.072 0.068 0.015 0.043 0.016
[ 175/ 723] blk.19.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.070 0.078 0.146 0.090 0.146 0.077 0.071 0.070 0.015 0.044 0.016
[ 176/ 723] blk.19.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.144 0.088 0.144 0.078 0.072 0.070 0.015 0.045 0.016
[ 177/ 723] blk.19.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.140 0.088 0.140 0.082 0.076 0.067 0.016 0.043 0.017
[ 178/ 723] blk.19.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.074 0.069 0.015 0.044 0.016
[ 179/ 723] blk.19.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 180/ 723] blk.19.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 181/ 723] blk.19.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 182/ 723] blk.20.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.073 0.081 0.142 0.090 0.142 0.081 0.075 0.067 0.015 0.043 0.016
[ 183/ 723] blk.20.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.025 0.063 0.071 0.078 0.145 0.091 0.145 0.079 0.072 0.068 0.015 0.043 0.016
[ 184/ 723] blk.20.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.090 0.145 0.078 0.072 0.069 0.015 0.044 0.016
[ 185/ 723] blk.20.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 186/ 723] blk.20.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.088 0.140 0.082 0.076 0.067 0.016 0.043 0.017
[ 187/ 723] blk.20.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.088 0.143 0.080 0.074 0.069 0.015 0.044 0.016
[ 188/ 723] blk.20.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 189/ 723] blk.20.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 190/ 723] blk.20.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 191/ 723] blk.21.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.074 0.083 0.140 0.092 0.141 0.083 0.075 0.066 0.016 0.042 0.016
[ 192/ 723] blk.21.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.042 0.025 0.062 0.072 0.080 0.144 0.093 0.144 0.080 0.073 0.067 0.015 0.042 0.016
[ 193/ 723] blk.21.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.073 0.080 0.143 0.090 0.143 0.080 0.074 0.068 0.014 0.044 0.016
[ 194/ 723] blk.21.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.069 0.015 0.045 0.016
[ 195/ 723] blk.21.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.088 0.140 0.083 0.076 0.067 0.016 0.043 0.017
[ 196/ 723] blk.21.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.088 0.143 0.080 0.074 0.069 0.015 0.044 0.016
[ 197/ 723] blk.21.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 198/ 723] blk.21.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 199/ 723] blk.21.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 200/ 723] blk.22.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.074 0.082 0.141 0.091 0.141 0.082 0.075 0.067 0.016 0.042 0.016
[ 201/ 723] blk.22.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.063 0.071 0.079 0.146 0.093 0.145 0.079 0.072 0.068 0.015 0.043 0.016
[ 202/ 723] blk.22.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.090 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 203/ 723] blk.22.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 204/ 723] blk.22.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.082 0.140 0.088 0.140 0.082 0.076 0.067 0.015 0.043 0.017
[ 205/ 723] blk.22.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.143 0.088 0.143 0.080 0.074 0.069 0.015 0.044 0.016
[ 206/ 723] blk.22.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 207/ 723] blk.22.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 208/ 723] blk.22.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 209/ 723] blk.23.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.074 0.083 0.141 0.090 0.141 0.083 0.075 0.067 0.016 0.042 0.016
[ 210/ 723] blk.23.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.063 0.071 0.079 0.145 0.092 0.145 0.079 0.073 0.068 0.015 0.043 0.016
[ 211/ 723] blk.23.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.072 0.080 0.143 0.090 0.143 0.080 0.074 0.068 0.015 0.044 0.016
[ 212/ 723] blk.23.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.070 0.015 0.045 0.016
[ 213/ 723] blk.23.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.140 0.088 0.140 0.082 0.076 0.067 0.015 0.043 0.017
[ 214/ 723] blk.23.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.088 0.143 0.080 0.074 0.069 0.015 0.044 0.016
[ 215/ 723] blk.23.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 216/ 723] blk.23.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 217/ 723] blk.23.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 218/ 723] blk.24.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.062 0.074 0.082 0.142 0.091 0.142 0.082 0.075 0.067 0.015 0.042 0.016
[ 219/ 723] blk.24.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.063 0.072 0.079 0.145 0.091 0.144 0.080 0.073 0.068 0.015 0.043 0.016
[ 220/ 723] blk.24.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.073 0.081 0.142 0.089 0.142 0.081 0.074 0.068 0.015 0.044 0.016
[ 221/ 723] blk.24.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.070 0.015 0.045 0.016
[ 222/ 723] blk.24.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.082 0.140 0.088 0.140 0.082 0.076 0.067 0.015 0.043 0.017
[ 223/ 723] blk.24.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.088 0.142 0.080 0.074 0.068 0.015 0.044 0.016
[ 224/ 723] blk.24.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 225/ 723] blk.24.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 226/ 723] blk.24.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 227/ 723] blk.25.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.074 0.083 0.140 0.091 0.140 0.083 0.076 0.066 0.016 0.042 0.016
[ 228/ 723] blk.25.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.025 0.063 0.072 0.079 0.145 0.091 0.145 0.079 0.073 0.068 0.015 0.043 0.016
[ 229/ 723] blk.25.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.072 0.080 0.144 0.090 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 230/ 723] blk.25.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.024 0.064 0.073 0.080 0.142 0.088 0.143 0.080 0.074 0.070 0.014 0.045 0.016
[ 231/ 723] blk.25.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.137 0.089 0.137 0.085 0.077 0.066 0.017 0.042 0.017
[ 232/ 723] blk.25.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.062 0.074 0.082 0.141 0.088 0.141 0.082 0.075 0.068 0.015 0.044 0.016
[ 233/ 723] blk.25.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.089 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 234/ 723] blk.25.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 235/ 723] blk.25.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 236/ 723] blk.26.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.082 0.141 0.089 0.141 0.082 0.075 0.067 0.015 0.043 0.016
[ 237/ 723] blk.26.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.063 0.073 0.080 0.144 0.091 0.144 0.080 0.074 0.068 0.015 0.043 0.016
[ 238/ 723] blk.26.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.072 0.080 0.143 0.089 0.143 0.080 0.074 0.069 0.015 0.044 0.016
[ 239/ 723] blk.26.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.142 0.088 0.143 0.079 0.074 0.069 0.015 0.045 0.016
[ 240/ 723] blk.26.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.137 0.089 0.137 0.085 0.077 0.066 0.017 0.042 0.017
[ 241/ 723] blk.26.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.074 0.081 0.141 0.088 0.141 0.081 0.075 0.068 0.015 0.044 0.016
[ 242/ 723] blk.26.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.088 0.145 0.078 0.073 0.070 0.015 0.045 0.016
[ 243/ 723] blk.26.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 244/ 723] blk.26.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 245/ 723] blk.27.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.074 0.083 0.141 0.092 0.141 0.083 0.075 0.066 0.016 0.042 0.016
[ 246/ 723] blk.27.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.042 0.025 0.062 0.072 0.080 0.144 0.092 0.143 0.081 0.073 0.067 0.015 0.042 0.016
[ 247/ 723] blk.27.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.071 0.079 0.145 0.090 0.145 0.078 0.073 0.069 0.015 0.044 0.016
[ 248/ 723] blk.27.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.073 0.080 0.142 0.088 0.142 0.080 0.074 0.069 0.015 0.044 0.016
[ 249/ 723] blk.27.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.137 0.089 0.137 0.085 0.077 0.066 0.017 0.042 0.017
[ 250/ 723] blk.27.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.082 0.140 0.088 0.140 0.082 0.076 0.067 0.015 0.043 0.017
[ 251/ 723] blk.27.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 252/ 723] blk.27.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 253/ 723] blk.27.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 254/ 723] blk.28.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.075 0.084 0.139 0.091 0.139 0.083 0.076 0.066 0.016 0.042 0.016
[ 255/ 723] blk.28.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.025 0.063 0.072 0.079 0.144 0.091 0.144 0.080 0.073 0.068 0.015 0.043 0.016
[ 256/ 723] blk.28.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.073 0.080 0.143 0.090 0.143 0.080 0.074 0.068 0.014 0.044 0.016
[ 257/ 723] blk.28.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 258/ 723] blk.28.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.137 0.089 0.137 0.085 0.077 0.066 0.017 0.042 0.017
[ 259/ 723] blk.28.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.082 0.140 0.088 0.140 0.082 0.076 0.067 0.015 0.043 0.017
[ 260/ 723] blk.28.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 261/ 723] blk.28.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 262/ 723] blk.28.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 263/ 723] blk.29.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.075 0.083 0.140 0.091 0.140 0.083 0.076 0.066 0.016 0.042 0.016
[ 264/ 723] blk.29.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.072 0.080 0.144 0.091 0.144 0.080 0.073 0.068 0.015 0.043 0.016
[ 265/ 723] blk.29.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.143 0.090 0.143 0.080 0.074 0.068 0.015 0.044 0.016
[ 266/ 723] blk.29.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 267/ 723] blk.29.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.136 0.089 0.137 0.085 0.078 0.065 0.017 0.042 0.017
[ 268/ 723] blk.29.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.082 0.140 0.088 0.140 0.082 0.076 0.067 0.015 0.043 0.017
[ 269/ 723] blk.29.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 270/ 723] blk.29.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 271/ 723] blk.29.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 272/ 723] blk.30.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.075 0.083 0.140 0.092 0.140 0.084 0.076 0.065 0.016 0.041 0.016
[ 273/ 723] blk.30.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.042 0.024 0.062 0.072 0.079 0.145 0.094 0.145 0.080 0.073 0.066 0.015 0.042 0.016
[ 274/ 723] blk.30.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.089 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 275/ 723] blk.30.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 276/ 723] blk.30.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.136 0.089 0.137 0.085 0.078 0.065 0.017 0.042 0.017
[ 277/ 723] blk.30.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.140 0.088 0.140 0.082 0.076 0.067 0.015 0.043 0.017
[ 278/ 723] blk.30.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 279/ 723] blk.30.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 280/ 723] blk.30.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 281/ 723] blk.31.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.075 0.083 0.140 0.090 0.140 0.083 0.076 0.066 0.016 0.042 0.016
[ 282/ 723] blk.31.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.073 0.080 0.144 0.091 0.144 0.080 0.074 0.067 0.015 0.043 0.016
[ 283/ 723] blk.31.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.062 0.073 0.081 0.142 0.089 0.142 0.081 0.074 0.068 0.015 0.044 0.016
[ 284/ 723] blk.31.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.078 0.143 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 285/ 723] blk.31.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.136 0.089 0.137 0.085 0.078 0.065 0.017 0.042 0.017
[ 286/ 723] blk.31.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.082 0.140 0.088 0.141 0.082 0.075 0.068 0.015 0.044 0.017
[ 287/ 723] blk.31.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 288/ 723] blk.31.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 289/ 723] blk.31.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 290/ 723] blk.32.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.082 0.141 0.091 0.141 0.082 0.075 0.067 0.016 0.042 0.016
[ 291/ 723] blk.32.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.063 0.072 0.079 0.144 0.091 0.144 0.079 0.073 0.068 0.015 0.043 0.016
[ 292/ 723] blk.32.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.071 0.079 0.144 0.089 0.145 0.078 0.073 0.069 0.015 0.044 0.016
[ 293/ 723] blk.32.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.144 0.088 0.144 0.078 0.072 0.070 0.015 0.045 0.016
[ 294/ 723] blk.32.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.137 0.089 0.137 0.085 0.077 0.066 0.017 0.042 0.017
[ 295/ 723] blk.32.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.082 0.140 0.088 0.140 0.082 0.076 0.067 0.015 0.043 0.017
[ 296/ 723] blk.32.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 297/ 723] blk.32.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 298/ 723] blk.32.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 299/ 723] blk.33.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.082 0.141 0.090 0.141 0.082 0.075 0.067 0.015 0.043 0.016
[ 300/ 723] blk.33.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.063 0.072 0.079 0.144 0.090 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 301/ 723] blk.33.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.072 0.079 0.143 0.089 0.143 0.080 0.073 0.069 0.015 0.044 0.016
[ 302/ 723] blk.33.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 303/ 723] blk.33.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.137 0.089 0.137 0.085 0.077 0.066 0.017 0.042 0.017
[ 304/ 723] blk.33.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.062 0.074 0.082 0.140 0.088 0.141 0.082 0.075 0.068 0.015 0.044 0.017
[ 305/ 723] blk.33.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 306/ 723] blk.33.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 307/ 723] blk.33.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 308/ 723] blk.34.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.073 0.081 0.143 0.091 0.143 0.081 0.074 0.067 0.015 0.043 0.016
[ 309/ 723] blk.34.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.071 0.079 0.146 0.092 0.145 0.079 0.072 0.068 0.015 0.043 0.016
[ 310/ 723] blk.34.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.090 0.144 0.079 0.073 0.068 0.015 0.044 0.016
[ 311/ 723] blk.34.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 312/ 723] blk.34.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.137 0.089 0.137 0.084 0.077 0.066 0.017 0.042 0.017
[ 313/ 723] blk.34.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.082 0.140 0.088 0.140 0.082 0.076 0.067 0.015 0.043 0.017
[ 314/ 723] blk.34.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 315/ 723] blk.34.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 316/ 723] blk.34.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 317/ 723] blk.35.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.062 0.074 0.082 0.141 0.091 0.141 0.082 0.075 0.067 0.015 0.042 0.016
[ 318/ 723] blk.35.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.063 0.072 0.079 0.145 0.091 0.144 0.079 0.073 0.068 0.015 0.043 0.016
[ 319/ 723] blk.35.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.072 0.080 0.143 0.089 0.143 0.080 0.074 0.068 0.015 0.044 0.016
[ 320/ 723] blk.35.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.070 0.015 0.045 0.016
[ 321/ 723] blk.35.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.137 0.089 0.137 0.085 0.077 0.066 0.017 0.042 0.017
[ 322/ 723] blk.35.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.140 0.088 0.140 0.082 0.076 0.067 0.015 0.043 0.017
[ 323/ 723] blk.35.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.089 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 324/ 723] blk.35.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 325/ 723] blk.35.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 326/ 723] blk.36.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.082 0.141 0.090 0.142 0.082 0.075 0.067 0.015 0.043 0.016
[ 327/ 723] blk.36.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.072 0.080 0.144 0.092 0.144 0.080 0.073 0.067 0.015 0.043 0.016
[ 328/ 723] blk.36.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.069 0.015 0.044 0.016
[ 329/ 723] blk.36.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.143 0.088 0.144 0.079 0.073 0.070 0.015 0.045 0.016
[ 330/ 723] blk.36.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.086 0.136 0.090 0.137 0.085 0.078 0.065 0.017 0.041 0.017
[ 331/ 723] blk.36.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.140 0.088 0.140 0.082 0.076 0.067 0.015 0.043 0.017
[ 332/ 723] blk.36.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 333/ 723] blk.36.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 334/ 723] blk.36.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 335/ 723] blk.37.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.083 0.140 0.090 0.141 0.082 0.075 0.067 0.016 0.042 0.016
[ 336/ 723] blk.37.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.072 0.080 0.143 0.091 0.143 0.080 0.074 0.068 0.015 0.043 0.016
[ 337/ 723] blk.37.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.144 0.089 0.144 0.078 0.073 0.069 0.015 0.044 0.016
[ 338/ 723] blk.37.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.071 0.015 0.045 0.016
[ 339/ 723] blk.37.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.086 0.136 0.090 0.136 0.085 0.078 0.065 0.017 0.041 0.017
[ 340/ 723] blk.37.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.089 0.139 0.083 0.076 0.067 0.016 0.043 0.017
[ 341/ 723] blk.37.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 342/ 723] blk.37.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 343/ 723] blk.37.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 344/ 723] blk.38.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.075 0.083 0.140 0.090 0.140 0.083 0.076 0.066 0.016 0.042 0.017
[ 345/ 723] blk.38.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.072 0.079 0.144 0.090 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 346/ 723] blk.38.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.072 0.079 0.144 0.089 0.144 0.079 0.074 0.069 0.015 0.044 0.016
[ 347/ 723] blk.38.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.070 0.015 0.045 0.016
[ 348/ 723] blk.38.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.086 0.136 0.090 0.136 0.085 0.078 0.065 0.017 0.041 0.017
[ 349/ 723] blk.38.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.089 0.139 0.083 0.076 0.067 0.016 0.043 0.017
[ 350/ 723] blk.38.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.070 0.015 0.044 0.016
[ 351/ 723] blk.38.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 352/ 723] blk.38.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 353/ 723] blk.39.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.140 0.089 0.140 0.083 0.076 0.067 0.016 0.043 0.016
[ 354/ 723] blk.39.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.025 0.063 0.073 0.080 0.143 0.090 0.143 0.080 0.074 0.068 0.015 0.044 0.016
[ 355/ 723] blk.39.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.089 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 356/ 723] blk.39.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.070 0.015 0.045 0.016
[ 357/ 723] blk.39.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.136 0.089 0.137 0.085 0.078 0.065 0.017 0.042 0.017
[ 358/ 723] blk.39.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.089 0.139 0.083 0.076 0.067 0.016 0.043 0.017
[ 359/ 723] blk.39.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.070 0.015 0.044 0.016
[ 360/ 723] blk.39.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 361/ 723] blk.39.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 362/ 723] blk.40.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.140 0.090 0.140 0.083 0.076 0.067 0.016 0.043 0.016
[ 363/ 723] blk.40.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.025 0.063 0.073 0.080 0.143 0.090 0.143 0.080 0.074 0.068 0.015 0.043 0.016
[ 364/ 723] blk.40.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.071 0.078 0.145 0.090 0.146 0.078 0.072 0.069 0.015 0.044 0.016
[ 365/ 723] blk.40.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.072 0.070 0.015 0.045 0.016
[ 366/ 723] blk.40.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.060 0.076 0.086 0.136 0.090 0.136 0.085 0.078 0.065 0.017 0.041 0.017
[ 367/ 723] blk.40.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.089 0.139 0.083 0.076 0.067 0.016 0.043 0.017
[ 368/ 723] blk.40.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.089 0.145 0.078 0.072 0.069 0.015 0.044 0.016
[ 369/ 723] blk.40.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 370/ 723] blk.40.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 371/ 723] blk.41.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.075 0.083 0.139 0.090 0.139 0.083 0.076 0.066 0.016 0.042 0.016
[ 372/ 723] blk.41.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.073 0.081 0.143 0.091 0.143 0.081 0.074 0.067 0.015 0.043 0.016
[ 373/ 723] blk.41.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.073 0.069 0.015 0.044 0.016
[ 374/ 723] blk.41.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 375/ 723] blk.41.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.135 0.090 0.136 0.086 0.078 0.065 0.018 0.041 0.017
[ 376/ 723] blk.41.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.089 0.139 0.083 0.076 0.067 0.016 0.043 0.017
[ 377/ 723] blk.41.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 378/ 723] blk.41.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 379/ 723] blk.41.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 380/ 723] blk.42.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.074 0.083 0.140 0.090 0.140 0.083 0.076 0.067 0.016 0.042 0.016
[ 381/ 723] blk.42.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.073 0.081 0.143 0.090 0.143 0.081 0.074 0.068 0.015 0.043 0.016
[ 382/ 723] blk.42.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.145 0.090 0.145 0.078 0.073 0.069 0.015 0.044 0.016
[ 383/ 723] blk.42.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.071 0.078 0.143 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 384/ 723] blk.42.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.135 0.090 0.136 0.086 0.078 0.065 0.018 0.041 0.017
[ 385/ 723] blk.42.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.139 0.089 0.139 0.083 0.077 0.067 0.016 0.043 0.017
[ 386/ 723] blk.42.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.089 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 387/ 723] blk.42.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 388/ 723] blk.42.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 389/ 723] blk.43.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.075 0.084 0.139 0.091 0.139 0.084 0.076 0.066 0.016 0.042 0.016
[ 390/ 723] blk.43.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.073 0.080 0.143 0.091 0.143 0.081 0.074 0.068 0.015 0.043 0.016
[ 391/ 723] blk.43.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.072 0.079 0.144 0.089 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 392/ 723] blk.43.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.078 0.143 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 393/ 723] blk.43.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.135 0.090 0.136 0.086 0.078 0.065 0.018 0.041 0.017
[ 394/ 723] blk.43.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.083 0.139 0.089 0.139 0.083 0.076 0.067 0.016 0.043 0.017
[ 395/ 723] blk.43.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.089 0.145 0.078 0.073 0.070 0.015 0.044 0.016
[ 396/ 723] blk.43.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 397/ 723] blk.43.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 398/ 723] blk.44.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.137 0.091 0.138 0.085 0.077 0.065 0.017 0.041 0.017
[ 399/ 723] blk.44.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.073 0.081 0.143 0.090 0.142 0.081 0.074 0.067 0.015 0.043 0.016
[ 400/ 723] blk.44.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.078 0.145 0.090 0.145 0.078 0.073 0.069 0.015 0.044 0.016
[ 401/ 723] blk.44.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.144 0.088 0.144 0.078 0.072 0.070 0.015 0.045 0.016
[ 402/ 723] blk.44.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.135 0.090 0.136 0.086 0.078 0.065 0.018 0.041 0.017
[ 403/ 723] blk.44.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.088 0.140 0.083 0.076 0.067 0.015 0.043 0.017
[ 404/ 723] blk.44.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 405/ 723] blk.44.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 406/ 723] blk.44.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 407/ 723] blk.45.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.083 0.140 0.090 0.140 0.083 0.076 0.067 0.015 0.043 0.016
[ 408/ 723] blk.45.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.072 0.080 0.144 0.091 0.144 0.080 0.074 0.067 0.015 0.043 0.016
[ 409/ 723] blk.45.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.078 0.144 0.090 0.145 0.079 0.073 0.069 0.015 0.044 0.016
[ 410/ 723] blk.45.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 411/ 723] blk.45.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.136 0.090 0.136 0.085 0.078 0.065 0.017 0.041 0.017
[ 412/ 723] blk.45.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.088 0.139 0.083 0.076 0.067 0.015 0.043 0.017
[ 413/ 723] blk.45.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 414/ 723] blk.45.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 415/ 723] blk.45.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 416/ 723] blk.46.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.075 0.084 0.139 0.090 0.139 0.084 0.076 0.066 0.016 0.042 0.017
[ 417/ 723] blk.46.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.072 0.080 0.144 0.092 0.144 0.080 0.073 0.068 0.015 0.043 0.016
[ 418/ 723] blk.46.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.072 0.079 0.144 0.090 0.145 0.079 0.073 0.069 0.015 0.044 0.016
[ 419/ 723] blk.46.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 420/ 723] blk.46.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.077 0.086 0.136 0.090 0.136 0.085 0.078 0.065 0.017 0.041 0.017
[ 421/ 723] blk.46.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.088 0.140 0.083 0.076 0.067 0.015 0.043 0.017
[ 422/ 723] blk.46.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.144 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 423/ 723] blk.46.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 424/ 723] blk.46.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 425/ 723] blk.47.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.075 0.084 0.138 0.091 0.139 0.084 0.077 0.066 0.016 0.042 0.017
[ 426/ 723] blk.47.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.042 0.025 0.062 0.073 0.082 0.142 0.091 0.142 0.082 0.075 0.067 0.015 0.042 0.016
[ 427/ 723] blk.47.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.062 0.073 0.080 0.142 0.090 0.143 0.080 0.074 0.068 0.015 0.044 0.016
[ 428/ 723] blk.47.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.073 0.080 0.142 0.088 0.142 0.080 0.074 0.069 0.015 0.044 0.016
[ 429/ 723] blk.47.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.136 0.090 0.136 0.086 0.078 0.065 0.017 0.041 0.017
[ 430/ 723] blk.47.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.089 0.139 0.083 0.076 0.067 0.016 0.043 0.017
[ 431/ 723] blk.47.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.144 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 432/ 723] blk.47.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 433/ 723] blk.47.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 434/ 723] blk.48.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.076 0.085 0.138 0.092 0.138 0.085 0.077 0.065 0.017 0.040 0.017
[ 435/ 723] blk.48.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.041 0.025 0.060 0.074 0.083 0.142 0.094 0.141 0.083 0.075 0.065 0.016 0.041 0.016
[ 436/ 723] blk.48.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.062 0.073 0.081 0.142 0.089 0.142 0.081 0.074 0.068 0.015 0.044 0.016
[ 437/ 723] blk.48.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.088 0.142 0.080 0.074 0.069 0.015 0.044 0.016
[ 438/ 723] blk.48.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.076 0.086 0.136 0.090 0.136 0.085 0.078 0.065 0.017 0.041 0.017
[ 439/ 723] blk.48.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.138 0.089 0.139 0.084 0.077 0.066 0.016 0.043 0.017
[ 440/ 723] blk.48.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 441/ 723] blk.48.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 442/ 723] blk.48.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 443/ 723] blk.49.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.024 0.061 0.074 0.082 0.142 0.092 0.142 0.082 0.075 0.067 0.015 0.042 0.016
[ 444/ 723] blk.49.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.041 0.025 0.060 0.074 0.083 0.141 0.096 0.141 0.084 0.075 0.064 0.016 0.040 0.016
[ 445/ 723] blk.49.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.062 0.073 0.081 0.142 0.089 0.142 0.080 0.074 0.068 0.015 0.044 0.016
[ 446/ 723] blk.49.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.073 0.080 0.142 0.088 0.143 0.079 0.074 0.069 0.015 0.045 0.016
[ 447/ 723] blk.49.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.136 0.090 0.136 0.085 0.078 0.065 0.017 0.041 0.017
[ 448/ 723] blk.49.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.076 0.084 0.138 0.089 0.138 0.084 0.077 0.066 0.016 0.042 0.017
[ 449/ 723] blk.49.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 450/ 723] blk.49.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 451/ 723] blk.49.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 452/ 723] blk.50.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.041 0.025 0.060 0.075 0.085 0.138 0.092 0.139 0.085 0.077 0.065 0.017 0.041 0.017
[ 453/ 723] blk.50.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.042 0.024 0.061 0.072 0.080 0.144 0.096 0.144 0.081 0.073 0.066 0.015 0.041 0.016
[ 454/ 723] blk.50.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.071 0.078 0.144 0.089 0.145 0.078 0.073 0.069 0.015 0.045 0.016
[ 455/ 723] blk.50.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.144 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 456/ 723] blk.50.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.136 0.090 0.136 0.085 0.078 0.065 0.017 0.041 0.017
[ 457/ 723] blk.50.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.138 0.089 0.139 0.083 0.077 0.066 0.016 0.043 0.017
[ 458/ 723] blk.50.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 459/ 723] blk.50.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 460/ 723] blk.50.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 461/ 723] blk.51.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.040 0.026 0.060 0.076 0.086 0.136 0.094 0.137 0.086 0.077 0.064 0.018 0.040 0.017
[ 462/ 723] blk.51.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.073 0.081 0.143 0.092 0.143 0.081 0.074 0.067 0.015 0.042 0.016
[ 463/ 723] blk.51.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.063 0.073 0.080 0.142 0.089 0.143 0.080 0.074 0.068 0.015 0.044 0.016
[ 464/ 723] blk.51.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 465/ 723] blk.51.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.086 0.136 0.090 0.136 0.085 0.078 0.065 0.017 0.041 0.017
[ 466/ 723] blk.51.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.084 0.139 0.089 0.139 0.083 0.077 0.067 0.016 0.043 0.017
[ 467/ 723] blk.51.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.144 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 468/ 723] blk.51.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 469/ 723] blk.51.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 470/ 723] blk.52.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.001 0.040 0.026 0.060 0.076 0.086 0.136 0.093 0.136 0.086 0.077 0.064 0.018 0.040 0.017
[ 471/ 723] blk.52.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.042 0.024 0.061 0.074 0.082 0.142 0.092 0.142 0.082 0.075 0.066 0.015 0.042 0.016
[ 472/ 723] blk.52.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.089 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 473/ 723] blk.52.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.144 0.088 0.144 0.078 0.072 0.070 0.015 0.045 0.016
[ 474/ 723] blk.52.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.061 0.077 0.086 0.136 0.090 0.136 0.085 0.078 0.065 0.017 0.041 0.017
[ 475/ 723] blk.52.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.138 0.089 0.139 0.083 0.077 0.067 0.016 0.043 0.017
[ 476/ 723] blk.52.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.144 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 477/ 723] blk.52.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 478/ 723] blk.52.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 479/ 723] blk.53.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.041 0.025 0.060 0.075 0.084 0.140 0.092 0.140 0.084 0.076 0.065 0.016 0.041 0.016
[ 480/ 723] blk.53.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.041 0.024 0.061 0.072 0.080 0.146 0.096 0.145 0.081 0.073 0.065 0.015 0.041 0.016
[ 481/ 723] blk.53.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.090 0.145 0.078 0.072 0.069 0.015 0.045 0.016
[ 482/ 723] blk.53.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.069 0.015 0.045 0.016
[ 483/ 723] blk.53.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.136 0.090 0.136 0.085 0.078 0.065 0.018 0.041 0.017
[ 484/ 723] blk.53.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.089 0.139 0.083 0.077 0.067 0.016 0.043 0.017
[ 485/ 723] blk.53.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.144 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 486/ 723] blk.53.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 487/ 723] blk.53.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 488/ 723] blk.54.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.041 0.025 0.060 0.075 0.085 0.138 0.093 0.138 0.085 0.077 0.064 0.017 0.040 0.016
[ 489/ 723] blk.54.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.041 0.024 0.061 0.072 0.079 0.146 0.097 0.145 0.081 0.073 0.066 0.015 0.041 0.016
[ 490/ 723] blk.54.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.073 0.069 0.015 0.044 0.016
[ 491/ 723] blk.54.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.144 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 492/ 723] blk.54.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.136 0.090 0.137 0.085 0.077 0.065 0.017 0.042 0.017
[ 493/ 723] blk.54.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.138 0.089 0.139 0.083 0.077 0.067 0.016 0.043 0.017
[ 494/ 723] blk.54.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.144 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 495/ 723] blk.54.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 496/ 723] blk.54.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 497/ 723] blk.55.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.076 0.086 0.137 0.092 0.137 0.086 0.077 0.064 0.018 0.040 0.017
[ 498/ 723] blk.55.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.042 0.024 0.061 0.073 0.081 0.144 0.094 0.143 0.081 0.074 0.066 0.015 0.042 0.016
[ 499/ 723] blk.55.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.090 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 500/ 723] blk.55.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 501/ 723] blk.55.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.136 0.090 0.137 0.085 0.078 0.065 0.017 0.042 0.017
[ 502/ 723] blk.55.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.138 0.089 0.139 0.083 0.077 0.067 0.016 0.043 0.017
[ 503/ 723] blk.55.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 504/ 723] blk.55.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 505/ 723] blk.55.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 506/ 723] blk.56.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.040 0.026 0.060 0.076 0.086 0.137 0.094 0.137 0.086 0.077 0.064 0.018 0.039 0.016
[ 507/ 723] blk.56.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.041 0.024 0.060 0.073 0.081 0.144 0.095 0.143 0.082 0.074 0.065 0.016 0.041 0.016
[ 508/ 723] blk.56.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.089 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 509/ 723] blk.56.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.088 0.142 0.080 0.074 0.069 0.015 0.044 0.016
[ 510/ 723] blk.56.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.137 0.089 0.137 0.085 0.077 0.066 0.017 0.042 0.017
[ 511/ 723] blk.56.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.088 0.139 0.083 0.076 0.067 0.016 0.043 0.017
[ 512/ 723] blk.56.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.144 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 513/ 723] blk.56.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 514/ 723] blk.56.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 515/ 723] blk.57.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.076 0.085 0.138 0.092 0.138 0.085 0.077 0.065 0.017 0.040 0.016
[ 516/ 723] blk.57.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.041 0.024 0.061 0.072 0.080 0.146 0.096 0.145 0.081 0.073 0.065 0.015 0.041 0.016
[ 517/ 723] blk.57.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.143 0.089 0.143 0.079 0.074 0.068 0.015 0.044 0.016
[ 518/ 723] blk.57.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 519/ 723] blk.57.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.137 0.089 0.137 0.084 0.077 0.066 0.017 0.042 0.017
[ 520/ 723] blk.57.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.089 0.139 0.083 0.076 0.067 0.016 0.043 0.017
[ 521/ 723] blk.57.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 522/ 723] blk.57.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 523/ 723] blk.57.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 524/ 723] blk.58.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.024 0.061 0.074 0.082 0.143 0.092 0.142 0.082 0.075 0.066 0.015 0.042 0.016
[ 525/ 723] blk.58.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.040 0.024 0.060 0.071 0.079 0.147 0.101 0.146 0.081 0.072 0.064 0.015 0.040 0.016
[ 526/ 723] blk.58.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 527/ 723] blk.58.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.144 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 528/ 723] blk.58.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.136 0.090 0.137 0.085 0.078 0.065 0.017 0.042 0.017
[ 529/ 723] blk.58.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.088 0.139 0.083 0.076 0.067 0.016 0.043 0.017
[ 530/ 723] blk.58.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.144 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 531/ 723] blk.58.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 532/ 723] blk.58.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 533/ 723] blk.59.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.040 0.026 0.059 0.075 0.086 0.138 0.094 0.138 0.086 0.077 0.064 0.017 0.040 0.016
[ 534/ 723] blk.59.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.040 0.024 0.060 0.072 0.081 0.146 0.099 0.145 0.082 0.073 0.064 0.015 0.040 0.016
[ 535/ 723] blk.59.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.080 0.143 0.089 0.143 0.079 0.074 0.069 0.015 0.044 0.016
[ 536/ 723] blk.59.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 537/ 723] blk.59.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.136 0.089 0.137 0.085 0.078 0.065 0.017 0.042 0.017
[ 538/ 723] blk.59.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.076 0.084 0.138 0.089 0.138 0.084 0.077 0.066 0.016 0.042 0.017
[ 539/ 723] blk.59.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.144 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 540/ 723] blk.59.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 541/ 723] blk.59.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 542/ 723] blk.60.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.043 0.000 0.040 0.024 0.059 0.073 0.083 0.143 0.097 0.143 0.084 0.075 0.064 0.016 0.039 0.016
[ 543/ 723] blk.60.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.042 0.000 0.036 0.022 0.057 0.068 0.078 0.152 0.118 0.151 0.081 0.070 0.059 0.015 0.035 0.015
[ 544/ 723] blk.60.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.089 0.144 0.079 0.073 0.069 0.015 0.044 0.016
[ 545/ 723] blk.60.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 546/ 723] blk.60.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.136 0.089 0.137 0.085 0.078 0.065 0.017 0.042 0.017
[ 547/ 723] blk.60.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.084 0.139 0.089 0.139 0.083 0.077 0.067 0.016 0.043 0.017
[ 548/ 723] blk.60.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 549/ 723] blk.60.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 550/ 723] blk.60.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 551/ 723] blk.61.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.040 0.025 0.059 0.075 0.085 0.139 0.094 0.139 0.085 0.076 0.064 0.017 0.040 0.016
[ 552/ 723] blk.61.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.039 0.024 0.059 0.071 0.080 0.148 0.104 0.147 0.082 0.072 0.063 0.015 0.038 0.016
[ 553/ 723] blk.61.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.062 0.073 0.081 0.142 0.089 0.142 0.081 0.074 0.068 0.015 0.044 0.016
[ 554/ 723] blk.61.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.081 0.141 0.088 0.142 0.081 0.074 0.069 0.015 0.044 0.016
[ 555/ 723] blk.61.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.136 0.090 0.137 0.085 0.078 0.065 0.017 0.041 0.017
[ 556/ 723] blk.61.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.088 0.139 0.083 0.076 0.067 0.016 0.043 0.017
[ 557/ 723] blk.61.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 558/ 723] blk.61.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 559/ 723] blk.61.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 560/ 723] blk.62.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.043 0.000 0.040 0.026 0.059 0.076 0.087 0.138 0.095 0.138 0.086 0.077 0.063 0.018 0.039 0.016
[ 561/ 723] blk.62.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.039 0.024 0.059 0.072 0.081 0.147 0.104 0.146 0.083 0.072 0.062 0.016 0.038 0.016
[ 562/ 723] blk.62.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.024 0.062 0.073 0.080 0.142 0.089 0.142 0.080 0.074 0.068 0.015 0.044 0.016
[ 563/ 723] blk.62.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.073 0.080 0.142 0.088 0.142 0.080 0.074 0.069 0.015 0.045 0.016
[ 564/ 723] blk.62.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.136 0.090 0.137 0.085 0.077 0.065 0.017 0.042 0.017
[ 565/ 723] blk.62.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.084 0.138 0.089 0.139 0.083 0.077 0.067 0.016 0.043 0.017
[ 566/ 723] blk.62.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 567/ 723] blk.62.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 568/ 723] blk.62.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 569/ 723] blk.63.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.043 0.000 0.040 0.025 0.059 0.075 0.085 0.140 0.095 0.140 0.085 0.076 0.064 0.017 0.040 0.016
[ 570/ 723] blk.63.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.038 0.023 0.058 0.070 0.080 0.149 0.107 0.148 0.082 0.072 0.062 0.016 0.037 0.016
[ 571/ 723] blk.63.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.062 0.073 0.080 0.142 0.089 0.143 0.080 0.074 0.068 0.015 0.044 0.016
[ 572/ 723] blk.63.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.073 0.080 0.142 0.088 0.142 0.080 0.074 0.069 0.015 0.044 0.016
[ 573/ 723] blk.63.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.077 0.086 0.136 0.090 0.137 0.085 0.078 0.065 0.017 0.041 0.017
[ 574/ 723] blk.63.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.089 0.139 0.083 0.076 0.067 0.016 0.043 0.017
[ 575/ 723] blk.63.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 576/ 723] blk.63.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 577/ 723] blk.63.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 578/ 723] blk.64.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.074 0.082 0.142 0.092 0.142 0.083 0.075 0.066 0.015 0.042 0.016
[ 579/ 723] blk.64.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.041 0.024 0.061 0.071 0.079 0.148 0.099 0.147 0.080 0.072 0.065 0.015 0.040 0.016
[ 580/ 723] blk.64.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.089 0.144 0.079 0.073 0.069 0.015 0.045 0.016
[ 581/ 723] blk.64.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.144 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 582/ 723] blk.64.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.137 0.090 0.137 0.085 0.077 0.065 0.017 0.042 0.017
[ 583/ 723] blk.64.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.138 0.089 0.139 0.083 0.077 0.067 0.016 0.043 0.017
[ 584/ 723] blk.64.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 585/ 723] blk.64.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 586/ 723] blk.64.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 587/ 723] blk.65.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.043 0.000 0.041 0.024 0.060 0.073 0.081 0.145 0.095 0.144 0.082 0.074 0.065 0.015 0.041 0.016
[ 588/ 723] blk.65.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.037 0.022 0.057 0.069 0.078 0.153 0.114 0.151 0.081 0.070 0.060 0.015 0.036 0.015
[ 589/ 723] blk.65.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.081 0.142 0.089 0.142 0.080 0.074 0.068 0.015 0.044 0.016
[ 590/ 723] blk.65.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.070 0.015 0.045 0.016
[ 591/ 723] blk.65.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.137 0.089 0.137 0.085 0.077 0.065 0.017 0.042 0.017
[ 592/ 723] blk.65.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.138 0.089 0.139 0.083 0.077 0.067 0.016 0.043 0.017
[ 593/ 723] blk.65.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 594/ 723] blk.65.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 595/ 723] blk.65.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 596/ 723] blk.66.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.041 0.025 0.060 0.075 0.084 0.141 0.094 0.140 0.084 0.076 0.064 0.016 0.040 0.016
[ 597/ 723] blk.66.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.039 0.023 0.059 0.071 0.079 0.149 0.105 0.148 0.081 0.071 0.063 0.015 0.038 0.016
[ 598/ 723] blk.66.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.073 0.069 0.015 0.044 0.016
[ 599/ 723] blk.66.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 600/ 723] blk.66.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.136 0.090 0.137 0.085 0.078 0.065 0.017 0.041 0.017
[ 601/ 723] blk.66.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.075 0.083 0.139 0.089 0.139 0.083 0.076 0.067 0.016 0.043 0.017
[ 602/ 723] blk.66.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.065 0.071 0.078 0.145 0.088 0.145 0.077 0.072 0.070 0.015 0.045 0.016
[ 603/ 723] blk.66.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 604/ 723] blk.66.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 605/ 723] blk.67.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.043 0.000 0.039 0.024 0.059 0.073 0.083 0.144 0.100 0.144 0.084 0.074 0.063 0.016 0.038 0.016
[ 606/ 723] blk.67.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.043 0.000 0.038 0.024 0.059 0.070 0.079 0.148 0.110 0.146 0.081 0.071 0.062 0.016 0.037 0.016
[ 607/ 723] blk.67.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.089 0.145 0.079 0.073 0.069 0.015 0.044 0.016
[ 608/ 723] blk.67.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.142 0.088 0.143 0.079 0.074 0.069 0.015 0.044 0.016
[ 609/ 723] blk.67.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.085 0.137 0.089 0.137 0.085 0.077 0.065 0.017 0.042 0.017
[ 610/ 723] blk.67.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.139 0.089 0.139 0.083 0.077 0.066 0.016 0.043 0.017
[ 611/ 723] blk.67.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 612/ 723] blk.67.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 613/ 723] blk.67.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 614/ 723] blk.68.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.075 0.083 0.140 0.091 0.140 0.083 0.076 0.066 0.016 0.041 0.016
[ 615/ 723] blk.68.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.042 0.024 0.061 0.072 0.080 0.146 0.095 0.145 0.080 0.073 0.066 0.015 0.041 0.016
[ 616/ 723] blk.68.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.077 0.072 0.069 0.015 0.045 0.016
[ 617/ 723] blk.68.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 618/ 723] blk.68.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.076 0.086 0.136 0.090 0.136 0.085 0.078 0.065 0.017 0.041 0.017
[ 619/ 723] blk.68.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.138 0.089 0.139 0.083 0.077 0.066 0.016 0.043 0.017
[ 620/ 723] blk.68.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 621/ 723] blk.68.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 622/ 723] blk.68.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 623/ 723] blk.69.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.075 0.084 0.140 0.092 0.140 0.084 0.076 0.066 0.016 0.041 0.016
[ 624/ 723] blk.69.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.042 0.024 0.062 0.071 0.079 0.146 0.093 0.146 0.079 0.072 0.068 0.015 0.042 0.016
[ 625/ 723] blk.69.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.070 0.077 0.145 0.089 0.146 0.077 0.072 0.069 0.016 0.045 0.016
[ 626/ 723] blk.69.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.144 0.088 0.144 0.078 0.072 0.070 0.015 0.045 0.016
[ 627/ 723] blk.69.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.061 0.076 0.086 0.136 0.090 0.136 0.085 0.078 0.065 0.017 0.041 0.017
[ 628/ 723] blk.69.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.139 0.089 0.139 0.083 0.077 0.066 0.016 0.043 0.017
[ 629/ 723] blk.69.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 630/ 723] blk.69.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 631/ 723] blk.69.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 632/ 723] blk.70.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.041 0.025 0.060 0.075 0.084 0.140 0.093 0.140 0.084 0.076 0.065 0.016 0.041 0.016
[ 633/ 723] blk.70.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.042 0.024 0.061 0.073 0.081 0.144 0.094 0.144 0.081 0.074 0.066 0.015 0.041 0.016
[ 634/ 723] blk.70.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.089 0.145 0.078 0.072 0.070 0.015 0.044 0.016
[ 635/ 723] blk.70.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.070 0.015 0.045 0.016
[ 636/ 723] blk.70.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.135 0.090 0.136 0.086 0.078 0.065 0.018 0.041 0.017
[ 637/ 723] blk.70.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.076 0.084 0.138 0.089 0.138 0.084 0.077 0.066 0.016 0.042 0.017
[ 638/ 723] blk.70.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.088 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 639/ 723] blk.70.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 640/ 723] blk.70.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 641/ 723] blk.71.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.040 0.026 0.060 0.076 0.086 0.136 0.093 0.137 0.086 0.077 0.064 0.018 0.040 0.017
[ 642/ 723] blk.71.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.073 0.081 0.143 0.092 0.143 0.081 0.074 0.067 0.015 0.042 0.016
[ 643/ 723] blk.71.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.078 0.144 0.089 0.145 0.078 0.073 0.069 0.015 0.044 0.016
[ 644/ 723] blk.71.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.073 0.080 0.142 0.088 0.142 0.080 0.074 0.069 0.015 0.045 0.016
[ 645/ 723] blk.71.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.136 0.090 0.136 0.085 0.078 0.065 0.017 0.041 0.017
[ 646/ 723] blk.71.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.138 0.089 0.139 0.083 0.077 0.066 0.016 0.043 0.017
[ 647/ 723] blk.71.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.088 0.145 0.078 0.072 0.069 0.015 0.045 0.016
[ 648/ 723] blk.71.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 649/ 723] blk.71.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 650/ 723] blk.72.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.040 0.026 0.060 0.076 0.086 0.136 0.092 0.136 0.086 0.077 0.064 0.018 0.040 0.017
[ 651/ 723] blk.72.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.073 0.080 0.144 0.091 0.144 0.080 0.074 0.067 0.015 0.043 0.016
[ 652/ 723] blk.72.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.090 0.145 0.079 0.073 0.069 0.015 0.044 0.016
[ 653/ 723] blk.72.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.069 0.015 0.045 0.016
[ 654/ 723] blk.72.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.135 0.091 0.136 0.086 0.078 0.065 0.018 0.041 0.017
[ 655/ 723] blk.72.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.076 0.084 0.138 0.089 0.139 0.084 0.077 0.066 0.016 0.042 0.017
[ 656/ 723] blk.72.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.088 0.145 0.078 0.072 0.069 0.015 0.045 0.016
[ 657/ 723] blk.72.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 658/ 723] blk.72.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 659/ 723] blk.73.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.076 0.086 0.137 0.092 0.137 0.085 0.077 0.065 0.017 0.040 0.017
[ 660/ 723] blk.73.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.072 0.080 0.144 0.091 0.144 0.080 0.073 0.067 0.015 0.043 0.016
[ 661/ 723] blk.73.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.144 0.089 0.145 0.079 0.073 0.069 0.015 0.044 0.016
[ 662/ 723] blk.73.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.142 0.088 0.143 0.079 0.074 0.070 0.015 0.045 0.016
[ 663/ 723] blk.73.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.135 0.091 0.135 0.086 0.078 0.065 0.018 0.041 0.017
[ 664/ 723] blk.73.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.076 0.084 0.138 0.089 0.139 0.084 0.077 0.066 0.016 0.042 0.017
[ 665/ 723] blk.73.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.070 0.015 0.045 0.016
[ 666/ 723] blk.73.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 667/ 723] blk.73.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 668/ 723] blk.74.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.074 0.083 0.140 0.091 0.140 0.083 0.076 0.066 0.016 0.042 0.016
[ 669/ 723] blk.74.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.025 0.063 0.072 0.079 0.145 0.091 0.144 0.079 0.073 0.068 0.015 0.043 0.016
[ 670/ 723] blk.74.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.069 0.015 0.044 0.016
[ 671/ 723] blk.74.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.142 0.087 0.142 0.079 0.074 0.070 0.015 0.045 0.016
[ 672/ 723] blk.74.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.135 0.090 0.135 0.086 0.078 0.065 0.018 0.041 0.017
[ 673/ 723] blk.74.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.076 0.085 0.137 0.089 0.138 0.084 0.077 0.066 0.016 0.042 0.017
[ 674/ 723] blk.74.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.144 0.089 0.145 0.078 0.072 0.069 0.015 0.045 0.016
[ 675/ 723] blk.74.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 676/ 723] blk.74.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 677/ 723] blk.75.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.075 0.083 0.140 0.090 0.140 0.083 0.076 0.066 0.015 0.042 0.016
[ 678/ 723] blk.75.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.072 0.080 0.144 0.091 0.144 0.080 0.073 0.068 0.015 0.043 0.016
[ 679/ 723] blk.75.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.072 0.079 0.145 0.090 0.145 0.079 0.073 0.069 0.015 0.044 0.016
[ 680/ 723] blk.75.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.143 0.088 0.143 0.079 0.073 0.070 0.015 0.045 0.016
[ 681/ 723] blk.75.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.087 0.135 0.091 0.135 0.086 0.078 0.064 0.018 0.041 0.017
[ 682/ 723] blk.75.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.076 0.084 0.138 0.089 0.138 0.084 0.077 0.066 0.016 0.042 0.017
[ 683/ 723] blk.75.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.069 0.015 0.044 0.016
[ 684/ 723] blk.75.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 685/ 723] blk.75.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 686/ 723] blk.76.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.075 0.083 0.139 0.090 0.139 0.083 0.076 0.066 0.016 0.042 0.016
[ 687/ 723] blk.76.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.025 0.063 0.072 0.079 0.144 0.091 0.144 0.080 0.073 0.068 0.015 0.043 0.016
[ 688/ 723] blk.76.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.071 0.078 0.145 0.090 0.146 0.078 0.072 0.069 0.015 0.044 0.016
[ 689/ 723] blk.76.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.065 0.071 0.078 0.144 0.088 0.144 0.078 0.073 0.070 0.015 0.045 0.016
[ 690/ 723] blk.76.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.087 0.135 0.091 0.136 0.086 0.078 0.065 0.018 0.041 0.017
[ 691/ 723] blk.76.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.076 0.084 0.138 0.089 0.138 0.084 0.077 0.066 0.016 0.042 0.017
[ 692/ 723] blk.76.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.071 0.078 0.145 0.089 0.145 0.078 0.072 0.069 0.015 0.044 0.016
[ 693/ 723] blk.76.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 694/ 723] blk.76.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 695/ 723] blk.77.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.040 0.026 0.060 0.076 0.086 0.136 0.093 0.137 0.086 0.077 0.064 0.018 0.040 0.017
[ 696/ 723] blk.77.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.074 0.083 0.141 0.092 0.141 0.082 0.075 0.066 0.016 0.042 0.016
[ 697/ 723] blk.77.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.071 0.078 0.146 0.090 0.147 0.078 0.072 0.069 0.015 0.044 0.016
[ 698/ 723] blk.77.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.045 0.025 0.064 0.072 0.079 0.142 0.087 0.143 0.079 0.073 0.070 0.015 0.045 0.016
[ 699/ 723] blk.77.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.041 0.026 0.060 0.077 0.086 0.136 0.091 0.136 0.086 0.078 0.065 0.018 0.041 0.017
[ 700/ 723] blk.77.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.139 0.089 0.139 0.083 0.076 0.066 0.016 0.043 0.017
[ 701/ 723] blk.77.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.071 0.078 0.145 0.090 0.145 0.078 0.072 0.069 0.015 0.044 0.016
[ 702/ 723] blk.77.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 703/ 723] blk.77.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 704/ 723] blk.78.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.043 0.001 0.039 0.027 0.059 0.076 0.088 0.133 0.096 0.134 0.088 0.078 0.063 0.020 0.038 0.017
[ 705/ 723] blk.78.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.042 0.025 0.061 0.074 0.082 0.142 0.092 0.142 0.082 0.075 0.066 0.015 0.042 0.016
[ 706/ 723] blk.78.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.071 0.079 0.145 0.090 0.146 0.078 0.073 0.069 0.015 0.044 0.016
[ 707/ 723] blk.78.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.064 0.073 0.080 0.142 0.088 0.142 0.080 0.074 0.069 0.015 0.044 0.016
[ 708/ 723] blk.78.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.026 0.060 0.076 0.085 0.137 0.090 0.137 0.085 0.077 0.065 0.017 0.041 0.017
[ 709/ 723] blk.78.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.084 0.139 0.089 0.139 0.083 0.076 0.066 0.016 0.042 0.017
[ 710/ 723] blk.78.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.063 0.071 0.078 0.146 0.090 0.146 0.078 0.072 0.069 0.015 0.044 0.016
[ 711/ 723] blk.78.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 712/ 723] blk.78.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 713/ 723] blk.79.attn_q.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.001 0.040 0.026 0.060 0.076 0.086 0.136 0.094 0.136 0.086 0.077 0.064 0.018 0.040 0.017
[ 714/ 723] blk.79.attn_k.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.082 0.141 0.091 0.141 0.082 0.075 0.067 0.016 0.042 0.016
[ 715/ 723] blk.79.attn_v.weight - [ 8192, 1024, 1, 1], type = f16, quantizing to q4_0 .. size = 16.00 MB -> 4.50 MB | hist: 0.044 0.000 0.043 0.024 0.062 0.072 0.080 0.145 0.090 0.145 0.079 0.073 0.068 0.015 0.043 0.016
[ 716/ 723] blk.79.attn_output.weight - [ 8192, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 128.00 MB -> 36.00 MB | hist: 0.044 0.000 0.044 0.025 0.063 0.074 0.082 0.139 0.088 0.140 0.082 0.076 0.068 0.015 0.044 0.017
[ 717/ 723] blk.79.ffn_gate.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.061 0.075 0.083 0.140 0.089 0.140 0.083 0.076 0.066 0.016 0.042 0.016
[ 718/ 723] blk.79.ffn_up.weight - [ 8192, 28672, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.043 0.025 0.062 0.074 0.082 0.142 0.090 0.142 0.082 0.075 0.067 0.015 0.043 0.016
[ 719/ 723] blk.79.ffn_down.weight - [28672, 8192, 1, 1], type = f16, quantizing to q4_0 .. size = 448.00 MB -> 126.00 MB | hist: 0.044 0.000 0.042 0.024 0.062 0.070 0.078 0.148 0.093 0.149 0.078 0.071 0.067 0.015 0.042 0.016
[ 720/ 723] blk.79.attn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 721/ 723] blk.79.ffn_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 722/ 723] output_norm.weight - [ 8192, 1, 1, 1], type = f32, size = 0.031 MB
[ 723/ 723] output.weight - [ 8192, 32000, 1, 1], type = f16, quantizing to q6_K .. size = 500.00 MB -> 205.08 MB | hist:
llama_model_quantize_internal: model size = 131565.03 MB
llama_model_quantize_internal: quant size = 37070.73 MB
llama_model_quantize_internal: hist: 0.044 0.000 0.043 0.025 0.063 0.073 0.081 0.141 0.089 0.141 0.081 0.075 0.068 0.016 0.043 0.016
main: quantize time = 177334.49 ms
main: total time = 177334.49 ms
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment