Skip to content

Instantly share code, notes, and snippets.

View av's full-sized avatar
💻
🌚

Ivan Charapanau av

💻
🌚
View GitHub Profile
@av
av / tasks.html
Created September 15, 2024 15:51
misguidedbench - tasks
This file has been truncated, but you can view the full file.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Task Report</title>
<style>
body {
@av
av / misguidedbench.sh
Last active September 15, 2024 15:41
misguidedbench
#!/bin/bash
OPENROUTER_KEY=< your key here >
TASKS=/path/to/misguided.yaml
NAME=misguided
# Common
h bench judge meta-llama/llama-3.1-70b-instruct
h bench judge_api https://openrouter.ai/api
h bench judge_key $OPENROUTER_KEY
h bench tasks $TASKS
@av
av / cheese.yaml
Last active September 12, 2024 21:13
CheeseBench
- tags: [cheese]
question: Which cheese is nicknamed "King of Cheeses" but paradoxically has a rind resembling concrete?
criteria:
correctness: Answer mentions Parmigiano-Reggiano
bonus: Answer explains the paradox
- tags: [cheese]
question: What's the connection between a Norwegian brown cheese and caramel?
criteria:
correctness: Answer mentions caramelized milk sugars in any form
@av
av / engbench.sh
Created September 12, 2024 16:22
Harbor bench - engines recipe
#!/bin/bash
# Note that you're not expected to run this
# file as is in one go
OPENROUTER_KEY=<your_openrouter_key>
TASKS=<path_to_tasks_file>
NAME=engbench
@av
av / mmlu_256.yaml
Created September 12, 2024 16:17
Harbor MMLU 256
- tags:
- ori_mmlu-global_facts
question: >-
<instructions>Carefully read the question and the options provided. Choose
the option that best answers the question.</instructions>
<question>As of 2017, the share of deaths in Greenland by suicide is
about</question>
<options><option>A: 3.60%</option>
@av
av / rml.md
Created September 6, 2024 21:43
RML - Reasoning Markup Language

Prompt

You are a helpful assistant. You're smart, clever, direct and pragmatic. You notice details that a few people would. Be careful as the questions might attempt to misguide and tricky you. When answering to the User, you outline your thought process using these tags:

<thought> The root element that encapsulates an entire thought process.
<observation> Initial information or context that prompts the thinking process.
<question> The main query or problem to be addressed.
<hypothesis> An initial proposed explanation or solution.
<reasoning> Container for the logical steps of the thought process.
@av
av / chat.md
Created September 6, 2024 20:57
Misguided Reflection

Problem 1 - Jugs

I have a 1- and a 2-liter jug. I want to measure exactly 3 liters.

Reflection 70B (Free)

<thinking>
Let's approach this problem step by step:
@av
av / bundle.js
Last active August 8, 2021 18:58
Custom Widget Bundle
parcelRequire=function(e,r,t,n){var i,o="function"==typeof parcelRequire&&parcelRequire,u="function"==typeof require&&require;function f(t,n){if(!r[t]){if(!e[t]){var i="function"==typeof parcelRequire&&parcelRequire;if(!n&&i)return i(t,!0);if(o)return o(t,!0);if(u&&"string"==typeof t)return u(t);var c=new Error("Cannot find module '"+t+"'");throw c.code="MODULE_NOT_FOUND",c}p.resolve=function(r){return e[t][1][r]||r},p.cache={};var l=r[t]=new f.Module(t);e[t][0].call(l.exports,p,l,l.exports,this)}return r[t].exports;function p(e){return f(p.resolve(e))}}f.isParcelRequire=!0,f.Module=function(e){this.id=e,this.bundle=f,this.exports={}},f.modules=e,f.cache=r,f.parent=o,f.register=function(r,t){e[r]=[function(e,r){r.exports=t},{}]};for(var c=0;c<t.length;c++)try{f(t[c])}catch(e){i||(i=e)}if(t.length){var l=f(t[t.length-1]);"object"==typeof exports&&"undefined"!=typeof module?module.exports=l:"function"==typeof define&&define.amd?define(function(){return l}):n&&(this[n]=l)}if(parcelRequire=f,i)throw i;return f}({
@av
av / lisp.dart
Created August 29, 2020 08:38
Example of running LISP in Dart
void main() {
// LISP Scope could be populated with reuqired values
// to provide interop between Dart and LISP
final baseScope = LispScope({
'*': Multiplication('*'),
'+': Addition('+'),
'offset': OffsetContainer('offset'),
'print': Print('print'),
'call': CallMethod('call'),
});
@av
av / level.txt
Created August 28, 2020 19:29
Sample level MLLS
~~>Meta
## Cheesy onions
@@ 240 Wettotter Harbor, 53320
Local food supplier needs help unlocking a warehouse.
~~>Dialog
hello|Warehouse owner| Hey, you're here... You gotta help me! You gotta help me quick! 🙏
what|You| Any rush?
--
what|Warehouse owner| Actually... Yes. That boy did not get his raise... By... An occasion.