Skip to content

Instantly share code, notes, and snippets.

View yenche123's full-sized avatar

Tsui Yen-Che yenche123

View GitHub Profile
@hanxiao
hanxiao / testRegex.js
Last active September 20, 2024 18:18
Regex for chunking by using all semantic cues
// Updated: Aug. 20, 2024
// Run: node testRegex.js whatever.txt
// Live demo: https://jina.ai/tokenizer
// LICENSE: Apache-2.0 (https://www.apache.org/licenses/LICENSE-2.0)
// COPYRIGHT: Jina AI
const fs = require('fs');
const util = require('util');
// Define variables for magic numbers
const MAX_HEADING_LENGTH = 7;