Skip to content

Instantly share code, notes, and snippets.

@tombarys
Last active April 30, 2023 17:38
Show Gist options
  • Save tombarys/f007574fda7cd25081698ce7171ea04a to your computer and use it in GitHub Desktop.
Save tombarys/f007574fda7cd25081698ce7171ea04a to your computer and use it in GitHub Desktop.
Cleans the focused Roam block after copy/pasting/importing from PDF
;; For all those who paste text from PDF into Roam using Copy-Paste or import via Readwise from PDF in the Reader app.
;; The problem is that there are often hard-coded "newline" characters at the end of lines in PDFs,
;; and words tend to be split by hyphens, which will break up the text when pasted into Roam.
;; I've made the simple script that cleans up the text, merges it, and if there's even a line shorter
;; than the preset threshold, treats it as the end of a paragraph (so it keeps the newline).
;; You can assign a keyboard shortcut to this, so it then works instantly.
;; Installation:
;; 1) copy this code as a children codeblock (`clojure`) under parent containing {{[[roam/cljs]]}} anywhere
;; in [[roam/cljs]] page
;; 2) confirm "Yes, I know what I am doing"
;;
;; Usage:
;; 1) focus block containing pasted text from PDF
;; 2) press Cmd-P (Ctrl-P on Windows) to show Command Palette
;; 3) search for "Clean PDF text"
;; 4) press Enter
;;
;; TIP: you can easily setup quick keyboard shortcut for this script with Command Palette
(ns clean-block-30-4-2023
(:require [clojure.string :as str]
[roam.datascript :as rd]
[roam.block :as block]))
(def treshold 40) ;; set the line length treshold under which the line will be considered
;; as new paragraphs (=lines ending with \n character shorter than the treshold)
(defn block-content [uid]
(rd/q '[:find ?text .
:in $ ?uid
:where [?e :block/uid ?uid]
[?e :block/string ?text]]
uid))
(defn clean [text]
(str/join
(map #(if (> treshold (count %)) (str % "\n") (str % " "))
(str/split-lines
(-> text
(str/replace #"\-\s" "")
(str/trim))))))
(defn main []
(js/window.roamAlphaAPI.ui.commandPalette.addCommand
(clj->js
{:label (str "Clean PDF text")
:callback (fn [e]
(let [block-uid (:block-uid (js->clj (js/window.roamAlphaAPI.ui.getFocusedBlock)
:keywordize-keys true))
text (block-content block-uid)]
(when-not (= text "")
(block/update {:block {:uid block-uid :string (clean text)}}))))})))
(main)
@tombarys
Copy link
Author

CleanShot 2023-04-30 at 19 18 57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment