Skip to content

Instantly share code, notes, and snippets.

View MarioZZJ's full-sized avatar
🎸
Be well-grounded and speak up moderately.

Zhejun Zheng MarioZZJ

🎸
Be well-grounded and speak up moderately.
View GitHub Profile
@MarioZZJ
MarioZZJ / llm-api-demo-bigmodel4extraction.ipynb
Created September 4, 2024 08:35
demo for extract keywords via llm: bigmodel api
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@MarioZZJ
MarioZZJ / citing-edge-gen3.sql
Last active May 6, 2024 14:02
获取三代引文边、发表年、mesh词
WITH g1 As (
SELECT citing, referenced, 1 As gen
FROM userdb_mariozzj_SAO4D.dbo.sample_xyc AS s INNER JOIN pubmed_2024.dbo.open_citation_collection As occ
ON s.pmid = occ.citing
), g2 AS (
SELECT citing, referenced, 2 AS gen
FROM
(SELECT referenced AS r1 FROM g1 GROUP BY referenced) AS rg1 INNER JOIN pubmed_2024.dbo.open_citation_collection As occ
ON rg1.r1 = occ.citing
), g3 AS (
@MarioZZJ
MarioZZJ / comesh-cocite.sql
Last active April 25, 2024 01:12
计算种子文献前后1年内含同mesh的耦合强度
DECLARE @PMID INT;
DECLARE @PYEAR INT;
DECLARE @THRESHOLD INT = 2;
DECLARE cur CURSOR FOR
SELECT py.pmid AS PMID, py.publish_year AS PYEAR FROM
userdb_mariozzj_SAO4D.dbo.sample_xyc AS s
INNER JOIN
pubmed_2024.dbo.pmid_py AS py
ON s.pmid = py.pmid;
@MarioZZJ
MarioZZJ / biobert-embedding.py
Last active May 25, 2024 02:12
使用biobert-large-cased表示论文标题摘要文本
from transformers import AutoConfig, AutoModel, AutoTokenizer
import pandas as pd
import numpy as np
import torch
import csv
import argparse
from tqdm import tqdm
if __name__ == '__main__':
@MarioZZJ
MarioZZJ / download_mesh_tree.py
Created March 15, 2023 06:52
采集 MeSH 树中所有 MeSH 词,保存为 json 和 csv。
#!/usr/bin/env python
"""
采集 MeSH 树中所有 MeSH 词,保存为 json 和 csv。
Author: MarioZZJ <zjzheng@smail.nju.edu.cn>
Usage:
python3 download_mesh_tree.py
"""
@MarioZZJ
MarioZZJ / flatten-openalex-jsonl.py
Last active November 12, 2022 07:30 — forked from richard-orr/flatten-openalex-jsonl.py
flatten openalex JSON Lines files to TSV readable by Spark
# python 3.8+ required
import csv
import glob
import gzip
import json
import os
SNAPSHOT_DIR = 'openalex-snapshot'
CSV_DIR = 'tsv-files'
from glob import glob
from lxml import etree
import gc
import gzip
from dbutils.pooled_db import PooledDB
import pymysql
import threading
import logging
import time
import pandas as pd
@MarioZZJ
MarioZZJ / header.png
Last active September 24, 2024 01:35
metrics
header.png
@MarioZZJ
MarioZZJ / MarioZZJ's GitHub Stats
Last active September 26, 2022 01:10
github-stats
⭐ Total Stars: 13
➕ Yearly Commits: 108
🔀 Total PRs: 8
🚩 Total Issues: 52
📦 Contributed to: 6
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.