Agnes en_US # Isn't it nice to have a computer that will talk to you? | |
Albert en_US # I have a frog in my throat. No, I mean a real frog! | |
Alex en_US # Most people recognize me by my voice. | |
Alice it_IT # Salve, mi chiamo Alice e sono una voce italiana. | |
Alva sv_SE # Hej, jag heter Alva. Jag är en svensk röst. | |
Amelie fr_CA # Bonjour, je m’appelle Amelie. Je suis une voix canadienne. | |
Anna de_DE # Hallo, ich heiße Anna und ich bin eine deutsche Stimme. | |
Bad News en_US # The light you see at the end of the tunnel is the headlamp of a fast approaching train. | |
Bahh en_US # Do not pull the wool over my eyes. | |
Bells en_US # Time flies when you are having fun. |
import openai | |
import asyncio | |
from typing import Any | |
async def dispatch_openai_requests( | |
messages_list: list[list[dict[str,Any]]], | |
model: str, | |
temperature: float, | |
max_tokens: int, | |
top_p: float, |
https://www.coop.ch/de/navigation/meganav/get?categoryCode=m_0475&_=1629798238733 | |
https://www.amazon.com/dp/B0009R5B3U?th=1 | |
https://play.google.com/_/PlayStoreUi/data/batchexecute?rpcids=qnKhOb&bl=boq_playuiserver_20191117.08_p1&gl=my&hl=ms&authuser&soc-app=121&soc-platform=1&soc-device=1&rt=c | |
https://www.coop.ch/de/lebensmittel/vorraete/pastasaucen-warme-saucen/saucen-gemischt/c/m_0166 | |
https://www.amazon.com.au/gp/aod/ajax/ref=aod_f_freeShipping?asin=B08CXNTJ89&pageno=1&pc=dp | |
https://losangeles.craigslist.org/lac/ofc/d/san-diego-market-research-project/7369959933.html | |
https://www.bestbuy.ca/api/offers/v1/products/10434538/offers | |
https://www.zillow.com/homes/5266-S-Umatilla-Ave-Boise-ID-83709_rb | |
https://www.yelp.com/not_recommended_reviews/dannys-rv-repair-williams?not_recommended_start=40 |
This is an introduction to managing infrastructure (Infra) and system administration operations (Ops) particularly for deep learning applications.
Q: I have my Jupyter notebooks and virtual machines that comes with all batteries included. Why do I have to handle bare-metal machines?
A: You don't (if you are not interested). In most cases, the normal virtual machines with everything included, e.g. AWS Sagemaker or Azures Machine Learning Studio. But these are technically "Managed" Virtual Machines and often fixing something that goes wrong after the VM has been spun up for a couple of months will lead to the same issues that requires the knowledge of infra-ops to handle/fix the issue.
from functools import lru_cache | |
from itertools import product | |
@lru_cache(maxsize=4095) | |
def ld(s, t): | |
""" | |
Levenshtein distance memoized implementation from Rosetta code: | |
https://rosettacode.org/wiki/Levenshtein_distance#Python | |
""" | |
if not s: return len(t) |
import requests | |
from io import StringIO | |
response = requests.get('https://unicode.org/Public/emoji/13.0/emoji-sequences.txt') | |
with open('emoji.txt', 'w') as fout: | |
with StringIO(response.content.decode('utf8')) as fin: | |
for line in fin: | |
if line.strip() and not line.startswith('#'): | |
hexa = line.split(';')[0].split('..') |
So in the midst of all these Sesame Streets characters and robots transforming automobile era of "contextualize" language models, there is this "Toronto Book Corpus" that points to this kinda recently influential paper:
Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. "Aligning books and movies: Towards story-like visual explanations by watching movies and reading books." In Proceedings of the IEEE international conference on computer vision, pp. 19-27.
Some might know my personal pet peeve on collecting translation datasets but this BookCorpus has no translations, so why do I even care about it?