This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# from https://xlsxwriter.readthedocs.io/example_pandas_table.html | |
df.index.name = "org_id" | |
criteria = df["Include"]=="y" | |
sheets = { | |
"Included - active": (criteria & df["active"]), | |
"Included - inactive": (criteria & ~df["active"]), | |
"Excluded": ~criteria, | |
} | |
column_widths = df.apply(lambda x: x.astype(str).str.len().max(), axis=0).apply(lambda x: min(x, 50)).to_dict() |
We can't make this file beautiful and searchable because it's too large.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"reg_number","name","constitution","geographical_spread","address","postcode","phone","active","date_registered","date_removed","web","email","company_number","objects","source","last_updated","income","spending","fye","dual_registered" | |
SC004401,University Of Glasgow Court,Statutory corporation (Royal Charter etc),UK and overseas,"Directorate, Finance Office, Gilbert Scott Building, University Avenue, Glasgow",G12 8QQ,,true,1992-01-01,,www.gla.ac.uk,,,"The advancement of education. The advancement of health. The advancement of, arts, heritage, culture, or science.",Office of Scottish Charity Regulator,2021-01-25 18:48:11,690648000,758804000,2019-07-31,false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"id_and_name": "[\"Exeter City Council\", \"GB-LAE-EXE\"]", | |
"locationLookup": [ | |
{ | |
"ewcd": "K04000001", | |
"ewnm": "England and Wales", | |
"gbcd": "K03000001", | |
"gbnm": "Great Britain", | |
"ukcd": "K02000001", | |
"uknm": "United Kingdom", |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ladcd | ladnm | |
---|---|---|
E09000002 | Barking and Dagenham | |
E09000003 | Barnet | |
E09000004 | Bexley | |
E09000005 | Brent | |
E09000006 | Bromley | |
E09000008 | Croydon | |
E09000009 | Ealing | |
E09000010 | Enfield | |
E09000013 | Hammersmith and Fulham |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Bins used for numeric fields | |
AMOUNT_BINS = [0, 500, 1000, 2000, 5000, 10000, 100000, 1000000, float("inf")] | |
AMOUNT_BIN_LABELS = ["Under £500", "£500 - £1k", "£1k - £2k", "£2k - £5k", "£5k - £10k", | |
"£10k - £100k", "£100k - £1m", "Over £1m"] | |
INCOME_BINS = [-1, 10000, 100000, 250000, 500000, 1000000, 10000000, float("inf")] | |
INCOME_BIN_LABELS = ["Under £10k", "£10k - £100k", "£100k - £250k", | |
"£250k - £500k", "£500k - £1m", "£1m - £10m", "Over £10m"] | |
AGE_BINS = pd.to_timedelta( | |
[x * 365 for x in [-1, 1, 2, 5, 10, 25, 200]], unit="D") | |
AGE_BIN_LABELS = ["Under 1 year", "1-2 years", "2-5 years", |
We can't make this file beautiful and searchable because it's too large.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
LSOA11CD,LSOA11NM,MSOA11CD,MSOA11NM,TTWA11CD,TTWA11NM,LAD11CD,LAD11NM,LAD11NMW,LAD20CD,LAD20NM,LAD20NMW,CTY20CD,CTY20NM,CTRYCD,CTRYNM | |
95DD04W1,Ballee,95DD04W1,Ballee,N12000001,Ballymena,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,N92000002,Northern Ireland | |
95DD05W1,Ballykeel,95DD05W1,Ballykeel,N12000001,Ballymena,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,N92000002,Northern Ireland | |
95DD06W1,Ballyloughan,95DD06W1,Ballyloughan,N12000001,Ballymena,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,N92000002,Northern Ireland | |
95DD07S1,Broughshane 1,95DD07S1,Broughshane 1,N12000001,Ballymena,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,,N09000008,Mid and East Antrim,N92000002,Northern Ireland |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import io | |
from zipfile import ZipFile, ZIP_DEFLATED | |
import random | |
import bcp | |
import click | |
import tqdm | |
@click.command() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# use with https://pypi.org/project/titlecase/ | |
# eg: titlecase.titlecase(x, title_exceptions) | |
def title_exceptions(word, **kwargs): | |
word_test = word.strip("(){}<>.") | |
# lowercase words | |
if word_test.lower() in ['a', 'an', 'of', 'the', 'is', 'or']: | |
return word.lower() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"extra_context": { | |
"amountAwardedFixed_facet_size": 3, | |
"awardYear_facet_size": 3 | |
}, | |
"sort": { | |
"_score": { | |
"order": "desc" | |
} | |
}, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CompanyNumberPattern | CompanyCategory | companies | |
---|---|---|---|
00000000 | Community Interest Company | 18634 | |
00000000 | Converted/Closed | 11 | |
00000000 | Old Public Company | 35 | |
00000000 | Other Company Type | 3 | |
00000000 | PRI/LBG/NSC (Private, Limited by guarantee, no share capital, use of 'Limited' exemption) | 40062 | |
00000000 | PRI/LTD BY GUAR/NSC (Private, limited by guarantee, no share capital) | 103622 | |
00000000 | PRIV LTD SECT. 30 (Private limited company, section 30 of the Companies Act) | 18 | |
00000000 | Private Limited Company | 4526934 | |
00000000 | Private Unlimited | 250 |
NewerOlder