The current implementation (behind feature flag rqg=true
) imports
all the iterations of a group in a single JSON blob.
While this was a good experiment it leads to some complexities, e.g. question type handling.
Right now there is code that understand the value behind a question type.
Consider this 2 imports:
\d ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7
Table “public.ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7” | ||||
---|---|---|---|---|
Column | Type | Collation | Nullable | Default |
rnum | integer | not null | nextval(‘ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7_rnum_seq’::regclass) | |
identifier | text | |||
instance_id | text | not null | ||
display_name | text | |||
submitter | text | |||
submitted_at | timestamp with time zone | |||
surveyal_time | double precision | |||
device_id | text | |||
c583119147 | text | |||
c594979148 | double precision | |||
c609479145 | text | |||
c617319146 | text | |||
c601879162 | text | |||
c601889144 | geometry(Point,4326) | |||
c601899165 | text | |||
c601879163 | text | |||
c601899166 | timestamp with time zone | |||
c599649166 | text | |||
c615289146 | geometry(Geometry,4326) | |||
c615289146_0 | text | |||
c597909145 | text | |||
c615289147 | text | |||
c617309149 | text | |||
c588869155 | geometry(Point,4326) | |||
Indexes: | ||||
“ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7_pkey” PRIMARY KEY, btree (rnum) | ||||
“ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7_instance_id_key” UNIQUE CONSTRAINT, btree (instance_id) | ||||
“ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7_c588869155_idx” gist (c588869155) | ||||
“ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7_c601889144_idx” gist (c601889144) | ||||
“ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7_c615289146_idx” gist (c615289146) |
And some of the values
select * from ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7 limit 5;
rnum | identifier | instance_id | display_name | submitter | submitted_at | surveyal_time | device_id | c583119147 | c594979148 | c609479145 | c617319146 | c601879162 | c601889144 | c601899165 | c601879163 | c601899166 | c599649166 | c615289146 | c615289146_0 | c597909145 | c615289147 | c617309149 | c588869155 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | y8r0-sj5w-fnga | 579079115 | Gombitova | Jana bluestacks | 2020-06-11 15:15:34+00 | 148 | jana bluestacks | Jana | 30 | 1:yes | yes | 2 | 0101000020E610000078EE3D5C72DC13405436ACA92C644C40 | https://akvoflow-uat1.s3.amazonaws.com/images/5e4305e9-57a9-46e8-b045-e0f37b2ff7c5.jpg | https://akvoflow-uat1.s3.amazonaws.com/images/90d5af97-a52a-4363-849f-c45f969be158.mp4 | 2020-06-11 15:10:56.09+00 | 1234567890 | 0103000020E6100000010000000D000000E646EC6EF9BB2640F3AD0FEB8DFE3F401A23C8F77B6C2340AB4203B16C343C403A32A0281A292D40CF0E5DF525BE374013E3EB21D0AB21409D1CF45B2AD936408F52094FE815FA3F6B0F7BA1802F3A4065DEAAEB504DBFBFDFFAB0DEA8044140AB49A6F9197510C0E148FB7A1915444098D41A947AC0F9BFC29B7F057A134840B350210D5D2F0E4051A1BAB9F8D24A40DF8C9AAF92732240FE7B952133814A4094F029A50C812040227E0FA8486C46406E4617E5E30C18407C702D4954AE4340E646EC6EF9BB2640F3AD0FEB8DFE3F40 | {“pointCount”:”12”,”length”:”7756726.5”,”area”:”2.7898780828596826E12”} | Gombitova | 0101000020E6100000274EEE7728CA13404EB9C2BB5C244C40 | |||
2 | v1ex-928r-bh9g | 587119117 | bla ble blu bli | Jana bluestacks | 2020-06-11 15:20:40+00 | 135 | jana bluestacks | bla bla | 25 | 1:yes | no | 1 | 0101000020E610000000000000000018400000000000004E40 | https://akvoflow-uat1.s3.amazonaws.com/images/af5a5426-94f0-4547-8409-9e648c42278a.jpg | https://akvoflow-uat1.s3.amazonaws.com/images/98eb72ad-3084-4062-bc8e-c0dadbd42230.mp4 | 2020-06-03 15:15:42.535+00 | 21212121212121 | 0103000020E610000001000000080000007080F4A8530122404497811486F64C40BB438A0112E13240C6B656C565D94D405A958E835C353C40E7A15B5430DB4E400A7CEAFDEBDD3C409412279C82954C4054B8D4F7D0A03A4062A1D634EFAB4A40C4DD6AE7455B304011FBA99619714840128F1F85463B27402F185C73476F4A407080F4A8530122404497811486F64C40 | {“pointCount”:”7”,”length”:”3542964.2”,”area”:”1.0478411372043157E12”} | bla ble blu bli | 0101000020E610000000000000000018400000000000004E40 | |||
3 | vbqk-k700-7gya | 597079115 | One Repetition | valeria_acer | 2020-06-09 15:10:54+00 | 213 | valeria_acer | Pete | 2 | 2:no | no | 2 | 0101000020E6100000000000000000F03F000000000000F03F | https://akvoflow-uat1.s3.amazonaws.com/images/00a2a5e7-03f4-493c-9711-202b72569be9.jpg | https://akvoflow-uat1.s3.amazonaws.com/images/26691c4c-c7b7-48f0-b8c6-415fa9d351ef.mp4 | 2014-06-09 15:09:28.062+00 | 1234567 | 0103000020E610000001000000070000009C1AC3F759643F4048F0D0C144FB4840521B8B5C26653F40D594641D8EF64840A4B6C31506713F407C58B96125F84840D305065ACB563F4001B8FE02E6F04840287513C660633F40C9C452C9A5EB48403B0D62B1974F3F4017EAFA4F48EA48409C1AC3F759643F4048F0D0C144FB4840 | {“pointCount”:”6”,”length”:”44719.12”,”area”:”2.7101229755859375E7”} | One Repetition | 0101000020E6100000000000000000F03F000000000000F03F | |||
4 | ryvu-0gxm-j69b | 601909115 | Poppins | valeria_acer | 2020-06-09 14:42:45+00 | 719 | valeria_acer | Mary | 36 | 1:yes | yes | no | 1 | 0101000020E6100000D090A88D3E5105400019A453C6CE4440 | https://akvoflow-uat1.s3.amazonaws.com/images/825057a7-42c8-490f-906b-8215e7666af1.jpg | https://akvoflow-uat1.s3.amazonaws.com/images/3978e56b-0b91-4d80-bd55-f68a49144214.mp4 | 1986-06-09 14:06:48.048+00 | 12345678 | 0103000020E6100000010000000600000004C1882249352840AB77B81D1A6D4540CE4B6A0D4A3528402F00E88B186D4540552E54FEB5342840982ABDDB616B454005E33B8CA42C2840B000A60C1C6C454083520F2CA23D2840D4957439CA6D454004C1882249352840AB77B81D1A6D4540 | {“pointCount”:”5”,”length”:”7469.8857”,”area”:”827588.20703125”} | {“app”:{“appVersion”:”1.0”,”language”:”en”},”device”:{“country”:”US”,”language”:”en”,”manufacturer”:”Acer”,”model”:”A1-840FHD”,”os”:”Android - 4.4.2 (19)”,”product”:”a1840fhd_ww_gen1”},”name”:”Water - Chlorine, pH”,”result”:[{“value”:”3.0”,”name”:”pH”,”id”:1},{“value”:”3.0”,”name”:”Chlorine”,”unit”:”mg/l”,”id”:2}],”testDate”:”2020-06-09 16:34”,”type”:”caddisfly”,”user”:{“language”:”en”,”isBackDropDetection”:true},”uuid”:”520ba67c-233f-4dc7-a2ad-17d86047d7c4”} | Poppins | 0101000020E61000004943A6E1CA50054079B8C312C0CE4440 |
You’ll find that we detect the question type and translate that to a
proper column type, e.g. timestamp with time zone
, geometry
,
double precision
, etc
select columns from dataset_version where table_name = 'ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7';
columns |
---|
[{“key”: false, “sort”: null, “type”: “text”, “title”: “Identifier”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “identifier”, “multipleId”: null, “multipleType”: null}, {“key”: true, “sort”: null, “type”: “text”, “title”: “Instance id”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “instance_id”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Display name”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “display_name”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Submitter”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “submitter”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “date”, “title”: “Submitted at”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “submitted_at”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “number”, “title”: “Surveyal time”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “surveyal_time”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Device Id”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “device_id”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Name”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c583119147”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “number”, “title”: “Age”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c594979148”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Likes Pizza”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c609479145”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Likes pasta”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c617319146”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Region”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c601879162”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “geopoint”, “title”: “Current location”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c601889144”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Photo”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c601899165”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Video”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c601879163”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “date”, “title”: “DOB”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c601899166”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Barcode”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c599649166”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “geoshape”, “title”: “Shape”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c615289146”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “multiple”, “title”: “Shape Features”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c615289146_0”, “multipleId”: “615289146”, “multipleType”: “geo-shape-features”}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Signature”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c597909145”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “multiple”, “title”: “Caddisfly”, “hidden”: false, “groupId”: “597899156”, “metadata”: null, “direction”: null, “groupName”: “Repeated”, “columnName”: “c615289147”, “multipleId”: “520ba67c-233f-4dc7-a2ad-17d86047d7c4”, “multipleType”: “caddisfly”}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Family name”, “hidden”: false, “groupId”: “617319144”, “metadata”: null, “direction”: null, “groupName”: “Non repeatable”, “columnName”: “c617309149”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “geopoint”, “title”: “Location”, “hidden”: false, “groupId”: “617319144”, “metadata”: null, “direction”: null, “groupName”: “Non repeatable”, “columnName”: “c588869155”, “multipleId”: null, “multipleType”: null}] |
[
{
"multipleType": null,
"multipleId": null,
"columnName": "identifier",
"groupName": null,
"direction": null,
"metadata": null,
"groupId": null,
"hidden": false,
"title": "Identifier",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "instance_id",
"groupName": null,
"direction": null,
"metadata": null,
"groupId": null,
"hidden": false,
"title": "Instance id",
"type": "text",
"sort": null,
"key": true
},
{
"multipleType": null,
"multipleId": null,
"columnName": "display_name",
"groupName": null,
"direction": null,
"metadata": null,
"groupId": null,
"hidden": false,
"title": "Display name",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "submitter",
"groupName": null,
"direction": null,
"metadata": null,
"groupId": null,
"hidden": false,
"title": "Submitter",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "submitted_at",
"groupName": null,
"direction": null,
"metadata": null,
"groupId": null,
"hidden": false,
"title": "Submitted at",
"type": "date",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "surveyal_time",
"groupName": null,
"direction": null,
"metadata": null,
"groupId": null,
"hidden": false,
"title": "Surveyal time",
"type": "number",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "device_id",
"groupName": null,
"direction": null,
"metadata": null,
"groupId": null,
"hidden": false,
"title": "Device Id",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c583119147",
"groupName": "Repeated",
"direction": null,
"metadata": null,
"groupId": "597899156",
"hidden": false,
"title": "Name",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c594979148",
"groupName": "Repeated",
"direction": null,
"metadata": null,
"groupId": "597899156",
"hidden": false,
"title": "Age",
"type": "number",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c609479145",
"groupName": "Repeated",
"direction": null,
"metadata": null,
"groupId": "597899156",
"hidden": false,
"title": "Likes Pizza",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c617319146",
"groupName": "Repeated",
"direction": null,
"metadata": null,
"groupId": "597899156",
"hidden": false,
"title": "Likes pasta",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c601879162",
"groupName": "Repeated",
"direction": null,
"metadata": null,
"groupId": "597899156",
"hidden": false,
"title": "Region",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c601889144",
"groupName": "Repeated",
"direction": null,
"metadata": null,
"groupId": "597899156",
"hidden": false,
"title": "Current location",
"type": "geopoint",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c601899165",
"groupName": "Repeated",
"direction": null,
"metadata": null,
"groupId": "597899156",
"hidden": false,
"title": "Photo",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c601879163",
"groupName": "Repeated",
"direction": null,
"metadata": null,
"groupId": "597899156",
"hidden": false,
"title": "Video",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c601899166",
"groupName": "Repeated",
"direction": null,
"metadata": null,
"groupId": "597899156",
"hidden": false,
"title": "DOB",
"type": "date",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c599649166",
"groupName": "Repeated",
"direction": null,
"metadata": null,
"groupId": "597899156",
"hidden": false,
"title": "Barcode",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c615289146",
"groupName": "Repeated",
"direction": null,
"metadata": null,
"groupId": "597899156",
"hidden": false,
"title": "Shape",
"type": "geoshape",
"sort": null,
"key": false
},
{
"multipleType": "geo-shape-features",
"multipleId": "615289146",
"columnName": "c615289146_0",
"groupName": "Repeated",
"direction": null,
"metadata": null,
"groupId": "597899156",
"hidden": false,
"title": "Shape Features",
"type": "multiple",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c597909145",
"groupName": "Repeated",
"direction": null,
"metadata": null,
"groupId": "597899156",
"hidden": false,
"title": "Signature",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": "caddisfly",
"multipleId": "520ba67c-233f-4dc7-a2ad-17d86047d7c4",
"columnName": "c615289147",
"groupName": "Repeated",
"direction": null,
"metadata": null,
"groupId": "597899156",
"hidden": false,
"title": "Caddisfly",
"type": "multiple",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c617309149",
"groupName": "Non repeatable",
"direction": null,
"metadata": null,
"groupId": "617319144",
"hidden": false,
"title": "Family name",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c588869155",
"groupName": "Non repeatable",
"direction": null,
"metadata": null,
"groupId": "617319144",
"hidden": false,
"title": "Location",
"type": "geopoint",
"sort": null,
"key": false
}
]
The same dataset this time with the new import code
\d ds_92712882_698d_4fe6_ab20_98c5b9c253b3
Table “public.ds_92712882_698d_4fe6_ab20_98c5b9c253b3” | ||||
---|---|---|---|---|
Column | Type | Collation | Nullable | Default |
rnum | integer | not null | nextval(‘ds_92712882_698d_4fe6_ab20_98c5b9c253b3_rnum_seq’::regclass) | |
identifier | text | |||
instance_id | text | not null | ||
display_name | text | |||
submitter | text | |||
submitted_at | timestamp with time zone | |||
surveyal_time | double precision | |||
device_id | text | |||
c597899156 | text | |||
c617309149 | text | |||
c588869155 | geometry(Point,4326) | |||
Indexes: | ||||
“ds_92712882_698d_4fe6_ab20_98c5b9c253b3_pkey” PRIMARY KEY, btree (rnum) | ||||
“ds_92712882_698d_4fe6_ab20_98c5b9c253b3_instance_id_key” UNIQUE CONSTRAINT, btree (instance_id) | ||||
“ds_92712882_698d_4fe6_ab20_98c5b9c253b3_c588869155_idx” gist (c588869155) |
We have fewer columns because there is one RGQ that was transformed from several columns into a single one.
select columns from dataset_version where table_name = 'ds_92712882_698d_4fe6_ab20_98c5b9c253b3'
columns |
---|
[{“key”: false, “sort”: null, “type”: “text”, “title”: “Identifier”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “identifier”, “multipleId”: null, “multipleType”: null}, {“key”: true, “sort”: null, “type”: “text”, “title”: “Instance id”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “instance_id”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Display name”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “display_name”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Submitter”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “submitter”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “date”, “title”: “Submitted at”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “submitted_at”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “number”, “title”: “Surveyal time”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “surveyal_time”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Device Id”, “hidden”: false, “groupId”: null, “metadata”: null, “direction”: null, “groupName”: null, “columnName”: “device_id”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “rqg”, “title”: “Repeated”, “hidden”: false, “groupId”: “597899156”, “metadata”: {“columns”: [{“id”: “c583119147”, “type”: “text”, “title”: “Name”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c594979148”, “type”: “number”, “title”: “Age”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c609479145”, “type”: “text”, “title”: “Likes Pizza”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c617319146”, “type”: “text”, “title”: “Likes pasta”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c601879162”, “type”: “text”, “title”: “Region”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c601889144”, “type”: “geopoint”, “title”: “Current location”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c601899165”, “type”: “text”, “title”: “Photo”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c601879163”, “type”: “text”, “title”: “Video”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c601899166”, “type”: “date”, “title”: “DOB”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c599649166”, “type”: “text”, “title”: “Barcode”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c615289146”, “type”: “geoshape”, “title”: “Shape”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c597909145”, “type”: “text”, “title”: “Signature”, “groupId”: null, “metadata”: null, “groupName”: null}, {“id”: “c615289147”, “type”: “multiple”, “title”: “Caddisfly”, “groupId”: null, “metadata”: null, “groupName”: null, “multipleId”: “520ba67c-233f-4dc7-a2ad-17d86047d7c4”, “multipleType”: “caddisfly”}]}, “direction”: null, “groupName”: “Repeated”, “columnName”: “c597899156”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “text”, “title”: “Family name”, “hidden”: false, “groupId”: “617319144”, “metadata”: null, “direction”: null, “groupName”: “Non repeatable”, “columnName”: “c617309149”, “multipleId”: null, “multipleType”: null}, {“key”: false, “sort”: null, “type”: “geopoint”, “title”: “Location”, “hidden”: false, “groupId”: “617319144”, “metadata”: null, “direction”: null, “groupName”: “Non repeatable”, “columnName”: “c588869155”, “multipleId”: null, “multipleType”: null}] |
[
{
"multipleType": null,
"multipleId": null,
"columnName": "identifier",
"groupName": null,
"direction": null,
"metadata": null,
"groupId": null,
"hidden": false,
"title": "Identifier",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "instance_id",
"groupName": null,
"direction": null,
"metadata": null,
"groupId": null,
"hidden": false,
"title": "Instance id",
"type": "text",
"sort": null,
"key": true
},
{
"multipleType": null,
"multipleId": null,
"columnName": "display_name",
"groupName": null,
"direction": null,
"metadata": null,
"groupId": null,
"hidden": false,
"title": "Display name",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "submitter",
"groupName": null,
"direction": null,
"metadata": null,
"groupId": null,
"hidden": false,
"title": "Submitter",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "submitted_at",
"groupName": null,
"direction": null,
"metadata": null,
"groupId": null,
"hidden": false,
"title": "Submitted at",
"type": "date",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "surveyal_time",
"groupName": null,
"direction": null,
"metadata": null,
"groupId": null,
"hidden": false,
"title": "Surveyal time",
"type": "number",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "device_id",
"groupName": null,
"direction": null,
"metadata": null,
"groupId": null,
"hidden": false,
"title": "Device Id",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c597899156",
"groupName": "Repeated",
"direction": null,
"metadata": {
"columns": [
{
"groupName": null,
"metadata": null,
"groupId": null,
"title": "Name",
"type": "text",
"id": "c583119147"
},
{
"groupName": null,
"metadata": null,
"groupId": null,
"title": "Age",
"type": "number",
"id": "c594979148"
},
{
"groupName": null,
"metadata": null,
"groupId": null,
"title": "Likes Pizza",
"type": "text",
"id": "c609479145"
},
{
"groupName": null,
"metadata": null,
"groupId": null,
"title": "Likes pasta",
"type": "text",
"id": "c617319146"
},
{
"groupName": null,
"metadata": null,
"groupId": null,
"title": "Region",
"type": "text",
"id": "c601879162"
},
{
"groupName": null,
"metadata": null,
"groupId": null,
"title": "Current location",
"type": "geopoint",
"id": "c601889144"
},
{
"groupName": null,
"metadata": null,
"groupId": null,
"title": "Photo",
"type": "text",
"id": "c601899165"
},
{
"groupName": null,
"metadata": null,
"groupId": null,
"title": "Video",
"type": "text",
"id": "c601879163"
},
{
"groupName": null,
"metadata": null,
"groupId": null,
"title": "DOB",
"type": "date",
"id": "c601899166"
},
{
"groupName": null,
"metadata": null,
"groupId": null,
"title": "Barcode",
"type": "text",
"id": "c599649166"
},
{
"groupName": null,
"metadata": null,
"groupId": null,
"title": "Shape",
"type": "geoshape",
"id": "c615289146"
},
{
"groupName": null,
"metadata": null,
"groupId": null,
"title": "Signature",
"type": "text",
"id": "c597909145"
},
{
"multipleType": "caddisfly",
"multipleId": "520ba67c-233f-4dc7-a2ad-17d86047d7c4",
"groupName": null,
"metadata": null,
"groupId": null,
"title": "Caddisfly",
"type": "multiple",
"id": "c615289147"
}
]
},
"groupId": "597899156",
"hidden": false,
"title": "Repeated",
"type": "rqg",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c617309149",
"groupName": "Non repeatable",
"direction": null,
"metadata": null,
"groupId": "617319144",
"hidden": false,
"title": "Family name",
"type": "text",
"sort": null,
"key": false
},
{
"multipleType": null,
"multipleId": null,
"columnName": "c588869155",
"groupName": "Non repeatable",
"direction": null,
"metadata": null,
"groupId": "617319144",
"hidden": false,
"title": "Location",
"type": "geopoint",
"sort": null,
"key": false
}
]
In the case of the RQG the definition of the columns gets embedded in
the metadata
key of the column holding all the group.
We want to reuse the existing infrastructure to deal with question types, transformation, versioning, etc.
It all starts with a data_source
that defines the the source of data
for a particular dataset.
A job_execution
keeps track of long running processes (import,
transformation) and links the data_source
with the resulting
dataset
and dataset_version
.
At the end the data is stored in a ds_*
table with the raw data in
a import_*
table, e.g.
select dataset_id, version, table_name, imported_table_name from dataset_version limit 1
dataset_id | version | table_name | imported_table_name |
---|---|---|---|
5ee6f025-e96b-4530-8d44-7f8edbd89b6c | 1 | ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7 | imported_e51ef10b_22e4_425f_a809_4b15f3234054 |
The table imported_e51ef10b_22e4_425f_a809_4b15f3234054
is always
used as starting point for running transformations, and producing a
fresh data table (e.g. ds_6d6b08ae_e38d_43ff_9437_0576e5afe6f7
)
The current hierarchy is
data_source
-> job_execution
-> dataset
-> dataset_version
In this model we have one of each (after import). One way of following
the same mental model is to define a new type of data_source
and
this spec
definition we store the required information of
parent/child relationship.
Current spec
for FLOW
source
select spec from data_source limit 1
spec |
---|
{“name”: “RQG - 1”, “source”: {“kind”: “AKVO_FLOW”, “email”: “***@akvo.org”, “formId”: “601879159”, “version”: 3, “instance”: “uat1”, “surveyId”: “615289140”}} |
{
"source": {
"surveyId": "615289140",
"instance": "uat1",
"version": 3,
"formId": "601879159",
"email": "***@akvo.org",
"kind": "AKVO_FLOW"
},
"name": "RQG - 1"
}
The spec
of a dataset for a RQG will use a different kind
and some
extra properties in source
{
"source": {
"surveyId": "615289140",
"instance": "uat1",
"version": 3,
"formId": "601879159",
"groupId": "597899156", // <-- RQG id
"email": "***@akvo.org",
"kind": "AKVO_FLOW_RQG" // <-- new kind
},
"name": "Repeated" // <- RQG name
}
With this approach we don’t need to touch too much the current data model. There are some open questions that need some thinking and decision.
- In order to filter datasets for library view, it will be easier to
just have a new flag in
dataset
that help us filter out RQG datasets. e.g.visible=false
Let’s remember that RQG as dataset is an implementation detail. - An import
job_execution
produces onedataset
and onedataset_version
. What happens now? Do we want a differentjob_execution
to produce the dataset for a RQG? Or change the behavior: Onejob_execution
produces 1 + ndatasets
anddataset_versions
(wheren
is the number of RQG).- The same applies for already imported datasets on Update and Transformation processes.
We don’t introduce a new data_source
. When importing we detect the
RQG and create one dataset
and dataset_version
for each RQG group.
The difference between this and the previous is:
- Don’t introduce a new
data_source
- A
dataset
can have multipledataset_version
withtable_name
pointing to the different parts of the dataset (main + RQGs)
select version, table_name, imported_table_name from dataset_version where dataset_id = '5ee6f658-5ca1-4b2a-bc4f-1b13cb095147' order by created desc;
version | table_name | imported_table_name |
---|---|---|
2 | ds_92712882_698d_4fe6_ab20_98c5b9c253b3 | imported_8597e45a_fbe7_42d7_bf6d_3f003bf97cc7 |
1 | imported_8597e45a_fbe7_42d7_bf6d_3f003bf97cc7 |
- The relationship parent/child is based on
dataset
->dataset_version
. We need a way to know that adataset_version
is a RQG so we can fetch the right data when requested. - We need to decide if we want one
job_execution
for alldataset_version
or one for each - Although this is an implementation detail it feels weird that the
distinction between a RQG and main dataset happens at the
dataset_version
level
- This is similar to the previous but we don’t break the current assumption that at one point in time a
dataset_version
only has one data table. - We need to introduce a flag at
dataset
level to define it as RQG - We need to introduce a parent/child relationship between
dataset
. This is to know to which main dataset a particular RQG belongs to.