Insights Builder is a module that allows to query different source of data in order to extract information and transform it in a format which can be used to generate dynamic charts.
In order to extract the information, the following steps are followed by the module:
- Set configuration
- Build query
- Get data (execute query)
- Transform data
The following are the available configurations that can be set:
- Source
- X Axis
- Inverval
- X Segment
- Y Axis
- Y Operation
- Filters
Configuration Name | Required |
---|---|
Source | true |
X Axis | true |
X Segment | false |
Y Axis | true |
Y Operation | true |
Filters | false |
The query is built depending on the configurations set. The following are the different ways in which a query is built:
{
aggs: {
by_main_grouping: {
terms: {
field: value
},
aggs: {
by_operation {
operation: {
field: value
}
}
}
}
}
}
{
query: {
bool: {
must: [
{terms: {field: value}}
]
}
},
aggs: {
by_main_grouping: {
terms: {
field: value
},
aggs: {
by_segment: {
terms: {
field: value
},
aggs: {
by_operation: {
operation: {
field: value
}
}
}
}
}
}
}
}
A query built with the following configurations results in the following responses:
Results in:
{
by_main_grouping: [
{
by_operation: {
value: value
}
}
]
}
Results in:
{
by_main_grouping: [
{
by_segment: [
{
by_operation: {
value: value
}
}
]
}
]
}
After getting a result, a series of transformations can be executed, the following are the supported transformations:
- Resolve location names (X Axis)
- Convert result values to the total percentage (Y Axis)
- Convert result values to percentage change (Y Axis)
- Convert result values to the corresponding value by 100k inhabitants (Y Axis)
The module has a set of rules and restrictions that applied to each configuration depending on the value which are listed below:
There are only two supported sources which are: internal and secretariat
The X Axis can be any of the following depending on the source selected:
- internal
- crime_category:
String
- loss_amount:
Number
- location (municipality, state):
String
- source_name:
String
- reporter_name:
String
- incident_date:
Date
- crime_category:
- secretariat
- crime_category:
String
- crime_type:
String
- modality:
String
- location (municipality, state, country):
String
- date:
Date
- value:
Number
- value_100k:
Number
- crime_category:
When a field of type date is selected, an interval configuration can be set in order to group the resulting data by the interval specified, the following values are supported:
interval String |
---|
month |
trimester |
year |
The X Segment can be any of the following depending on the X Axis selected (with the restriction that the selected X Axis cannot be set as X Segment):
- internal
- crime_category:
String
- location (municipality, state):
String
- loss_amount:
String
- source_name:
String
- reporter_name:
String
- crime_category:
- secretariat
- crime_category:
String
- crime_type:
String
- modality:
String
- location (municipality, state, country):
String
- crime_category:
The Y Axis can be any of the following depending on the following:
- internal
- crime_category:
String
- loss_amount:
Number
- source_name?:
String
- reporter_name?:
String
- crime_category:
- secretariat
- crime_category:
String
- crime_type:
String
- modality:
String
- value:
Number
- value_100k:
Number
- crime_category:
The Y Operation can be any of the following depending on the selected source and data type of the Y Axis:
- internal
- string
- count
- number
- sum
- average
- string
- secretariat
- string
- count (resolves to the sum of the value field)
- by 100k inhabitants (resolves to the sum of the value_100k field)
- total percentage (resolves to the sum of the value field)
- percentage change (resolves to the sum of the value field)
- string
The filters are a way to get only the data you want, these filters depends on the data type of the selected field used to filter, the following are the available fields:
- internal
- crime_category:
String
- location (municipality, state):
String
- incident_date:
Date
- crime_category:
- secretariat
- crime_category:
String
- crime_type:
String
- modality:
String
- location (municipality, state, country):
String
- date:
Date
- crime_category:
For string fields a terms query is used in order to apply the filter
{
terms: {
field: [values]
}
}
For date fields a range query is used in order to apply the filter
{
range: {
field: {
gte: start_date,
lte: end_date
}
}
}
In order to be able to generate charts that contain results with more than one of the different types of locations (i.e. municipalities, states, countries) is necessary to specify this information.
Discrete locations are a way to get the data of different types of locations in the same chart and also filter the data by the locations selected.
For example, you want to generate a chart that contains two municipalities from Aguascalientes, the state of Coahuila along with the National rate of homicides. In order to get this information a list of location IDs must be sent.
locs: ['ags_municipality1_id', 'ags_municipality2_id', 'coahuila_state_id', 'mexico_country_id']
Behind the scene the module will convert the list of locations into a dictionary where the keys are the different types of locations and the values are the list of IDs of those types, like the following:
{
'country_id': ['mexico_country_id'],
'state_id': ['coahuila_state_id'],
'municipality_id': ['ags_municipality1_id', 'ags_municipality2_id']
}
In order to be able to generate charts that contain results with more than one of the different types of crime levels (i.e. crime_category, crime_type, modality) is necessary to specify this information.
Discrete crimes are a way to get the data of different levels of crimes in the same chart and also filter the data by the crimes selected.
For example, you want to generate a chart that contains the categories Homicidio y Robo, the types Homicidio Doloso along with the modality Con arma de fuego. In order to get this information a list of crimes must be sent.
For each crime level a query will be built, executed and the final results will be merged resulting in the desired dataset.
# Compare the anual rate of "robo de coche de 4 ruedas sin violencia"
# of 5 municipalities from Aguascalientes with the rate of the
# Aguascalientes state and the National one.
gb = GraphicBuilderService(owner, GraphicBuilderService.SECRETARIAT)
gb.set_x_axis('location')
gb.set_y_axis('modality', 'count')
gb.set_date_filter('date', '2017-01:01 00:00:00', '2017-12-31 00:00:00')
gb.set_filter('modality', ['robo de coche de 4 ruedas sin violencia'])
gb.set_discrete_field('location', ['0924850f-abe4-407c-80d8-175416b169bb', 'c9b46887-24f2-414a-8f39-07f273a1de2a', '4aa2137a-c19a-4db8-8f4d-1409c0625b70', 'c68c4d2a-b34e-4a22-a2da-d71bceaa1944'])
data = gb.execute()
# Compare the percentage change of the first trimester in 2018
# with the same trimester of the former year of
# "robo comun con violencia a transeuntes"
# in five municipalities of Chihuahua with the state of Chihuahua,
# three municipalities of Nuevo León and the neighbor state of Nuevo León
gb = GraphicBuilderService(owner, GraphicBuilderService.SECRETARIAT)
gb.set_x_axis_date('date', 'trimester')
gb.set_x_segment('location')
gb.set_y_axis('modality', 'percentual change')
gb.set_date_filter('date', '2017', '2018', 'year')
gb.set_filter('modality', ['robo comun con violencia a transeuntes'])
gb.set_discrete_field('location', ['0924850f-abe4-407c-80d8-175416b169bb', 'c9b46887-24f2-414a-8f39-07f273a1de2a', '4aa2137a-c19a-4db8-8f4d-1409c0625b70', 'c68c4d2a-b34e-4a22-a2da-d71bceaa1944'])
data = gb.execute()
# Compare the 100k inhabitants rate from January of 2017 to February of 2017
# of "lesiones dolosas con arma de fuego" in one municipality of Aguascalientes,
# two of Zacatecas, one of Guanajuato, one of SLP and one of Tlaxcala
gb = GraphicBuilderService(owner, GraphicBuilderService.SECRETARIAT)
gb.set_x_axis_date('date', 'month')
gb.set_x_segment('location')
gb.set_y_axis('modality', '100k')
gb.set_date_filter('date', '2016-01:01 00:00:00', '2017-01-28 00:00:00')
gb.set_filter('modality', ['lesiones dolosas con arma de fuego'])
gb.set_discrete_field('location', ['0924850f-abe4-407c-80d8-175416b169bb', 'c9b46887-24f2-414a-8f39-07f273a1de2a', '4aa2137a-c19a-4db8-8f4d-1409c0625b70', 'c68c4d2a-b34e-4a22-a2da-d71bceaa1944'])
data = gb.execute()