Skip to content

Instantly share code, notes, and snippets.

@Crystalh
Last active December 19, 2015 08:39
Show Gist options
  • Save Crystalh/5926826 to your computer and use it in GitHub Desktop.
Save Crystalh/5926826 to your computer and use it in GitHub Desktop.
Proposal for a script to merge regional meta and to manage CRUD functions for future components meta. Removes the need for any hand-editing of master meta file.

Requirements & Features

  • provide a metadata template which meta producers must follow
  • easy to update: update entire components or specific properties of a component
  • There is a set of generic meta for each region. New meta should ONLY include unique keys/properties. i.e.
  • Radio has generic key of name, and unique keys of radio URL & sid. So meta producer would only provide URL & sid.
  • namespace properties of a particular component, i.e.: weatherLocations, radioSid, radioUrl, sportTeams, travelSearchTerm
  • Region URL is the key for each node
  • script WILL NOT modify metadata provided. Must be in correct format.
  • script can be provided either json or yaml, outputs TBC

Script functionality

  • view meta for a particular region, i.e. rake meta:view '/news/england/london'
  • add meta for a new feature/component and basic data formatting validation. Exit if not valid throwing a custom error
  • add meta properties for an existing component
  • warns when existing properties might be overwritten. Lets user continue or exit.
  • delete all properties for a component, e.g. weather
  • delete specific properties for a component (passed in as an array), e.g. weatherLocations
  • delete generic properties e.g. name
  • update existing property name, e.g. radioSid --> radioId

Examples

Properties are nested by component type e.g. radio

 "/news/england/beds_bucks_and_herts": {
     "radio": {
         "name": "Beds, Herts & Bucks",
         "radioSid": "bedshertsandbucks"
     }
 },

Properties are namespaced by component type (unless they are generic properties e.g. "name") and no nesting

 "/news/england/beds_bucks_and_herts": {
     "name": "Beds, Herts & Bucks",
     "radioSid": "threecounties",
     "travelSearchTerm": "bedshertsandbucks",
     "weatherLocations": {}
 }
@kkajero
Copy link

kkajero commented Jul 4, 2013

Haven't read or thought about it in a detail, but here are some of my thoughts:

  • Sounds good as a generic tool for CRUD operations on a regions meta "DB" :)
  • Need to focus on current issue of multiple files.
  • Create a (ruby) script for the merge first, then deal with CRUD after that.
  • The merge script doesn't have to be a rake task. It'll probably be throwaway, but knowledge gained and the output will be of great benefit.
  • Good to stick with YAML for consistency with the recent move to YAML for application config.
  • Good and makes sense that the main keys of the data structure are News region index URLs.
  • When we get round to it, label the rake task clearly as something for Local regions, say rake update_regions_meta or just simply rake regions_meta
  • Probably just focus on operations that are really needed first. How likely are we to need all CRUD operations in the near future? Wait till we need an operation, before we add it?
  • If it's simple YAML, do you really need a "rake view"?
  • How do we know when the amount of data has become unmanageable? I.e., when does this become a crap solution?
  • Can we write some ruby tests (RSpec or something) to harness and tease out the script's implementation?

@kkajero
Copy link

kkajero commented Jul 4, 2013

Also, not sure there's need to mandate another file or file-snippet as the argument. Fact that we currently have multiple files now, doesn't mean we must put arguments into a file in a certain format in order to perform operations on the master file in the future.

@Crystalh
Copy link
Author

Crystalh commented Jul 4, 2013

@kkajero Agreed that we'll first focus on the task of merging all of the meta files together. And I also agree about adding in features as and when in the future. It could be a lot of effort to add all of the CRUD features when they may never be used, or rarely.

  • yaml will be the output format
  • we can use Sam's original Ruby script as a starting point
  • we'll write some Ruby tests. Will find out what's best for a data manipulation script.

In answer to your question "How do we know when this becomes a bad solution?": tough to quantify this.

From a performance PoV: It should be highly cached with APC. Had a chat with John and he suggested using a rudimentary tool such as ab on sandbox, first on the merged file and then on a potentially much larger file (just by faking it with repeating the same data).

As a later optimisation, we could use something like Rake or Grunt to split out the meta based on region index. That way we can serve up e.g. england_derbyshire.yml (gzipped) which should make the payload a lot smaller than a massive merged file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment