Skip to content

Instantly share code, notes, and snippets.

@milimetric
Created March 9, 2022 17:02
Show Gist options
  • Save milimetric/df0b4779bea4b1d0e8cdfb9bf867a328 to your computer and use it in GitHub Desktop.
Save milimetric/df0b4779bea4b1d0e8cdfb9bf867a328 to your computer and use it in GitHub Desktop.
Thoughts on DAG generation
projectview_ready = HiveTriggeredHQLTaskFactory(
'run_hql_and_arhive',
default_args=default_args,
...
)
archive = ArchiveTaskFactory(...)
projectview_ready.sensors() >> projectview_ready.etl() >> archive()
class HiveTriggeredHQLTaskFactory(...):
def __init__(sources, ...):
self.sources = ...
def sensors(self):
return self._build_sensors(self.sources, ...)
def etl(self):
# I looked and it seems that dag= is not required
# https://airflow.apache.org/docs/apache-airflow/1.10.6/_api/airflow/models/index.html#airflow.models.BaseOperator
return SparkSqlOperator(...)
cass ArchiveTaskFactory(...):
def __init__(...):
...
def __call__(...):
return ArchiveOperator(...)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment