In the course of working on pachyderm/pachyderm#2505, JD and I ran into a conflict between that design and our auth model:
If PipelineInfo
documents are stored in output repos, then e.g. ListPipeline
and InspectPipeline
have no way to retrieve PipelineInfo
s for users who don't have access to the pipeline's output repo.
This means that ListPipeline
no longer returns all PipelineInfos in a DAG (and may not even return most of the PipelineInfos) in a DAG, which I believe breaks some of the assumptions in our dashboard rendering algorithm (see Alternatives Considered for some of the conceptual problems I ran into while trying to think of solutions).
While we could hack around this issue (again, see Alternatives Considered below), I think this may be an opportunity to move our auth system in the direction of a role-based auth system similar to the one in GCP, AWS and etcd.
- The most obvious alternative is to call the incomplete results of
ListPipeline
a feature: @gabrielgrant and @JoeyZwicker have been interested in restricting discoverability for a while (for large organizations with many independent teams managing roughly independent DAGs). Not returning all pipelines fromListPipeline
seems like a good step in that direction.- The problem with this is that I couldn't come up with a definition of discoverability that made sense.
- For example, if
ListPipeline
only returns a subset of pipelines, butListRepo
returns all repos, then users may see a large number of orphan repos, which is probably useless (e.g. in the context of a large organization with many independent teams). SoListRepo
also needs to be restricted. - As well, if
ListPipeline
returns all pipelines such that the caller (U
) has access to the pipeline's output repo, thenListPipeline
may return pipelines where the caller can't list (or inspect) all of the pipeline's inputs, so whatever the discoverability criteria for repos are,ListPipeline
must check thatU
can read the pipeline's output and discover all of the pipeline's inputs ListPipeline
could return pipelines whereU
hasREADER
access to all of the pipeline's inputs as well as its output, but if repos are only discoverable to users who already haveREADER
access to them, then users have no way to ask for access to new repos