Skip to content

Instantly share code, notes, and snippets.

@bgarcial
Last active July 31, 2024 21:24
Show Gist options
  • Save bgarcial/0720271c81dbc318be1280cf869e3990 to your computer and use it in GitHub Desktop.
Save bgarcial/0720271c81dbc318be1280cf869e3990 to your computer and use it in GitHub Desktop.
Node affinity problems on sysctl daemonset

I see there are some nodes (those ones aks-usermoose-*.) that has this taint:

​"taints": [
   {
     "effect": "NoSchedule",
     "key": "kubernetes.azure.com/scalesetpriority",
     "value": "spot"
   }
  ],

But we are allowing sysctl daemonset to go to those nodes since we specify on it:

​    tolerations:
        - operator: Exists
          key: kubernetes.azure.com/scalesetpriority

Even it is a wide rule, as we are jsut saying whete the key exists regardless its value, so at some extend this is not a problem of taints and tolerations and as you point me out the toleration applied to the daemonset should be enough for the pods to be scheduled, this plus the nodeselector which label is present on all nodes.

So some thoughts

  • The weird thing that I don't understand why the pods are not being scheduled on such nodes but also even one pod got scheduled in a node with no taints, (just becuase the node selector?),

  • But also the nodes with the taints has the same label that matches with the node selector.

  • The nodeselector tells to me that daemonset should be able to be placed on all nodes as all have that label.

  • Also the rolling update strategy seems to be fine to allow to deploy all pods

​updateStrategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 100%
    type: RollingUpdate

and also nodes has enough allocatable cpu and memory

Since the error on poending pods has to do with their node affinity

​Warning  FailedScheduling  3m49s (x15 over 74m)  default-scheduler  0/23 nodes are available: 
1 node(s) didn't match Pod's node affinity/selector, 22 node is filtered out by the prefil ││ ter result. 
preemption: 0/23 nodes are available: 23 Preemption is not helpful for scheduling

I also added a node affinity to the daemonset like this to don't force to go to spot instances but preffered


     affinity:
        nodeAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 1
            preference:
              matchExpressions:
              - key: kubernetes.azure.com/scalesetpriority
                operator: In
                values:
                - spot

so with this not only for nodeselector but with nodeaffinity pods should be placed on matching nodes, but without luck ..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment