EKS: Unable to Add a New Classic Load Balancer?

So I was working on a project that required me to create a custom webhook into a pre-existing service running in our EKS cluster. I needed to use the ingress nginx controller due to needing to create a custom server block for the webhook, and due to the fact that this particular cluster didn’t seem to have any pre-existing externally facing nginx ingress controllers I decided to add a new one for this use-case.

Helm chart added, Argo app added, deployed OK. All good. A new nginx ingress controller pod spun up, and everything looked good for me to add the test ingress for my new webhook.

[~/temp/yaml]# Kubectl apply -f test_webhook.yaml
DALL-E 3

Ingress deployed OK, lets see where it attaches to and make sure DNS records get added automatically via the handy external-dns controller. After a few minutes I check, and the webhook resolves. Something is off though. I can’t seem to get through to the backend service. I simplify the webhook, spin up a plain nginx pod for testing and re-point the ingress to my test nginx service. Nothing. No ‘welcome to nginx’ page. Just an error code. I fiddle around for a while and then decide to look at the load balancer…

NLB. Huh. Why did this spin up an NLB? I have plenty of the same version of ingress nginx running in other clusters and those are using classic LB’s. I notice that the cert is also attached to the NLB, which is something new with NLBs I normally don’t use that much. After fiddling with it I start digging through the github issues for the controller to see if I have run up against anything weird. No dice, but there are a number of comments about how classic LB’s are deprecated anyway. Sure I ‘know’ that but I know they aren’t going anywhere and my existing ones work just fine on the same controller elsewhere. What gives?

So in kubernetes it is actually the service resource for the nginx controller that causes the LB to get created, and the creation isn’t done by the ingress nginx controller at all. It’s built into EKS. Oddly I notice that the nginx controller service definition has a field showing the load balancer type as being ‘nlb’ – was this in the chart and I missed it? Some new values.yaml field that I needed to set the load balancer type? Looking further, I found that the field that I clearly saw inside the running cluster for the nlb type on the nginx service definition yaml wasn’t in the chart at all. More head scratching. Where was the .spec.loadBalancerClass field being set here?

After some digging I found the answer – elsewhere. This is one of those cases where reading all the release notes for everything you are running pays off, and I hadn’t done that. The AWS ALB controller service was actually the culprit here, or rather it was the fact that I hadn’t read the recent release notes. Note that the ALB controller is it’s own thing completely. It never occurred to me that one service in kubernetes might be designed to change a resource that was part of another service. Sure I had seen some admission stuff in the past but we didn’t have any of that in place, or so I thought.

https://github.com/kubernetes-sigs/aws-load-balancer-controller/releases/tag/v2.5.1

DALL-E 3
🚨 🚨 🚨We have made the LBC the default controller for service type LoadBalancer by adding a mutating webhook. You can disable the feature by setting the helm chart value enableServiceMutatorWebhook to falseYou will no longer be able to provision new Classic Load Balancer (CLB) from your kubernetes service unless you disable this feature.

Oh! – So they created a mutating webhook that assumed you wouldn’t be using any other charts using a service of type: loadbalancer, and this webhook prevents any classic LB’s from being created from any other service resource. Sheesh – all I wanted was to start testing out an ingress, and somehow came all the way around to having to make a change on a mostly unrelated service. I disabled the mutating webhook, redeployed the new nginx controller, this time verifying a classic LB was created, and finally got a basic ingress working. I know at some point I will have to get rid of my classic load balancers, but this wasn’t the day for that. Maybe soon I will get my new custom webhook into my cluster working, but today that ALB controller yak needed to be shaved.

Onwards.