In Kubernetes, I am a little unclear of what criteria needs to be met for open-faas to scale a function's replicas up or down.
According to the documentation:
Auto-scaling in OpenFaaS allows a function to scale up or down depending on demand represented by different metrics.
It sounds like, by default, a reason for scaling would be requests/second increasing/decreasing.
OpenFaaS ships with a single auto-scaling rule defined in the mounted configuration file for AlertManager. AlertManager reads usage (requests per second) metrics from Prometheus in order to know when to fire an alert to the API Gateway.
And this "alert" sent to the API Gateway would cause a function's replica count to scale up.
I don't see in the documentation, or the AlertManager, where the threshold for requests/second is set to scale up/down at.
My overall questions: