After deploying around 60 pods on AKS which uses Rebus RabbitMq. During the initialization, some around 15 pods restart several times and then come into running state. Below error thrown by the components,
*Unhandled Exception: Rebus.Injection.ResolutionException: Could not resolve Rebus.Bus.IBus with decorator depth 0 - registrations: Rebus.Injection.Injectionist+Handler ---> RabbitMQ.Client.Exceptions.BrokerUnreachableException: None of the specified endpoints were reachable ---> System.AggregateException: One or more errors occurred. ---> RabbitMQ.Client.Exceptions.ConnectFailureException: Connection failed ---> System.Net.Sockets.SocketException: No such host is known
at System.Net.Dns.HostResolutionEndHelper(IAsyncResult asyncResult)
at System.Net.Dns.EndGetHostAddresses(IAsyncResult asyncResult)
at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at RabbitMQ.Client.TcpClientAdapter.<ConnectAsync>d__2.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at RabbitMQ.Client.Impl.TaskExtensions.<TimeoutAfter>d__1.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at RabbitMQ.Client.Impl.SocketFrameHandler.ConnectOrFail(ITcpClient socket, AmqpTcpEndpoint endpoint, Int32 timeout)
--- End of inner exception stack trace ---
at RabbitMQ.Client.Impl.SocketFrameHandler.ConnectUsingAddressFamily(AmqpTcpEndpoint endpoint, Func`2 socketFactory, Int32 timeout, AddressFamily family)
at RabbitMQ.Client.Impl.SocketFrameHandler..ctor(AmqpTcpEndpoint endpoint, Func`2 socketFactory, Int32 connectionTimeout, Int32 readTimeout, Int32 writeTimeout)
at RabbitMQ.Client.ConnectionFactory.CreateFrameHandler(AmqpTcpEndpoint endpoint)
at RabbitMQ.Client.EndpointResolverExtensions.SelectOne[T](IEndpointResolver resolver, Func`2 selector)
--- End of inner exception stack trace ---
at RabbitMQ.Client.EndpointResolverExtensions.SelectOne[T](IEndpointResolver resolver, Func`2 selector)
at RabbitMQ.Client.Framing.Impl.AutorecoveringConnection.Init(IEndpointResolver endpoints)
at RabbitMQ.Client.ConnectionFactory.CreateConnection(IEndpointResolver endpointResolver, String clientProvidedName)
--- End of inner exception stack trace ---
at RabbitMQ.Client.ConnectionFactory.CreateConnection(IEndpointResolver endpointResolver, String clientProvidedName)
at Rebus.Internals.ConnectionManager.GetConnection()
at Rebus.RabbitMq.RabbitMqTransport.CreateQueue(String address)
at Rebus.Config.RebusConfigurer.<>c__DisplayClass12_0.<Start>b__26(IResolutionContext c)
at Rebus.Injection.Injectionist.ResolutionContext.Get[TService]()
--- End of inner exception stack trace ---
at Rebus.Injection.Injectionist.ResolutionContext.Get[TService]()
at Rebus.Injection.Injectionist.Get[TService]()
at Rebus.Config.RebusConfigurer.Start()
at Castle.Windsor.Installer.AssemblyInstaller.Install(IWindsorContainer container, IConfigurationStore store)
at Castle.Windsor.WindsorContainer.Install(IWindsorInstaller[] installers, DefaultComponentInstaller scope)
at Castle.Windsor.WindsorContainer.Install(IWindsorInstaller[] installers)
at RebusHost.Main(String[] args)*
Although there is a connection available to RabbitMq server but some pods on start give this error and after 3 to 5 restarts they are in successful running state. So not sure what will be causing pod to not get connected on first attempt itself. Any clue will be appreciated.
We are using Rebus 4.0 & RabbitMq 5.1.0.0 versions. Deploying the components(pods) on windows node of AKS. And on AKS running docker image of "rabbitmq:3-management" under linux node ofcourse.