Questions related to cores vs executors are asked number of times on SO.
Apache Spark: The number of cores vs. the number of executors
As each case is different, I'm asking similar question again.
I'm running a cpu intensive application with same number of cores with different executors. Below are the observations.
Resource Manager : Kubernetes
Case 1: Executors - 6, Number of cores for each executor -2, Executor Memory - 3g, Amount of Data processing ~ 10GB, Partitions -36, Job Duration : 75 mins
Case 2: Executors - 4, Number of cores for each executor -3, Executor Memory - 3g, Amount of Data processing ~ 10GB, Partitions -36, Job Duration : 101 mins
As per the above link anything less than 5 cores per executor is good for IO operations.
In both my cases cores are same(12), however both jobs took different times. Any thoughts?
Updated
Case 3: Executors - 12, Number of cores for each executor -1, Executor Memory - 3g, Amount of Data processing ~ 10GB, Partitions -36, Job Duration : 81 mins
There are many possible solutions, First of all not all nodes are born equal, it might be one of the jobs got unlucky and got a slow node. Seconds if you perform shuffle operations having more nodes but the same computation power will really slow your job. after all in a shuffle operation all of your information will be eventually stored on a single node. having this node with less data before hand and less power will slow the operation. But I suspect that even without shuffle operations more nodes will be a bit slower, as there is a higher chance of a single node having more work to do from other nodes.
Explanation:
Lets say I have a single node and 10 hour of work and 10 cores, so I know it will take 1 hour but if I have 2 nodes, with 5 cores each and lets say the dataset was partitioned in a way one node has 5.5 hours of work and the other 4.5 hours , the job length will be 1.1 hours.
There is always overhead price to pay for distributed computing. so Its usually faster to do work with the same resources on a single machine.
Hope what I tried to say is clear.
In the first case you do have 50% more memory to work with (3g*6=18g) while have less locking issues (2 cores/executor instead of 3). Try dynamic allocation with 1core/executor and see what happens