Creating HA clusters with kubeadm

6/5/2018

I am building kubeadm HA based on the following site.
https://kubernetes.io/docs/setup/independent/

The environment I use is Ubuntu server 16.04 on AWS.

I faced problems while building the environment.

The following error occurs when kubeadm init --config=config.yaml is executed.

# kubeadm init --config=config.yaml
[init] Using Kubernetes version: v1.10.3
[init] Using Authorization modes: [Node RBAC]
[preflight] Running pre-flight checks.
        [WARNING SystemVerification]: docker version is greater than the most recently validated version. Docker version: 18.03.1-ce. Max validated version: 17.03
        [WARNING FileExisting-crictl]: crictl not found in system path
Suggestion: go get github.com/kubernetes-incubator/cri-tools/cmd/crictl
[preflight] Some fatal errors occurred:
        [ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
        [ERROR ExternalEtcdVersion]: couldn't parse external etcd version "": Version string empty
        [ERROR ExternalEtcdVersion]: couldn't parse external etcd version "": Version string empty
        [ERROR ExternalEtcdVersion]: couldn't parse external etcd version "": Version string empty
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`

This is config.yaml
(IP address value is dummy.)

apiVersion: kubeadm.k8s.io/v1alpha1
kind: MasterConfiguration
api:
  advertiseAddress: 192.168.0.10
etcd:
  endpoints:
  - https://192.168.0.10:2379
  - https://192.168.0.11:2379
  - https://192.168.0.12:2379
  caFile: /etc/kubernetes/pki/etcd/ca.pem
  certFile: /etc/kubernetes/pki/etcd/client.pem
  keyFile: /etc/kubernetes/pki/etcd/client-key.pem
networking:
  podSubnet: 10.244.0.0/16
apiServerCertSANs:
- <load-balancer-ip>
apiServerExtraArgs:
  apiserver-count: "3"

Is this a bug in kubeadm?
Please let me know how to solve an error.

-- rootman
kubeadm
kubernetes

1 Answer

6/7/2018

The problem you've faced is related to the fact that in version prior to v1.10.3 kubeadm suppresses connection error. That's why you are not able to see what is happening exactly and may be thinking of some errors in the configuration file.

Here is the issue related to your problem.

In version 1.10.3, a fix was introduced in PR #60585, so now you should see connection errors and can figure out how to fix them.

In any case, your problem is caused by the problem with connection to etcd cluster endpoints.

https://192.168.0.10:2379/version
https://192.168.0.11:2379/version
https://192.168.0.12:2379/version

You may try to connect to that endpoints using curl command from the node where you run kubeadm init using your certificates from the config file:

caFile: /etc/kubernetes/pki/etcd/ca.pem
certFile: /etc/kubernetes/pki/etcd/client.pem
keyFile: /etc/kubernetes/pki/etcd/client-key.pem

Here is an example:

curl --cacert /etc/kubernetes/pki/etcd/ca.pem --cert /etc/kubernetes/pki/etcd/client.pem --key /etc/kubernetes/pki/etcd/client-key.pem   -L https://192.168.0.10:2379/version
{"etcdserver":"3.3.2","etcdcluster":"3.3.0"}

If you got a connection error, you should fix this problem before cluster initialization.

This is the part of the code related to checking the external etcd server version. It was copied from the master branch:

// Check validates external etcd version
// TODO: Use the official etcd Golang client for this instead?
func (evc ExternalEtcdVersionCheck) Check() (warnings, errors []error) {
    glog.V(1).Infoln("validating the external etcd version")

    // Return quickly if the user isn't using external etcd
    if evc.Etcd.External.Endpoints == nil {
        return nil, nil
    }

    var config *tls.Config
    var err error
    if config, err = evc.configRootCAs(config); err != nil {
        errors = append(errors, err)
        return nil, errors
    }
    if config, err = evc.configCertAndKey(config); err != nil {
        errors = append(errors, err)
        return nil, errors
    }

    client := evc.getHTTPClient(config)
    for _, endpoint := range evc.Etcd.External.Endpoints {
        if _, err := url.Parse(endpoint); err != nil {
            errors = append(errors, fmt.Errorf("failed to parse external etcd endpoint %s : %v", endpoint, err))
            continue
        }
        resp := etcdVersionResponse{}
        var err error
        versionURL := fmt.Sprintf("%s/%s", endpoint, "version")
        if tmpVersionURL, err := purell.NormalizeURLString(versionURL, purell.FlagRemoveDuplicateSlashes); err != nil {
            errors = append(errors, fmt.Errorf("failed to normalize external etcd version url %s : %v", versionURL, err))
            continue
        } else {
            versionURL = tmpVersionURL
        }

##### Here we connect to endpoint and request version info
        if err = getEtcdVersionResponse(client, versionURL, &resp); err != nil {
            errors = append(errors, err)
            continue
        }
##### Here we print that error message in case of error on the previous step
        etcdVersion, err := semver.Parse(resp.Etcdserver)
        if err != nil {
            errors = append(errors, fmt.Errorf("couldn't parse external etcd version %q: %v", resp.Etcdserver, err))
            continue
        }
        if etcdVersion.LT(minExternalEtcdVersion) {
            errors = append(errors, fmt.Errorf("this version of kubeadm only supports external etcd version >= %s. Current version: %s", kubeadmconstants.MinExternalEtcdVersion, resp.Etcdserver))
            continue
        }
    }

    return nil, errors
}

....

func getEtcdVersionResponse(client *http.Client, url string, target interface{}) error {
    loopCount := externalEtcdRequestRetries + 1
    var err error
    var stopRetry bool
    for loopCount > 0 {
        if loopCount <= externalEtcdRequestRetries {
            time.Sleep(externalEtcdRequestInterval)
        }
        stopRetry, err = func() (stopRetry bool, err error) {
            r, err := client.Get(url)
            if err != nil {
                loopCount--
                return false, err     #### <-- this line was fixed by replacing "return false, nil"
            }
            defer r.Body.Close()

            if r != nil && r.StatusCode >= 500 && r.StatusCode <= 599 {
                loopCount--
                return false, fmt.Errorf("server responded with non-successful status: %s", r.Status)
            }
            return true, json.NewDecoder(r.Body).Decode(target)

        }()
        if stopRetry {
            break
        }
    }
    return err
}
-- VAS
Source: StackOverflow