Getting error on microk8s with Kubeflow PIpelines SDK and Jupyter Notebook

4/15/2020

I have set up a local cluster using microk8s and Kubeflow on my local machine. I followed these installation instructions to get my cluster up and running. I have started a Jupyter Server and coded a Kubeflow Pipeline.

My YAML file I have used to define my components shown below:

name: beat_the_market - Preprocess
description:  Preprocesses market data and loads into GCS bucket.
inputs:
- {name: project, type: String, description: GCP Project ID}
- {name: bucket, type: GCSPath, description: GCS bucket path}
- {name: ticker, type: String, description: Ticker symbol for selected stock}

outputs:
- {name: Trained model, type: Tensorflow model}

implementation:
    container:
        image: us.gcr.io/manceps-labs/beat_the_market:latest
        command: [python3, /opt/preprocess.py,
        --project, {inputValue: project},
        --bucket, {inputValue: bucket},
        --ticker, {inputValue: ticker}
        ] 

Unfortunately when I try to create an experiment using the Kubeflow Pipelines SDK I get the following error:

2020-04-15 23:03:25,135 WARNING Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1cc8a4c358>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /apis/v1beta1/experiments
2020-04-15 23:03:25,135 WARNING Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1cc8a4c358>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /apis/v1beta1/experiments
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1cc8a4c358>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /apis/v1beta1/experiments

---------------------------------------------------------------------------
gaierror                                  Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/urllib3/connection.py in _new_conn(self)
    158             conn = connection.create_connection(
--> 159                 (self._dns_host, self.port), self.timeout, **extra_kw)
    160 

/usr/local/lib/python3.6/dist-packages/urllib3/util/connection.py in create_connection(address, timeout, source_address, socket_options)
     56 
---> 57     for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
     58         af, socktype, proto, canonname, sa = res

/usr/lib/python3.6/socket.py in getaddrinfo(host, port, family, type, proto, flags)
    744     addrlist = []
--> 745     for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
    746         af, socktype, proto, canonname, sa = res

gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

NewConnectionError                        Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    599                                                   body=body, headers=headers,
--> 600                                                   chunked=chunked)
    601 

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    353         else:
--> 354             conn.request(method, url, **httplib_request_kw)
    355 

/usr/lib/python3.6/http/client.py in request(self, method, url, body, headers, encode_chunked)
   1238         """Send a complete request to the server."""
-> 1239         self._send_request(method, url, body, headers, encode_chunked)
   1240 

/usr/lib/python3.6/http/client.py in _send_request(self, method, url, body, headers, encode_chunked)
   1284             body = _encode(body, 'body')
-> 1285         self.endheaders(body, encode_chunked=encode_chunked)
   1286 

/usr/lib/python3.6/http/client.py in endheaders(self, message_body, encode_chunked)
   1233             raise CannotSendHeader()
-> 1234         self._send_output(message_body, encode_chunked=encode_chunked)
   1235 

/usr/lib/python3.6/http/client.py in _send_output(self, message_body, encode_chunked)
   1025         del self._buffer[:]
-> 1026         self.send(msg)
   1027 

/usr/lib/python3.6/http/client.py in send(self, data)
    963             if self.auto_open:
--> 964                 self.connect()
    965             else:

/usr/local/lib/python3.6/dist-packages/urllib3/connection.py in connect(self)
    180     def connect(self):
--> 181         conn = self._new_conn()
    182         self._prepare_conn(conn)

/usr/local/lib/python3.6/dist-packages/urllib3/connection.py in _new_conn(self)
    167             raise NewConnectionError(
--> 168                 self, "Failed to establish a new connection: %s" % e)
    169 

NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f1cc8b3e860>: Failed to establish a new connection: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

MaxRetryError                             Traceback (most recent call last)
<ipython-input-325-c8d6a70afd2d> in <module>
      9 try:
---> 10     experiment = client.get_experiment(experiment_name=experiment_name)
     11 except:

/usr/local/lib/python3.6/dist-packages/kfp/_client.py in get_experiment(self, experiment_id, experiment_name)
    213     while next_page_token is not None:
--> 214       list_experiments_response = self.list_experiments(page_size=100, page_token=next_page_token)
    215       next_page_token = list_experiments_response.next_page_token

/usr/local/lib/python3.6/dist-packages/kfp/_client.py in list_experiments(self, page_token, page_size, sort_by)
    193     response = self._experiment_api.list_experiment(
--> 194         page_token=page_token, page_size=page_size, sort_by=sort_by)
    195     return response

/usr/local/lib/python3.6/dist-packages/kfp_server_api/api/experiment_service_api.py in list_experiment(self, **kwargs)
    347         else:
--> 348             (data) = self.list_experiment_with_http_info(**kwargs)  # noqa: E501
    349             return data

/usr/local/lib/python3.6/dist-packages/kfp_server_api/api/experiment_service_api.py in list_experiment_with_http_info(self, **kwargs)
    429             _request_timeout=params.get('_request_timeout'),
--> 430             collection_formats=collection_formats)

/usr/local/lib/python3.6/dist-packages/kfp_server_api/api_client.py in call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_type, auth_settings, async_req, _return_http_data_only, collection_formats, _preload_content, _request_timeout)
    329                                    _return_http_data_only, collection_formats,
--> 330                                    _preload_content, _request_timeout)
    331         else:

/usr/local/lib/python3.6/dist-packages/kfp_server_api/api_client.py in __call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_type, auth_settings, _return_http_data_only, collection_formats, _preload_content, _request_timeout)
    160             _preload_content=_preload_content,
--> 161             _request_timeout=_request_timeout)
    162 

/usr/local/lib/python3.6/dist-packages/kfp_server_api/api_client.py in request(self, method, url, query_params, headers, post_params, body, _preload_content, _request_timeout)
    350                                         _request_timeout=_request_timeout,
--> 351                                         headers=headers)
    352         elif method == "HEAD":

/usr/local/lib/python3.6/dist-packages/kfp_server_api/rest.py in GET(self, url, headers, query_params, _preload_content, _request_timeout)
    237                             _request_timeout=_request_timeout,
--> 238                             query_params=query_params)
    239 

/usr/local/lib/python3.6/dist-packages/kfp_server_api/rest.py in request(self, method, url, query_params, headers, body, post_params, _preload_content, _request_timeout)
    210                                               timeout=timeout,
--> 211                                               headers=headers)
    212         except urllib3.exceptions.SSLError as e:

/usr/local/lib/python3.6/dist-packages/urllib3/request.py in request(self, method, url, fields, headers, **urlopen_kw)
     67                                            headers=headers,
---> 68                                            **urlopen_kw)
     69         else:

/usr/local/lib/python3.6/dist-packages/urllib3/request.py in request_encode_url(self, method, url, fields, headers, **urlopen_kw)
     88 
---> 89         return self.urlopen(method, url, **extra_kw)
     90 

/usr/local/lib/python3.6/dist-packages/urllib3/poolmanager.py in urlopen(self, method, url, redirect, **kw)
    323         else:
--> 324             response = conn.urlopen(method, u.request_uri, **kw)
    325 

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    666                                 release_conn=release_conn, body_pos=body_pos,
--> 667                                 **response_kw)
    668 

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    666                                 release_conn=release_conn, body_pos=body_pos,
--> 667                                 **response_kw)
    668 

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    666                                 release_conn=release_conn, body_pos=body_pos,
--> 667                                 **response_kw)
    668 

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    637             retries = retries.increment(method, url, error=e, _pool=self,
--> 638                                         _stacktrace=sys.exc_info()[2])
    639             retries.sleep()

/usr/local/lib/python3.6/dist-packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
    398         if new_retry.is_exhausted():
--> 399             raise MaxRetryError(_pool, url, error or ResponseError(cause))
    400 

MaxRetryError: HTTPConnectionPool(host='ml-pipeline.kubeflow.svc.cluster.local', port=8888): Max retries exceeded with url: /apis/v1beta1/experiments?page_token=&page_size=100&sort_by= (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1cc8b3e860>: Failed to establish a new connection: [Errno -2] Name or service not known',))

During handling of the above exception, another exception occurred:

gaierror                                  Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/urllib3/connection.py in _new_conn(self)
    158             conn = connection.create_connection(
--> 159                 (self._dns_host, self.port), self.timeout, **extra_kw)
    160 

/usr/local/lib/python3.6/dist-packages/urllib3/util/connection.py in create_connection(address, timeout, source_address, socket_options)
     56 
---> 57     for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
     58         af, socktype, proto, canonname, sa = res

/usr/lib/python3.6/socket.py in getaddrinfo(host, port, family, type, proto, flags)
    744     addrlist = []
--> 745     for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
    746         af, socktype, proto, canonname, sa = res

gaierror: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

NewConnectionError                        Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    599                                                   body=body, headers=headers,
--> 600                                                   chunked=chunked)
    601 

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
    353         else:
--> 354             conn.request(method, url, **httplib_request_kw)
    355 

/usr/lib/python3.6/http/client.py in request(self, method, url, body, headers, encode_chunked)
   1238         """Send a complete request to the server."""
-> 1239         self._send_request(method, url, body, headers, encode_chunked)
   1240 

/usr/lib/python3.6/http/client.py in _send_request(self, method, url, body, headers, encode_chunked)
   1284             body = _encode(body, 'body')
-> 1285         self.endheaders(body, encode_chunked=encode_chunked)
   1286 

/usr/lib/python3.6/http/client.py in endheaders(self, message_body, encode_chunked)
   1233             raise CannotSendHeader()
-> 1234         self._send_output(message_body, encode_chunked=encode_chunked)
   1235 

/usr/lib/python3.6/http/client.py in _send_output(self, message_body, encode_chunked)
   1025         del self._buffer[:]
-> 1026         self.send(msg)
   1027 

/usr/lib/python3.6/http/client.py in send(self, data)
    963             if self.auto_open:
--> 964                 self.connect()
    965             else:

/usr/local/lib/python3.6/dist-packages/urllib3/connection.py in connect(self)
    180     def connect(self):
--> 181         conn = self._new_conn()
    182         self._prepare_conn(conn)

/usr/local/lib/python3.6/dist-packages/urllib3/connection.py in _new_conn(self)
    167             raise NewConnectionError(
--> 168                 self, "Failed to establish a new connection: %s" % e)
    169 

NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f1cc8a4c5f8>: Failed to establish a new connection: [Errno -2] Name or service not known

During handling of the above exception, another exception occurred:

MaxRetryError                             Traceback (most recent call last)
<ipython-input-325-c8d6a70afd2d> in <module>
     10     experiment = client.get_experiment(experiment_name=experiment_name)
     11 except:
---> 12     experiment = client.create_experiment(experiment_name)
     13 
     14 print(experiment)

/usr/local/lib/python3.6/dist-packages/kfp/_client.py in create_experiment(self, name)
    172       logging.info('Creating experiment {}.'.format(name))
    173       experiment = kfp_server_api.models.ApiExperiment(name=name)
--> 174       experiment = self._experiment_api.create_experiment(body=experiment)
    175 
    176     if self._is_ipython():

/usr/local/lib/python3.6/dist-packages/kfp_server_api/api/experiment_service_api.py in create_experiment(self, body, **kwargs)
     52             return self.create_experiment_with_http_info(body, **kwargs)  # noqa: E501
     53         else:
---> 54             (data) = self.create_experiment_with_http_info(body, **kwargs)  # noqa: E501
     55             return data
     56 

/usr/local/lib/python3.6/dist-packages/kfp_server_api/api/experiment_service_api.py in create_experiment_with_http_info(self, body, **kwargs)
    129             _preload_content=params.get('_preload_content', True),
    130             _request_timeout=params.get('_request_timeout'),
--> 131             collection_formats=collection_formats)
    132 
    133     def delete_experiment(self, id, **kwargs):  # noqa: E501

/usr/local/lib/python3.6/dist-packages/kfp_server_api/api_client.py in call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_type, auth_settings, async_req, _return_http_data_only, collection_formats, _preload_content, _request_timeout)
    328                                    response_type, auth_settings,
    329                                    _return_http_data_only, collection_formats,
--> 330                                    _preload_content, _request_timeout)
    331         else:
    332             thread = self.pool.apply_async(self.__call_api, (resource_path,

/usr/local/lib/python3.6/dist-packages/kfp_server_api/api_client.py in __call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_type, auth_settings, _return_http_data_only, collection_formats, _preload_content, _request_timeout)
    159             post_params=post_params, body=body,
    160             _preload_content=_preload_content,
--> 161             _request_timeout=_request_timeout)
    162 
    163         self.last_response = response_data

/usr/local/lib/python3.6/dist-packages/kfp_server_api/api_client.py in request(self, method, url, query_params, headers, post_params, body, _preload_content, _request_timeout)
    371                                          _preload_content=_preload_content,
    372                                          _request_timeout=_request_timeout,
--> 373                                          body=body)
    374         elif method == "PUT":
    375             return self.rest_client.PUT(url,

/usr/local/lib/python3.6/dist-packages/kfp_server_api/rest.py in POST(self, url, headers, query_params, post_params, body, _preload_content, _request_timeout)
    273                             _preload_content=_preload_content,
    274                             _request_timeout=_request_timeout,
--> 275                             body=body)
    276 
    277     def PUT(self, url, headers=None, query_params=None, post_params=None,

/usr/local/lib/python3.6/dist-packages/kfp_server_api/rest.py in request(self, method, url, query_params, headers, body, post_params, _preload_content, _request_timeout)
    165                         preload_content=_preload_content,
    166                         timeout=timeout,
--> 167                         headers=headers)
    168                 elif headers['Content-Type'] == 'application/x-www-form-urlencoded':  # noqa: E501
    169                     r = self.pool_manager.request(

/usr/local/lib/python3.6/dist-packages/urllib3/request.py in request(self, method, url, fields, headers, **urlopen_kw)
     70             return self.request_encode_body(method, url, fields=fields,
     71                                             headers=headers,
---> 72                                             **urlopen_kw)
     73 
     74     def request_encode_url(self, method, url, fields=None, headers=None,

/usr/local/lib/python3.6/dist-packages/urllib3/request.py in request_encode_body(self, method, url, fields, headers, encode_multipart, multipart_boundary, **urlopen_kw)
    148         extra_kw.update(urlopen_kw)
    149 
--> 150         return self.urlopen(method, url, **extra_kw)

/usr/local/lib/python3.6/dist-packages/urllib3/poolmanager.py in urlopen(self, method, url, redirect, **kw)
    322             response = conn.urlopen(method, url, **kw)
    323         else:
--> 324             response = conn.urlopen(method, u.request_uri, **kw)
    325 
    326         redirect_location = redirect and response.get_redirect_location()

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    665                                 timeout=timeout, pool_timeout=pool_timeout,
    666                                 release_conn=release_conn, body_pos=body_pos,
--> 667                                 **response_kw)
    668 
    669         def drain_and_release_conn(response):

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    665                                 timeout=timeout, pool_timeout=pool_timeout,
    666                                 release_conn=release_conn, body_pos=body_pos,
--> 667                                 **response_kw)
    668 
    669         def drain_and_release_conn(response):

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    665                                 timeout=timeout, pool_timeout=pool_timeout,
    666                                 release_conn=release_conn, body_pos=body_pos,
--> 667                                 **response_kw)
    668 
    669         def drain_and_release_conn(response):

/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
    636 
    637             retries = retries.increment(method, url, error=e, _pool=self,
--> 638                                         _stacktrace=sys.exc_info()[2])
    639             retries.sleep()
    640 

/usr/local/lib/python3.6/dist-packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
    397 
    398         if new_retry.is_exhausted():
--> 399             raise MaxRetryError(_pool, url, error or ResponseError(cause))
    400 
    401         log.debug("Incremented Retry for (url='%s'): %r", url, new_retry)

MaxRetryError: HTTPConnectionPool(host='ml-pipeline.kubeflow.svc.cluster.local', port=8888): Max retries exceeded with url: /apis/v1beta1/experiments (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1cc8a4c5f8>: Failed to establish a new connection: [Errno -2] Name or service not known',))

Note that I did not include all of the retries but I think you get the point. I have tried using the IP provided by microk8s.enable and it gave me a sort-of successful output but all values were None so still not what I want.

client = kfp.Client(host='http://xx.xx.xx.xx.xip.io')
experiment = client.create_experiment('test')
Experiment link here

{'created_at': None, 'description': None, 'id': None, 'name': None}

Any help would be much appreciated. Let me know any other output you need to assess properly. Still learning Kubeflow so not sure how to debug and couldn't find much on it in Kubeflow docs, microk8s docs, or other threads. Currently working off of these 2 examples.

https://github.com/kubeflow/examples/blob/master/named_entity_recognition/notebooks/Pipeline.ipynb

https://github.com/kubeflow/pipelines/blob/master/samples/tutorials/mnist/03_Reusable_Components.ipynb

-- Christopher Thompson
kubeflow-pipelines
kubernetes
microk8s

0 Answers