I have set up a local cluster using microk8s and Kubeflow on my local machine. I followed these installation instructions to get my cluster up and running. I have started a Jupyter Server and coded a Kubeflow Pipeline.
My YAML file I have used to define my components shown below:
name: beat_the_market - Preprocess
description: Preprocesses market data and loads into GCS bucket.
inputs:
- {name: project, type: String, description: GCP Project ID}
- {name: bucket, type: GCSPath, description: GCS bucket path}
- {name: ticker, type: String, description: Ticker symbol for selected stock}
outputs:
- {name: Trained model, type: Tensorflow model}
implementation:
container:
image: us.gcr.io/manceps-labs/beat_the_market:latest
command: [python3, /opt/preprocess.py,
--project, {inputValue: project},
--bucket, {inputValue: bucket},
--ticker, {inputValue: ticker}
]
Unfortunately when I try to create an experiment using the Kubeflow Pipelines SDK I get the following error:
2020-04-15 23:03:25,135 WARNING Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1cc8a4c358>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /apis/v1beta1/experiments
2020-04-15 23:03:25,135 WARNING Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1cc8a4c358>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /apis/v1beta1/experiments
WARNING:urllib3.connectionpool:Retrying (Retry(total=0, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1cc8a4c358>: Failed to establish a new connection: [Errno -2] Name or service not known',)': /apis/v1beta1/experiments
---------------------------------------------------------------------------
gaierror Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/urllib3/connection.py in _new_conn(self)
158 conn = connection.create_connection(
--> 159 (self._dns_host, self.port), self.timeout, **extra_kw)
160
/usr/local/lib/python3.6/dist-packages/urllib3/util/connection.py in create_connection(address, timeout, source_address, socket_options)
56
---> 57 for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
58 af, socktype, proto, canonname, sa = res
/usr/lib/python3.6/socket.py in getaddrinfo(host, port, family, type, proto, flags)
744 addrlist = []
--> 745 for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
746 af, socktype, proto, canonname, sa = res
gaierror: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
NewConnectionError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
599 body=body, headers=headers,
--> 600 chunked=chunked)
601
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
353 else:
--> 354 conn.request(method, url, **httplib_request_kw)
355
/usr/lib/python3.6/http/client.py in request(self, method, url, body, headers, encode_chunked)
1238 """Send a complete request to the server."""
-> 1239 self._send_request(method, url, body, headers, encode_chunked)
1240
/usr/lib/python3.6/http/client.py in _send_request(self, method, url, body, headers, encode_chunked)
1284 body = _encode(body, 'body')
-> 1285 self.endheaders(body, encode_chunked=encode_chunked)
1286
/usr/lib/python3.6/http/client.py in endheaders(self, message_body, encode_chunked)
1233 raise CannotSendHeader()
-> 1234 self._send_output(message_body, encode_chunked=encode_chunked)
1235
/usr/lib/python3.6/http/client.py in _send_output(self, message_body, encode_chunked)
1025 del self._buffer[:]
-> 1026 self.send(msg)
1027
/usr/lib/python3.6/http/client.py in send(self, data)
963 if self.auto_open:
--> 964 self.connect()
965 else:
/usr/local/lib/python3.6/dist-packages/urllib3/connection.py in connect(self)
180 def connect(self):
--> 181 conn = self._new_conn()
182 self._prepare_conn(conn)
/usr/local/lib/python3.6/dist-packages/urllib3/connection.py in _new_conn(self)
167 raise NewConnectionError(
--> 168 self, "Failed to establish a new connection: %s" % e)
169
NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f1cc8b3e860>: Failed to establish a new connection: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
MaxRetryError Traceback (most recent call last)
<ipython-input-325-c8d6a70afd2d> in <module>
9 try:
---> 10 experiment = client.get_experiment(experiment_name=experiment_name)
11 except:
/usr/local/lib/python3.6/dist-packages/kfp/_client.py in get_experiment(self, experiment_id, experiment_name)
213 while next_page_token is not None:
--> 214 list_experiments_response = self.list_experiments(page_size=100, page_token=next_page_token)
215 next_page_token = list_experiments_response.next_page_token
/usr/local/lib/python3.6/dist-packages/kfp/_client.py in list_experiments(self, page_token, page_size, sort_by)
193 response = self._experiment_api.list_experiment(
--> 194 page_token=page_token, page_size=page_size, sort_by=sort_by)
195 return response
/usr/local/lib/python3.6/dist-packages/kfp_server_api/api/experiment_service_api.py in list_experiment(self, **kwargs)
347 else:
--> 348 (data) = self.list_experiment_with_http_info(**kwargs) # noqa: E501
349 return data
/usr/local/lib/python3.6/dist-packages/kfp_server_api/api/experiment_service_api.py in list_experiment_with_http_info(self, **kwargs)
429 _request_timeout=params.get('_request_timeout'),
--> 430 collection_formats=collection_formats)
/usr/local/lib/python3.6/dist-packages/kfp_server_api/api_client.py in call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_type, auth_settings, async_req, _return_http_data_only, collection_formats, _preload_content, _request_timeout)
329 _return_http_data_only, collection_formats,
--> 330 _preload_content, _request_timeout)
331 else:
/usr/local/lib/python3.6/dist-packages/kfp_server_api/api_client.py in __call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_type, auth_settings, _return_http_data_only, collection_formats, _preload_content, _request_timeout)
160 _preload_content=_preload_content,
--> 161 _request_timeout=_request_timeout)
162
/usr/local/lib/python3.6/dist-packages/kfp_server_api/api_client.py in request(self, method, url, query_params, headers, post_params, body, _preload_content, _request_timeout)
350 _request_timeout=_request_timeout,
--> 351 headers=headers)
352 elif method == "HEAD":
/usr/local/lib/python3.6/dist-packages/kfp_server_api/rest.py in GET(self, url, headers, query_params, _preload_content, _request_timeout)
237 _request_timeout=_request_timeout,
--> 238 query_params=query_params)
239
/usr/local/lib/python3.6/dist-packages/kfp_server_api/rest.py in request(self, method, url, query_params, headers, body, post_params, _preload_content, _request_timeout)
210 timeout=timeout,
--> 211 headers=headers)
212 except urllib3.exceptions.SSLError as e:
/usr/local/lib/python3.6/dist-packages/urllib3/request.py in request(self, method, url, fields, headers, **urlopen_kw)
67 headers=headers,
---> 68 **urlopen_kw)
69 else:
/usr/local/lib/python3.6/dist-packages/urllib3/request.py in request_encode_url(self, method, url, fields, headers, **urlopen_kw)
88
---> 89 return self.urlopen(method, url, **extra_kw)
90
/usr/local/lib/python3.6/dist-packages/urllib3/poolmanager.py in urlopen(self, method, url, redirect, **kw)
323 else:
--> 324 response = conn.urlopen(method, u.request_uri, **kw)
325
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
666 release_conn=release_conn, body_pos=body_pos,
--> 667 **response_kw)
668
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
666 release_conn=release_conn, body_pos=body_pos,
--> 667 **response_kw)
668
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
666 release_conn=release_conn, body_pos=body_pos,
--> 667 **response_kw)
668
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
637 retries = retries.increment(method, url, error=e, _pool=self,
--> 638 _stacktrace=sys.exc_info()[2])
639 retries.sleep()
/usr/local/lib/python3.6/dist-packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
398 if new_retry.is_exhausted():
--> 399 raise MaxRetryError(_pool, url, error or ResponseError(cause))
400
MaxRetryError: HTTPConnectionPool(host='ml-pipeline.kubeflow.svc.cluster.local', port=8888): Max retries exceeded with url: /apis/v1beta1/experiments?page_token=&page_size=100&sort_by= (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1cc8b3e860>: Failed to establish a new connection: [Errno -2] Name or service not known',))
During handling of the above exception, another exception occurred:
gaierror Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/urllib3/connection.py in _new_conn(self)
158 conn = connection.create_connection(
--> 159 (self._dns_host, self.port), self.timeout, **extra_kw)
160
/usr/local/lib/python3.6/dist-packages/urllib3/util/connection.py in create_connection(address, timeout, source_address, socket_options)
56
---> 57 for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
58 af, socktype, proto, canonname, sa = res
/usr/lib/python3.6/socket.py in getaddrinfo(host, port, family, type, proto, flags)
744 addrlist = []
--> 745 for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
746 af, socktype, proto, canonname, sa = res
gaierror: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
NewConnectionError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
599 body=body, headers=headers,
--> 600 chunked=chunked)
601
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in _make_request(self, conn, method, url, timeout, chunked, **httplib_request_kw)
353 else:
--> 354 conn.request(method, url, **httplib_request_kw)
355
/usr/lib/python3.6/http/client.py in request(self, method, url, body, headers, encode_chunked)
1238 """Send a complete request to the server."""
-> 1239 self._send_request(method, url, body, headers, encode_chunked)
1240
/usr/lib/python3.6/http/client.py in _send_request(self, method, url, body, headers, encode_chunked)
1284 body = _encode(body, 'body')
-> 1285 self.endheaders(body, encode_chunked=encode_chunked)
1286
/usr/lib/python3.6/http/client.py in endheaders(self, message_body, encode_chunked)
1233 raise CannotSendHeader()
-> 1234 self._send_output(message_body, encode_chunked=encode_chunked)
1235
/usr/lib/python3.6/http/client.py in _send_output(self, message_body, encode_chunked)
1025 del self._buffer[:]
-> 1026 self.send(msg)
1027
/usr/lib/python3.6/http/client.py in send(self, data)
963 if self.auto_open:
--> 964 self.connect()
965 else:
/usr/local/lib/python3.6/dist-packages/urllib3/connection.py in connect(self)
180 def connect(self):
--> 181 conn = self._new_conn()
182 self._prepare_conn(conn)
/usr/local/lib/python3.6/dist-packages/urllib3/connection.py in _new_conn(self)
167 raise NewConnectionError(
--> 168 self, "Failed to establish a new connection: %s" % e)
169
NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f1cc8a4c5f8>: Failed to establish a new connection: [Errno -2] Name or service not known
During handling of the above exception, another exception occurred:
MaxRetryError Traceback (most recent call last)
<ipython-input-325-c8d6a70afd2d> in <module>
10 experiment = client.get_experiment(experiment_name=experiment_name)
11 except:
---> 12 experiment = client.create_experiment(experiment_name)
13
14 print(experiment)
/usr/local/lib/python3.6/dist-packages/kfp/_client.py in create_experiment(self, name)
172 logging.info('Creating experiment {}.'.format(name))
173 experiment = kfp_server_api.models.ApiExperiment(name=name)
--> 174 experiment = self._experiment_api.create_experiment(body=experiment)
175
176 if self._is_ipython():
/usr/local/lib/python3.6/dist-packages/kfp_server_api/api/experiment_service_api.py in create_experiment(self, body, **kwargs)
52 return self.create_experiment_with_http_info(body, **kwargs) # noqa: E501
53 else:
---> 54 (data) = self.create_experiment_with_http_info(body, **kwargs) # noqa: E501
55 return data
56
/usr/local/lib/python3.6/dist-packages/kfp_server_api/api/experiment_service_api.py in create_experiment_with_http_info(self, body, **kwargs)
129 _preload_content=params.get('_preload_content', True),
130 _request_timeout=params.get('_request_timeout'),
--> 131 collection_formats=collection_formats)
132
133 def delete_experiment(self, id, **kwargs): # noqa: E501
/usr/local/lib/python3.6/dist-packages/kfp_server_api/api_client.py in call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_type, auth_settings, async_req, _return_http_data_only, collection_formats, _preload_content, _request_timeout)
328 response_type, auth_settings,
329 _return_http_data_only, collection_formats,
--> 330 _preload_content, _request_timeout)
331 else:
332 thread = self.pool.apply_async(self.__call_api, (resource_path,
/usr/local/lib/python3.6/dist-packages/kfp_server_api/api_client.py in __call_api(self, resource_path, method, path_params, query_params, header_params, body, post_params, files, response_type, auth_settings, _return_http_data_only, collection_formats, _preload_content, _request_timeout)
159 post_params=post_params, body=body,
160 _preload_content=_preload_content,
--> 161 _request_timeout=_request_timeout)
162
163 self.last_response = response_data
/usr/local/lib/python3.6/dist-packages/kfp_server_api/api_client.py in request(self, method, url, query_params, headers, post_params, body, _preload_content, _request_timeout)
371 _preload_content=_preload_content,
372 _request_timeout=_request_timeout,
--> 373 body=body)
374 elif method == "PUT":
375 return self.rest_client.PUT(url,
/usr/local/lib/python3.6/dist-packages/kfp_server_api/rest.py in POST(self, url, headers, query_params, post_params, body, _preload_content, _request_timeout)
273 _preload_content=_preload_content,
274 _request_timeout=_request_timeout,
--> 275 body=body)
276
277 def PUT(self, url, headers=None, query_params=None, post_params=None,
/usr/local/lib/python3.6/dist-packages/kfp_server_api/rest.py in request(self, method, url, query_params, headers, body, post_params, _preload_content, _request_timeout)
165 preload_content=_preload_content,
166 timeout=timeout,
--> 167 headers=headers)
168 elif headers['Content-Type'] == 'application/x-www-form-urlencoded': # noqa: E501
169 r = self.pool_manager.request(
/usr/local/lib/python3.6/dist-packages/urllib3/request.py in request(self, method, url, fields, headers, **urlopen_kw)
70 return self.request_encode_body(method, url, fields=fields,
71 headers=headers,
---> 72 **urlopen_kw)
73
74 def request_encode_url(self, method, url, fields=None, headers=None,
/usr/local/lib/python3.6/dist-packages/urllib3/request.py in request_encode_body(self, method, url, fields, headers, encode_multipart, multipart_boundary, **urlopen_kw)
148 extra_kw.update(urlopen_kw)
149
--> 150 return self.urlopen(method, url, **extra_kw)
/usr/local/lib/python3.6/dist-packages/urllib3/poolmanager.py in urlopen(self, method, url, redirect, **kw)
322 response = conn.urlopen(method, url, **kw)
323 else:
--> 324 response = conn.urlopen(method, u.request_uri, **kw)
325
326 redirect_location = redirect and response.get_redirect_location()
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
665 timeout=timeout, pool_timeout=pool_timeout,
666 release_conn=release_conn, body_pos=body_pos,
--> 667 **response_kw)
668
669 def drain_and_release_conn(response):
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
665 timeout=timeout, pool_timeout=pool_timeout,
666 release_conn=release_conn, body_pos=body_pos,
--> 667 **response_kw)
668
669 def drain_and_release_conn(response):
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
665 timeout=timeout, pool_timeout=pool_timeout,
666 release_conn=release_conn, body_pos=body_pos,
--> 667 **response_kw)
668
669 def drain_and_release_conn(response):
/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py in urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, **response_kw)
636
637 retries = retries.increment(method, url, error=e, _pool=self,
--> 638 _stacktrace=sys.exc_info()[2])
639 retries.sleep()
640
/usr/local/lib/python3.6/dist-packages/urllib3/util/retry.py in increment(self, method, url, response, error, _pool, _stacktrace)
397
398 if new_retry.is_exhausted():
--> 399 raise MaxRetryError(_pool, url, error or ResponseError(cause))
400
401 log.debug("Incremented Retry for (url='%s'): %r", url, new_retry)
MaxRetryError: HTTPConnectionPool(host='ml-pipeline.kubeflow.svc.cluster.local', port=8888): Max retries exceeded with url: /apis/v1beta1/experiments (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1cc8a4c5f8>: Failed to establish a new connection: [Errno -2] Name or service not known',))
Note that I did not include all of the retries but I think you get the point. I have tried using the IP provided by microk8s.enable
and it gave me a sort-of successful output but all values were None
so still not what I want.
client = kfp.Client(host='http://xx.xx.xx.xx.xip.io')
experiment = client.create_experiment('test')
Experiment link here
{'created_at': None, 'description': None, 'id': None, 'name': None}
Any help would be much appreciated. Let me know any other output you need to assess properly. Still learning Kubeflow so not sure how to debug and couldn't find much on it in Kubeflow docs, microk8s docs, or other threads. Currently working off of these 2 examples.
https://github.com/kubeflow/examples/blob/master/named_entity_recognition/notebooks/Pipeline.ipynb