Presto error 400 when connecting to s3 bucket

9/29/2020

OVERVIEW

I deploy a Presto cluster on Kubernetes and try to connect it to a S3 bucket through a compatible API.

HOW TO REPRODUCE

I use Presto Operator (by Starburst) and configure my Presto resource with the following properties :

  hive:
    additionalProperties: |
      connector.name=hive-hadoop2
      hive.metastore.uri=thrift://hive-metastore-presto-cluster-name.default.svc.cluster.local:9083
      hive.s3.endpoint=https://<s3-endpoint>

And i add the s3-secret with data like AWS_ACCESS_KEY_ID: ** and AWS_SECRET_ACCESS_KEY: **

Then, I use presto-cli to create a Schema with the following command :

CREATE SCHEMA ban WITH (LOCATION = 's3a://path/to/schema');

ERROR

And it throw an error in the Hive metastore like :

Got exception: org.apache.hadoop.fs.s3a.AWSBadRequestException doesBucketExist on presto: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request

PROBLEM IDENTIFICATION

I saw on this link that "This happens when trying to work with any S3 service which only supports the “V4” signing API —but the client is configured to use the default S3 service endpoint"

And the solution is : "The S3A client needs to be given the endpoint to use via the fs.s3a.endpoint property"

But the property fs.s3a.endpoint doesn't work with Presto's Hive configuration.

Does someone already had this problem ?

USEFUL DETAILS

  • Kubernetes version : 1.18
  • Presto Operator image : starburstdata/presto-operator:341-e-k8s-0.35

--- EDIT ---

FULL STACKTRACE

2020-09-29T12:20:34,120 ERROR [pool-7-thread-2] utils.MetaStoreUtils: Got exception: org.apache.hadoop.fs.s3a.AWSBadRequestException doesBucketExist on presto: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692), S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692:400 Bad Request: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692)
org.apache.hadoop.fs.s3a.AWSBadRequestException: doesBucketExist on presto: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692), S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692:400 Bad Request: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692)
        at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:212) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:111) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:260) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:317) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:256) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:231) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:372) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:308) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3303) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3352) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3320) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:479) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) ~[hadoop-common-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.hive.metastore.Warehouse.getFs(Warehouse.java:115) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:141) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.Warehouse.getDnsPath(Warehouse.java:147) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.Warehouse.determineDatabasePath(Warehouse.java:190) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database_core(HiveMetaStore.java:1265) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_database(HiveMetaStore.java:1425) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_232]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_232]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_232]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_232]
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at com.sun.proxy.$Proxy26.create_database(Unknown Source) [?:?]
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_database.getResult(ThriftHiveMetastore.java:14861) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_database.getResult(ThriftHiveMetastore.java:14845) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:104) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_232]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_232]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_232]
Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: txc7f0218a02574ba59ec91-005f732692; S3 Extended Request ID: txc7f0218a02574ba59ec91-005f732692)
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1639) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1304) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1056) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:743) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:717) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:699) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:667) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:649) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:513) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4325) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4272) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1337) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:1277) ~[aws-java-sdk-bundle-1.11.271.jar:?]
        at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$1(S3AFileSystem.java:373) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109) ~[hadoop-aws-3.1.1.3.1.0.0-78.jar:?]
        ... 33 more
-- Amojow
amazon-s3
hive
kubernetes
presto
starburst

0 Answers