BigQuery integration with GKE hosted PostgreSQL

10/15/2021

Did some research but I can't find anything on this topic. So here goes:

What I'm trying to do is to add a k8s (on GKE) hosted TimescaleDB (so basically PostgreSQL) as an external datasource on BigQuery. However, as BQ's docs state:

BigQuery supports the following external data sources:

  • Bigtable
  • Cloud Spanner
  • Cloud SQL
  • Cloud Storage
  • Drive

I find it hard to believe that Google doesn't support connecting your GKE database with Big Query.
Anyone that has any experience with this? Is it really not possible? And if not, can I use some other solution to let BigQuery query my k8s database?

-- Giannis Tampakakis
google-bigquery
google-kubernetes-engine
kubernetes
postgresql
timescaledb

1 Answer

10/15/2021

It really is not possible to directly access your own database running on GKE.

The only option which will definitely work is to replicate the data from your TimeScaleDB instance to BigQuery tables. Google Cloud provides Cloud Data Fusion, but there are other change data capture / replication products as well.

One other option would be to replicate TimeScaleDB to a CloudSQL PostgreSQL instance, but I'm not sure this would work as I don't know what extension TimescaleDB actually installs/requires on top of PostgreSQL. I doubt this will work, but might be worth asking the Timescale folks.

-- Gari Singh
Source: StackOverflow