Google Cloud Storage
Batch process all your records to store structured outputs in Google Cloud Service.
The requirements are as follows.
-
A Google Cloud service account. Create a service account.
-
A service account key for the service account. See Create a service account key in Create and delete service account keys.
To ensure maximum compatibility across Unstructured service offerings, you should give the service account key information to Unstructured as a single-line string that contains the contents of the downloaded service account key file (and not the service account key file itself). To print this single-line string without line breaks, suitable for copying, you can run one of the following commands from your Terminal or Command Prompt. In this command, replace
<path-to-downloaded-key-file>
with the path to the service account key file that you downloaded by following the preceding instructions.- For macOS or Linux:
- For Windows:
- For macOS or Linux:
-
The URI for a Google Cloud Storage bucket. This URI consists of the target bucket name, plus any target folder within the bucket, expressed as
gs://<bucket-name>[/folder-name]
. Create a bucket.This bucket must have, at minimum, one of the following roles applied to the target Google Cloud service account:
Storage Object Viewer
for bucket read access.Storage Object Creator
for bucket write access.- The
Storage Object Admin
role provides read and write access, plus access to additional bucket operations.
To apply one of these roles to a service account for a bucket, see Add a principal to a bucket-level policy in Set and manage IAM policies on buckets.
The Google Cloud Storage connector dependencies:
You might also need to install additional dependencies, depending on your needs. Learn more.
The following environment variables:
GCS_SERVICE_ACCOUNT_KEY
- The Google Cloud service account key for Google Cloud Storage, represented by--service-account-key
(CLI) orservice_account_key
(Python).GCS_REMOTE_URL
- The Google Cloud Storage bucket URL, represented by--remote-url
(CLI) orremote_url
(Python).
These environment variables:
UNSTRUCTURED_API_KEY
- Your Unstructured API key value.UNSTRUCTURED_API_URL
- Your Unstructured API URL.
Now call the Unstructured CLI or Python SDK. The source connector can be any of the ones supported. This example uses the local source connector: