Process files in batches by using the Unstructured Ingest CLI
The Unstructured Ingest CLI enables you to use command-line scripts to send files in batches to Unstructured API services for processing, and to tell Unstructured API services where to deliver the processed data. Learn more.
Installation
One approach to get started quickly with the Unstructured Ingest CLI is to install Python and then run the following command:
This default installation option enables the ingestion of plain text files, HTML, XML, JSON and emails that do not require any extra dependencies. This default option also enables you to specify local source and destination locations.
You might also need to install additional dependencies, depending on your needs. Learn more.
For additional installation options, see Unstructured Ingest CLI in the Ingest section.
pip install unstructured
, see the migration guide.Usage
To call the Unstructured Ingest CLI, follow this calling pattern, where:
<source>
is the command name for one of the available source (input) connectors, such aslocal
for a local source location,azure
for an Azure Storage account source,s3
for an Amazon S3 bucket source, and so on.<destination>
is the command name for one of the available destination (output) connectors, such aslocal
for a local destination,azure
for an Azure Storage account destination,s3
for an Amazon S3 bucket destination, and so on.<setting>
is one or more command-line options for specifying how and where Unstructured API services will ingest the files from the<source>
, or how and where to deliver the processed data to the<destination>
.
To learn how to use the Unstructured Ingest CLI to work with a specific source (input) and destination (output) location, see the CLI script examples for the source and destination connectors that are available for you to choose from.
See also the ingest configuration settings for command-line options that enable you to further control how batches are sent and processed.