Workflow
Update workflow
Update an existing workflow’s name, connectors, schedule, or workflow type.
PUT
Documentation Index
Fetch the complete documentation index at: https://docs.unstructured.io/llms.txt
Use this file to discover all available pages before exploring further.
Only long-lived workflows (workflows that persist until explicitly deleted) can be updated. Workflows created for the duration of a single on-demand job run cannot be updated.
Path parameters
The unique identifier of the workflow to update.
Body
Updated workflow name.
Updated source connector ID.
Updated destination connector ID.
Updated execution mode:
auto or custom.Repeating run schedule. Valid values and their cron equivalents:
If omitted, the workflow does not automatically run on a repeating schedule.Workflows with a local source cannot be set to run on a repeating schedule.
| Value | cron | Description |
|---|---|---|
every 15 minutes | */15 * * * * | Every 15 minutes. |
every hour | 0 * * * * | At the first minute of every hour. |
every 2 hours | 0 */2 * * * | At the first minute of every second hour. |
every 4 hours | 0 */4 * * * | At the first minute of every fourth hour. |
every 6 hours | 0 */6 * * * | At the first minute of every sixth hour. |
every 8 hours | 0 */8 * * * | At the first minute of every eighth hour. |
every 10 hours | 0 */10 * * * | At the first minute of every tenth hour. |
every 12 hours | 0 */12 * * * | At the first minute of every twelfth hour. |
daily | 0 0 * * * | At the first minute of every day. |
weekly | 0 0 * * 0 | At the first minute of every Sunday. |
monthly | 0 0 1 * * | At the first minute of the first day of every month. |
Updated processing pipeline stages. Each node requires
id (string, UUID) and node_type (string), and supports optional node_subtype (string), config (object), and params (object).For more information on workflow nodes, see Workflow nodes.Default:
The following table lists the source connectors that support the
Additional considerations to take into account when setting
false. If true, reprocesses all documents in the source location on every run. If false, the workflow excludes from future processing any files Unstructured determines are unchanged since the last time the workflow ran.Unstructured determines if a document has changed based on the document version. For each workflow, Unstructured maintains a record of documents (and their versions, if present) processed by that workflow. Each document record consists of:- A
record_idderived from the document name and path. - A
record_versionderived from either the document Etag (if the source provider generates one) or the source provider’s native version identifier.
reprocess_all to false for a source connector that supports reprocess_all, Unstructured uses this list of records to determine whether or not to process each document:- If the
record_iddoes not exist in the workflow records, Unstructured processes the document. - If the
record_idexists, but therecord_versionhas changed, or there is norecord_version, Unstructured processes the document.
record-id and record_version combinations, and the action Unstructured takes in each case:record_id | record_version | Action |
|---|---|---|
| Exists | Unchanged | Do not process file |
| Exists | Changed | Process file |
| Exists | (none) | Process file |
| New | (Does not apply) | Process file |
Renaming a document results in a new
record_id; Unstructured will then reprocess the renamed document when the workflow runs.reprocess_all setting. The Record version base column specifies the versioning information Unstructured uses to generated the corresponding record version for each processed document.Source connectors that do not support reprocess_all reprocess every document in the source location each time the workflow runs.| Connector | record_version base |
|---|---|
| Amazon S3 | ETag |
| Azure Blob Storage | ETag |
| Box | Provider version ID |
| Dropbox | Provider version ID |
| Elastisearch | Provider version ID |
| Google Cloud Storage | ETag |
| Google Drive | Provider version ID |
| Microsoft OneDrive | Provider version ID |
| Microsoft SharePoint | Provider version ID |
reprocess_all to false:- Unstructured only adds document records for documents that it successfully processes. Documents that failed to process will be reprocessed the next time the workflow is run.
- Because S3 ETags are content-based, changing the metadata on an S3 object will not result in it being reprocessed.
- For source providers that support the S3 protocol, be aware that deleting an object and then reuploading it to the source location will maintain the same
record_id, but may result in a differentrecord_versionbeing generated. This is especially true of multipart uploads. This results in Unstructured reprocessing the document. - For source providers that offer Key Management Services (KMS), be aware that server-side encryption can change document ETags. This results in the the
record_versionof a document changing, and Unstructured reprocessing the document. - If you clone or recreate a source connector, the resulting connector does not include the document processing history of the previous connector.
- Changing a workflow’s configuration does not automatically result in Unstructured reprocessing all documents. For example, changing chunker, embedder, enrichment, or partitioner settings may not result in reprocessing all document. To reprocess all documents using new workflow settings, set
reprocess_alltotruefor at least the next workflow run.
Response
Unique identifier for the workflow.
Workflow name.
Workflow type:
custom or auto.Workflow state:
active, inactive, or paused.ISO 8601 timestamp when the workflow was created.
Source connector ID.
Destination connector ID.
Repeating run schedule.
Workflow processing pipeline nodes.
ISO 8601 timestamp when the workflow was last updated.

