Install and run Docker
Pull the Unstructured Docker image
downloads.unstructured.io/unstructured-io/unstructured:latest
.Create and run a container from the image
unstructured
.Interact with the Unstructured open source library by running code inside the container
<container-name>
with the name of your container, such as unstructured
:
/app/example-docs/pdf
directory named layout-parser-paper.pdf
. The
processed data is written as a JSON file named layout-parser-paper-output.json
in that same directory:
Ctrl+D
:
Interact with the Unstructured open source library by running code outside the container
<host-path>
with the path to the directory containing your code, for example /Users/<username>/my_example_code/
.<container-path>
with the path to some directory within the container to mount <host-path>
into, for example /app/my_example_code/
. If
<container-path>
does not already exist, it will be created at the same time that the container is created.<container-name>
with some name for your container, such as unstructured_mount
.<container-name>
with the name of your container, such as unstructured_mount
:
<container-path>
to the PYTHONPATH
environment variable within the container by running the following commands,
replacing <container-path>
with the path to the target directory within the container:
<container-path>
.
For example, if you have a file named main.py
in <host-path>
that contains the four commands following >>>
from the previous step,
you can run it as follows, replacing <container-path>
with the path to the target directory within the container:
Ctrl+D
:
Stop running the container