Skip to main content
First time creating a connector? Read this first.

Requirements

You will need: To access the information you’ll need to configure the connector:
  1. Log into your account.
  2. Choose the Navigator tile. The IBM Navigator launches in a separate browser window, and provides a view of your object stores and content. You can use the IBM Navigator views to find the information necessary to create a connection to Unstructured. For the URL of your IBM FileNet server:
    • The server URL displays in the browser address bar. You only need the base URL that specifies the company and domain. For example, https://<company-name>.automationcloud.ibm.com.
    For object store names and folder paths:
    • Select the folder in the left pane. The full folder path is displayed at the top of the main detail pane, in the following format: <object-store>/<folder>/etc.
    For the document class:
    • Right-click the document and select Properties.
    For the account username:
    • Right-click the profile icon on the upper right in the top menu.

Document permissions metadata

The source connector outputs any permissions information that it can find in the source location about the processed source files, and associates that information with each corresponding element that is generated. This permissions information is output into the permissions_data field, which is within the data_source field under the element’s metadata field (metadata.data_source.permissions_data). This information lists the users or groups, if any, that have permissions to read, update, or delete the element’s associated source document. It also lists any users or groups that are explicitly denied those permissions. For more information on how IBM FileNet uses Access Control Lists (ACLs) and Access Control Entries (ACEs) to manage permissions, see About access rights in the IBM FileNet Platform documentation.
The permissions metadata Unstructured outputs should not be used for runtime authorization or access control enforcement.Unstructured outputs document permissions metadata that is accurate only at the point in time when Unstructured ingested the corresponding document to which those permissions applied. Because this metadata is a point-in-time copy of the permissions in the source location, these metadata outputs that are sent to your destination location are not always guaranteed to match the current permissions in the source location.Also, be aware that Unstructured updates permission metadata for a document only when the document’s content has changed.This is because Unstructured performs incremental processing of documents only when documents’ content has changed—not when only the documents’ permissions have changed. Whenever Unstructured performs incremental processing of documents for a workflow (in other words, if Reprocess All Files is turned off or set to false for a workflow), that worfklow will not output metadata for any document permissions that have been added, changed, or removed since the previous workflow run, unless the corresponding documents’ content has also been changed since the previous workflow run.
Always take into account the deny_users and deny_groups fields when determining the effective permissions a user or group has. Not doing so may result in over-granting permissions.Unstructured does not resolve groups into users when outputting permissions metadata.
Unstructured derives permissions metadata from the document’s ACL, returned inline with the document’s metadata via the IBM FileNet GraphQL API. The document ACL returned by IBM FileNet includes the full effective ACL, including inherited permissions. Because of this, Unstructured does not further query for inherited permissions. For more information, see ACE source: Default, Direct, Inherited, Template in the IBM FileNet Platform documentation. Unstructured does not include the following permission values in the permissions_data field:
  • MARKING Marking is a security classification system layered on top of standard ACL permissions. For more information, see Markings overview in the IBM FileNet Platform documentation.
  • PROXY Permissions granted to a principal to act on behalf of another principal.
Unstructured writes a maximum of 1000 permission entries (ALLOW and DENY) per file.

Permissions evaluation

IBM FileNet supports explicit DENY ACEs and evaluates them with DENY-wins semantics. This means that if a user has both an ALLOW and a DENY ACE for the same action, whether directly or through group membership, the DENY takes precedence. For more information, see Allow or Deny and order of evaluation in the IBM FileNet Platform documentation.

Identifier formats

The connector takes the granteeName value that IBM FileNet returns and writes it directly into users, groups, deny_users, or deny_groups without any modification. The format of that identifier depends entirely on how your specific IBM FileNet instance is connected to its directory service.
  • IBM FileNet SaaS (IBM Cloud Identity) LDAP distinguished names are used. For example: Users:uid=alice.smith,cn=users,O=IBM,C=US Groups: cn=Finance,cn=groups,O=IBM,C=US
  • On-premises with Active Directory Several possible formats are possible. For example:
    • A CN=…/DC=… distinguished name: CN=Alice Smith,OU=Staff,DC=contoso,DC=com
    • A DOMAIN\user short name: CONTOSO\alice.smith
    • A Windows SID: S-1-5-21-3623811015-3361044348-30300820-1013
    • A userPrincipalName: alice.smith@contoso.com
If you’re building a downstream system that filters on these values, you will need to verify which format your instance uses before writing that logic. We recommend running a sample query against your instance to check.

Metadata output example

The following example shows what the output looks like. Ellipses indicate content that has been omitted from this example for brevity. This example uses IBM FileNet SaaS (IBM Cloud Identity) with LDAP distinguished names.
[
    {
        "...": "...",
        "metadata": {
            "...": "...",
            "data_source": {
                "...": "...",
                "permissions_data": [
                    {
                        "read": {
                            "users": ["uid=alice,cn=users,O=IBM,C=US"],
                            "groups": ["cn=Finance,cn=groups,O=IBM,C=US"],
                            "deny_users": ["uid=contractor,cn=users,O=IBM,C=US"],
                            "deny_groups": []
                        }
                    },
                    {
                        "update": {
                            "users": ["uid=alice,cn=users,O=IBM,C=US"],
                            "groups": [],
                            "deny_users": [],
                            "deny_groups": []
                        }
                    },
                    {
                        "delete": {
                            "users": [],
                            "groups": [],
                            "deny_users": [],
                            "deny_groups": []
                        }
                    }
                ],
                "...": "..."
            }
        }
    }
]

Examples

To create an IBM FileNet source connector, see the following examples. For more information on working with source connectors using the Unstructured API, see Source endpoints.
import os

from unstructured_client import UnstructuredClient
from unstructured_client.models.operations import CreateSourceRequest
from unstructured_client.models.shared import CreateSourceConnector

with UnstructuredClient(api_key_auth=os.getenv("UNSTRUCTURED_API_KEY")) as client:
    response = client.sources.create_source(
        request=CreateSourceRequest(
            create_source_connector=CreateSourceConnector(
                name="<name>",
                type="filenet",
                config={
                    "server_url": "<server-url>",
                    "object_store": "<object-store>",
                    "folder_path": "<folder-path>",
                    "document_class": "<document-class>",
                    "recursive": <true|false>,
                    "username": "<username>",
                    "password": "<password>"
                }
            )
        )
    )

    print(response.source_connector_information)

Configuration settings

Replace the preceding placeholders as follows:
name
string
required
A unique name for this connector.
server_url
string
required
The base URL of your Content Platform Engine, containing both the IBM domain and your company’s subdomain. For example, https://<company-name>.automationcloud.ibm.com.
object_store
string
required
The name of the object store to connect within the Content Platform Engine.
folder_path
string
required
Source connector only. The path of the folder within the object store to use as the source.
target_folder
string
required
Destination connector only. The path of the folder within the object store to use as the upload destination.
document_class
string
default:"Document"
The class of documents to include.
recursive
boolean
default:"false"
Source connector only. Set to true to include documents contained in any subfolders.
username
string
required
The username of the IBM Cloud Pak for Business Automation as a Service account to use.
password
string
required
The password for the corresponding username.