Transformation Examples

This page provides practical examples of transforming documents using the Graphora client library.

Basic Document Transformation

The following example demonstrates how to transform a set of documents using a registered ontology:

from graphora import GraphoraClient

# Initialize the client with user ID (required)
client = GraphoraClient(
    base_url="https://api.graphora.io",
    user_id="your-user-id",  # Required for all API calls
    api_key="your-api-key"
)

# Use a previously registered ontology
ontology_id = "ont_123456789"

# Prepare a list of document files to transform
document_paths = [
    "document1.pdf",
    "document2.docx",
    "document3.txt"
]

# Transform the documents
response = client.transform(ontology_id, document_paths)
print(f"Transformation started with ID: {response.id}")

# Check the status of the transformation
status = client.get_transform_status(response.id)
print(f"Transformation status: {status.status}")
print(f"Progress: {status.progress:.2%}")

Transforming Documents with Metadata

You can also provide metadata for each document to enrich the transformation process:

from graphora import GraphoraClient, DocumentMetadata, DocumentType

# Initialize the client with user ID (required)
client = GraphoraClient(
    base_url="https://api.graphora.io",
    user_id="your-user-id",  # Required for all API calls
    api_key="your-api-key"
)

# Use a previously registered ontology
ontology_id = "ont_123456789"

# Prepare document paths and metadata
document_paths = [
    "document1.pdf",
    "document2.docx"
]

metadata = [
    DocumentMetadata(
        source="Annual Report 2024",
        document_type=DocumentType.PDF,
        tags=["finance", "annual", "report"],
        custom_metadata={
            "department": "Finance",
            "confidentiality": "Internal"
        }
    ),
    DocumentMetadata(
        source="Product Specifications",
        document_type=DocumentType.DOCX,
        tags=["product", "specifications", "technical"],
        custom_metadata={
            "department": "Engineering",
            "product_id": "PRD-2024-001"
        }
    )
]

# Transform the documents with metadata
response = client.transform(ontology_id, document_paths, metadata=metadata)
print(f"Transformation started with ID: {response.id}")

Monitoring Transformation Progress

For long-running transformations, you can monitor the progress:

from graphora import GraphoraClient
import time

# Initialize the client with user ID (required)
client = GraphoraClient(
    base_url="https://api.graphora.io",
    user_id="your-user-id",  # Required for all API calls
    api_key="your-api-key"
)

# Start a transformation (assuming you have already set up the ontology and documents)
transform_response = client.transform(ontology_id, document_paths)
transform_id = transform_response.id

# Monitor the transformation progress
completed = False
while not completed:
    status = client.get_transform_status(transform_id)
    print(f"Status: {status.status}, Progress: {status.progress:.2%}")
    
    if status.status in ["COMPLETED", "FAILED"]:
        completed = True
        if status.status == "COMPLETED":
            print("Transformation completed successfully!")
        else:
            print(f"Transformation failed: {status.error}")
    else:
        # Wait for 5 seconds before checking again
        time.sleep(5)

# If transformation was successful, you can proceed to merging
if status.status == "COMPLETED":
    print("Ready to merge the transformed data into a graph")

Using the Wait Helper Method

You can also use the built-in wait method for convenience:

from graphora import GraphoraClient

# Initialize the client with user ID (required)
client = GraphoraClient(
    base_url="https://api.graphora.io",
    user_id="your-user-id",  # Required for all API calls
    api_key="your-api-key"
)

# Start a transformation
transform_response = client.transform(ontology_id, document_paths)
transform_id = transform_response.id

# Wait for completion (with timeout)
try:
    final_status = client.wait_for_transform(transform_id, timeout=300)  # 5 minutes
    print(f"Transformation completed with status: {final_status.status}")
except Exception as e:
    print(f"Transformation failed or timed out: {e}")

Handling Different Document Types

Graphora supports various document types. Here’s how to transform a mix of document types:

from graphora import GraphoraClient

# Initialize the client with user ID (required)
client = GraphoraClient(
    base_url="https://api.graphora.io",
    user_id="your-user-id",  # Required for all API calls
    api_key="your-api-key"
)

# Use a previously registered ontology
ontology_id = "ont_123456789"

# Prepare a list of different document types
documents = [
    "text_document.txt",      # Plain text
    "word_document.docx",     # Microsoft Word
    "pdf_document.pdf",       # PDF
    "data.csv",               # CSV
    "config.json",            # JSON
    "schema.yaml"             # YAML
]

# Transform the documents
response = client.transform(ontology_id, documents)
print(f"Transformation started with ID: {response.id}")

These examples demonstrate the basic operations for transforming documents with the Graphora client library. For more detailed information about transformation concepts and options, see the Transformation Concepts page.

Getting Started

Core Concepts

API Reference

Examples

Transformation Examples

Transformation Examples

Basic Document Transformation

Transforming Documents with Metadata

Monitoring Transformation Progress

Using the Wait Helper Method

Handling Different Document Types

Getting Started

Core Concepts

API Reference

Examples

​Transformation Examples

​Basic Document Transformation

​Transforming Documents with Metadata

​Monitoring Transformation Progress

​Using the Wait Helper Method

​Handling Different Document Types

Transformation Examples

Basic Document Transformation

Transforming Documents with Metadata

Monitoring Transformation Progress

Using the Wait Helper Method

Handling Different Document Types