Turn Messy Web Content Into Clean, Useful Data

Tired of copy-pasting web content? Frustrated with messy HTML and inconsistent formats? Ulfom helps you automatically collect, clean, and organize web content exactly how you need it. Perfect for developers and AI teams who need quality data without the headache.

Key Features

Structured Content Extraction

Convert web content into structured data using custom JSON schemas and AI processing.

Async Processing

Handle large-scale content processing with our efficient async task system.

Smart Crawling

Intelligent sitemap and recursive crawling with built-in rate limiting and error handling.

Use Cases

Markdown Processing

Convert web content into clean, readable markdown format. Perfect for:

  • Content management systems
  • Documentation generators
  • AI training data preparation
  • Knowledge base creation

AI Content Analysis

Extract structured information using custom schemas. Ideal for:

  • Market research automation
  • Competitive analysis
  • Content categorization
  • Automated data extraction

Intelligent Web Crawling

Smart, efficient web content collection. Perfect for:

  • Content aggregation
  • Website migration
  • Documentation archival
  • SEO analysis

Integration Scenarios

Flexible integration options for various use cases:

  • LLM training data preparation
  • Knowledge base enrichment
  • Content recommendation systems
  • Research and analysis platforms

API Reference

Structured Content Processing

curl -X POST "https://www.ulfom.com/api/v1/task/structured_task" \
    -H "Content-Type: application/json"
    -H "Authorization: Bearer YOUR_API_KEY" \
    -d '{
        "url": "https://example.com/article",
        "parameters": {
            "prompt": "Extract key points and conclusions",
            "json_schema": {
                "type": "object",
                "properties": {
                    "key_points": {"type": "array", "items": {"type": "string"}},
                    "conclusion": {"type": "string"}
                }
            }
        }
    }'

Sitemap Crawling

curl -X POST "https://www.ulfom.com/api/v1/task/sitemap_crawl" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -d '{
        "url": "https://example.com",
        "parameters": {
            "concurrent_requests": 5,
            "max_pages": 1000
        }
    }'

Code Examples

Python Integration

import asyncio
from ulfom import UlfomClient

async def process_content():
    client = UlfomClient(api_key="YOUR_API_KEY")
    
    # Create a structured processing task
    task_id = await client.create_task(
        service="structured_task",
        url="https://example.com/article",
        parameters={
            "prompt": "Extract key information",
            "json_schema": {...}
        }
    )
    
    # Wait for results
    result = await client.wait_for_result(task_id)
    print(result)

asyncio.run(process_content())