Zerox
OCR & Document Extraction using vision models. Contribute to getomni-ai/zerox development by creating an account on GitHub.
A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. The vision models just make sense! Zerox is available as both a Node and Python package. (Node.js SDK - supports vision models from different providers like OpenAI, Azure OpenAI, Anthropic, AWS Bedrock, Google Gemini, etc.) The maintainFormat option tries to return the markdown in a consistent format by passing the output of a prior page in as additional context for the next page. This requires the requests to run synchronously, so it's a lot slower. But valuable if your documents have a lot of tabular data, or frequently have tables that cross pages. Zerox supports structured data extraction from documents using a schema. This allows you to pull specific information from documents in a structured format instead of getting the full markdown conversion. Use extractPerPage to extract data per page instead of from the whole document at once. Zerox supports a wide range of models across different providers: (Python SDK - supports vision models from different providers like OpenAI, Azure OpenAI, Anthropic, AWS Bedrock, etc.) The pyzerox.zerox function is an asynchronous API that performs OCR (Optical Character Recognition) to markdown using vision models. It processes PDF files and converts them into markdown format. Make sure to set up the environment variables for the model and the model provider before using this API. Refer to the LiteLLM Documentation for setting up the environment and passing the correct model name. Note the output is manually wrapped for this documentation for better readability. This project is licensed under the MIT License. OCR Document Extraction using vision models There was an error while loading. Please reload this page. There was an error while loading. Please reload this page. There was an error while loading. Please reload this page. There was an error while loading. Please reload this page.
Ragie
Meet Ragie.
Powered by the most advanced RAG pipeline, Ragie uses context engineering to deliver fast, accurate, context-rich retrieval—through structured chunking, multi-layered indexing, and LLM-aware optimizations—built for production-grade generative AI. Ragie is built for enterprise-scale workloads with multi-tenant architecture, SOC 2-compliant security, and seamless performance at any scale. Built to handle any data you throw at it — Ragie’s multimodal ingest pipeline processes text, PDFs, images, audio, video, tables, and more. It parses, enriches, and structures diverse content into a unified format ready for chunking, indexing, and retrieval. Ragie offers out-of-the-box features that accelerate your application development. Built to meet the security, scale, and reliability requirements of production AI. Seamless data ingest with built-in authentication and authorization Ragie’s fully-managed connectors handle authentication and authorization to securely access data from popular data sources, freeing up precious engineering time and resources. Automatic syncing keeps data up to date Automatic syncing keeps your RAG pipeline up to date, ensuring your application delivers accurate and reliable information around the clock. Growing library of native integrations Purpose-built for AI applications, Ragie’s growing list of native connectors allow seamless integration with the most popular data sources. Connect your data (or your customers’) to your app, no matter where it lives. With Ragie Connect, your customers can securely connect and manage their own data, directly from your application. For white-label version, chat with sales. Ragie is a fully managed RAG-as-a-Service designed for developers to streamline the ingestion, chunking, and multimodal indexing of structured and unstructured data. It offers simple APIs and SDKs, seamless integration with sources like Google Drive, Notion, and Confluence, and built-in capabilities like summary indexing, chunk reranking, flexible vector filtering, and hybrid semantic-keyword search. With agentic retrieval for multi-step reasoning and a context-aware MCP Server that enables intelligent tool use, Ragie helps your applications deliver state-of-the-art, agent-ready generative AI. Building production applications using RAG can be very tedious. Developers must connect and sync multiple data sources, extract meaningful data from various file formats, implement evolving techniques for chunking and retrieval, build a scalable and resilient data processing pipeline, avoid hallucinations, and ensure content accuracy. Using open-source frameworks can be time-consuming and often results in brittle applications. Originally developed for Glue, Ragie solves this by providing a fully managed RAG-as-a-Service platform. Ragie is ideal for developers who want to build AI applications that leverage their own data for accurate and relevant outputs. Whether you're working on internal chatbots, enterprise SaaS p
Zerox
Ragie
Zerox
Pricing found: $50.10, $48.71, $48.71, $48.71, $9.74
Ragie
Pricing found: $100 / month, $500 / month, $500 / month, $0.02 / page, $0.02 / page
Ragie (1)
Only in Zerox (10)
Zerox
Ragie