🔎 DocSeek

Search inside your documents by meaning, not just by words.

DocSeek lets you load a PDF, Word, HTML, Markdown or text file and find passages by what they mean — so you can locate the right paragraph even when you don't know the exact wording used.

A small AI model runs inside your browser to power the search. Nothing is ever uploaded.

Free Private Local AI No sign-up Works offline*

User Manual

Overview

DocSeek answers a very specific question: “Is what I'm looking for in this document, and where?” It reads your file, breaks it into short passages, and turns each passage into a numerical fingerprint (an embedding) using a compact AI model that runs entirely in your browser. When you type a query, it is fingerprinted the same way, and DocSeek ranks the passages by how close their meaning is to your query. Because it compares meaning rather than letters, it can surface a relevant paragraph even when it uses completely different words than you did. Everything — the file, the index, and your searches — stays on your device. No server, no cloud, no upload.

Reliable by design. DocSeek always shows you the actual passages from your document and highlights the most relevant sentence. It never rewrites or invents text, so you can trust what you read.

Getting Started

1. Load a document

DocSeek extracts the text and reports how many sections and passages it found.

2. Build the search index

Click ⚙️ Build search index. The first time ever, a ~25 MB AI model (all-MiniLM-L6-v2) downloads from a CDN and is cached by your browser; after that it works offline. Indexing then fingerprints every passage — a progress bar shows how far along it is. The index is cached locally, so re-opening the same file is instant.

3. Search

Type what you're looking for and press Search. Results appear as passages, each tagged with its location (page number for PDFs, paragraph number otherwise), a match score, and the most relevant sentence highlighted.

Search Modes

ModeWhat it doesWhen to use it
Meaning search Ranks passages by conceptual similarity using the AI index. When you don't know the exact wording, or want everything about a topic.
Exact / keyword Finds literal text, with an optional regex toggle. No AI index required. When you know the precise term, or want a quick literal cross-check.
Use the two modes together: if meaning search surfaces a concept, an exact search confirms whether a specific word truly appears. This is the best way to be sure something is — or isn't — in the document.

Reading the Results

ElementMeaning
Location chip (p. 4, ¶ 12)Where the passage sits in the document.
Match score (72% match)How close the passage's meaning is to your query. Higher is stronger; green ≥ 50%, amber ≥ 35%.
HighlightIn meaning mode, the single most relevant sentence; in keyword mode, every literal match.

A modest top score (say 30–45%) doesn't mean failure — it often means the topic is only touched on lightly. If even the best matches look unrelated, that's good evidence the document doesn't cover what you asked.

Document Overview

Open 📋 Document overview and click Generate overview to get an extractive summary: DocSeek picks the most central real sentences from the document and lists them in reading order. Nothing is paraphrased or generated, so the overview is always faithful to the source. Choose how many sentences you'd like (3–20).

Settings

SettingEffect
Passage sizeHow many characters each passage holds. Smaller = more precise locations; larger = more context per result.
OverlapHow much text is shared between neighbouring passages, so ideas spanning a boundary aren't missed.
Results shownHow many passages to display per search.

Changing passage size or overlap requires rebuilding the index (DocSeek will prompt you).

Privacy

DocSeek processes everything locally in your browser. Your documents are never uploaded — not their text, not your searches. The AI model and the supporting libraries are downloaded once from public CDNs and then cached; from then on the tool runs offline. Your settings live in localStorage and the search index is cached in IndexedDB, both on your own device.

* About “offline”: the very first run needs internet to download the AI model and libraries. After your browser has cached them, DocSeek works with no connection.

Tips & Limitations

Supported File Types

.pdf  ·  .docx  ·  .html  ·  .md  ·  .txt

About this tool

Technologies used:

Transformers.js (all-MiniLM-L6-v2) WebAssembly pdf.js mammoth.js marked.js IndexedDB localStorage File & Drag-and-Drop API Web Crypto API Vanilla JS

Built with:

Claude Cowork (Opus 4.8)

← Back to Bunka.be