# imgPub OCR to EPUB – Technical Overview This document describes how the imgPub application converts page photos into a final EPUB book. It covers the frontend wizard, the Node.js backend that performs EPUB generation, and the most important implementation details so new contributors can extend the project confidently. ## 1. Stack Summarys | Layer | Technology | Notes | | --- | --- | --- | | Frontend build | **Vite + React 18** | SPA wizard experience | | UI | **Material UI v6** | Theme, Stepper, responsive grid | | Routing | **react-router-dom v7** | Each wizard step is a route | | State | **Zustand** | `useAppStore` centralises workflow data | | Upload | **react-dropzone** | Drag & drop multiple images | | Crop | **Custom overlay** | iOS-style handles, percentage-based selection | | OCR | **tesseract.js 5** | Uses local worker/core/lang assets (eng + tur) | | EPUB generation | **Node.js + express + epub-gen** | Backend service builds EPUB and streams back base64 | ## 2. Folder Layout Highlights ``` src/ components/ (UploadStep, CropStep, BulkCropStep, OcrStep, EpubStep, DownloadStep) store/useAppStore.js (global state) utils/ cropUtils.js (canvas cropping) ocrUtils.js (date extraction, sorting) epubUtils.js (API client for EPUB) fileUtils.js (download helper) server/ index.js (Express app with /generate-epub) package.json (epub-gen, express, cors, uuid) public/ tesseract/ (worker.min.js, tesseract-core-simd-lstm.wasm(.js), eng.traineddata, tur.traineddata) ``` ## 3. Frontend Wizard Flow 1. **Upload** – Drag/drop `.png/.jpg/.jpeg` files, previews stored with `URL.createObjectURL` (revoked during resets). 2. **Crop** – Choose a reference image, adjust selection handles, optional numeric offsets. Crop config saved in store with relative ratios. 3. **Bulk Crop** – Applies saved ratios to every upload via ``, storing cropped blobs and URLs. 4. **OCR** – Sequential Tesseract worker (`tur` language, fallback `eng`). Each cropped image is processed in upload order and the cleaned text is appended to a single in-memory string (with a single-space separator). Only that cumulative string is persisted in the store to keep CPU/RAM usage minimal. 5. **EPUB** – After OCR, the frontend sends the full concatenated string to the backend, waits for the resulting EPUB blob, and stores it for download. 6. **Download** – Displays the EPUB metadata and lets the user download or restart the process. ## 4. EPUB Backend Service - Located in `/server`. Run with `cd server && npm install && npm run dev` (defaults to port **4000**). - Exposes `POST /generate-epub` accepting `{ text, meta }` where `text` is the single, concatenated OCR output. - Uses [`epub-gen`](https://www.npmjs.com/package/epub-gen) to build one chapter containing the entire text and writes it to a temporary file. - Returns `{ filename, data }` where `data` is base64-encoded EPUB bytes. Frontend decodes to `Blob` and stores in Zustand (`generatedEpub`). - CORS origin defaults to `http://localhost:5173` and can be overridden via `CLIENT_ORIGIN` env var. ## 5. Tesseract Assets All heavy OCR assets are served locally to avoid CDN issues: - `public/tesseract/worker.min.js` - `public/tesseract/tesseract-core-simd-lstm.wasm(.js)` - `public/tesseract/eng.traineddata` - `public/tesseract/tur.traineddata` The OCR step creates a single worker and reuses it for every cropped image to keep CPU usage predictable. ## 6. State Management & Cleanup `useAppStore` tracks: - `uploadedImages`, `cropConfig`, `croppedImages`, `ocrResults` - `generatedEpub` (blob, URL, filename) - `error` `resetFromStep(step)` clears downstream data and revokes blob URLs (uploads, crops, EPUB) so memory usage stays bounded even after long sessions. ## 7. Running the Project ```bash # Frontend npm install npm run dev # Backend in another terminal dcd server npm install # already included, run once npm run dev # starts on http://localhost:4000 ``` Set `VITE_API_BASE_URL` in `.env` if the server runs on a different host/port. `npm run build` still targets Vite’s static output (`dist/`). Chunk warnings are disabled by bumping `chunkSizeWarningLimit` (see `vite.config.js`). ## 8. Potential Enhancements - Allow users to edit OCR text before sending it to the EPUB service. - Add cover image generation per session. - Persist workflow state (e.g. IndexedDB) so refreshes are less disruptive. - Stream EPUB as soon as chapters are processed for better perceived speed. The current setup keeps PDF logic out of the client entirely, ensuring consistent Turkish characters thanks to EPUB readers’ native font stacks or bundled fonts (if added later).