2025-11-17 16:39:50 +03:00
2025-11-17 11:39:53 +03:00
2025-11-17 16:39:50 +03:00
2025-11-17 14:30:32 +03:00
2025-11-10 23:35:59 +03:00
2025-11-17 16:39:50 +03:00
2025-11-12 19:22:27 +03:00
2025-11-13 21:44:08 +03:00
2025-11-17 14:30:32 +03:00
2025-11-10 23:35:59 +03:00

imgPub OCR to EPUB Technical Overview

This document describes how the imgPub application converts page photos into a final EPUB book. It covers the frontend wizard, the Node.js backend that performs EPUB generation, and the most important implementation details so new contributors can extend the project confidently.

1. Stack Summarys

Layer Technology Notes
Frontend build Vite + React 18 SPA wizard experience
UI Material UI v6 Theme, Stepper, responsive grid
Routing react-router-dom v7 Each wizard step is a route
State Zustand useAppStore centralises workflow data
Upload react-dropzone Drag & drop multiple images
Crop Custom overlay iOS-style handles, percentage-based selection
OCR tesseract.js 5 Uses local worker/core/lang assets (eng + tur)
EPUB generation Node.js + express + epub-gen Backend service builds EPUB and streams back base64

2. Folder Layout Highlights

src/
  components/ (UploadStep, CropStep, BulkCropStep, OcrStep, EpubStep, DownloadStep)
  store/useAppStore.js (global state)
  utils/
    cropUtils.js (canvas cropping)
    ocrUtils.js (date extraction, sorting)
    epubUtils.js (API client for EPUB)
    fileUtils.js (download helper)
server/
  index.js (Express app with /generate-epub)
  package.json (epub-gen, express, cors, uuid)
public/
  tesseract/ (worker.min.js, tesseract-core-simd-lstm.wasm(.js), eng.traineddata, tur.traineddata)

3. Frontend Wizard Flow

  1. Upload Drag/drop .png/.jpg/.jpeg files, previews stored with URL.createObjectURL (revoked during resets).
  2. Crop Choose a reference image, adjust selection handles, optional numeric offsets. Crop config saved in store with relative ratios.
  3. Bulk Crop Applies saved ratios to every upload via <canvas>, storing cropped blobs and URLs.
  4. OCR Sequential Tesseract worker (tur language, fallback eng). Each cropped image is processed in upload order and the cleaned text is appended to a single in-memory string (with a single-space separator). Only that cumulative string is persisted in the store to keep CPU/RAM usage minimal.
  5. EPUB After OCR, the frontend sends the full concatenated string to the backend, waits for the resulting EPUB blob, and stores it for download.
  6. Download Displays the EPUB metadata and lets the user download or restart the process.

4. EPUB Backend Service

  • Located in /server. Run with cd server && npm install && npm run dev (defaults to port 4000).
  • Exposes POST /generate-epub accepting { text, meta } where text is the single, concatenated OCR output.
  • Uses epub-gen to build one chapter containing the entire text and writes it to a temporary file.
  • Returns { filename, data } where data is base64-encoded EPUB bytes. Frontend decodes to Blob and stores in Zustand (generatedEpub).
  • CORS origin defaults to http://localhost:5173 and can be overridden via CLIENT_ORIGIN env var.

5. Tesseract Assets

All heavy OCR assets are served locally to avoid CDN issues:

  • public/tesseract/worker.min.js
  • public/tesseract/tesseract-core-simd-lstm.wasm(.js)
  • public/tesseract/eng.traineddata
  • public/tesseract/tur.traineddata

The OCR step creates a single worker and reuses it for every cropped image to keep CPU usage predictable.

6. State Management & Cleanup

useAppStore tracks:

  • uploadedImages, cropConfig, croppedImages, ocrResults
  • generatedEpub (blob, URL, filename)
  • error

resetFromStep(step) clears downstream data and revokes blob URLs (uploads, crops, EPUB) so memory usage stays bounded even after long sessions.

7. Running the Project

# Frontend
npm install
npm run dev

# Backend in another terminal
dcd server
npm install   # already included, run once
npm run dev   # starts on http://localhost:4000

Set VITE_API_BASE_URL in .env if the server runs on a different host/port.

npm run build still targets Vites static output (dist/). Chunk warnings are disabled by bumping chunkSizeWarningLimit (see vite.config.js).

8. Potential Enhancements

  • Allow users to edit OCR text before sending it to the EPUB service.
  • Add cover image generation per session.
  • Persist workflow state (e.g. IndexedDB) so refreshes are less disruptive.
  • Stream EPUB as soon as chapters are processed for better perceived speed.

The current setup keeps PDF logic out of the client entirely, ensuring consistent Turkish characters thanks to EPUB readers native font stacks or bundled fonts (if added later).

Description
No description provided
Readme 20 MiB
Languages
JavaScript 99.8%
EJS 0.1%