6.3 KiB
6.3 KiB
Technical Infrastructure & Services Documentation
Project Overview
imgPub is a modern web application that converts image documents into EPUB format through OCR processing. The application provides a wizard-style interface for step-by-step document processing including image upload, cropping, OCR, translation, and EPUB generation.
Technology Stack
Frontend
- Framework: React 18.3.1
- Build Tool: Vite 5.4.10
- UI Library: Material-UI (MUI) v6.1.1
- State Management: Zustand 5.0.2
- Routing: React Router DOM v7.0.2
- HTTP Client: Axios/Fetch API
- Authentication: Supabase Auth
Backend Services
- Authentication: Supabase (PostgreSQL database)
- EPUB Generation: Node.js + Express + epub-gen
- OCR Processing: Tesseract.js 5.1.1 (client-side)
- Translation: GLM-4.6 API integration
Development Tools
- Containerization: Docker & Docker Compose
- Package Manager: npm
- Code Style: ESLint (ES6+)
Core Services Architecture
1. Frontend Application (Port 5173)
- Single Page Application (SPA) with wizard-style navigation
- Client-side OCR processing using Tesseract.js
- Real-time image cropping with canvas manipulation
- State management through Zustand store
- Authentication integration with Supabase
2. Backend EPUB Service (Port 4000)
- Express.js server running in
/serverdirectory - Single endpoint:
POST /generate-epub - Uses epub-gen library for EPUB generation
- Returns base64-encoded EPUB files
- CORS enabled for frontend communication
3. Authentication Service (Supabase)
- User registration and login
- Google OAuth integration
- Session management
- User data persistence
Key Components
Image Processing Pipeline
- Upload: Multi-file drag-drop interface with preview
- Crop: Interactive cropping with visual selection handles
- Bulk Processing: Batch crop application to multiple images
- OCR: Text extraction from cropped images (Turkish + English)
- Translation: Optional text translation using GLM-4.6
- EPUB Generation: Final document creation and download
State Management Structure
Core State:
- uploadedImages: File array with preview URLs
- cropConfig: Cropping selection data
- croppedImages: Processed image blobs
- ocrText: Extracted text content
- translatedText: Optional translated content
- generatedEpub: Final EPUB blob and metadata
- authToken/currentUser: Authentication state
File Organization
src/
├── components/ # React components for wizard steps
│ ├── UploadStep.jsx # File upload interface
│ ├── CropStep.jsx # Image cropping
│ ├── BulkCropStep.jsx # Batch processing
│ ├── OcrStep.jsx # OCR processing
│ ├── TranslationStep.jsx # Text translation
│ ├── EpubStep.jsx # EPUB metadata
│ └── DownloadStep.jsx # Final download
├── pages/auth/ # Authentication pages
├── store/
│ └── useAppStore.js # Zustand state management
├── utils/
│ ├── fileUtils.js # File handling utilities
│ ├── ocrUtils.js # OCR processing
│ ├── cropUtils.js # Image cropping
│ ├── epubUtils.js # EPUB API client
│ ├── authApi.js # Authentication API
│ └── translationUtils.js # Translation handling
└── lib/
└── supabaseClient.js # Supabase configuration
server/
├── index.js # Express server
├── package.json # Backend dependencies
└── node_modules/ # Backend packages
public/
├── tesseract/ # OCR engine assets
│ ├── worker.min.js
│ ├── tesseract-core-simd-lstm.wasm(.js)
│ ├── eng.traineddata
│ └── tur.traineddata
└── fonts/ # Font assets for UI
Data Flow
User Workflow
- Authentication: Login via Supabase (Google OAuth or email/password)
- Image Upload: Multiple files selected, preview URLs generated
- Reference Selection: Choose reference image for crop configuration
- Crop Configuration: Visual selection area definition
- Batch Processing: Apply crop to all uploaded images
- OCR Processing: Sequential text extraction from cropped images
- Optional Translation: Text translation via GLM-4.6 API
- EPUB Generation: Send text to backend service
- Download: Retrieve and download generated EPUB
API Endpoints
Frontend ↔ Backend
POST /generate-epub(EPUB service)- Request:
{ text: string, meta: object } - Response:
{ filename: string, data: base64-string }
- Request:
Frontend ↔ Supabase
- Authentication endpoints (handled by Supabase client)
- User profile management
- Session validation
Environment Configuration
Frontend (.env)
VITE_API_BASE_URL: Backend service URLVITE_SUPABASE_URL: Supabase project URLVITE_SUPABASE_ANON_KEY: Supabase anonymous key
Backend (.env)
CLIENT_ORIGIN: Frontend URL for CORSPORT: Server port (default: 4000)
Security Considerations
Authentication
- JWT tokens stored in localStorage
- Session management via Supabase
- Google OAuth integration with proper token handling
Data Handling
- Client-side OCR processing (no server upload of images)
- Temporary blob URLs automatically revoked
- No persistent storage of user images
API Security
- CORS configuration for cross-origin requests
- Environment variable for API keys
- No sensitive data in frontend code
Performance Optimizations
OCR Processing
- Single Tesseract worker reused across images
- Local asset serving (no CDN dependencies)
- Sequential processing to manage CPU usage
Memory Management
- Automatic blob URL revocation
- State cleanup on workflow reset
- Efficient state management with Zustand
Build Optimization
- Vite's optimized bundling
- Asset chunking for large libraries
- Development hot refresh
Deployment
Development
# Frontend
npm install
npm run dev
# Backend
cd server
npm install
npm run dev
# Combined development
npm run dev:all
Production Build
npm run build # Frontend build to dist/
Containerization
- Dockerfile for frontend containerization
- docker-compose.yml for multi-service deployment
- Production-ready configuration included