Files
imgPub/doc/technical-infrastructure.md
2025-11-17 11:39:53 +03:00

206 lines
6.3 KiB
Markdown

# Technical Infrastructure & Services Documentation
## Project Overview
**imgPub** is a modern web application that converts image documents into EPUB format through OCR processing. The application provides a wizard-style interface for step-by-step document processing including image upload, cropping, OCR, translation, and EPUB generation.
## Technology Stack
### Frontend
- **Framework**: React 18.3.1
- **Build Tool**: Vite 5.4.10
- **UI Library**: Material-UI (MUI) v6.1.1
- **State Management**: Zustand 5.0.2
- **Routing**: React Router DOM v7.0.2
- **HTTP Client**: Axios/Fetch API
- **Authentication**: Supabase Auth
### Backend Services
- **Authentication**: Supabase (PostgreSQL database)
- **EPUB Generation**: Node.js + Express + epub-gen
- **OCR Processing**: Tesseract.js 5.1.1 (client-side)
- **Translation**: GLM-4.6 API integration
### Development Tools
- **Containerization**: Docker & Docker Compose
- **Package Manager**: npm
- **Code Style**: ESLint (ES6+)
## Core Services Architecture
### 1. Frontend Application (Port 5173)
- Single Page Application (SPA) with wizard-style navigation
- Client-side OCR processing using Tesseract.js
- Real-time image cropping with canvas manipulation
- State management through Zustand store
- Authentication integration with Supabase
### 2. Backend EPUB Service (Port 4000)
- Express.js server running in `/server` directory
- Single endpoint: `POST /generate-epub`
- Uses epub-gen library for EPUB generation
- Returns base64-encoded EPUB files
- CORS enabled for frontend communication
### 3. Authentication Service (Supabase)
- User registration and login
- Google OAuth integration
- Session management
- User data persistence
## Key Components
### Image Processing Pipeline
1. **Upload**: Multi-file drag-drop interface with preview
2. **Crop**: Interactive cropping with visual selection handles
3. **Bulk Processing**: Batch crop application to multiple images
4. **OCR**: Text extraction from cropped images (Turkish + English)
5. **Translation**: Optional text translation using GLM-4.6
6. **EPUB Generation**: Final document creation and download
### State Management Structure
```javascript
Core State:
- uploadedImages: File array with preview URLs
- cropConfig: Cropping selection data
- croppedImages: Processed image blobs
- ocrText: Extracted text content
- translatedText: Optional translated content
- generatedEpub: Final EPUB blob and metadata
- authToken/currentUser: Authentication state
```
## File Organization
```
src/
├── components/ # React components for wizard steps
│ ├── UploadStep.jsx # File upload interface
│ ├── CropStep.jsx # Image cropping
│ ├── BulkCropStep.jsx # Batch processing
│ ├── OcrStep.jsx # OCR processing
│ ├── TranslationStep.jsx # Text translation
│ ├── EpubStep.jsx # EPUB metadata
│ └── DownloadStep.jsx # Final download
├── pages/auth/ # Authentication pages
├── store/
│ └── useAppStore.js # Zustand state management
├── utils/
│ ├── fileUtils.js # File handling utilities
│ ├── ocrUtils.js # OCR processing
│ ├── cropUtils.js # Image cropping
│ ├── epubUtils.js # EPUB API client
│ ├── authApi.js # Authentication API
│ └── translationUtils.js # Translation handling
└── lib/
└── supabaseClient.js # Supabase configuration
server/
├── index.js # Express server
├── package.json # Backend dependencies
└── node_modules/ # Backend packages
public/
├── tesseract/ # OCR engine assets
│ ├── worker.min.js
│ ├── tesseract-core-simd-lstm.wasm(.js)
│ ├── eng.traineddata
│ └── tur.traineddata
└── fonts/ # Font assets for UI
```
## Data Flow
### User Workflow
1. **Authentication**: Login via Supabase (Google OAuth or email/password)
2. **Image Upload**: Multiple files selected, preview URLs generated
3. **Reference Selection**: Choose reference image for crop configuration
4. **Crop Configuration**: Visual selection area definition
5. **Batch Processing**: Apply crop to all uploaded images
6. **OCR Processing**: Sequential text extraction from cropped images
7. **Optional Translation**: Text translation via GLM-4.6 API
8. **EPUB Generation**: Send text to backend service
9. **Download**: Retrieve and download generated EPUB
### API Endpoints
#### Frontend ↔ Backend
- `POST /generate-epub` (EPUB service)
- Request: `{ text: string, meta: object }`
- Response: `{ filename: string, data: base64-string }`
#### Frontend ↔ Supabase
- Authentication endpoints (handled by Supabase client)
- User profile management
- Session validation
## Environment Configuration
### Frontend (.env)
- `VITE_API_BASE_URL`: Backend service URL
- `VITE_SUPABASE_URL`: Supabase project URL
- `VITE_SUPABASE_ANON_KEY`: Supabase anonymous key
### Backend (.env)
- `CLIENT_ORIGIN`: Frontend URL for CORS
- `PORT`: Server port (default: 4000)
## Security Considerations
### Authentication
- JWT tokens stored in localStorage
- Session management via Supabase
- Google OAuth integration with proper token handling
### Data Handling
- Client-side OCR processing (no server upload of images)
- Temporary blob URLs automatically revoked
- No persistent storage of user images
### API Security
- CORS configuration for cross-origin requests
- Environment variable for API keys
- No sensitive data in frontend code
## Performance Optimizations
### OCR Processing
- Single Tesseract worker reused across images
- Local asset serving (no CDN dependencies)
- Sequential processing to manage CPU usage
### Memory Management
- Automatic blob URL revocation
- State cleanup on workflow reset
- Efficient state management with Zustand
### Build Optimization
- Vite's optimized bundling
- Asset chunking for large libraries
- Development hot refresh
## Deployment
### Development
```bash
# Frontend
npm install
npm run dev
# Backend
cd server
npm install
npm run dev
# Combined development
npm run dev:all
```
### Production Build
```bash
npm run build # Frontend build to dist/
```
### Containerization
- Dockerfile for frontend containerization
- docker-compose.yml for multi-service deployment
- Production-ready configuration included