first commit

This commit is contained in:
2025-11-23 14:25:09 +03:00
commit 46d75b64d5
18 changed files with 4749 additions and 0 deletions

663
doc/DEPLOYMENT.md Normal file
View File

@@ -0,0 +1,663 @@
# MetaScraper Deployment Guide
## 📦 Package Publishing
### Preparation Checklist
Before publishing, ensure:
- [ ] All tests pass: `npm test`
- [ ] Code is properly documented
- [ ] Version number follows semantic versioning
- [ ] CHANGELOG.md is updated
- [ ] Package.json is complete and accurate
- [ ] License file is present
- [ ] README.md is up to date
### Version Management
#### Semantic Versioning
```bash
# Patch version (bug fixes)
npm version patch
# Minor version (new features, backward compatible)
npm version minor
# Major version (breaking changes)
npm version major
```
#### Version Numbering Rules
- **MAJOR**: Breaking changes (API changes, Node.js version requirements)
- **MINOR**: New features (new Turkish patterns, performance improvements)
- **PATCH**: Bug fixes (error handling, small fixes)
### Package.json Configuration
```json
{
"name": "flixscaper",
"version": "1.0.0",
"description": "Netflix meta veri scraper.",
"type": "module",
"main": "src/index.js",
"exports": {
".": "./src/index.js",
"./parser": "./src/parser.js",
"./headless": "./src/headless.js"
},
"files": [
"src/",
"README.md",
"LICENSE",
"CHANGELOG.md"
],
"engines": {
"node": ">=18"
},
"keywords": [
"netflix",
"scraper",
"metadata",
"turkish",
"flixscaper"
],
"repository": {
"type": "git",
"url": "https://github.com/username/flixscaper.git"
}
}
```
### Publishing Process
#### 1. Local Testing
```bash
# Test package locally
npm pack
# Install in test project
npm install ./flixscaper-1.0.0.tgz
# Test functionality
node -e "import { scraperNetflix } from 'flixscaper'; console.log('Import successful')"
```
#### 2. NPM Registry Publishing
```bash
# Login to npm
npm login
# Publish to public registry
npm publish
# Publish with beta tag
npm publish --tag beta
# Publish dry run
npm publish --dry-run
```
#### 3. Private Registry Publishing
```bash
# Publish to private registry
npm publish --registry https://registry.yourcompany.com
# Configure default registry
npm config set registry https://registry.yourcompany.com
```
## 🏗️ Build & Distribution
### Source Distribution
MetaScraper is distributed as source code with minimal processing:
```bash
# Files included in distribution
src/
├── index.js # Main entry point
├── parser.js # HTML parsing logic
├── headless.js # Playwright integration
└── polyfill.js # Node.js compatibility
# Documentation files
README.md
LICENSE
CHANGELOG.md
# Configuration files
package.json
```
### Browser/Node.js Compatibility
#### Node.js Support Matrix
| Node.js Version | Support Status | Notes |
|-----------------|----------------|-------|
| 18.x | ✅ Full Support | Requires polyfill |
| 20.x | ✅ Full Support | Polyfill optional |
| 22.x | ✅ Full Support | Native support |
| 16.x | ❌ Not Supported | Use older version or upgrade |
| <16.x | ❌ Not Supported | Major compatibility issues |
#### Compatibility Layer
```javascript
// src/polyfill.js - Automatic compatibility handling
import { Blob } from 'node:buffer';
// Only apply polyfill if needed
if (typeof globalThis.File === 'undefined') {
class PolyfillFile extends Blob {
constructor(parts, name, options = {}) {
super(parts, options);
this.name = String(name);
this.lastModified = options.lastModified ?? Date.now();
}
}
globalThis.File = PolyfillFile;
}
globalThis.Blob = globalThis.Blob || Blob;
```
## 🔄 Continuous Integration/Deployment
### GitHub Actions Workflow
```yaml
# .github/workflows/deploy.yml
name: Deploy Package
on:
push:
tags:
- 'v*'
release:
types: [published]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [18.x, 20.x, 22.x]
steps:
- uses: actions/checkout@v3
- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node-version }}
cache: 'npm'
registry-url: 'https://registry.npmjs.org'
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test
- name: Run linting
run: npm run lint
- name: Check build
run: npm pack
publish:
needs: test
runs-on: ubuntu-latest
if: github.event_name == 'release' || startsWith(github.ref, 'refs/tags/')
steps:
- uses: actions/checkout@v3
- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: '20.x'
cache: 'npm'
registry-url: 'https://registry.npmjs.org'
- name: Install dependencies
run: npm ci
- name: Build package
run: npm pack
- name: Publish to NPM
run: npm publish
env:
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
```
### Automated Testing Pipeline
```yaml
# .github/workflows/test.yml
name: Test Suite
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [18.x, 20.x, 22.x]
os: [ubuntu-latest, windows-latest, macos-latest]
steps:
- uses: actions/checkout@v3
- name: Setup Node.js ${{ matrix.node-version }}
uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node-version }}
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Install Playwright (if needed)
run: npx playwright install chromium
- name: Run tests
run: npm test -- --coverage
- name: Upload coverage
uses: codecov/codecov-action@v3
```
## 🐳 Docker Deployment
### Dockerfile
```dockerfile
# Dockerfile
FROM node:18-alpine
WORKDIR /app
# Copy package files
COPY package*.json ./
# Install dependencies
RUN npm ci --only=production
# Copy source code
COPY src/ ./src/
# Create non-root user
RUN addgroup -g 1001 -S nodejs
RUN adduser -S flixscaper -u 1001
# Change ownership
RUN chown -R flixscaper:nodejs /app
USER flixscaper
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD node -e "import('flixscaper').then(() => process.exit(0)).catch(() => process.exit(1))"
EXPOSE 3000
CMD ["node", "-e", "import('flixscaper').then(m => console.log('MetaScraper ready'))"]
```
### Docker Compose
```yaml
# docker-compose.yml
version: '3.8'
services:
flixscaper:
build: .
container_name: flixscaper
environment:
- NODE_ENV=production
volumes:
- ./logs:/app/logs
restart: unless-stopped
flixscaper-test:
build: .
container_name: flixscaper-test
command: npm test
environment:
- NODE_ENV=test
volumes:
- .:/app
- /app/node_modules
```
### Building Docker Images
```bash
# Build image
docker build -t flixscaper:latest .
# Build with specific version
docker build -t flixscaper:1.0.0 .
# Run container
docker run --rm flixscaper:latest node -e "
import('flixscaper').then(async (m) => {
const result = await m.scraperNetflix('https://www.netflix.com/title/80189685');
console.log(result);
})
"
```
## 🔒 Security Considerations
### Package Security
#### Dependency Scanning
```bash
# Audit dependencies for vulnerabilities
npm audit
# Fix vulnerabilities
npm audit fix
# Generate security report
npm audit --json > security-report.json
```
#### Secure Publishing
```bash
# Use 2FA for npm account
npm profile enable-2fa
# Check package contents before publishing
npm pack --dry-run
# Verify no sensitive files included
tar -tzf flixscaper-*.tgz | grep -E "(key|secret|password|token)" || echo "No sensitive files found"
```
### Runtime Security
#### Input Validation
```javascript
// Ensure all inputs are validated
function validateInput(url, options = {}) {
if (!url || typeof url !== 'string') {
throw new Error('Invalid URL provided');
}
// Validate URL format
try {
new URL(url);
} catch {
throw new Error('Invalid URL format');
}
// Sanitize options
const safeOptions = {
headless: Boolean(options.headless),
timeoutMs: Math.max(1000, Math.min(60000, Number(options.timeoutMs) || 15000)),
userAgent: typeof options.userAgent === 'string' ? options.userAgent : undefined
};
return safeOptions;
}
```
#### Network Security
```javascript
// Secure request configuration
const secureHeaders = {
'User-Agent': userAgent || DEFAULT_USER_AGENT,
'Accept': 'text/html,application/xhtml+xml',
'Accept-Language': 'en-US,en;q=0.9',
'Cache-Control': 'no-cache',
'Pragma': 'no-cache'
};
// Rate limiting consideration
const requestDelay = 1000; // 1 second between requests
```
## 📊 Monitoring & Analytics
### Usage Analytics
#### Basic Metrics Collection
```javascript
// Optional analytics (user consent required)
function trackUsage(url, options, success, duration) {
if (!options.analytics) return;
const metrics = {
timestamp: Date.now(),
url: url.replace(/\/title\/\d+/, '/title/XXXXXX'), // Anonymize
headless: options.headless,
success: success,
duration: duration,
nodeVersion: process.version,
version: require('./package.json').version
};
// Send to analytics service (optional)
// analytics.track('flixscaper_usage', metrics);
}
```
#### Error Tracking
```javascript
function trackError(error, context) {
const errorInfo = {
message: error.message,
stack: error.stack,
context: context,
timestamp: Date.now(),
nodeVersion: process.version
};
// Log for debugging
console.error('MetaScraper Error:', errorInfo);
// Optional: Send to error tracking service
// errorTracker.captureException(error, { extra: context });
}
```
### Performance Monitoring
```javascript
// Performance metrics
class PerformanceMonitor {
constructor() {
this.metrics = {
totalRequests: 0,
successfulRequests: 0,
averageResponseTime: 0,
errorCounts: {}
};
}
recordRequest(duration, success, error = null) {
this.metrics.totalRequests++;
if (success) {
this.metrics.successfulRequests++;
} else {
this.metrics.errorCounts[error?.message] =
(this.metrics.errorCounts[error?.message] || 0) + 1;
}
// Update average response time
this.metrics.averageResponseTime =
(this.metrics.averageResponseTime * (this.metrics.totalRequests - 1) + duration)
/ this.metrics.totalRequests;
}
getMetrics() {
return {
...this.metrics,
successRate: (this.metrics.successfulRequests / this.metrics.totalRequests) * 100
};
}
}
```
## 🔄 Version Management
### Release Process
#### 1. Development Release
```bash
# Create feature branch
git checkout -b feature/new-patterns
# Implement changes
# Add tests
# Update documentation
# Create development release
npm version prerelease --preid=dev
git push --tags
npm publish --tag dev
```
#### 2. Production Release
```bash
# Merge to main
git checkout main
git merge develop
# Update version
npm version minor # or patch/major
# Create GitHub release
gh release create v1.1.0 --generate-notes
# Publish to npm
npm publish
```
#### 3. Hotfix Release
```bash
# Create hotfix branch from main
git checkout -b hotfix/critical-bug
# Fix issue
npm version patch
# Publish immediately
npm publish --tag latest
# Merge back to develop
git checkout develop
git merge main
git checkout main
git merge hotfix/critical-bug
```
### Changelog Management
```markdown
# CHANGELOG.md
## [1.1.0] - 2025-11-23
### Added
- New Turkish UI pattern: "yeni başlık"
- Performance monitoring API
- Docker support
### Fixed
- Memory leak in Playwright cleanup
- URL validation for Turkish Netflix domains
### Changed
- Improved error messages in Turkish
- Updated Node.js compatibility matrix
### Deprecated
- Support for Node.js 16.x (will be removed in 2.0.0)
## [1.0.1] - 2025-11-20
### Fixed
- Critical bug in title cleaning
- Missing year extraction for movies
```
## 🌐 Distribution Channels
### NPM Registry
```json
// package.json - publishing configuration
{
"publishConfig": {
"access": "public",
"registry": "https://registry.npmjs.org"
},
"repository": {
"type": "git",
"url": "https://github.com/username/flixscaper.git"
},
"bugs": {
"url": "https://github.com/username/flixscaper/issues"
},
"homepage": "https://github.com/username/flixscaper#readme"
}
```
### CDN Distribution
```javascript
// For browser usage (future enhancement)
// Available via CDN:
// https://cdn.jsdelivr.net/npm/flixscaper/dist/flixscaper.min.js
import('https://cdn.jsdelivr.net/npm/flixscaper@latest/dist/flixscaper.min.js')
.then(module => {
const { scraperNetflix } = module;
// Use in browser
});
```
### Private Distribution
```bash
# For enterprise/internal distribution
npm config set @company:registry https://npm.company.com
# Publish to private registry
npm publish --registry https://npm.company.com
# Install from private registry
npm install @company/flixscaper
```
---
*Deployment guide last updated: 2025-11-23*