Files
metascraper/doc/DEPLOYMENT.md
2025-11-23 14:25:09 +03:00

663 lines
13 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# MetaScraper Deployment Guide
## 📦 Package Publishing
### Preparation Checklist
Before publishing, ensure:
- [ ] All tests pass: `npm test`
- [ ] Code is properly documented
- [ ] Version number follows semantic versioning
- [ ] CHANGELOG.md is updated
- [ ] Package.json is complete and accurate
- [ ] License file is present
- [ ] README.md is up to date
### Version Management
#### Semantic Versioning
```bash
# Patch version (bug fixes)
npm version patch
# Minor version (new features, backward compatible)
npm version minor
# Major version (breaking changes)
npm version major
```
#### Version Numbering Rules
- **MAJOR**: Breaking changes (API changes, Node.js version requirements)
- **MINOR**: New features (new Turkish patterns, performance improvements)
- **PATCH**: Bug fixes (error handling, small fixes)
### Package.json Configuration
```json
{
"name": "flixscaper",
"version": "1.0.0",
"description": "Netflix meta veri scraper.",
"type": "module",
"main": "src/index.js",
"exports": {
".": "./src/index.js",
"./parser": "./src/parser.js",
"./headless": "./src/headless.js"
},
"files": [
"src/",
"README.md",
"LICENSE",
"CHANGELOG.md"
],
"engines": {
"node": ">=18"
},
"keywords": [
"netflix",
"scraper",
"metadata",
"turkish",
"flixscaper"
],
"repository": {
"type": "git",
"url": "https://github.com/username/flixscaper.git"
}
}
```
### Publishing Process
#### 1. Local Testing
```bash
# Test package locally
npm pack
# Install in test project
npm install ./flixscaper-1.0.0.tgz
# Test functionality
node -e "import { scraperNetflix } from 'flixscaper'; console.log('Import successful')"
```
#### 2. NPM Registry Publishing
```bash
# Login to npm
npm login
# Publish to public registry
npm publish
# Publish with beta tag
npm publish --tag beta
# Publish dry run
npm publish --dry-run
```
#### 3. Private Registry Publishing
```bash
# Publish to private registry
npm publish --registry https://registry.yourcompany.com
# Configure default registry
npm config set registry https://registry.yourcompany.com
```
## 🏗️ Build & Distribution
### Source Distribution
MetaScraper is distributed as source code with minimal processing:
```bash
# Files included in distribution
src/
├── index.js # Main entry point
├── parser.js # HTML parsing logic
├── headless.js # Playwright integration
└── polyfill.js # Node.js compatibility
# Documentation files
README.md
LICENSE
CHANGELOG.md
# Configuration files
package.json
```
### Browser/Node.js Compatibility
#### Node.js Support Matrix
| Node.js Version | Support Status | Notes |
|-----------------|----------------|-------|
| 18.x | ✅ Full Support | Requires polyfill |
| 20.x | ✅ Full Support | Polyfill optional |
| 22.x | ✅ Full Support | Native support |
| 16.x | ❌ Not Supported | Use older version or upgrade |
| <16.x | ❌ Not Supported | Major compatibility issues |
#### Compatibility Layer
```javascript
// src/polyfill.js - Automatic compatibility handling
import { Blob } from 'node:buffer';
// Only apply polyfill if needed
if (typeof globalThis.File === 'undefined') {
class PolyfillFile extends Blob {
constructor(parts, name, options = {}) {
super(parts, options);
this.name = String(name);
this.lastModified = options.lastModified ?? Date.now();
}
}
globalThis.File = PolyfillFile;
}
globalThis.Blob = globalThis.Blob || Blob;
```
## 🔄 Continuous Integration/Deployment
### GitHub Actions Workflow
```yaml
# .github/workflows/deploy.yml
name: Deploy Package
on:
push:
tags:
- 'v*'
release:
types: [published]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [18.x, 20.x, 22.x]
steps:
- uses: actions/checkout@v3
- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node-version }}
cache: 'npm'
registry-url: 'https://registry.npmjs.org'
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test
- name: Run linting
run: npm run lint
- name: Check build
run: npm pack
publish:
needs: test
runs-on: ubuntu-latest
if: github.event_name == 'release' || startsWith(github.ref, 'refs/tags/')
steps:
- uses: actions/checkout@v3
- name: Setup Node.js
uses: actions/setup-node@v3
with:
node-version: '20.x'
cache: 'npm'
registry-url: 'https://registry.npmjs.org'
- name: Install dependencies
run: npm ci
- name: Build package
run: npm pack
- name: Publish to NPM
run: npm publish
env:
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
```
### Automated Testing Pipeline
```yaml
# .github/workflows/test.yml
name: Test Suite
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [18.x, 20.x, 22.x]
os: [ubuntu-latest, windows-latest, macos-latest]
steps:
- uses: actions/checkout@v3
- name: Setup Node.js ${{ matrix.node-version }}
uses: actions/setup-node@v3
with:
node-version: ${{ matrix.node-version }}
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Install Playwright (if needed)
run: npx playwright install chromium
- name: Run tests
run: npm test -- --coverage
- name: Upload coverage
uses: codecov/codecov-action@v3
```
## 🐳 Docker Deployment
### Dockerfile
```dockerfile
# Dockerfile
FROM node:18-alpine
WORKDIR /app
# Copy package files
COPY package*.json ./
# Install dependencies
RUN npm ci --only=production
# Copy source code
COPY src/ ./src/
# Create non-root user
RUN addgroup -g 1001 -S nodejs
RUN adduser -S flixscaper -u 1001
# Change ownership
RUN chown -R flixscaper:nodejs /app
USER flixscaper
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD node -e "import('flixscaper').then(() => process.exit(0)).catch(() => process.exit(1))"
EXPOSE 3000
CMD ["node", "-e", "import('flixscaper').then(m => console.log('MetaScraper ready'))"]
```
### Docker Compose
```yaml
# docker-compose.yml
version: '3.8'
services:
flixscaper:
build: .
container_name: flixscaper
environment:
- NODE_ENV=production
volumes:
- ./logs:/app/logs
restart: unless-stopped
flixscaper-test:
build: .
container_name: flixscaper-test
command: npm test
environment:
- NODE_ENV=test
volumes:
- .:/app
- /app/node_modules
```
### Building Docker Images
```bash
# Build image
docker build -t flixscaper:latest .
# Build with specific version
docker build -t flixscaper:1.0.0 .
# Run container
docker run --rm flixscaper:latest node -e "
import('flixscaper').then(async (m) => {
const result = await m.scraperNetflix('https://www.netflix.com/title/80189685');
console.log(result);
})
"
```
## 🔒 Security Considerations
### Package Security
#### Dependency Scanning
```bash
# Audit dependencies for vulnerabilities
npm audit
# Fix vulnerabilities
npm audit fix
# Generate security report
npm audit --json > security-report.json
```
#### Secure Publishing
```bash
# Use 2FA for npm account
npm profile enable-2fa
# Check package contents before publishing
npm pack --dry-run
# Verify no sensitive files included
tar -tzf flixscaper-*.tgz | grep -E "(key|secret|password|token)" || echo "No sensitive files found"
```
### Runtime Security
#### Input Validation
```javascript
// Ensure all inputs are validated
function validateInput(url, options = {}) {
if (!url || typeof url !== 'string') {
throw new Error('Invalid URL provided');
}
// Validate URL format
try {
new URL(url);
} catch {
throw new Error('Invalid URL format');
}
// Sanitize options
const safeOptions = {
headless: Boolean(options.headless),
timeoutMs: Math.max(1000, Math.min(60000, Number(options.timeoutMs) || 15000)),
userAgent: typeof options.userAgent === 'string' ? options.userAgent : undefined
};
return safeOptions;
}
```
#### Network Security
```javascript
// Secure request configuration
const secureHeaders = {
'User-Agent': userAgent || DEFAULT_USER_AGENT,
'Accept': 'text/html,application/xhtml+xml',
'Accept-Language': 'en-US,en;q=0.9',
'Cache-Control': 'no-cache',
'Pragma': 'no-cache'
};
// Rate limiting consideration
const requestDelay = 1000; // 1 second between requests
```
## 📊 Monitoring & Analytics
### Usage Analytics
#### Basic Metrics Collection
```javascript
// Optional analytics (user consent required)
function trackUsage(url, options, success, duration) {
if (!options.analytics) return;
const metrics = {
timestamp: Date.now(),
url: url.replace(/\/title\/\d+/, '/title/XXXXXX'), // Anonymize
headless: options.headless,
success: success,
duration: duration,
nodeVersion: process.version,
version: require('./package.json').version
};
// Send to analytics service (optional)
// analytics.track('flixscaper_usage', metrics);
}
```
#### Error Tracking
```javascript
function trackError(error, context) {
const errorInfo = {
message: error.message,
stack: error.stack,
context: context,
timestamp: Date.now(),
nodeVersion: process.version
};
// Log for debugging
console.error('MetaScraper Error:', errorInfo);
// Optional: Send to error tracking service
// errorTracker.captureException(error, { extra: context });
}
```
### Performance Monitoring
```javascript
// Performance metrics
class PerformanceMonitor {
constructor() {
this.metrics = {
totalRequests: 0,
successfulRequests: 0,
averageResponseTime: 0,
errorCounts: {}
};
}
recordRequest(duration, success, error = null) {
this.metrics.totalRequests++;
if (success) {
this.metrics.successfulRequests++;
} else {
this.metrics.errorCounts[error?.message] =
(this.metrics.errorCounts[error?.message] || 0) + 1;
}
// Update average response time
this.metrics.averageResponseTime =
(this.metrics.averageResponseTime * (this.metrics.totalRequests - 1) + duration)
/ this.metrics.totalRequests;
}
getMetrics() {
return {
...this.metrics,
successRate: (this.metrics.successfulRequests / this.metrics.totalRequests) * 100
};
}
}
```
## 🔄 Version Management
### Release Process
#### 1. Development Release
```bash
# Create feature branch
git checkout -b feature/new-patterns
# Implement changes
# Add tests
# Update documentation
# Create development release
npm version prerelease --preid=dev
git push --tags
npm publish --tag dev
```
#### 2. Production Release
```bash
# Merge to main
git checkout main
git merge develop
# Update version
npm version minor # or patch/major
# Create GitHub release
gh release create v1.1.0 --generate-notes
# Publish to npm
npm publish
```
#### 3. Hotfix Release
```bash
# Create hotfix branch from main
git checkout -b hotfix/critical-bug
# Fix issue
npm version patch
# Publish immediately
npm publish --tag latest
# Merge back to develop
git checkout develop
git merge main
git checkout main
git merge hotfix/critical-bug
```
### Changelog Management
```markdown
# CHANGELOG.md
## [1.1.0] - 2025-11-23
### Added
- New Turkish UI pattern: "yeni başlık"
- Performance monitoring API
- Docker support
### Fixed
- Memory leak in Playwright cleanup
- URL validation for Turkish Netflix domains
### Changed
- Improved error messages in Turkish
- Updated Node.js compatibility matrix
### Deprecated
- Support for Node.js 16.x (will be removed in 2.0.0)
## [1.0.1] - 2025-11-20
### Fixed
- Critical bug in title cleaning
- Missing year extraction for movies
```
## 🌐 Distribution Channels
### NPM Registry
```json
// package.json - publishing configuration
{
"publishConfig": {
"access": "public",
"registry": "https://registry.npmjs.org"
},
"repository": {
"type": "git",
"url": "https://github.com/username/flixscaper.git"
},
"bugs": {
"url": "https://github.com/username/flixscaper/issues"
},
"homepage": "https://github.com/username/flixscaper#readme"
}
```
### CDN Distribution
```javascript
// For browser usage (future enhancement)
// Available via CDN:
// https://cdn.jsdelivr.net/npm/flixscaper/dist/flixscaper.min.js
import('https://cdn.jsdelivr.net/npm/flixscaper@latest/dist/flixscaper.min.js')
.then(module => {
const { scraperNetflix } = module;
// Use in browser
});
```
### Private Distribution
```bash
# For enterprise/internal distribution
npm config set @company:registry https://npm.company.com
# Publish to private registry
npm publish --registry https://npm.company.com
# Install from private registry
npm install @company/flixscaper
```
---
*Deployment guide last updated: 2025-11-23*