181 lines
6.0 KiB
Markdown
181 lines
6.0 KiB
Markdown
# Changelog
|
|
|
|
All notable changes to MetaScraper will be documented in this file.
|
|
|
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
|
|
## [Unreleased]
|
|
|
|
### Planned
|
|
- Multi-language UI pattern support
|
|
- Browser performance optimizations
|
|
- API rate limiting built-in
|
|
- WebSocket streaming support
|
|
|
|
## [1.0.0] - 2025-11-23
|
|
|
|
### Added
|
|
- 🎯 Core Netflix metadata scraping functionality
|
|
- 🌍 Turkish UI text pattern removal
|
|
- 📦 Dual-mode operation: Static HTML + Playwright fallback
|
|
- 🏗️ Modular architecture with separate parser, headless, and polyfill modules
|
|
- 🔧 Comprehensive API with `scraperNetflix` main function
|
|
- 📚 Complete documentation suite in `/doc` directory
|
|
- 🧪 Integration tests with real Netflix URLs
|
|
- 🔍 JSON-LD structured data extraction
|
|
- ⚡ Performance-optimized static parsing
|
|
- 🛡️ Error handling with Turkish error messages
|
|
- 📊 URL normalization for various Netflix formats
|
|
- 🎨 Clean title extraction with Netflix suffix removal
|
|
- 📝 Node.js 18+ compatibility with minimal polyfills
|
|
|
|
### Technical Features
|
|
- **HTML Parser**: Cheerio-based static HTML parsing
|
|
- **Title Cleaning**: Turkish and English UI pattern removal
|
|
- **Browser Automation**: Optional Playwright integration
|
|
- **URL Processing**: Netflix URL normalization and validation
|
|
- **Metadata Extraction**: Year, title, and season information
|
|
- **Error Recovery**: Automatic fallback strategies
|
|
- **Memory Management**: Proper browser resource cleanup
|
|
- **Network Handling**: Configurable timeouts and User-Agents
|
|
|
|
### Supported Content Types
|
|
- ✅ Movies with year extraction
|
|
- ✅ TV series with season information
|
|
- ✅ Turkish Netflix interface optimization
|
|
- ✅ Various Netflix URL formats
|
|
- ✅ Region-agnostic content extraction
|
|
|
|
### Turkish Localization
|
|
- Removes UI text: "izlemenizi bekliyor", "izleyin", "devam et", "başla"
|
|
- Handles season-specific text: "Sezon X izlemeye devam"
|
|
- Netflix suffix cleaning: " | Netflix" removal
|
|
- Turkish error messages for better UX
|
|
|
|
### Performance Characteristics
|
|
- Static mode: 200-500ms response time
|
|
- Headless mode: 2-5 seconds (when needed)
|
|
- Memory usage: <50MB (static), 100-200MB (headless)
|
|
- Success rate: ~95% with headless fallback
|
|
|
|
### Documentation
|
|
- 📖 **API Reference**: Complete function documentation with examples
|
|
- 🏗️ **Architecture Guide**: System design and technical decisions
|
|
- 👨💻 **Development Guide**: Setup, conventions, and contribution process
|
|
- 🧪 **Testing Guide**: Test patterns and procedures
|
|
- 🔧 **Troubleshooting**: Common issues and solutions
|
|
- ❓ **FAQ**: Frequently asked questions
|
|
- 📦 **Deployment Guide**: Packaging and publishing instructions
|
|
|
|
### Dependencies
|
|
- **cheerio** (^1.0.0-rc.12) - HTML parsing
|
|
- **playwright** (^1.41.2) - Optional browser automation
|
|
- **vitest** (^1.1.3) - Testing framework
|
|
- Node.js 18+ compatibility with minimal polyfills
|
|
|
|
### Quality Assurance
|
|
- ✅ Integration tests with live Netflix URLs
|
|
- ✅ Turkish UI text pattern testing
|
|
- ✅ Error handling validation
|
|
- ✅ Performance benchmarking
|
|
- ✅ Node.js version compatibility testing
|
|
|
|
---
|
|
|
|
## Version History
|
|
|
|
### Development Phase (Pre-1.0)
|
|
|
|
The project evolved through several iterations:
|
|
|
|
1. **Initial Concept**: Basic Netflix HTML parsing
|
|
2. **Turkish Localization**: Added Turkish UI text removal
|
|
3. **Dual-Mode Architecture**: Implemented static + headless fallback
|
|
4. **Modular Design**: Separated concerns into dedicated modules
|
|
5. **Production Ready**: Comprehensive testing and documentation
|
|
|
|
### Key Technical Decisions
|
|
|
|
- **ES6+ Modules**: Modern JavaScript with import/export
|
|
- **Static-First Strategy**: Prioritize performance over completeness
|
|
- **Graceful Degradation**: Continue operation when optional deps fail
|
|
- **Minimal Polyfills**: Targeted compatibility layer for Node.js
|
|
- **Comprehensive Testing**: Live data testing with real Netflix pages
|
|
- **Documentation-First**: Extensive documentation for future maintainers
|
|
|
|
### Breaking Changes from Development
|
|
|
|
- Function renamed from `fetchNetflixMeta` → `scraperNetflix`
|
|
- `normalizeNetflixUrl` integrated into main function
|
|
- Polyfill approach simplified for Node.js 24+ compatibility
|
|
- Error messages localized to Turkish
|
|
- Module structure reorganized for better maintainability
|
|
|
|
---
|
|
|
|
## Migration Guide
|
|
|
|
### For Users Upgrading from Development Versions
|
|
|
|
If you were using early development versions:
|
|
|
|
```javascript
|
|
// Old API (development)
|
|
import { fetchNetflixMeta, normalizeNetflixUrl } from 'flixscaper';
|
|
|
|
const normalized = normalizeNetflixUrl(url);
|
|
const result = await fetchNetflixMeta(normalized);
|
|
|
|
// New API (1.0.0)
|
|
import { scraperNetflix } from 'flixscaper';
|
|
|
|
const result = await scraperNetflix(url);
|
|
```
|
|
|
|
### Key Changes
|
|
1. **Single Function**: `scraperNetflix` handles everything
|
|
2. **Integrated Normalization**: No separate URL normalization function
|
|
3. **Better Error Messages**: Turkish error messages for Turkish users
|
|
4. **Improved Performance**: Optimized static parsing
|
|
5. **Better Documentation**: Complete API and architectural documentation
|
|
|
|
---
|
|
|
|
## Roadmap
|
|
|
|
### Version 1.1 (Planned)
|
|
- [ ] Additional Turkish UI patterns
|
|
- [ ] Performance optimizations
|
|
- [ ] Better error recovery
|
|
- [ ] Request caching support
|
|
- [ ] Batch processing utilities
|
|
|
|
### Version 1.2 (Planned)
|
|
- [ ] Multi-language support
|
|
- [ ] Rate limiting built-in
|
|
- [ ] Retry logic improvements
|
|
- [ ] Metrics and monitoring
|
|
- [ ] Browser pool optimization
|
|
|
|
### Version 2.0 (Future)
|
|
- [ ] Multi-platform support (YouTube, etc.)
|
|
- [ ] REST API server version
|
|
- [ ] Browser extension
|
|
- [ ] GraphQL API
|
|
- [ ] Real-time scraping
|
|
|
|
---
|
|
|
|
## Support
|
|
|
|
For questions, issues, or contributions:
|
|
|
|
- **Documentation**: See `/doc` directory for comprehensive guides
|
|
- **Issues**: [GitHub Issues](https://github.com/username/flixscaper/issues)
|
|
- **Examples**: Check `local-demo.js` for usage patterns
|
|
- **Testing**: Run `npm test` to verify functionality
|
|
|
|
---
|
|
|
|
*Changelog format based on [Keep a Changelog](https://keepachangelog.com/)* |