A robust web application that allows users to upload JPEG and HEIC images and convert them to HTML using advanced OCR (Optical Character Recognition) with mathematical formula preservation. Now with enhanced HEIC support and improved error handling.
Features
Enhanced HEIC Support: Robust handling of HEIC files with automatic conversion to JPEG
Drag and Drop Interface: Easy file upload with visual feedback for JPEG and HEIC formats
Multiple File Upload: Process up to 60 files simultaneously with batch processing
Advanced OCR: Text recognition using Tesseract.js with mathematical formula detection
Mathematical Formula Recognition: Specialized detection and preservation of mathematical expressions
LaTeX Math Rendering: Mathematical formulas preserved using MathJax in HTML output
Real-time Progress Tracking: Detailed progress monitoring for each file including conversion status
Error Recovery: Comprehensive error handling with detailed error messages
HTML Output Generation: Clean, styled HTML with mathematical notation properly rendered
Recent Improvements
HEIC Processing Enhancements
Improved Library Loading: Better handling of heic2any library initialization with immediate loading
Enhanced File Detection: More robust HEIC file format detection using both MIME types and file extensions
Better Error Handling: Detailed error messages for conversion failures
Progress Tracking: Real-time status updates during HEIC conversion process
Smart UI Feedback: Visual indicators prevent premature file uploads before libraries are ready
User Experience Improvements
Loading Status Indicators: Real-time visual feedback showing library loading progress
Smart Upload Area: Disabled upload area until all libraries are loaded
No Premature Warnings: Removed confusing warning messages during library loading
Animated Status Dots: Pulsing indicators show when libraries are still loading
Helpful Loading Messages: Clear guidance on when the system is ready for use
OCR Processing Improvements
Enhanced Character Recognition: Expanded character whitelist for better mathematical symbol detection
Improved Engine Configuration: Optimized Tesseract.js settings for mathematical content
Better Progress Reporting: More granular progress updates during OCR processing
Extended Library Waiting: Longer timeout periods for reliable library loading
Library Availability Checks: Robust checking for library availability before processing
HTML Output Enhancements
HEIC File Indicators: Clear badges showing which files were converted from HEIC
Enhanced Statistics: Detailed processing summary including conversion counts
Better Visual Design: Improved styling with file type indicators and processing notes
Conversion Tracking: Clear indication of which files underwent HEIC conversion
Project Structure
├── backend/
│ └── index.ts # Enhanced API server with robust HEIF support and improved error handling
├── frontend/
│ ├── index.html # Updated HTML template with improved library loading
│ ├── index.tsx # Enhanced React application with better HEIF processing
│ └── style.css # Custom styles with math formula styling
└── README.md
Advanced Features
Enhanced HEIC Support
Automatic Format Detection: Robust detection of HEIC files using both MIME types and file extensions
Improved Library Loading: Better handling of heic2any library with retry mechanisms