British Flag

Game-Changing Translator User Manual

Copyright © 2025 Tomasz Kamiński
Last Updated: 15 September 2025

Game-Changing Translator Main Interface

Table of Contents

  1. Introduction
  2. Getting Started
  3. Main Interface
  4. Setting Up Translation Areas
  5. Settings Configuration
  6. AI-Powered Translation and OCR
  7. API Usage Monitoring
  8. Other Translation Methods
  9. Keyboard Shortcuts
  10. Troubleshooting
  11. Tips and Best Practices

Introduction

Game-Changing Translator is a desktop application that automatically captures text from any area of your screen, performs optical character recognition (OCR), and translates the text in real-time. With its floating overlay windows, you can position the translation anywhere on your screen, making it perfect for translating games, videos, PDFs, or any application with text that you can't easily copy and paste.

Getting Started

Requirements

Before using Game-Changing Translator , ensure you have:

First Run

  1. Launch Game-Changing Translator by running the main.py script or the executable if you're using a compiled version.
  2. On first startup, the application loads with default settings and both source and target areas are hidden.
  3. Before starting translation, you need to:
  4. When you click Start, the translation window will automatically be shown.
  5. When you click Stop, the translation window will automatically be hidden.
  6. You can manually toggle the visibility of the translation window using Alt+2 at any time.

Main Interface

The main interface is organised into five tabs:

Home Tab

Home Tab

  1. Select Source Area (OCR) – Define the area where text will be captured.
  2. Select Target Area (Translation) – Define where the translated text will appear.
  3. Start/Stop – Toggle the translation process on/off.
  4. Hide/Show Source Window – Toggle visibility of the source capture area.
  5. Hide/Show Target Window – Toggle visibility of the translation target area.
  6. Clear Translation Cache – Clear translations stored in memory to force retranslation.
  7. Clear Debug Log – Clear the application log.
  8. Enable/Disable Debug Log – Disable debug logging for improved performance.
  9. Keyboard Shortcuts – List of available hotkeys.
  10. Status – Current application status.

Settings Tab

Settings Tab

Here you can configure:

  1. Translation Model – Select between different translation providers (Gemini, OpenAI, MarianMT, DeepL, etc.).
  2. OCR Model – Choose between Tesseract (offline), Gemini API (online), or OpenAI API (online).
  3. Source Language – The language to detect with OCR and translate from (DeepL and Google Translate only).
  4. Target Language – The language to translate into (DeepL and Google Translate only).
  5. API Key – For DeepL, Google Translate, Gemini, or OpenAI.
  6. Quality – Choose between Classic (faster) or Next-gen (potentially better quality) models (DeepL only).
  7. MarianMT Options – For offline neural translation (MarianMT-specific).
  8. Gemini/OpenAI Options – For cost-effective AI translation with context awareness.
  9. Tesseract Path – Path to the Tesseract OCR executable (Tesseract only).
  10. Scan Interval (ms) – How frequently to capture the screen.
  11. Clear Translation Timeout (s) – Time before clearing translations when source text disappears.
  12. Text Stability Threshold – How many consistent readings needed before translation (Tesseract only).
  13. OCR Confidence Threshold – Minimum confidence for OCR text detection (Tesseract only).
  14. Image Preprocessing Mode – How to process images for OCR (Tesseract only).
  15. OCR Debugging – Option to show debug images and text in the Debugging tab (Tesseract only).
  16. Preview button – Opens OCR Preview window (Tesseract only).
  17. Remove Trailing Garbage – Option to remove text after the last punctuation mark (Tesseract only).
  18. Appearance Options – Colours and font sizes for overlays.
  19. File Caching Options – Settings to enable/disable caching for API services.

OCR Preview (Tesseract only)

OCR Preview

Clicking the Preview button on the Settings tab opens a separate OCR Preview window when using Tesseract OCR. This window displays:

  1. Processed Image (1:1 scale) – The preprocessed image being used for OCR recognition.
  2. Recognized Text – The text currently being recognised by the OCR engine.

This preview window is particularly useful for fine-tuning OCR settings and understanding why certain text might not be recognised properly. It can be moved and resized independently of the main application window.

API Usage Tab

API Usage Tab

This tab provides comprehensive monitoring and analysis of your API usage for services like Gemini and OpenAI, including:

  1. Gemini/OpenAI OCR Statistics – Cost tracking and performance metrics for OCR operations.
  2. Gemini/OpenAI Translation Statistics – Word counts, costs, and efficiency metrics for translations.
  3. Combined API Statistics – Overall cost analysis and projections for each provider.
  4. DeepL Usage Tracker – Monitor free monthly limits for DeepL API.
  5. Export and Management Tools – Export statistics to CSV/text and copy to clipboard.

For detailed information about all available statistics and cost tracking features, see the API Usage Monitoring section.

Debugging Tab

Debugging Tab

This tab shows:

  1. Original Image – The raw captured image.
  2. Processed Image – The image after preprocessing for OCR.
  3. OCR Results – Text detected by OCR.
  4. Application Log – Running log of application events.
  5. Save OCR Images and Refresh Log – Buttons to save debug images and refresh the log.

About Tab

This tab provides basic information about the application and includes a convenient Check for Updates button that allows you to easily update the application to the latest version with a simple one-click process.

Setting Up Translation Areas

Selecting the Source Area

  1. Click Select Source Area (OCR) button.
  2. Your screen will dim, and you'll see see a black cross.
  3. Click and drag to select the area containing text you want to translate.
  4. After selection, a semi-transparent overlay window appears at the selected location.
  5. This overlay are hidden by default when the application starts.
  6. This overlay can be:

Selecting the Target Area

  1. Click Select Target Area (Translation) button.
  2. Your screen will dim, and you'll see a black cross.
  3. Click and drag to select where you want translations to appear.
  4. After selection, a semi-transparent overlay window appears at the selected location.
  5. This overlay are hidden by default when the application starts.
  6. This overlay can be:

Settings Configuration

Translation Configuration

  1. Translation Model:

    A variety of AI and traditional translation services are available:

  2. Source Language:

  3. Target Language:

OCR Configuration

  1. OCR Model:

    Choose between online AI-powered models and a traditional offline engine:

  2. Tesseract Path (Tesseract only):

  3. Image Preprocessing Mode (Tesseract only):

  4. Adaptive Mode (Tesseract only):

    When you select Adaptive preprocessing mode, the system unlocks sophisticated adaptive thresholding capabilities that excel in challenging visual environments. This mode is particularly valuable when dealing with difficult conditions such as small subtitle text overlaid on dynamic, flickering backgrounds with constantly changing colours and lighting.

    Unlike the three standard preprocessing modes, Adaptive mode provides two adjustable parameters that allow you to fine-tune the OCR recognition process:

    This mode proves invaluable when standard preprocessing fails to produce reliable results. By experimenting with these two parameters, you can often achieve superior OCR recognition compared to the ready-to-use modes, particularly in scenarios where backgrounds contain moving elements, varying illumination, or complex visual patterns that would otherwise interfere with text detection.

    For optimal results, start with moderate values (Block Size: 11, C Value: 2) and adjust based on your specific content. Increase Block Size for larger text or gradual lighting changes, and adjust C Value to balance between capturing all text and reducing false positives.

  5. OCR Confidence Threshold (Tesseract only):

  6. Text Stability Threshold (Tesseract only):

  7. OCR Debugging (Tesseract only):

  8. Remove Trailing Garbage (Tesseract only):

Performance Settings

  1. Scan Interval (ms):

  2. Clear Translation Timeout (s):

  3. Clear Translation Cache:

  4. File Caching Options (API Translation Services Only):

  5. Debug Logging:

Appearance Settings

  1. Source Area Colour – Background colour of the source capture overlay (customisable).
  2. Target Area Colour – Background colour of the translation overlay (customisable).
  3. Target Text Colour – Colour of the translated text (customisable).
  4. Target Window Font Size – Size of the translated text.

AI-Powered Translation and OCR

Game-Changing Translator integrates state-of-the-art Large Language Models (LLMs) from Google (Gemini) and OpenAI (GPT) to provide advanced capabilities for both OCR and translation. This gives you the choice between the highly recommended, cost-effective Gemini models and the flexible, powerful alternatives from OpenAI.

Gemini OCR - Premium Text Recognition

Gemini OCR represents a revolutionary advancement in text recognition technology, providing superior accuracy for challenging subtitle scenarios where traditional OCR engines like Tesseract struggle. This premium feature leverages Google's advanced Gemini models to deliver exceptional OCR results with flexible model selection for optimal performance and cost efficiency.

Intelligent Model Selection

The application features flexible model selection for both OCR and translation operations, allowing you to optimise performance based on your specific use case:

Recommended Model Selection:

Advanced Configuration: Model availability and costs can be customised by editing the gemini_models.csv file in the resources directory. This allows you to add new models, update pricing, or modify which models are available for OCR versus translation operations as Google releases new Gemini models.

Challenging Subtitles Scenarios

Gemini OCR excels in scenarios where subtitles are difficult to recognise due to:

OCR Comparison Examples

Challenging subtitle example 1

Tesseract OCR Result: ~ Trust me, OD tite WE loca mS
Gemini OCR Result: Trust me, Oakmonters know a newcomer when they see one. We locals can tell.

Challenging subtitle example 2

Tesseract OCR Result: ' Paulie: Driv: show, Tom. Next stop's Bi the motel. 7 jj ie
Gemini OCR Result: Paulie: Drive before the cops show, Tom. Next stop's Bill at the motel.

Superior Premium Feature with Multiple Models

Gemini OCR is a premium feature that significantly outperforms traditional OCR methods through intelligent model selection. The application provides access to multiple Gemini models, each optimised for different scenarios:

Gemini 2.0 Models - Superior OCR Accuracy and Translation Quality:

Gemini 2.5 Models - Speed Optimised:

Performance and Cost:

Outstanding Cost-to-Quality Ratio

The available Gemini models deliver exceptionally fast and accurate OCR results that significantly surpass Tesseract or Paddle OCR. With intelligent model selection, you can optimise the cost-to-quality ratio for your specific use case whilst maintaining superior performance compared to both free and paid OCR solutions.

Cost Comparison (using Gemini 2.5 Flash-Lite pricing):

Best Practices with Gemini OCR

Gemini API - Cost-Effective and Context-Aware Translation

Google's advanced Gemini models represent a breakthrough in AI translation technology, offering premium-quality translations with unprecedented cost-effectiveness. These advanced models combine intelligent context awareness with remarkable affordability, making it possible to translate massive gaming projects for a fraction of traditional costs.

Superior Translation Quality

Context Window Technology

Unlike traditional translation services that process each subtitle in isolation, Gemini API features a configurable sliding context window that maintains awareness of previous translations. This revolutionary approach ensures narrative coherence, improves grammar flow, and delivers translations that understand the broader context of conversations and storylines.

The context window can be configured to include 0-5 previous subtitles, allowing the AI to:

Example: Context-Aware Translation

This example demonstrates how context awareness helps maintain proper grammar when translating Czech to Polish:

Czech Original DeepL (No Context) Gemini (With Context) English Translation
A vodkaď se podle tebe teda známe? A skąd się znamy, według ciebie? A skąd niby się znamy? And how do we supposedly know each other?
Viděli jsme se přece u toho rybníka! Widzieliśmy się przecież nad stawem! Widzieliśmy się przecież nad tamtym stawem! We saw each other at that pond!
Jakýho rybníka? Já u žádnýho rybníka nebyla! Jakiego stawu? Nie byłam przy żadnym stawie! Nad jakim stawem? Ja nad żadnym stawem nie byłam! What pond? I wasn't at any pond!
Ale jo, byla! Ale tak, była! Ale tak, byłaś! But yes, you were!

Key Improvements with Context:

These examples clearly demonstrate how Gemini's context window helps maintain grammatical consistency and dialogue flow that would be impossible with sentence-by-sentence translation.

OCR Error Intelligence

One of Gemini's most impressive capabilities is its ability to interpret and correct OCR imperfections automatically. When text recognition produces garbled or incomplete results, Gemini's advanced language understanding can often deduce the intended meaning and provide clean, accurate translations without replicating OCR errors in the output.

Flexible Model Configuration

The application supports multiple Gemini models for both OCR and translation operations. You can select different models based on your specific needs: Gemini 2.0 models offer superior OCR accuracy for longer subtitles, whilst Gemini 2.5 models provide speed-optimised performance for rapidly changing content. Model selection and pricing can be customised by editing the gemini_models.csv file in the resources directory.

Example: OCR Error Correction

Here's a real-world example showing how Gemini handles OCR errors compared to DeepL when translating French to English:

OCR Input DeepL Output Gemini Output Analysis
Vraiment ? Really? Really? Clean OCR, both work well
| Vraiment ? | Really? Really? Gemini removes OCR artifact "|", DeepL replicates it

Exceptional Cost-Effectiveness

Real-World Cost Analysis

Gemini API offers extraordinary value for large translation projects. Even massive games like The Witcher 3, with hundreds of hours of dialogue and subtitles, can be translated for under $5 total cost. This remains true even when accounting for:

Cost Estimate: The Witcher 3 Translation

Here is a detailed cost analysis for translating The Witcher 3 subtitles using DeepL and Gemini 2.5 Flash-Lite:

Assumptions:

Cost Breakdown:

DeepL:

Gemini 2.5 Flash-Lite:

Service Estimated Cost (EUR) Estimated Cost (USD)
DeepL €135.00 $145.80
Gemini 2.5 Flash-Lite $2.16

Note: These are rough estimates. Actual costs depend on language pair, OCR accuracy, context settings, and cache effectiveness.

Disclaimer: Cost tracking is provided for reference purposes only. This is free software with no guarantees regarding cost accuracy. Users are responsible for monitoring their own API usage and costs through their provider's billing dashboard.

Built-in Cost Tracking

Game-Changing Translator includes comprehensive cost monitoring specifically designed for Gemini API usage:

Detailed API Call Example

Here's a real example of how the API call logging works, showing the complete translation process:

=== GEMINI API CALL LOG ===
Timestamp: 2025-07-06 17:19:03
Language Pair: fr -> en
Original Text: Vous avez manipulé des civilisations entières, provoqué des décennies de guerre, détruit Ziost... et pris la fuite.
Vous allez me dire pourquoi. CALL DETAILS: - Message Length: 695 characters - Word Count: 119 words - Line Count: 9 lines COMPLETE MESSAGE CONTENT SENT TO GEMINI: ---BEGIN MESSAGE--- <Translate idiomatically the third subtitle from French to English. Return translation only.> FRENCH: C'était mon objectif. Le reste... n'était qu'un moyen de parvenir à mes fins. FRENCH: Vous dites que vous avez fait tout ce chemin pour me trouver. Me voici. Que voulez-vous ? FRENCH: Vous avez manipulé des civilisations entières, provoqué des décennies de guerre, détruit Ziost... et pris la fuite.
Vous allez me dire pourquoi. ENGLISH: That was my goal. The rest... was merely a means to an end. ENGLISH: You say you came all this way to find me. Here I am. What do you want? ENGLISH: ---END MESSAGE--- RESPONSE RECEIVED: Timestamp: 2025-07-06 17:19:03 Call Duration: 0.385 seconds ---BEGIN RESPONSE--- You manipulated entire civilizations, caused decades of war, destroyed Ziost... and fled. You're going to tell me why. ---END RESPONSE--- TOKEN & COST ANALYSIS (CURRENT CALL): - Translated Words: 22 - Exact Input Tokens: 173 - Exact Output Tokens: 26 - Input Cost: $0.00001730 - Output Cost: $0.00001040 - Total Cost for this Call: $0.00002770 CUMULATIVE TOTALS (INCLUDING THIS CALL, FROM LOG START): - Total Translated Words (so far): 18460 - Total Input Tokens (so far): 213723 - Total Output Tokens (so far): 30987 - Total Input Cost (so far): $0.02137230 - Total Output Cost (so far): $0.01239480 - Cumulative Log Cost: $0.03376710 ========================================

This detailed logging is saved in the Gemini_API_call_logs.txt file. In the Settings tab, you'll find Total Words and Total Cost fields that display cumulative figures based solely on this log file. If the file is cleared or deleted, these totals will reset accordingly.

Configuration and Setup

  1. API Key Setup – Requires a Google AI Studio or Google Cloud account with Gemini API access. Go to Google AI Studio and click the "Get API key" button to set up an API key for Gemini models.
  2. Model Selection – Uses Gemini 2.5 Flash-Lite for optimal cost-quality balance.
  3. Context Window – Choose between:
  4. Enable API Log – Optional detailed logging for cost analysis and debugging (API calls are saved in Gemini_API_call_logs.txt).
  5. Enable Gemini file cache – Enable to reduce API calls for repeated content (translations are saved in gemini_cache.txt).
  6. Temperature Setting – This setting can only be changed manually in the ocr_translator_config.ini file. It is set at 0.0 by default (gemini_model_temp = 0.0), which is the recommended setting for consistent, deterministic translations.

Performance Optimization

Intelligent Caching System

Gemini API benefits from the same file caching system as DeepL and Google Translate. When caching is enabled, identical text segments are stored locally and retrieved without additional API calls. However, cache effectiveness depends on OCR consistency - even small recognition variations will trigger new API requests.

Cost Optimization Strategies:

Gaming and Large Project Applications

Ideal for Gaming Translation

Gemini API excels in gaming scenarios where context and narrative flow are crucial:

OpenAI API - A Flexible Alternative for Translation and OCR

💡
More Choice, More Power: The addition of OpenAI support provides users with greater flexibility. While Gemini models are recommended for their exceptional balance of cost and performance, OpenAI offers a powerful alternative, particularly for users who may already be invested in the OpenAI ecosystem or have specific use cases that benefit from its models.

The integration of OpenAI's models gives you another tier of advanced AI for your translation and OCR needs. This makes the application more versatile, allowing you to choose the provider that best fits your requirements.

Supported OpenAI Models & Recommendations

Three main OpenAI models are supported, each with distinct strengths:

Model Name Best For Translation Quality OCR Quality Cost
GPT 4.1 Mini Translation Excellent (handles context well) N/A (too expensive) High
GPT 4.1 Nano OCR Average (can struggle with context) Very Good High
GPT 5 Nano Balanced Use Average (can struggle with context) Good Low

For the best possible results with OpenAI, the recommended approach is to use a combination of models: use GPT 4.1 Nano for OCR and GPT 4.1 Mini for Translation. This leverages the strengths of each model, providing high-quality text recognition and superior, context-aware translation.

Performance and Cost Comparison: OpenAI vs. Gemini

While OpenAI models are highly capable, it's important to note that in testing, Gemini models—specifically Gemini 2.0 Flash—demonstrated better performance at a lower cost for both translation and OCR tasks.

The total cost for the Gemini setup was less than half that of the OpenAI combination. The higher cost for OpenAI comes from two areas. Firstly, the OCR model (GPT 4.1 Nano) uses more than twice the number of tokens to process the same images as Gemini 2.0 Flash. Secondly, the translation cost is also significantly higher because the only OpenAI model that offers excellent, context-aware translation quality, GPT 4.1 Mini, has a price per token that is four times that of Gemini 2.0 Flash. As for the GPT-5 Nano model, its reasoning capabilities are excellent. However, with the effort parameter set to minimal, the quality of its translation and OCR is merely average. Furthermore, increasing the effort even to the next low setting causes the token usage, cost, and latency to become too high for real-time translation. All of this makes Gemini the more cost-effective choice for sustained use.

Here are the results from a test translating the same game content with the 500ms scan interval:

Metric Gemini 2.0 Flash (OCR & Translation) OpenAI Combo (GPT 4.1 Nano OCR + GPT 4.1 Mini Translation)
OCR Cost per Hour ~$0.29 / hr ~$0.60 / hr
Translation Cost per Hour ~$0.02 / hr ~$0.07 / hr
Combined Cost per Hour ~$0.31 / hr ~$0.68 / hr

Overall, OpenAI models should be considered a strong fallback or alternative if for any reason Gemini models cannot be used. All OpenAI API usage can be tracked in the API Usage tab.

API Usage Monitoring

The API Usage tab provides comprehensive monitoring and cost analysis for both Gemini and OpenAI API usage, helping you track expenses and optimise your API consumption for both OCR and translation services.

API Usage Tab

This tab displays detailed statistics across several categories for each provider:

📊 OCR Statistics (Gemini & OpenAI)

🔄 Translation Statistics (Gemini & OpenAI)

💰 Combined API Statistics

📈 DeepL Usage Tracker

Statistics Management

The tab includes several management options:

Important Note: Statistics are based on files like GEMINI_API_OCR_short_log.txt and OpenAI_API_TRA_short_log.txt. Data will reset if these files are deleted.

Important: Cost tracking is provided for reference purposes only. You remain responsible for monitoring your own API usage and costs through your provider's official billing dashboard.

Other Translation Methods

MarianMT (offline and free)

  1. No API key required – completely free to use.
  2. Works entirely offline once models are downloaded.
  3. Models are downloaded automatically when first used (~500MB per language pair).
  4. Configure by:

MarianMT models are open-source neural machine translation systems that offer quite good translation quality. Whilst not quite reaching the standards of premium services like DeepL, they provide remarkably solid translations without any cost or internet requirement after the initial model download.

These models were originally designed for translating short, single sentences and would typically truncate longer passages. However, Game-Changing Translator implements a clever workaround to this limitation. The application automatically splits longer texts into individual sentences and processes them efficiently using batch translation. All sentences are processed together in a single, optimized model inference call, then seamlessly stitched back together, ensuring you receive complete translations regardless of text length.

This approach offers several practical advantages:

The Translation Beam Size (MarianMT) setting allows you to balance between speed and quality. Higher values (8–12) produce more refined translations but require more processing time, whilst lower values (1–4) prioritise speed over perfect phrasing.

⚠️
NOTE: The English-to-Polish model takes a bit longer to install when first selected, as it needs to be downloaded and converted from a different source than the other MarianMT models.

DeepL API

  1. Requires a DeepL account and API key.
  2. Offers premium-quality translations but supports fewer languages.
  3. Regarded by many as the industry leader in translation quality.
  4. The DeepL API Free plan allows for the translation of 500,000 characters per month free of charge (as at May, 2025).
  5. DeepL usage is tracked in the API Usage tab for comprehensive monitoring alongside other API services.
  6. Configure by:

Quality Options

DeepL offers two quality modes to suit different needs. The Classic model provides fast, high-quality translations that work with all supported language pairs. The Next-gen model uses DeepL's latest translation technology, which can deliver even better results for certain types of content, though it processes slightly slower and may not support all language pairs.

If you select Next-gen and your chosen language pair isn't supported, the application will automatically fall back to Classic mode to ensure your translation continues working seamlessly. Both options deliver top-quality results that DeepL is renowned for.

DeepL File Caching System

Game-Changing Translator implements a caching system for DeepL translations. Once a text segment has been translated, it's stored in the application's local cache (deepl_cache.txt). When the same text appears again, the application retrieves the translation from the cache instead of sending another API request.

It's important to understand that this caching mechanism relies entirely on OCR quality. For a cache match to occur, the OCR'd text must be identical – down to the last character – with what is stored in the cache. Even a single character difference will result in a new API call and translation. This means the actual efficiency of the cache depends heavily on consistent OCR results.

The cache can be helpful for gamers in specific scenarios. For instance, if you're playing a game where static menu options or repeated dialogue appear in exactly the same font, size, and screen position, the OCR is more likely to produce identical results each time. However, if text appears with different backgrounds, lighting, or slight position shifts, OCR variations will likely trigger new translations.

The cache persists between application sessions, but its practical benefit should be viewed as a helpful bonus rather than a major API-saving feature. The more consistent and clear the text presentation, the more likely you are to benefit from the caching system.

Google Translate API

  1. Requires a Google Cloud account and API key.
  2. Supports the widest range of languages.
  3. Good for general purpose translation with broad language coverage.
  4. Configure by:

Google Translate uses the same file caching system as DeepL. Please refer to the DeepL API section above for a detailed explanation of how the caching mechanism works, its dependencies on OCR quality, and its practical benefits and limitations. All the same considerations and notes apply to Google Translate's caching functionality.

Keyboard Shortcuts

These keyboard shortcuts are available:

Shortcut Function
~ (tilde) Start/Stop Translation
Alt+1 Toggle Source Window Visibility
Alt+2 Toggle Translation Window Visibility
Alt+S Save Settings
Alt+C Clear Translation Cache
Alt+L Clear Debug Log

Note: When the application is stopped (translation inactive), the translation window will be hidden automatically. When the application is started, the translation window will appear automatically. You can manually override this behaviour using the Alt+2 shortcut at any time.

Troubleshooting

If you encounter issues:

  1. Check the Debugging tab for error messages and the application log.
  2. Enable OCR Debugging in Settings to see what's being captured and recognised in the Debugging tab. Also view the OCR Preview window (accessible in the Settings tab via the Preview button).
  3. Adjust settings as needed:
  4. Consult the Troubleshooting Guide for common issues and solutions.

Tips and Best Practices

OCR Accuracy

For best OCR results:

  1. Capture clean, high-contrast text.
  2. Select appropriate source language.
  3. Adjust preprocessing mode to match text appearance – try Adaptive mode for difficult backgrounds.
  4. Resize the capture area to frame text closely but completely.
  5. Use a larger source area for more context if OCR is struggling.
  6. Enable Remove Trailing Garbage to clean up recognition artefacts.
  7. Adjust the confidence threshold to balance between capturing all text (lower values) and reducing errors (higher values).
  8. For small subtitles on changing backgrounds, experiment with Adaptive mode's Block Size and C Value parameters.

Performance Optimisation

  1. Use a smaller source capture area.
  2. Click Disable Debug Log in the Home tab.
  3. Increase Scan Interval in the Settings tab to reduce CPU usage.
  4. Disable OCR Debugging in the Settings tab.
  5. Set the Image Preprocessing Mode to None in the Settings tab.
  6. For MarianMT:
  7. For API models:

Practical Applications

  1. Games:

    🎮
    NOTE: Game-Changing Translator may not work correctly with some games in fullscreen mode.
    We recommend using borderless windowed mode, which is supported by most modern games.
  2. Videos: