📜 TOC
📖 User Manual · v4.1.1
GCT App Icon

Game-Changing
Translator

Next-generation AI translation for game and movie subtitles.
Powered by state-of-the-art OCR and superior context-aware engines, providing high-quality translation in over 100 languages.

Copyright © 2025–2026 Tomasz Kamiński  ·  Released 3 June 2026  ·  Built with ❤️ for gamers and language learners

Everything you need – nothing you don't

A focused, professional tool that does one thing exceptionally well: translate anything on your screen in real time using state-of-the-art AI.

🕹️

Simple & Custom Modes Free New v4

✨ The defining feature of version 4 ✨
Simple mode is designed for immediate operation – just pick your languages, position your overlays, and start translating.
Custom mode provides granular control over every aspect of the tool, allowing power users to fine-tune quality, performance, and cost.

🤖

AI-Powered OCR (Gemini & Gemma) Free

Industry-first AI text recognition. Use Gemini for ultimate precision or Gemma 4 (DeepInfra) for a high-performance budget option – delivering near-Gemini quality at 4x lower cost (~$0.16/hour for OCR).

🌍

AI Translation (Gemini & Gemma) Free

Multiple models for context-aware translation in 100+ languages. Choose Gemini 3 Flash for best results, or Gemma 4 (DeepInfra) for a fast, cost-effective alternative.

🎯

DeepL Translation PRO

High-quality, context-aware translation for over 100 languages. Delivers elite precision for Japanese, Chinese, and European scripts, as well as minority languages such as Welsh, Icelandic, Maori, and Burmese. Context subtitles are provided free and don't count towards your quota.

🧠

Sliding Context Window Free

Send up to 5 previous subtitles with every request. Maintains character names, grammatical flow, and narrative coherence across dialogue.

📊

Cost Monitoring Free

Real-time token-level analytics: cost per call, per minute, per hour, and cumulative cost for Google, DeepInfra and DeepL APIs.

Two-Tier Caching Free

In-memory LRU cache + optional file cache. Repeated phrases cost zero API credits – retrieved instantly from the disk.

🔍

Find Subtitles PRO New v4

Automatically scans the full screen for a set period to detect where subtitles appear, then locks the capture area.

↔️

Scan Wider PRO New v4

Dynamically expands the OCR capture area to prevent edge-of-frame word truncation and AI hallucinations from tight crops.

🪟

Target on Source PRO New v4

Automatically overlays the translation directly onto the original subtitle area for seamless, immersive reading. Even with this option disabled, a PRO user can drag the target window manually over the subtitles.

📝

Translation Prompt Free

Inject a custom instruction into every translation request. Define the tone, style, or game-specific context to ensure consistent character names and immersive dialogue.

✍️

OCR Prompt PRO New v4

Inject a custom instruction into every AI OCR call – ignore HUD elements, strip speaker names, focus on dialogue only.

🖥️

New Redesigned GUI New v4

Fully redesigned interface with Simple and Custom modes. Clean, responsive, and self-contained. Built-in High-DPI scaling ensures a perfectly crisp and consistent interface across all screen resolutions.

🔡

Native RTL Support Free

The advanced native RTL engine powered by PySide6. Flawless character shaping, cursive joining, and bidirectional rendering for Arabic, Hebrew, Pashto, and Persian. Punctuation and numbering are handled with pixel-perfect accuracy.

📜

Translation API Logs Free

Unparalleled transparency with dual-layer logging. Short logs provide a quick overview of costs, while Long logs capture the entire API exchange – including system prompts, context subtitles, and raw model responses.

🖼️

OCR API Logs Free

Comprehensive audit trail for every vision-based request. Track image metadata, token efficiency, and model latencies. Every recognition event is logged with its exact prompt, enabling precise debugging and cost monitoring.

⚠️

Licensing & API Costs – Important Note

Please distinguish between the GCT Software Licence and Third-Party API Costs:

  • Free Features: These are unlocked in the GCT software for everyone. However, using them requires a connection to the Google Gemini or DeepInfra (Gemma) API, which carries its own usage costs.
  • PRO Features: These require a one-time purchase of a GCT PRO Licence to unlock advanced functionality within the program. This fee covers only software access and does not include or cover any API costs.
  • Independent API Services: GCT is a professional interface for AI services provided by Google, DeepInfra, and DeepL. These are independent commercial entities. You are responsible for all costs incurred through their respective APIs.
  • No Affiliation: Game-Changing Translator and its author are entirely independent and have no affiliation with Google, DeepInfra, or DeepL. GCT is a tool designed to facilitate the use of these third-party paid services.
⚠️

Compatibility & Version 4 Requirements

Please review these technical requirements before proceeding:

  • API-Only Architecture: Version 4 is built entirely around third-party AI APIs. Operation requires a valid API key from Google or DeepInfra.
  • Stay on v3.9.6: If you do not have (and do not plan to obtain) an API key, you should not update to version 4. Please remain on version 3.9.6, which is the final release supporting free offline OCR (Tesseract) and offline translation models (MarianMT).
  • Try Before You Buy: Do not purchase the GCT PRO Licence before thoroughly testing the FREE version. Ensure the software works correctly on your system and that the AI-powered OCR and translation meet your expectations.

Table of Contents

  1. Quick Start
  2. Interface Overview
  3. Settings: Models & Context
  4. Translation Costs (💡 Best Value for Money)
  5. Settings: App Behaviour
  6. PRO Features In Depth
  7. Settings: Appearance
  8. Settings: Caching & Debugging
  9. Settings: Interface Language
  10. Custom Prompts
  11. API Usage Statistics
  12. Keyboard Shortcuts
  13. About & Updates
Getting Started

Up and running in 3 steps

The Simple mode gives you everything you need to start translating immediately.

Simple mode – main interface

The Simple mode – your everyday starting point

Japanese OCR example Japanese OCR example Japanese OCR example
1
Set your target language

Enter the
Target Language
(your preferred output language). The
Source Language
is auto-detected in the Simple mode.

2
Enter your API keys

Paste your
Gemini or DeepInfra API Key
and/or
DeepL API Key (optional)
The fields are masked – click Show to reveal. Saved with Save Settings or Alt+S.

3
Select areas and start

Use the
Source
button to position and resize the capture window, and the
Target
button to place the translation overlay. Press the green
Start (~)
button or hit ~.

💡 To get a Gemini API key, sign in to Google AI Studio and click the Get API key button.

To get a DeepInfra API key, sign up at DeepInfra.com, navigate to the Dashboard, and copy your API token from the API Keys tab.

To get a DeepL API key, set up your DeepL account, navigate to the API keys & limits tab, and click the Create key button.
💡 Both overlay windows can be hidden before translation starts; once the translation process begins, the target overlay will appear automatically.
Interface

Two modes – Simple and Custom

Switch between modes using the top tab bar. Simple mode is designed for hassle-free operation: settings tabs are hidden by default and all options use predefined, locked values optimized for performance – you only need to set your languages and keys. In Custom mode, the settings tabs are shown by default and all options are unlocked for full manual configuration. The application features full High-DPI awareness, automatically scaling its interface to remain perfectly crisp and consistent across any monitor resolution or DPI setting.

💡 In both Simple and Custom modes, you can toggle Settings on or off according to your preference.
💡 By clicking the / icon, you can hide or show the language and API key fields.

Custom mode – full settings panel

Custom mode – Settings sub-tab active

1
Top Tab Bar

Source and Target open transparent overlay windows that you can move and resize to define your capture and translation areas directly on screen.

2
Language & API Key Fields

Shared between Simple and Custom modes. Select your source and target languages from the searchable dropdown menus. You must provide at least one API key (Gemini or DeepInfra) to power the OCR and translation engines. The DeepL API key is optional and only needed if you choose DeepL for translations.

3
Start / Stop Button

Large green button (or ~ key). Launches the full OCR → translate → display pipeline. Press again to stop.

4
Settings Sub-tabs

Settings · Custom Prompt · API Usage · Shortcuts · About – these panels are detailed below.
💡 Available in both modes:
In Simple, options are predefined and locked.
In Custom, they are fully configurable.

Settings Tab

Models & Context

Choose AI models for OCR and translation independently, and set how much
previous dialogue the AI remembers between subtitle lines.

Models and Context settings

Models and Context – first group in the Settings tab

1
Translation Model Free

Selects the model used for translation. Gemini 3.1 Flash-Lite is the fastest Google option, while Gemini 3 Flash offers the most idiomatic results. Alternatively, Gemma 4 (DeepInfra) provides a highly capable and cost-effective choice for general gameplay.

1
DeepLPRO

To select the DeepL model, a PRO licence is required.
DeepL delivers high-quality, natural-sounding translations.
Its free tier offers a quota of 500,000 characters/month. Context subtitles sent to DeepL are free and don't count against your monthly limit.

2
OCR Model Free

Selects the model for text recognition. Gemini 3.1 Flash-Lite (Low) is a solid choice, but Gemma 4 (DeepInfra) is the ultimate budget alternative – it is 4x cheaper than Gemini 3.1 Flash-Lite while maintaining near-perfect accuracy (minor typos may occur with very ornate fonts).

3
Context Window Free

Number of previous subtitles sent with every translation request (0–5). A window of 5 gives the AI full narrative context, dramatically improving pronoun consistency and idiomatic flow.

💡 Pro Tip: Best Value for Money

For the optimal balance of quality and cost, we recommend using Gemini 3 Flash or Gemini 3.1 Flash-Lite for translation and Gemma 4 for OCR. Since OCR is the primary cost driver (5-6x more expensive than translation), switching to Gemma for screen scanning reduces your total hourly cost by approximately 3 times.

Recommended (Best Value)

• Translation: Gemini 3 Flash
• OCR: Gemma 4 (DeepInfra)
• Estimated cost: ~$0.29 / hour

Maximum Savings

• Translation: Gemma 4
• OCR: Gemma 4 (DeepInfra)
• Estimated cost: ~$0.18 / hour

Disclaimer: Costs are estimates based on a 750ms Scan Interval and are provided for illustrative purposes only. Actual costs depend on screen content, token density, and current API pricing. You are solely responsible for monitoring and managing your own API usage and billing through your official Google or DeepInfra provider accounts.

💰 Model Pricing Comparison (per 1M tokens)

Model Provider / Name Input (Image / Text) Output (OCR / Translation)
Gemma 4 26B A4B (DeepInfra) $0.07 $0.34
Gemini 3.1 Flash-Lite (Google) $0.25 $1.50
Gemini 3 Flash (Google) $0.50 $3.00

⚠️ Note on OCR Resolution (LOW vs MEDIUM):
Although the price per token is identical, the MEDIUM resolution setting captures more pixels and sends a larger data payload per image compared to LOW. Consequently, each screenshot in MEDIUM mode consumes more tokens, resulting in higher actual hourly expenses. Stick to LOW resolution unless you encounter extremely complex or tiny fonts.

*Pricing data verified as of May 2026.

Settings Tab

App Behaviour

Control how the application captures, processes, and displays content.
PRO features are marked with PRO.

App Behaviour settings

App Behaviour – second group in the Settings tab

1
Find Subtitles PRO

Automatically scans the full screen for a set period to detect where subtitles appear, then locks the capture area. During the scan window (default 120 seconds), the frame continuously expands to fit the widest subtitle seen. Once the period ends, the frame stabilises at those maximum dimensions. You can adjust the scan time with the / + buttons next to the checkbox.

2
Scan Wider PRO

Expands the captured area beyond its visible frame boundary before sending to the OCR model – by a configurable percentage (default 100%). This prevents edge words from being clipped and eliminates hallucinations caused by tight crops. The expansion is clamped to the actual screen dimensions.

3
Target on Source PRO

Automatically positions the translation overlay directly over the source capture window, so the translation appears exactly where the original text is.
💡 Even when this option is disabled, a PRO user can drag the target window manually to any position – including directly on top of the original text.

4
Keep Linebreaks Free

Preserves the original line structure of multi-line subtitles in the translated output – essential for cinematic subtitles where dialogue from two different characters is displayed simultaneously using dashes.

5
Scan Interval & Clear Timeout Free

Scan Interval (ms) – how often the screen is captured (default 500 ms). Clear Translation Timeout (s) – seconds before the overlay clears when source text disappears (default 3 s).

See the difference

Three screenshots from The Witcher 3 show how Find Subtitles progressively adapts the capture frame to fit the widest subtitle seen during the scan period, while Target on Source overlays the translation directly on the subtitle bar.

Find Subtitles – Frame Grows to Fit the Longest Subtitle

Short subtitle – frame is small
Short subtitle: "Are you... immortal?" – the frame is compact; translation overlay covers the text exactly
Longer subtitle – frame has not yet expanded
Subtitle grows longer: "Depends on what you mean by that. Yes, I cannot be killed." – the subtitle now extends beyond the previous frame; the overlay hasn't adjusted yet
Longest subtitle – frame stabilises
Longest subtitle yet: the frame has grown to its maximum width and stabilised – all future subtitles fit within it

🔍 Find Subtitles – how the scan period works

When enabled, Find Subtitles runs an active detection phase for a configurable duration (default 120 seconds). During this window the application continuously scans the source area for subtitle text and expands the capture frame to the largest bounding box seen so far. After the scan period ends, the frame is locked at those dimensions – large enough to contain the widest subtitle encountered, ensuring that every subsequent line is fully captured without manual adjustment.
💡 All PRO features can be used in any combination or entirely independently.

↔️ Scan Wider

Subtitles that extend close to the edge of the source rectangle can get clipped – causing the OCR model to see an incomplete word and hallucinate the rest. Scan Wider solves this by temporarily expanding the captured area beyond the visible frame boundary before sending the image to the OCR model. The percentage is configurable (default 100%). The expansion is clamped to the actual screen dimensions to avoid capturing outside the display.
💡 All PRO features can be used in any combination or entirely independently.

🪟 Target on Source

Normally, the translation overlay appears in a separate window that you position manually. With Target on Source enabled, the translation overlay is automatically placed over the source capture area – so the translated text appears directly on top of the original subtitle bar. Note: even with this option disabled, PRO users can drag the target window manually to any position, including directly over the subtitle area. The automatic mode simply makes this effortless – no dragging required.
💡 All PRO features can be used in any combination or entirely independently.

Settings Tab

Appearance

Customise the look of both overlay windows to blend with any game. Colour pickers are PRO-only – fonts and opacity are available to all users.

Appearance settings

Appearance – third group in the Settings tab

1
Source Area Colour PRO

Background colour of the source capture window. Click Choose Colour to open the native colour picker.

2
Target Area Colour PRO

The background colour of the translation overlay window.

3
Target Text Colour PRO

Font colour of the translated text in the overlay.

4
Font Type & Size Free

Enter any font installed on your system (e.g. Arial, Calibri). Adjust the size with / +.

5
Opacity Background / Text Free

Set the transparency of the overlay background and text independently (0.0 = fully transparent, 1.0 = fully opaque). Ideal for overlaying text on dark subtitle bars.

Settings Tab

Caching & Debugging

File caching saves translations to disk – repeated phrases cost zero API credits. Debug logging captures the full OCR and translation pipeline for troubleshooting.

Caching and Debugging settings

Caching and Debugging – bottom of the Settings tab

1
Enable DeepL file cache Free

Saves all DeepL translations to deepl_cache.txt. Cache hits don't count against the monthly character quota.

2
Enable Gemini file cache Free

Saves all Gemini translations to gemini_cache.txt. When identical OCR text appears again, the cached translation is returned instantly – no API call made.

3
Enable DeepInfra file cache Free

Saves all DeepInfra translations to deepinfra_cache.txt. Works identically to the Gemini cache, ensuring zero API costs for repeated phrases.

4
Enable Debug Log Free

Writes application status messages, thread events, API connection info, and error reports to translator_debug.log. This file captures operational diagnostics – not API call details (those are in the dedicated log files described in the API Usage section). Disable during normal gameplay for maximum performance.

5
Clear Buttons Free

Clear File Caches – wipes both cache files.
Clear Translation Cache – clears in-memory LRU only.
Clear Debug Log – empties the log file.

Settings Tab

Interface Language

Interface Language setting

Interface Language – bottom of the Settings tab

1
Interface Language Free

Switch the application UI between English and Polish. Changes take effect immediately without restarting.

Custom Prompt Tab

Custom Prompts

Inject custom instructions into every API call – for both translation and OCR. Prompts are saved to the disk and applied automatically once enabled.

Translation Prompt Free

Translation Prompt

Translation Prompt – Custom Prompt tab

1
Translation Prompt Free

Prepended to every translation instruction sent to Gemini and DeepL. Use it to define tone, style, or game-specific context – e.g. "You are translating a Star Wars game."

2
Enabled / Save / Reload

Toggle the setting without losing your saved text. Save writes to custom_prompt.txt. Reload re-reads from disk – useful if you edit the file externally.

⚠️ Sent with every API call. Keep it as short as possible to minimise costs. Write in English for best results.

OCR Prompt PRO

OCR Prompt

OCR Prompt – Custom Prompt tab

1
OCR Prompt PRO

A custom instruction injected into every AI OCR request – separate from the translation prompt. Use it to filter noise, ignore HUD elements, or enforce correct reading of ambiguous characters.

OCR Prompt in depth

The OCR Prompt gives you direct control over what the AI model reads from the screen. The example below shows it filtering a complex GTA-style game scene: despite a busy HUD with ammo counters, a minimap, and on-screen labels, the AI returns only the dialogue subtitle – and strips the speaker’s name as instructed.

OCR Prompt example – GTA-style game with complex HUD
Result: "Bang. Brain all up the walls." – the speaker name "Vincenzo:" is stripped, and all HUD elements are ignored. The debug logs confirm the exact OCR output.

✍️ How it works

The OCR Prompt is injected into every AI OCR request as an additional instruction, on top of the standard transcription command. It is completely separate from the Translation Prompt. You can describe in plain English what the AI should focus on or ignore – but results depend on the model's ability to follow instructions and will vary. This is a prompt-engineering exercise, not a deterministic setting: expect to experiment, refine, and test across different scenes before finding what works for your game. The prompt is saved to disk and reloaded automatically on startup.

💡 Example prompt used above: "CRITICAL: Ignore all UI elements, status lines, HUD, headers, inventory text, or player choice menus. Ignore ALL text in the image that is not a dialogue subtitle. Return the subtitles only. Ignore the speaker’s name displayed in CAPITALS, e.g.: For ‘ENZO: Let’s go’, return only: ‘Let’s go’."
API Usage Tab

API Usage Statistics

Real-time, per-call cost tracking for every AI and DeepL request. The API Usage tab contains four panels – scroll through them in the order shown below.

⚠️ Important Note: Statistics are based on the short log files (e.g., Gemini_OCR_Short_Log.txt, DeepInfra_OCR_Short_Log.txt, etc.). Data will reset if these files are deleted. Cost tracking is provided for reference purposes only. You remain solely responsible for monitoring your own API usage and costs through your own official billing accounts (Google, DeepInfra, or DeepL).
Gemini Statistics

Panel 1 – Gemini Statistics

1
Gemini calls & cost

Total calls, token usage, and cumulative cost for the Gemini API. Tracks both translation and OCR tasks performed by Google models.

DeepInfra Statistics

Panel 2 – DeepInfra Statistics

1
DeepInfra calls & cost

Statistics for DeepInfra models (such as Gemma). This panel tracks the significantly lower costs associated with the budget-friendly DeepInfra pipeline.

Combined Statistics

Panel 3 – Combined Statistics

1
Total API cost

Combines Google and DeepInfra pipelines into a single cost figure, providing an overview of total costs across different configurations.

DeepL Usage Tracker

Panel 4 – DeepL Usage Tracker

1
Monthly character quota

DeepL's free tier allows 500,000 characters/month. The tracker queries your account balance live from the DeepL API. Context subtitles are free and don't count against this quota.

Detailed Log Files

In addition to the on-screen statistics panels, the application writes detailed log files to the installation directory. These files contain the complete record of every API call made during all sessions. Depending on the model you use, entries are written to either Gemini, DeepInfra, or DeepL log files.

S
OCR Short Logs

Gemini_OCR_Short_Log.txt
DeepInfra_OCR_Short_Log.txt
One entry per call: model, cost, duration, and text result.

L
OCR Long Logs

Gemini_OCR_Long_Log.txt
DeepInfra_OCR_Long_Log.txt
Full record including the complete prompt and raw API response.

S
Translation Short Logs

Gemini_..., DeepInfra_..., or DeepL_...
Detailed statistics and the final translated text result.

L
Translation Long Logs

Full record of the translation pipeline including system prompts and context.

OCR API call log entry

Example OCR log entry – Gemini_OCR_Short_Log.txt

Translation API call log entry

Example Translation log entry – Gemini_Translation_Short_Log.txt

Shortcuts Tab

Keyboard Shortcuts

To improve workflow, the application supports both global and local keyboard shortcuts. Shortcuts marked with (G) are global – they remain active even when the application is minimised or you are playing a full-screen game.

Keyboard shortcuts

Shortcuts – visible in the Shortcuts sub-tab

1
Customizing Shortcuts

To modify a shortcut, click its button in the Shortcuts tab and press your desired key combination. The system will automatically capture the input. If a shortcut is already assigned to another function, the button will turn red and display an "Already used" message. Note that certain system-critical combinations may be forbidden to prevent conflicts with Windows operations.

2
Global vs Local (G)

Shortcuts marked with (G) are Global (Start/Stop, Window Visibility, Take Screenshot). They work from anywhere, even during gameplay. All other shortcuts (Save, Clear Cache, Reset, etc.) are Local and only work when the translator window is active.

💡 Global shortcuts work most reliably when the application is run as an administrator. This allows the system to capture inputs even when high-privilege windows (like Task Manager or certain games are in focus).
About Tab

About & Updates

Version information, one-click update checking, and PRO licence activation – all in the About sub-tab.

Version info Updates and PRO licence

About tab – version, updates, and PRO activation

1
Version & Release Date

Displays the installed version for quick reference.

2
Check for Updates

Queries the GitHub API for the latest release. If a newer version is available, it is downloaded to a staging directory and applied automatically.

👑
PRO Licence Activation

Enter your PRO licence key and click Activate PRO Key to unlock all PRO features: DeepL, Find Subtitles, Scan Wider, Target on Source, colour pickers, and OCR Prompt.