Next-generation AI translation for game and movie subtitles.
Powered by state-of-the-art OCR and superior context-aware engines, providing high-quality translation in over 100 languages.
A focused, professional tool that does one thing exceptionally well: translate anything on your screen in real time using state-of-the-art AI.
✨ The defining feature of version 4 ✨
Simple mode is designed for immediate operation – just pick your languages, position your overlays, and start translating.
Custom mode provides granular control over every aspect of the tool, allowing power users to fine-tune quality, performance, and cost.
Industry-first AI text recognition. Use Gemini for ultimate precision or Gemma 4 (DeepInfra) for a high-performance budget option – delivering near-Gemini quality at 4x lower cost (~$0.16/hour for OCR).
Multiple models for context-aware translation in 100+ languages. Choose Gemini 3 Flash for best results, or Gemma 4 (DeepInfra) for a fast, cost-effective alternative.
High-quality, context-aware translation for over 100 languages. Delivers elite precision for Japanese, Chinese, and European scripts, as well as minority languages such as Welsh, Icelandic, Maori, and Burmese. Context subtitles are provided free and don't count towards your quota.
Send up to 5 previous subtitles with every request. Maintains character names, grammatical flow, and narrative coherence across dialogue.
Real-time token-level analytics: cost per call, per minute, per hour, and cumulative cost for Google, DeepInfra and DeepL APIs.
In-memory LRU cache + optional file cache. Repeated phrases cost zero API credits – retrieved instantly from the disk.
Automatically scans the full screen for a set period to detect where subtitles appear, then locks the capture area.
Dynamically expands the OCR capture area to prevent edge-of-frame word truncation and AI hallucinations from tight crops.
Automatically overlays the translation directly onto the original subtitle area for seamless, immersive reading. Even with this option disabled, a PRO user can drag the target window manually over the subtitles.
Inject a custom instruction into every translation request. Define the tone, style, or game-specific context to ensure consistent character names and immersive dialogue.
Inject a custom instruction into every AI OCR call – ignore HUD elements, strip speaker names, focus on dialogue only.
Fully redesigned interface with Simple and Custom modes. Clean, responsive, and self-contained. Built-in High-DPI scaling ensures a perfectly crisp and consistent interface across all screen resolutions.
The advanced native RTL engine powered by PySide6. Flawless character shaping, cursive joining, and bidirectional rendering for Arabic, Hebrew, Pashto, and Persian. Punctuation and numbering are handled with pixel-perfect accuracy.
Unparalleled transparency with dual-layer logging. Short logs provide a quick overview of costs, while Long logs capture the entire API exchange – including system prompts, context subtitles, and raw model responses.
Comprehensive audit trail for every vision-based request. Track image metadata, token efficiency, and model latencies. Every recognition event is logged with its exact prompt, enabling precise debugging and cost monitoring.
Please distinguish between the GCT Software Licence and Third-Party API Costs:
Please review these technical requirements before proceeding:
The Simple mode gives you everything you need to start translating immediately.
Enter the
Target Language
(your preferred output language). The
Source Language
is auto-detected in the Simple mode.
Paste your
Gemini or DeepInfra API Key
and/or
DeepL API Key (optional)
The fields are masked – click Show to reveal. Saved with Save Settings or Alt+S.
Use the
Source
button to position and resize the capture window, and the
Target
button to place the translation overlay. Press the green
Start (~)
button or hit ~.
Switch between modes using the top tab bar. Simple mode is designed for hassle-free operation: settings tabs are hidden by default and all options use predefined, locked values optimized for performance – you only need to set your languages and keys. In Custom mode, the settings tabs are shown by default and all options are unlocked for full manual configuration. The application features full High-DPI awareness, automatically scaling its interface to remain perfectly crisp and consistent across any monitor resolution or DPI setting.
💡 In both Simple and Custom modes, you can toggle Settings on or off according to your preference.
💡 By clicking the /
icon, you can hide or show the language and API key fields.
Custom mode – Settings sub-tab active
Source and Target open transparent overlay windows that you can move and resize to define your capture and translation areas directly on screen.
Shared between Simple and Custom modes. Select your source and target languages from the searchable dropdown menus. You must provide at least one API key (Gemini or DeepInfra) to power the OCR and translation engines. The DeepL API key is optional and only needed if you choose DeepL for translations.
Large green button (or ~ key). Launches the full OCR → translate → display pipeline. Press again to stop.
Settings · Custom Prompt · API Usage · Shortcuts · About – these panels are detailed below.
💡 Available in both modes:
In Simple, options are predefined and locked.
In Custom, they are fully configurable.
Choose AI models for OCR and translation independently, and set how much
previous dialogue the AI remembers between subtitle lines.
Models and Context – first group in the Settings tab
Selects the model used for translation. Gemini 3.1 Flash-Lite is the fastest Google option, while Gemini 3 Flash offers the most idiomatic results. Alternatively, Gemma 4 (DeepInfra) provides a highly capable and cost-effective choice for general gameplay.
To select the DeepL model, a PRO licence is required.
DeepL delivers high-quality, natural-sounding translations.
Its free tier offers a quota of 500,000 characters/month. Context subtitles sent to DeepL are free and don't count against your monthly limit.
Selects the model for text recognition. Gemini 3.1 Flash-Lite (Low) is a solid choice, but Gemma 4 (DeepInfra) is the ultimate budget alternative – it is 4x cheaper than Gemini 3.1 Flash-Lite while maintaining near-perfect accuracy (minor typos may occur with very ornate fonts).
Number of previous subtitles sent with every translation request (0–5). A window of 5 gives the AI full narrative context, dramatically improving pronoun consistency and idiomatic flow.
For the optimal balance of quality and cost, we recommend using Gemini 3 Flash or Gemini 3.1 Flash-Lite for translation and Gemma 4 for OCR. Since OCR is the primary cost driver (5-6x more expensive than translation), switching to Gemma for screen scanning reduces your total hourly cost by approximately 3 times.
• Translation: Gemini 3 Flash
• OCR: Gemma 4 (DeepInfra)
• Estimated cost: ~$0.29 / hour
• Translation: Gemma 4
• OCR: Gemma 4 (DeepInfra)
• Estimated cost: ~$0.18 / hour
Disclaimer: Costs are estimates based on a 750ms Scan Interval and are provided for illustrative purposes only. Actual costs depend on screen content, token density, and current API pricing. You are solely responsible for monitoring and managing your own API usage and billing through your official Google or DeepInfra provider accounts.
| Model Provider / Name | Input (Image / Text) | Output (OCR / Translation) |
|---|---|---|
| Gemma 4 26B A4B (DeepInfra) | $0.07 | $0.34 |
| Gemini 3.1 Flash-Lite (Google) | $0.25 | $1.50 |
| Gemini 3 Flash (Google) | $0.50 | $3.00 |
⚠️ Note on OCR Resolution (LOW vs MEDIUM):
Although the price per token is identical, the MEDIUM resolution setting captures more pixels and sends a larger data payload per image compared to LOW. Consequently, each screenshot in MEDIUM mode consumes more tokens, resulting in higher actual hourly expenses. Stick to LOW resolution unless you encounter extremely complex or tiny fonts.
*Pricing data verified as of May 2026.
Control how the application captures, processes, and displays content.
PRO features are marked with PRO.
App Behaviour – second group in the Settings tab
Automatically scans the full screen for a set period to detect where subtitles appear, then locks the capture area. During the scan window (default 120 seconds), the frame continuously expands to fit the widest subtitle seen. Once the period ends, the frame stabilises at those maximum dimensions. You can adjust the scan time with the – / + buttons next to the checkbox.
Expands the captured area beyond its visible frame boundary before sending to the OCR model – by a configurable percentage (default 100%). This prevents edge words from being clipped and eliminates hallucinations caused by tight crops. The expansion is clamped to the actual screen dimensions.
Automatically positions the translation overlay directly over the source capture window, so the translation appears exactly where the original text is.
💡 Even when this option is disabled, a PRO user can drag the target window manually to any position – including directly on top of the original text.
Preserves the original line structure of multi-line subtitles in the translated output – essential for cinematic subtitles where dialogue from two different characters is displayed simultaneously using dashes.
Scan Interval (ms) – how often the screen is captured (default 500 ms). Clear Translation Timeout (s) – seconds before the overlay clears when source text disappears (default 3 s).
Customise the look of both overlay windows to blend with any game. Colour pickers are PRO-only – fonts and opacity are available to all users.
Appearance – third group in the Settings tab
Background colour of the source capture window. Click Choose Colour to open the native colour picker.
The background colour of the translation overlay window.
Font colour of the translated text in the overlay.
Enter any font installed on your system (e.g. Arial, Calibri). Adjust the size with – / +.
Set the transparency of the overlay background and text independently (0.0 = fully transparent, 1.0 = fully opaque). Ideal for overlaying text on dark subtitle bars.
File caching saves translations to disk – repeated phrases cost zero API credits. Debug logging captures the full OCR and translation pipeline for troubleshooting.
Caching and Debugging – bottom of the Settings tab
Saves all DeepL translations to deepl_cache.txt. Cache hits don't count against the monthly character quota.
Saves all Gemini translations to gemini_cache.txt. When identical OCR text appears again, the cached translation is returned instantly – no API call made.
Saves all DeepInfra translations to deepinfra_cache.txt. Works identically to the Gemini cache, ensuring zero API costs for repeated phrases.
Writes application status messages, thread events, API connection info, and error reports to translator_debug.log. This file captures operational diagnostics – not API call details (those are in the dedicated log files described in the API Usage section). Disable during normal gameplay for maximum performance.
Clear File Caches – wipes both cache files.
Clear Translation Cache – clears in-memory LRU only.
Clear Debug Log – empties the log file.
Interface Language – bottom of the Settings tab
Switch the application UI between English and Polish. Changes take effect immediately without restarting.
Inject custom instructions into every API call – for both translation and OCR. Prompts are saved to the disk and applied automatically once enabled.
Translation Prompt – Custom Prompt tab
Prepended to every translation instruction sent to Gemini and DeepL. Use it to define tone, style, or game-specific context – e.g. "You are translating a Star Wars game."
Toggle the setting without losing your saved text. Save writes to custom_prompt.txt. Reload re-reads from disk – useful if you edit the file externally.
OCR Prompt – Custom Prompt tab
A custom instruction injected into every AI OCR request – separate from the translation prompt. Use it to filter noise, ignore HUD elements, or enforce correct reading of ambiguous characters.
The OCR Prompt gives you direct control over what the AI model reads from the screen. The example below shows it filtering a complex GTA-style game scene: despite a busy HUD with ammo counters, a minimap, and on-screen labels, the AI returns only the dialogue subtitle – and strips the speaker’s name as instructed.
The OCR Prompt is injected into every AI OCR request as an additional instruction, on top of the standard transcription command. It is completely separate from the Translation Prompt. You can describe in plain English what the AI should focus on or ignore – but results depend on the model's ability to follow instructions and will vary. This is a prompt-engineering exercise, not a deterministic setting: expect to experiment, refine, and test across different scenes before finding what works for your game. The prompt is saved to disk and reloaded automatically on startup.
Real-time, per-call cost tracking for every AI and DeepL request. The API Usage tab contains four panels – scroll through them in the order shown below.
Gemini_OCR_Short_Log.txt, DeepInfra_OCR_Short_Log.txt, etc.). Data will reset if these files are deleted. Cost tracking is provided for reference purposes only. You remain solely responsible for monitoring your own API usage and costs through your own official billing accounts (Google, DeepInfra, or DeepL).
Panel 1 – Gemini Statistics
Total calls, token usage, and cumulative cost for the Gemini API. Tracks both translation and OCR tasks performed by Google models.
Panel 2 – DeepInfra Statistics
Statistics for DeepInfra models (such as Gemma). This panel tracks the significantly lower costs associated with the budget-friendly DeepInfra pipeline.
Panel 3 – Combined Statistics
Combines Google and DeepInfra pipelines into a single cost figure, providing an overview of total costs across different configurations.
Panel 4 – DeepL Usage Tracker
DeepL's free tier allows 500,000 characters/month. The tracker queries your account balance live from the DeepL API. Context subtitles are free and don't count against this quota.
In addition to the on-screen statistics panels, the application writes detailed log files to the installation directory. These files contain the complete record of every API call made during all sessions. Depending on the model you use, entries are written to either Gemini, DeepInfra, or DeepL log files.
Gemini_OCR_Short_Log.txtDeepInfra_OCR_Short_Log.txt
One entry per call: model, cost, duration, and text result.
Gemini_OCR_Long_Log.txtDeepInfra_OCR_Long_Log.txt
Full record including the complete prompt and raw API response.
Gemini_..., DeepInfra_..., or DeepL_...
Detailed statistics and the final translated text result.
Full record of the translation pipeline including system prompts and context.
Example OCR log entry – Gemini_OCR_Short_Log.txt
Example Translation log entry – Gemini_Translation_Short_Log.txt
To improve workflow, the application supports both global and local keyboard shortcuts. Shortcuts marked with (G) are global – they remain active even when the application is minimised or you are playing a full-screen game.
Shortcuts – visible in the Shortcuts sub-tab
To modify a shortcut, click its button in the Shortcuts tab and press your desired key combination. The system will automatically capture the input. If a shortcut is already assigned to another function, the button will turn red and display an "Already used" message. Note that certain system-critical combinations may be forbidden to prevent conflicts with Windows operations.
Shortcuts marked with (G) are Global (Start/Stop, Window Visibility, Take Screenshot). They work from anywhere, even during gameplay. All other shortcuts (Save, Clear Cache, Reset, etc.) are Local and only work when the translator window is active.
Version information, one-click update checking, and PRO licence activation – all in the About sub-tab.
About tab – version, updates, and PRO activation
Displays the installed version for quick reference.
Queries the GitHub API for the latest release. If a newer version is available, it is downloaded to a staging directory and applied automatically.
Enter your PRO licence key and click Activate PRO Key to unlock all PRO features: DeepL, Find Subtitles, Scan Wider, Target on Source, colour pickers, and OCR Prompt.