Releases: machinewrapped/gpt-subtrans
Fixed Edit Instructions
Fixed a regression where the "Edit Instructions" button would reset the instructions to default.
- Gemini could potentially return multiple response candidates so try to at least choose one that has content. Also handles responses with no content more gracefully, if there are none that do. This probably implies Gemini is censoring the request for that batch.
Changed "Local Server" to "Custom Server"
NEW: repackaged the Windows zip because of a false positive on some Antivirus software. Includes some tweaks to the themes.
Renamed "Local Server" provider to "Custom Server" - it was never a requirement that the server be local, so this makes it clearer that the provider can be used with any OpenAI-compatible API.
Added a max_completion_tokens option for Custom Server, since OpenAI are no longer accepting max_tokens for some of their own models. You should probably set one or the other or neither, not both.
Plus several arguably more important fixes for non-Windows platforms:
- Fixed the MacOS package builder
- Updated to latest PySide6 GUI modules
- Force the Qt theme to Fusion Light, for cross-platform compatibility with app themes
- Added a light Large theme
Removed boto3 from the packaged build to reduce the size - Bedrock is pretty niche, and if you can handle setting up AWS then you can definitely handle installing gpt-subtrans from source!
Fixed retries, updated instructions
Some of the APIs are unreliable at the moment so requests quite often need to be retried. I've cleaned up the retry mechanism for OpenAI-based clients (which includes DeepSeek) and fixed the Gemini retry so that both will retry in the event of the common API failures I am seeing.
I also added a reuse_client option for DeepSeek that defaults to False, meaning that a new connection will be established for each translation request. I noticed that the first request succeeds much more often than the second+ - creating a new client for each request seems to improve the odds of it being successful.
I also updated the custom instructions for OCR Errors and Whisper to match the format of the newer default instructions, as in my experience it seems to produce better results than the older instructions.
A MacOS version will be provided if I can get it to build, otherwise check previous versions to find one or install from source.
Support for reasoning models
- Enables OpenAI's reasoning models, o1 and o3, for translation.
- Switched to Google's newer genai SDK to enable support for Flash Thinking models
- Extracts reasoning content from the deepseek-reasoner model.
- Adds a new option under the OpenAI provider settings, reasoning effort - options are low, medium and high.
- Adds a max_toxens setting for DeepSeek, enabling generation of the full 8192 output tokens the model supports, rather than the default 4096.
- Changed the default temperature for DeepSeek to 1.3, which is what they recommend for translation (not sure why, seems high to me!)
Since reasoning models are tuned for reasoning they may score lower for "creative" tasks like translation. However, they may perform better with subtitles that contain OCR or transcription errors.
Note that there are additional costs with reasoning models in the form of reasoning tokens that do not form part of the final translations. The number of reasoning tokens generated is extracted from the response when available and can be viewed by double-clicking a translated batch and selecting the "Response" tab.
If the reasoning content is included in the response there will also be a Reasoning tab so you can see how the model thought about translating the subtitles with the provided context. This is currently only available for DeepSeek as the other providers do not expose the model's reasoning in the API.
Line 35: "一向是穿寬鞋賣大布" → "have always walked tall and proud," The idiom here is tricky. "穿寬鞋賣大布" literally is "wear wide shoes and sell big cloth," but idiomatically, it might mean acting confidently or with swagger.
Line 59: "2." – probably a scene number or a typo. Since subtitles don't usually have just numbers, maybe it's a misplaced line or part of a previous sentence. Maybe it's a misread of "二" (two), but in context, perhaps it's a card in a game, like "Two." But the next lines are about gambling ("梭" is "all-in" in poker). So maybe line #59 is "2." referring to a card, so translated as "Two."
Line 66: "今晚可以財色兼收了" – "財色兼收" means to obtain both wealth and beauty. So "Tonight I'll get both wealth and women!" fits the character's boastful tone.
o1 and o3 support a reasoning effort setting (available in the GUI settings when a reasoning model is selected). Higher effort implies higher hidden costs in the form of reasoning tokens.
NOTE: No MacOS release for this version due to PyInstaller issues again. Use version 1.0.4 or install from source.
Fixed project model selection reset
tl;dr - changing the model in project settings should work correctly now.
Fixed selected model being reset when opening project settings.
Fixed selected model not being updated in project settings immediately, causing the selection to be ignored if it was reselected.
Fixed provider/model changes not being committed when using "Translate Selection" with project settings open.
Fixed invalid index being set if the model saved in a project is no longer available from the provider.
Added Amazon Bedrock (AWS) as a provider
Made Amazon Bedrock available as a translation provider. Note that this is NOT recommended for most users. The AWS setup process is a lot more complicated than getting an API key for other providers, and the available models (which depend on the AWS region) may or may not be able to handle translation tasks well or at all.
However, if you are used to using AWS and Bedrock, the option is now there. Let us know in the discussions section if you find any of the models it offers particularly good for translation.
It won't be receiving much official support, so any fixes or improvements might be best handled by the community of people who actually use it (if any!).
Added DeepSeek and Mistral providers
Support added for:
DeepSeek
Uses the OpenAI client library, but making it a separate provider allows saving a separate API key etc.
Only two models are currently exposed via the API, and deepseek-chat is probably the only one that's suitable for translation. It seems a little bit limited as a translator, with awkward translations and some mangled lines, but it is at least very cheap!
Mistral
Uses the MistralAI Python library.
Mistral offer a range of models, most of which are very specialised and presumably not suitable for translation (e.g. the code models).
open-mistral-nemo is the default if none is specified but it seems quite unreliable, and does not do well with larger batch sizes. mistral-large-latest may be a better choice.
Additionally:
OpenAI
The OpenAI provider now filters out vision, audio and realtime models since these are not suitable for translation.
Conversely, if no GPT models are found in the model list it will be returned unfiltered, so that it can be used with any provider that offers an OpenAI-compatible API by setting the base URL.
Claude
Model list has been updated to the latest available models.
v1.0.1
No major changes in this release, just a bug fix, an update to the default instructions and the latest versions of dependent libraries.
Since the program has been stable for quite some time and has quite good test coverage now, I've decided it's time to come out of beta and make this the official v1.0 release for gpt-subtrans!
Fixed: Stop On Error failing to abort the translation when there are errors
Full Changelog: v0.8.4...v1.0.1
Option to add RTL markers to translated output
Optionally adds Unicode RTL markers to lines which contain primarily Right-To-Left script, which can improve the display and formatting of subtitles in e.g. Arabic. Has no effect if less than 50% of the line is in an RTL script.
Also adds proxy support for Claude API calls, and adds the latest Claude models to the supported model list.
All libraries and modules updated to the latest version except for PySide6, which has seemingly added support for detecting system light/dark themes - in a way that wreaks total havoc on the app's built-in themes 🙄
Load/save instructions from appdata folder
Minor update.:
- Fix for subtitles that start at 0:00:00,000 being lost
- Load/Save instructions files in application data folder so that they survive update/reinstall
- Support for Claude 3.5 Sonnet, and an option to manually specify future Anthropic models
Full Changelog: v0.8.2...v0.8.3