Intro
Although AI models now dominate all our digital ecosystems and are helping localization folks move faster between production workflows, noble people in the software engineering space continue to develop handy tools that take little effort to master and cost zero dollars and computing power.
Okapi Framework’s Rainbow is a great example. Briefly, Rainbow performs a myriad of cascade batch tasks on text-based files. In other words, you have a convenient UI to mix and match Java text manipulation scripts.
Although their documentation on usage is comprehensive (I love it), I realized it would be nice to share some concrete uses cases from my personal experience, particularly for file preparation and troubleshooting.
Find and Replace Corrupted Encodings
You are working with a big company whose Android and iOS app has a monthly release cadence. This app is used by more than 5 million customers. One day, the Sr. Product Manager shares a spreadsheet from the string repository in panic. All characters outside the ANSI map are corrupted.
The next release is in 15 days, and you have about 3,500 strings to localize and update. So, what can you do?
- Create a mapping of the corrupted characters and their correct counterparts in a text file. For this end, you will need translator and copywriter input. The key here is that you will add search-replace pairs by updating their indexes. Use the template below.
#v1
regEx.b=false
dotAll.b=false
ignoreCase.b=false
multiLine.b=false
target.b=true
source.b=false
replaceALL.b=true
replacementsPath=
logPath=${rootDir}/replacementsLog.txt
saveLog.b=false
count.i=1 # Update this number to include all the available replacements. #
use0=true # This activates the entry for replacement. #
search0=something
replace0=something
use1=true # Every entry index grows sequentially (1, 2, 3, 4). #
search1=hola
replace1=hello
In Rainbow, drag and drop your CSV, XLSX, or XLIFF file. Do not edit the file parser/filter because the defaults will suffice for this operation. Make sure to set the encoding to UTF-8 and your target language(s).
Under Utilities, use the Search Replace With Filter option to create a quick pipeline:
- Raw file to filter events: This makes your file temporarily manageable for text operations.
- Search and Replace: Where you will import your text file.
- Filter events to raw file: Assembles the source file with the modifications.
Open the Search and Replace step and import your text file. This will be your replace list.
Execute the pipeline and you will have successfully replace every corrupted character in batches.
Back to basics
See? You didn’t have to fight with your favorite LLM or RLM to accurately execute your prompt. Text manipulation scripts have been with us for decades, and they require >1% of CPU capacity to run.
I hope you have found what you were looking for, and please feel free to send me a DM via LinkedIn or a quick email if you are struggling with a task and would like to find a quick and cheap solution.
As this article grows, I will place a handy table of contents at the top for quick navigation.
🙋🏻♂️👋🏻