How to import bilingual source text into CAT tool
Thread poster: David Oliver
David Oliver
David Oliver  Identity Verified
United States
Local time: 07:02
Spanish to English
+ ...
Oct 24, 2022

I have been asked to translate an expanded new version of a document I translated before. The author has added sentences in Spanish throughout the English translation. I would rather not translate it in Word. I'd prefer to import it into my CAT tool (DVX3) to ensure consistent terminology.

What's the best way to approach this import?

Any suggestions are much appreciated.


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 17:02
English to Russian
Bilingual source text Oct 24, 2022

David Oliver wrote:
a document I translated before
Did you translate it with a CAT tool? And what is "bilingual source text"? Do you have a 2-column table in source and target languages? If yes, you can use Heartsome TMX Editor (freeware) to convert the table into a tmx file that you can then import into DVX3.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 16:02
Member (2006)
English to Afrikaans
+ ...
Mark as hidden? Oct 24, 2022

In many CAT tools, you can hide text of a Word file from the CAT tool by marking the text as hidden. So, if you were to mark all English text as hidden, and leave the Spanish text as non-hidden, the CAT tool might display only the Spanish text for you to translate.

(I assume you did the previous translation in the CAT tool and that you have a TM for it.)


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Lucky Joe Oct 24, 2022

If you're lucky and you have a version of Ms Word where automatic language detection still works (it is broken in the Mac version ), you can have Word hide all English table cells:

Screen Shot 2022-10-24 at 11.26.30


 
David Oliver
David Oliver  Identity Verified
United States
Local time: 07:02
Spanish to English
+ ...
TOPIC STARTER
Trying to get the game plan... Oct 24, 2022

Thanks for the replies!

I have DVX3, which is what I used to translate the first version of this text a couple years ago.

This hidden text option is intriguing. Let me see if I understand this: Would the process be as follows?

1) take the new bilingual file (with Spanish sentences mixed in with the English document)
2) import it into DVX3
3) insert the source segments into the target column
4) export it so that I have the full text in DVX
... See more
Thanks for the replies!

I have DVX3, which is what I used to translate the first version of this text a couple years ago.

This hidden text option is intriguing. Let me see if I understand this: Would the process be as follows?

1) take the new bilingual file (with Spanish sentences mixed in with the English document)
2) import it into DVX3
3) insert the source segments into the target column
4) export it so that I have the full text in DVX3-formatted columns
5) open the exported 2-column doc in Word & do the Find/Replace to hide English text
6) re-import the document into DVX3

...and then, will that skip (i.e., not import) the English text that was just hidden? If so, would that give me a disjointed Spanish-only text of sentences that aren't necessarily contiguous in the bilingual document?

Sorry for the questions, just not sure how the whole process should go.

Thanks again for your expertise!
Collapse


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Trick Oct 25, 2022

Can you create an en-US_en-US TMX file?

Either from the original project or by running an SQL command on the DVMDB? Or by editing a bilingual export from the original project (replace source language column with a copy of the target language column, adjust the column header).

Import the modified bilingual RTF file to an empty DVMDB. Use this to populate the English segments.

So, change this:

Screen Shot 2022-10-25 at 08.36.40

To this:

Screen Shot 2022-10-25 at 08.37.09


[Edited at 2022-10-25 06:41 GMT]


 
David Oliver
David Oliver  Identity Verified
United States
Local time: 07:02
Spanish to English
+ ...
TOPIC STARTER
Thanks for the suggestions! Oct 25, 2022

Thank you so much for the instructions. I will try this method. Sounds like a good option.

Cheers!


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 16:02
Member (2006)
English to Afrikaans
+ ...
@David Oct 25, 2022

David Oliver wrote:
Would that give me a disjointed Spanish-only text of sentences that aren't necessarily contiguous in the bilingual document?

Yes, that is the downside of this approach.

In most CAT tools, excluded text will not be visible to you in the CAT tool (the one exception that I know of is WFC). It seems such a logical thing for CAT tools to be able to do (i.e. import hidden text but show it as locked segments), but AFAIK no CAT tool can actually do that, so you're stuck with only seeing the non-hidden text in the CAT tool.

1) take the new bilingual file (with Spanish sentences mixed in with the English document)
2) import it into DVX3
3) insert the source segments into the target column
4) export it so that I have the full text in DVX3-formatted columns
5) open the exported 2-column doc in Word & do the Find/Replace to hide English text
6) re-import the document into DVX3

I don't know DVX well enough to know what your steps will do, but it sounds to me like... no, your steps will not accomplish anything.

What I meant was: in the source file (i.e. the DOCX that you received from the client), hide the text that you want to be excluded. (This includes all paragraphs that don't have any Spanish text in them.) Then open the DOCX file in the CAT tool. Then translate the Spanish text, and recreate the final DOCX file, and then unhide all hidden text again.

Of course, this all depends on whether DVX excludes hidden text (though most CAT tools do). And you need to figure out how to hide text, how to view hidden text, and how to unhide text in Microsoft Word.

[Edited at 2022-10-25 07:47 GMT]


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 17:02
English to Russian
DVX3 Oct 25, 2022

̶I̶f̶ ̶y̶o̶u̶ ̶h̶a̶v̶e̶ ̶t̶h̶e̶ ̶p̶r̶e̶v̶i̶o̶u̶s̶ ̶t̶e̶x̶t̶ ̶t̶r̶a̶n̶s̶l̶a̶t̶e̶d̶ ̶w̶i̶t̶h̶ ̶D̶V̶X̶3̶,̶ ̶t̶h̶e̶n̶ ̶I̶ ̶s̶u̶p̶p̶o̶s̶e̶ ̶t̶h̶e̶ ̶t̶r̶a̶n̶s̶l̶a̶t̶i̶o̶n̶ ̶i̶s̶ ̶s̶t̶o̶r̶e̶d̶ ̶i̶n̶ ̶y̶o̶u̶r̶ ̶T̶M̶.̶ ̶A̶m̶ ̶I̶ ̶r̶i̶g̶h̶t̶?̶ ̶T̶h̶e̶n̶ ̶y̶o̶u̶ ̶c̶a̶n̶ ̶s̶i̶m̶p̶l̶y̶ ̶o̶p̶e̶n̶ ̶t̶h̶e̶ ̶n̶e̶w̶ ̶f̶i̶l̶e̶ ̶a̶n̶d̶ ̶p̶r̶e̶t̶r̶a̶n̶... See more
̶I̶f̶ ̶y̶o̶u̶ ̶h̶a̶v̶e̶ ̶t̶h̶e̶ ̶p̶r̶e̶v̶i̶o̶u̶s̶ ̶t̶e̶x̶t̶ ̶t̶r̶a̶n̶s̶l̶a̶t̶e̶d̶ ̶w̶i̶t̶h̶ ̶D̶V̶X̶3̶,̶ ̶t̶h̶e̶n̶ ̶I̶ ̶s̶u̶p̶p̶o̶s̶e̶ ̶t̶h̶e̶ ̶t̶r̶a̶n̶s̶l̶a̶t̶i̶o̶n̶ ̶i̶s̶ ̶s̶t̶o̶r̶e̶d̶ ̶i̶n̶ ̶y̶o̶u̶r̶ ̶T̶M̶.̶ ̶A̶m̶ ̶I̶ ̶r̶i̶g̶h̶t̶?̶ ̶T̶h̶e̶n̶ ̶y̶o̶u̶ ̶c̶a̶n̶ ̶s̶i̶m̶p̶l̶y̶ ̶o̶p̶e̶n̶ ̶t̶h̶e̶ ̶n̶e̶w̶ ̶f̶i̶l̶e̶ ̶a̶n̶d̶ ̶p̶r̶e̶t̶r̶a̶n̶s̶l̶a̶t̶e̶ ̶i̶t̶ ̶w̶i̶t̶h̶ ̶D̶V̶X̶3̶.̶ ̶A̶l̶l̶ ̶1̶0̶0̶%̶ ̶m̶a̶t̶c̶h̶e̶s̶ ̶w̶i̶l̶l̶ ̶b̶e̶ ̶t̶h̶e̶r̶e̶,̶ ̶a̶n̶d̶ ̶y̶o̶u̶ ̶o̶n̶l̶y̶ ̶h̶a̶v̶e̶ ̶t̶o̶ ̶t̶r̶a̶n̶s̶l̶a̶t̶e̶ ̶t̶h̶e̶ ̶n̶e̶w̶ ̶c̶o̶n̶t̶e̶n̶t̶.̶
Ah, shame on me. I misunderstood it all. I am sorry.
Here is a short video on how to hide text in a specific language.

[Edited at 2022-10-25 09:40 GMT]
Collapse


 
David Oliver
David Oliver  Identity Verified
United States
Local time: 07:02
Spanish to English
+ ...
TOPIC STARTER
Hidden text method worked for this one Oct 26, 2022

Thanks for your helpful suggestions and support, Samuel, Hans, Stepan.

The hidden text method was something I hadn't thought of. Turns out it worked well on this one, because the original had been changed by the author (without telling me), and I have to fix a lot of that, so just translating the new Spanish additions turned out to be easier.

I'm just doing those sections in DVX3 and pasting them into the Word doc. I think this document would have been pretty messy if
... See more
Thanks for your helpful suggestions and support, Samuel, Hans, Stepan.

The hidden text method was something I hadn't thought of. Turns out it worked well on this one, because the original had been changed by the author (without telling me), and I have to fix a lot of that, so just translating the new Spanish additions turned out to be easier.

I'm just doing those sections in DVX3 and pasting them into the Word doc. I think this document would have been pretty messy if I had tried to break it up in different pieces and import/export all the pieces. I'll get through it without too much trouble now.

Much appreciated, guys!

BTW, what CAT tools do you all use these days? It looks like Atril scrapped their DVX4 release, which I heard was coming out in 2020...
Collapse


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
CafeTran Espresso: the OPEN tool Oct 26, 2022

David Oliver wrote:

BTW, what CAT tools do you all use these days? It looks like Atril scrapped their DVX4 release, which I heard was coming out in 2020...


I left Déjà Vu when .doc was still dominant. Atril handled .doc via .rtf, which led to gigantic files, with lots of troubles. The .docx filter still had to be created.

I went for Transit NXT, which had superb filters for Ms Office. I left Transit because of the buggy implementation of the SQL termbase driver.

Then came CafeTran Espresso, an open tool that runs on Mac and Linux too. With 'open' I mean:

  • terminology stored in plain-text tab-delimited files that you can edit directly with a spreadsheet program or text editor
  • XLIFF (like Trados) instead of a database (like memoQ, Déjà Vu) for projects
  • TMX (like omegaT) instead of a database (like memoQ, Déjà Vu, Trados) for TMs


My data are save and directly accessible. No risk of database corruption or need for re-indexation.

I'm looking forward to the announced cross-platform preview that works wherever CafeTran is installed!


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 16:02
Member (2006)
English to Afrikaans
+ ...
@David, off-topic Oct 26, 2022

David Oliver wrote:
BTW, what CAT tools do you all use these days?

For 99% of my work, I use WFC 6. However, I do have licenses for MemoQ, Trados and WFP, and I regularly do "CAT hopping" between the different tools. Recent developments in CAT tools have made CAT hopping easier (and safer!) in many cases.

A few years ago, if I wanted to translate an SDLXLIFF file that was fully untranslated, I would open it in Trados, copy all source to target, run an external script that added "[[" and "]]" on either side of the target text, and then open it in Word as plain text, run a macro that marked untranslatable text as tw4winExternal, and then translate it in WFC. Then save the result as plain text again, and open it in Trados just to check that there are no technical glitches. These days, for the same kind of file, I open it in Trados, copy all source to target, then open the SDLXLIFF file in WFP3, export a bilingual table, and translate the second column in WFC (and then reverse the steps in the end). Much faster and much safer. If the SDLXLIFF file is partially translated, then I do the same thing, except with MemoQ instead of WFP3. MemoQ's bilingual export feature works reasonably well (it only fails about 1-2% of the time), but WFP3's export/import is both faster and safer than MemoQ's, except that WFP3 can't handle non-source=target files properly. Trados' own bilingual export feature is abysmal (it regularly fails to import whole documents after translation, which leads to a last-minute struggle to get the text back into it via some other hack).


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Pre-segmentation Oct 27, 2022

Samuel Murray wrote:

These days, for the same kind of file, I open it in Trados, copy all source to target, then open the SDLXLIFF file in WFP3, export a bilingual table, and translate the second column in WFC (and then reverse the steps in the end).


My CAT tool needs this kind of pre-segmentation too.

It would be nice if could be done without using Trados, preferably from the command line and on macOS too.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How to import bilingual source text into CAT tool







Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »