Pages in topic:   [1 2] >
Translating HTML files
Thread poster: Wolfgang Schoene
Wolfgang Schoene
Wolfgang Schoene  Identity Verified
France
Local time: 18:10
Member (2007)
English to German
+ ...
Nov 7, 2021

Hi, I don't know if this is the right forum to post my question.
I will probably have to translate a number of HTML files, I've never dealt with this file type so I'm asking which CAT tool is the "best" to deal with this file type.


 
Thomas T. Frost
Thomas T. Frost  Identity Verified
Portugal
Local time: 17:10
Danish to English
+ ...
Source layout Nov 7, 2021

Wolfgang Schoene wrote:

Hi, I don't know if this is the right forum to post my question.
I will probably have to translate a number of HTML files, I've never dealt with this file type so I'm asking which CAT tool is the "best" to deal with this file type.


Some CAT tools mess around with the layout of the source and join several lines, which is something, as an old programmer, I abhor.

A memoQ developer sent me an HTML filter to avoid that, but I don't think it has been integrated in the product. I can send it to anyone who wants it, though.

I think some of the free CAT tools can do it too, but I've forgotten which ones. I think it was OmegaT or Cafetran.


Hans Lenting
 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Quick test with CafeTran Espresso Nov 7, 2021

Saved the Proz page with your post as an HTML file and imported it in CafeTran Espresso. Locked all segments with " (for a quick test) and replaced some letters with an "m".

The exported result looks good:

https://www.dropbox.com/s/l4ha315k3081w64/354223-translating_html_files_nl-NL.html?dl=0

Perhaps the filter could be tweaked to hide all refs (or one could probably define a hidden non-translatable).

One button label was missed:

Screen Shot 2021-11-07 at 21.23.30

Probably other CAT tools will perform well too, since HTML is such an important file format.


[Edited at 2021-11-07 20:30 GMT]


 
Milan Condak
Milan Condak  Identity Verified
Local time: 18:10
English to Czech
OmegaT Nov 7, 2021

Thomas T. Frost wrote:
. I think it was OmegaT or Cafetran.


I use OmegaT for individual HTML files or for the entire downloaded website.

Milan


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 19:10
English to Russian
Trados 2021 Nov 7, 2021

As translated by OPUS MT:
2021-11-08_021837


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
@Stepan Nov 8, 2021

Is the "Tell a friend" button label presented for translation, in the Studio project?

Screenshot 2021-11-08 at 07.36.47

[Edited at 2021-11-08 06:37 GMT]


 
Milan Condak
Milan Condak  Identity Verified
Local time: 18:10
English to Czech
OmegaT and MT on premise Nov 8, 2021

Thomas T. Frost wrote:

I think some of the free CAT tools can do it too, ....


@Wolfgang

OmegaT can use "OPUS MT", too.

@Stepan

Wolfgang's Working languages:
English to German
French to German
Italian to German

MT is another cup of tea

Downloadable online models from MT OPUS for Russian as target language: there are 5 models, two are for EN-RU pair.

Source languages Target languages Model name

English Russian opus-2020-02-11
French Russian opus-2020-01-24
English Russian opus+bt-2021-04-14
Armenian Russian opus-2021-02-23
Slovenian Russian opus-2021-02-18

For Czech as target language I can download 16 models.

Downloadable online models from Fiskmo MT, Russian as target language there are 7 models, two are for EN-RU pair.

en-ru opus-2020-01-16
en-ru opus-2020-02-11 (the same as from MT OPUS)
es-ru opus-2020-01-20
fi-ru opus-2020-01-26
fi-ru opus-2020-04-12
fr-ru opus-2020-01-24 (the same as from MT OPUS)
sv-ru opus-2020-01-16

Models for Fiskmo and MT OPUS was created by same tool.

Milan


[Edited at 2021-11-08 07:32 GMT]


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 18:10
Member (2006)
English to Afrikaans
+ ...
Most of them Nov 8, 2021

Wolfgang Schoene wrote:
I will probably have to translate a number of HTML files, I've never dealt with this file type so I'm asking which CAT tool is the "best" to deal with this file type.


HTML is one of the first formats to be supported beyond plain text by any CAT tool. Most CAT tools can handle it, and can handle it well.

One problem is when the HTML contains code from another language (this is perfectly acceptable in HTML). The CAT tool's HTML filter may not be able to handle that other language, and should therefore treat it as untranslatable, which may mean that parts of the file that should be translated remain untranslated. If the CAT tool's HTML filter is good, not translating such content should not break the file, but the text in that other language will remain untranslatable.

Will the client be sending you the HTML files, or will you have to download the files from some web site yourself?


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Language Nov 8, 2021

Samuel Murray wrote:

One problem is when the HTML contains code from another language (this is perfectly acceptable in HTML).


Language as in 'markup language' or 'scripting language'.


 
Wolfgang Schoene
Wolfgang Schoene  Identity Verified
France
Local time: 18:10
Member (2007)
English to German
+ ...
TOPIC STARTER
Translating HTML files Nov 8, 2021

Samuel Murray wrote:

Wolfgang Schoene wrote:
I will probably have to translate a number of HTML files, I've never dealt with this file type so I'm asking which CAT tool is the "best" to deal with this file type.


HTML is one of the first formats to be supported beyond plain text by any CAT tool. Most CAT tools can handle it, and can handle it well.

One problem is when the HTML contains code from another language (this is perfectly acceptable in HTML). The CAT tool's HTML filter may not be able to handle that other language, and should therefore treat it as untranslatable, which may mean that parts of the file that should be translated remain untranslated. If the CAT tool's HTML filter is good, not translating such content should not break the file, but the text in that other language will remain untranslatable.

Will the client be sending you the HTML files, or will you have to download the files from some web site yourself?


Thanks to all those who replied to my question.
@Samuel: The client will sent me about 25 files, I don't have to download them from the web.


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 19:10
English to Russian
Tell a friend Nov 8, 2021

Milan Condak wrote:
@Stepan
Wolfgang's Working languages:
English to German
French to German
Italian to German
MT is another cup of tea
Downloadable online models from MT OPUS for Russian as target language: there are 5 models, two are for EN-RU pair.
@Milan Condak
I didn't catch why you mentioned that. I just meant that I quickly ran an MT engine and that the translation is not mine (i.e. quality not guaranteed).

@German Dutch Engineering Translation
German Dutch Engineering Translation wrote:
Is the "Tell a friend" button label presented for translation, in the Studio project?
No, this phrase is not there.2021-11-08_163751

[Edited at 2021-11-08 13:38 GMT]


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Transit Nov 8, 2021

Transit does show the Tell a friend label:

Screen Shot 2021-11-08 at 21.27.10


 
Sara Massons
Sara Massons  Identity Verified
France
Local time: 18:10
Member (2016)
English to French
+ ...
Why HTML files? Nov 8, 2021

Hello,

Sorry if I drift a little bit from the initial question here but I'm really wondering why you should deal with HTML files at all.

I have some programming and web development knowledge myself and I often translate content for the web but nobody never asked me to directly deal with HTML files. Modern websites are supposed to be based on CMS tools which allow you or your client to input only text in the editable part of the site and usually the navigation and decora
... See more
Hello,

Sorry if I drift a little bit from the initial question here but I'm really wondering why you should deal with HTML files at all.

I have some programming and web development knowledge myself and I often translate content for the web but nobody never asked me to directly deal with HTML files. Modern websites are supposed to be based on CMS tools which allow you or your client to input only text in the editable part of the site and usually the navigation and decorative parts can be managed with independent resource files that can often be edited as simple spreadsheets. It is the same for software and mobile apps.

My clients either use an online localization tool which allow multiple (human) languages localization at the same time and instant publication by the PM once validated, or they send me text or spreadsheets that I can easily input in my CAT tool. I guess they probably have a few style adjustments made afterwards when importing this into their CMS but it is surely shorter than implementing the whole HMTL files... and possibly deal with code disruptions accidentally made during the translation process.

How do your "html files" client do when they want to change only a few things somewhere? Do they send you the whole file again ? Or maybe I'm completely wrong and these HTML files are used for another purpose...
Collapse


 
John Di Rico
John Di Rico  Identity Verified
France
Local time: 18:10
Member (2006)
French to English
Pseudotranslate Nov 8, 2021

Hi Wolfgang,

I suggest pseudo-translating these files using different tools if you have access to them, then open them in a browser and evaluate the results.

Make sure you verify filter settings beforehand (what encoding do they use?) and double check the final files to make sure that meta tags were extracted for translation (open them with a text editor and you will see meta tags in the header).

Best,
John

PS: @Sara makes a good point. O
... See more
Hi Wolfgang,

I suggest pseudo-translating these files using different tools if you have access to them, then open them in a browser and evaluate the results.

Make sure you verify filter settings beforehand (what encoding do they use?) and double check the final files to make sure that meta tags were extracted for translation (open them with a text editor and you will see meta tags in the header).

Best,
John

PS: @Sara makes a good point. Once upon a time I translated HTML files whose content should have been extracted from the CMS into an Excel, CSV, XML, or XLIFF file.
Collapse


 
Thomas T. Frost
Thomas T. Frost  Identity Verified
Portugal
Local time: 17:10
Danish to English
+ ...
Also open the translated HTML file in a plain-text editor Nov 9, 2021

John Di Rico wrote:

open them in a browser and evaluate the results.



It's also a good idea to open the HTML files in a plain-text editor such as Notepad++ to check how the HTML code has been treated.

Look at the example below.

Screenshot 2021-11-09 080755

The source file is easy to edit manually, thanks to the line breaks. The target file is not, as the CAT tool (memoQ in this case) has joined several lines (due to the long line length, you can't see the end of it, as it's off the screen). This can be very annoying when one maintains the HTML manually. It's so annoying, in fact, that I decided not to use memoQ for my own website. And no, I don't have a CMS or any other advanced software for a small website. I just maintain it with a simple HTML editor.

Trados does something similar, I believe.

memoQ recently provided me with an import filter to avoid this problem, but I don't think it's generally available.

In any case, it would be wise to ask the client first if they mind that the CAT tool 'mangles' the HTML layout if you intend to use a CAT tool that mangles it, as I would complain if I received a translated file back with joined lines and it would be very time-consuming to fix it manually in several files.


 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Translating HTML files







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »