Autoit script: extract TM from Memsource Editor for Web
Thread poster: Samuel Murray
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 13:13
Member (2006)
English to Afrikaans
+ ...
Feb 25, 2020

Hello everyone

I have a lovely agency client whose own client is forcing us to use Memsource Editor for Web and give them fuzzy match discounts, but we don't have access to the TM from which these fuzzy matches are drawn. I mean, we can see the fuzzy matches in the "CAT pane" and we can insert the match's target text into the target field, but we can't get to the TM match's source text easily, for use in off-line translation. Fortunately, the end-client did not disable the option
... See more
Hello everyone

I have a lovely agency client whose own client is forcing us to use Memsource Editor for Web and give them fuzzy match discounts, but we don't have access to the TM from which these fuzzy matches are drawn. I mean, we can see the fuzzy matches in the "CAT pane" and we can insert the match's target text into the target field, but we can't get to the TM match's source text easily, for use in off-line translation. Fortunately, the end-client did not disable the option to use DOCX export and import, so we're not entirely all the way up the creek.

In my usual browser, Ctrl+A does not select the match text in the CAT pane, but I discovered that if I use the Otter Browser, then Ctrl+A *does* select it. Great, so we can script the extraction. This script is largely untested but so far it works for me. It's an AutoIt script, so you need AutoIt.

http://www.leuce.com/autoit/memsource_web_extract_tm.zip

Basically, you press the shortcut key, and the script does "copy all" on the screen and then saves the highest match in the CAT pane to a text file. It's not super, super fast, but it's fast enough to be useful. Also, you don't need to extract every single little segment -- jump to the biggest ones and press the shortcut key.

Samuel
Collapse


JAGDISH ARORA
 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 15:13
English to Russian
Why not just pretranslate it? Feb 25, 2020

What is the purpose of this action? You can just pretranslate the files to populate them with fuzzy matches. Also, when using a 3rd-party tool, you cannot use the concordance feature, glossaries, etc. Seems useless to me unless I miss something.

 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 13:13
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
@Stepan Feb 25, 2020

Stepan Konev wrote:
You can just pretranslate the files to populate them with fuzzy matches.


Yes, but then you can't see what the fuzzy match's original source text was. For high fuzzy matches, this really matters.

If you have a 30-word segment in which only one or two words changed, then it's a high fuzzy match, i.e. low payment per segment, but if you only have the match's target text and the new source (not the match's original source, to compare it to the new source), then it's going to take a lot longer to figure out which one or two of those words had changed.

(Yes, I know Memsource shows a DIFF-like display in the bottom right corner of the screen, but that's far, far away from the active segment.)

When using a 3rd-party tool, you cannot use the concordance feature, glossaries, etc.


True, but concordance in Memsource Editor for the Web is just a click away: Ctrl+K. Most third-party CAT tools can do searches in other programs (e.g. in a web browser, or in a dictionary program) or if not, a simple script can do it. [However, the types of jobs I get for clients who work with Memsource aren't the types of jobs where doing concordance searches are typically useful.] As for glossaries, updating the script to extract the glossaries matches is reasonably simple. But yes, I agree, glossaries is a concern if the job you have is the type of job where following a specific glossary is part of the task. [However, with some types of jobs it is quicker and less effort to simply fix glossary errors afterwards when the Xbench report rolls in.]

At the low rates I'm being paid for Memsource work, I can't afford to work inside Memsource. If a client chooses Memsource and is wiling to pay their translators top dollar for suffering through it, then I'll bite, but most clients who choose tools like Memsource believe that the tool makes translators more productive (and thus capable of working for lower rates), which is only true if the only other tool you've ever used is Notepad.



[Edited at 2020-02-25 18:19 GMT]


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 15:13
English to Russian
Over my 18+ years with CAT tools, Feb 26, 2020

I think you need more experience with Memsource. If you really believe Memsource is as efficient as Notepad, you just definitely have lack of knowledge. All the things you describe are all there. You just don't know how to use them.

 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 13:13
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
@Stepan Feb 27, 2020

Stepan Konev wrote:
If you really believe Memsource is as efficient as Notepad...


I think you misunderstood me. I meant that Memsource *is* more efficient than Notepad.

Any user who is not yet a user of CAT tools will be amazed and astonished by the boost in productivity offered by Memsource (and the same goes for clients for whom Memsource is the first CAT tool they have ever seen), but users who are already operating at the peak efficiency provided by their own CAT tools will not experience any additional boost in productivity from using Memsource (in fact, they are likely to see a drop).

I think you need more experience with Memsource. ... All the things you describe are all there. You just don't know how to use them.


I'm sure that if I got more jobs with Memsource, I would get more experience with it, yes. And perhaps if I was paid more for those job, I would have less incentive to rush, and would be willing to work along with the inefficiencies.

It's often not a matter of not knowing how to use the features of other CAT tools (Memsource is reasonably intuitive), but rather of not being used to using them. Any tool that is not your usual tool or that does things in a way that seems counter-productive is going to be slower for you. There are some online CAT tools that I had absolutely loathed when I started using them, but after getting more and more jobs in them, gotten used to their idiosyncrasies.

My 18+ years with CAT tools


I have a mere 16.  

Look, Memsource is not bad. It can be a bit slow to confirm segments (which is why I have learnt to close the file and re-open it to see if all segments are still all confirmed, before delivery). The CAT pane shows content fairly instantaneously, though concordance searches are slower (not a deal-breaker). The fact that one can copy/paste tags and not be forced to use a keyboard shortcut to place them is very nice. The DOCX export and import works fine and is quite forgiving (in some other CAT tools' bilingual review files I have to tip-toe quite a bit not to "break" the file, but Memsource's file is robust, so hats off to the developers). So, a lot of good things can be said.


[Edited at 2020-02-27 11:50 GMT]


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 13:13
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
Updated version Feb 27, 2020

I've uploaded a new version (same URL):
http://www.leuce.com/autoit/memsource_web_extract_tm.zip

The new version extracts all fuzzy matches, MT matches, glossary matches and subsegment matches. It exports all of it to a single file, but doesn't do any other kind of sorting -- you must sort the file yourself (by column 2), and remove duplicates yourself using yo
... See more
I've uploaded a new version (same URL):
http://www.leuce.com/autoit/memsource_web_extract_tm.zip

The new version extracts all fuzzy matches, MT matches, glossary matches and subsegment matches. It exports all of it to a single file, but doesn't do any other kind of sorting -- you must sort the file yourself (by column 2), and remove duplicates yourself using your favourite duplicate line removal utility, and (of course) convert it to your chosen TM format manually.

My current method for putting the cursor back into the target field is to press Shift+Ctrl+space (which communicates to the server to insert a non-breaking space) and then press Ctrl+Z. This is necessary because after selecting all text, the cursor is no longer in the target field, and you can't move to the next segment unless the editor has the focus. This is typing of a space is one of the script's biggest bottlenecks.

Also, the script needs to wait about 0.3 seconds before grabbing the clipboard, but if your computer has a faster clipboard, you can shave perhaps 0.1 or 0.2 seconds off each segment's processing. These numbers may seem silly, until you remember that there may be hundreds of segments to visit.



[Edited at 2020-02-27 11:52 GMT]
Collapse


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Autoit script: extract TM from Memsource Editor for Web






Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »