Find/Replace across multiple files, multiple directories
Thread poster: Samuel Murray
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 15:12
Member (2006)
English to Afrikaans
+ ...
Dec 8, 2009

G'day everyone

Can you please give me your recommendation of a program or programs that can do the following? It needn't be freeware, but the cheaper the better, obviously.

I need to make edits on multiple plain text files located in multiple directories (but usually only one tree, i.e. only one top-level or ancestor directory). I'm using Windows XP Pro. The files are usually in UTF8 format (but if your tool can handle other formats too, so much the better). The fil
... See more
G'day everyone

Can you please give me your recommendation of a program or programs that can do the following? It needn't be freeware, but the cheaper the better, obviously.

I need to make edits on multiple plain text files located in multiple directories (but usually only one tree, i.e. only one top-level or ancestor directory). I'm using Windows XP Pro. The files are usually in UTF8 format (but if your tool can handle other formats too, so much the better). The files often do not have UTF8 byte order marks, but sometimes they do. The files usually have LF (Unix) line endings but may also have CRLF (Dos/Windows) line endings.

What I need to do, is this:

1. Find CRLF (carriage return/line feed, aka Dos/Windows line endings) and replace them with LF (line feed, aka Unix line endings). (must have)

2. Find a UTF8 byte order mark, and remove it (or optionally also add it). This can usually be done if the program is capable of doing hex editing, because the byte order mark is nothing more than three unique bytes at the start of a file. (must have)

3. If at all possible, if regex can be built into your recommended tool, it'd be so much better. (optional, but nice)

4. Do find/replace operations in UTF8, even if the file has no byte order mark (or alternative even if the file has a byte order mark). (kinda non-optional, but depends on the tool)

So, what can you recommend?

Samuel
Collapse


 
Adam Łobatiuk
Adam Łobatiuk  Identity Verified
Poland
Local time: 15:12
Member (2009)
English to Polish
+ ...
2 apps Dec 8, 2009

Rainbow (from Okapi Tools) - free and Ultra Edit 32 (commercial, but is probably available as a time-limited demo). You might want to try both for the different features.

Good luck.


 
Robert Tucker (X)
Robert Tucker (X)
United Kingdom
Local time: 14:12
German to English
+ ...
Perl Dec 8, 2009

Regarding UltraEdit and UEStudio "There are multiple configuration options which allow you to target specific file types, project files, directories of files..." according to:

http://www.ultraedit.com/support/tutorials_power_tips/ultraedit/find_replace.html

so probably they can do the job.

But command line Perl probab
... See more
Regarding UltraEdit and UEStudio "There are multiple configuration options which allow you to target specific file types, project files, directories of files..." according to:

http://www.ultraedit.com/support/tutorials_power_tips/ultraedit/find_replace.html

so probably they can do the job.

But command line Perl probably can too. Unix to DOS text conversion with Perl is shown here:

http://sial.org/howto/perl/one-liner/

Not quite sure if it will work recursively though.

Came across XReplace-32:

http://xreplace.vestris.com/

"XReplace-32 is the tool you need for massive search-and-replace operations among all of your text files, including html web documents and source code."
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 15:12
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
Using Perl Dec 8, 2009

Robert Tucker wrote:
But command line Perl probably can too. Unix to DOS text conversion with Perl is shown here:

http://sial.org/howto/perl/one-liner/

Not quite sure if it will work recursively though.


I'll gladly use Perl, if I can use it recursively and not have to specify the individual file names. I know that Perl can change line endings. Can Perl find/replace or find/remove hex characters like \xEF\xBB\xBF ?


 
Robert Tucker (X)
Robert Tucker (X)
United Kingdom
Local time: 14:12
German to English
+ ...
Perl Hex and Recursive Dec 8, 2009

Perl should handle hex according to this web page:

http://www.netadmintools.com/art415.html

and recursive according to this
... See more
Perl should handle hex according to this web page:

http://www.netadmintools.com/art415.html

and recursive according to this web page:

http://joseph.randomnetworks.com/archives/2005/08/18/perl-oneliner-recursive-search-and-replace/

(I think the "find" command works on both Windows and Linux/Unix and there is a Grep for Windows)
Collapse


 
Kevin Lossner
Kevin Lossner  Identity Verified
Portugal
Local time: 14:12
German to English
+ ...
You could treat it as a "translation" project Dec 8, 2009

Both MemoQ & DVX will enable you to do this. Copy the source text to the target by the usual methods and do your search/replace. The results will be exported in the same directory structure.

 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 15:12
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
Don't use XReplace-32! Dec 9, 2009

Robert Tucker wrote:
Came across XReplace-32:

http://xreplace.vestris.com/

"XReplace-32 is the tool you need for massive search-and-replace operations among all of your text files, including html web documents and source code."


I tried to test this program on a single file in a single directory, but when I pressed "Go!" it starting processing all files (all non-binary files) on my Desktop and all subdirectories of all folders on the Desktop! And its progress report indicated that it was making changes to all of the files. I tried to cancel the replacement process by clicking on the "x" but the program refused to close. There is no "Stop" button in the program as far as I can see either. Luckily I have a process killer utility on my desktop (Taskill, which I normally use for programs that hang) and I was able to kill XReplace-32 before it damaged all files on my computer.

Luckily the program makes backups of all changed files, so I was able to revert the changes by deleting the changed files and renaming the backups back.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 15:12
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
Comment on UltraEdit 32 Dec 9, 2009

Adam Łobatiuk wrote:
....and Ultra Edit 32 (commercial, but is probably available as a time-limited demo). You might want to try both for the different features.


I tried UltraEdit 32, thanks. It can find/replace CRLF and LF, using Perl regular expressions, if you specify the hexadecimal values:

Find: \x0D\x0A
Replace: \x0A

But it can't find the UTF8 BOM. I tried to search for \xEF\xBB\xBF but it could not find it. It did open a BOM'ed and a BOM-less UTF8 file both as UTF8, which is nice at least. But it once opened a BOM-less UTF8 file as ISO-8859-1 (see next paragraph).

One thing that is somewhat disconcerting is that if the file is UTF8 but is also valid ISO-8859-1, it sometimes opens it as ISO-8859-1 without asking, but sometimes it asks if I want it to "convert the file to DOS format" (and if I answer "no", then it opens the file as UTF8). In one case it opened a UTF8 file without asking if it should convert it, but when I saved the file, it was in ISO-8859-1, not UTF8.

I haven't checked extensively but I could not find any option to tell it to always assume that UTF8 files are in UTF8 and to always save such files as UTF8, even if it can be saved as ISO-8859-1 without data loss.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 15:12
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
Perl, hex and Windows Dec 9, 2009

Robert Tucker wrote:
Perl should handle hex according to this web page:
http://www.netadmintools.com/art415.html


Hmm, it doesn't seem to work. I tried this line:

perl -pe 's/\xEF\xBB\xBF//g' file1.txt > file1.txt

on a file with a UTF8 byte order mark (EF BB BF) but it fails to remove it.

Another problem is that the "-i" switch, which means "process the file itself, in place" doesn't seem to work on Windows, which means that I have to write the result to a new file, and I would then have to find a way to ensure that files that weren't modified are also copied to the new location. And because the target location must be mentioned, and must typically be a full path, in quotes, the way to use Perl would involve writing a large BAT file with each replacement on a new line. I've been down that road... it aint pretty.

and recursive according to this web page:

http://joseph.randomnetworks.com/archives/2005/08/18/perl-oneliner-recursive-search-and-replace/

(I think the "find" command works on both Windows and Linux/Unix and there is a Grep for Windows).


The lines mentioned on that page does not work, but then, they look rather weird anyway. Besides, the "find" command in Windows does not search for files (as it seems to do in Linux) but for lines in files. The relevant command in Windows may be "dir", and you may have to pipe the commands, but I have been unsuccessful in tinkering with it.





[Edited at 2009-12-09 10:46 GMT]


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 15:12
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
Rainbow does it Dec 9, 2009

Adam Łobatiuk wrote:
Rainbow (from Okapi Tools)...


I installed Rainbow-R00003-v5.0.1 (found on my computer somewhere) and it works. It can process multiple files in multiple directories, and you can drop a directory tree into Rainbow as-is. The relevant options are on the toolbar menu:

Utilities > Line-break conversion
Utilities > Byte-order-mark conversion

I'm not sure if it is required to specify the input and output encoding in the "Options" tab, but it can't hurt to do so. One can replace files or create new files with a regular type of name. Rainbow doesn't do find/replace, though.

Added: I see the latest version of Rainbow is 5.0.20, here:
http://okapi.sourceforge.net/downloads.html



[Edited at 2009-12-09 11:02 GMT]


 
Robert Tucker (X)
Robert Tucker (X)
United Kingdom
Local time: 14:12
German to English
+ ...
Perl – Remove BOM Dec 9, 2009

Samuel Murray wrote:
Hmm, it doesn't seem to work. I tried this line:

perl -pe 's/\xEF\xBB\xBF//g' file1.txt > file1.txt

on a file with a UTF8 byte order mark (EF BB BF) but it fails to remove it.


Try:

perl -CD -pe 'tr/\x{feff}//d' bom.txt > nobom.txt

It seemed to work on Linux. (I opened Bluefish, typed Ctrl+Shift+U EF, Ctrl+Shift+U BB, Ctrl+Shift+U BF so that I got  and then just added some text; ran the above command and in the new file the  was absent.)

http://www.perlmonks.org/?node_id=724474



[Edited at 2009-12-09 12:26 GMT]


 
Robert Tucker (X)
Robert Tucker (X)
United Kingdom
Local time: 14:12
German to English
+ ...
Windows Grep Dec 9, 2009

Samuel Murray wrote:
and recursive according to this web page:

http://joseph.randomnetworks.com/archives/2005/08/18/perl-oneliner-recursive-search-and-replace/

(I think the "find" command works on both Windows and Linux/Unix and there is a Grep for Windows).


The lines mentioned on that page does not work, but then, they look rather weird anyway. Besides, the "find" command in Windows does not search for files (as it seems to do in Linux) but for lines in files. The relevant command in Windows may be "dir", and you may have to pipe the commands, but I have been unsuccessful in tinkering with it.


I tried them on Linux. The one with the "find" command seemed to only edit one file in a directory (when in fact I had two files which could have been edited). The command with grep seemed to work.

Don't know if this Windows grep would be up to the job:

http://pages.interlog.com/~tcharron/grep.html


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Laureana Pavon[Call to this topic]

You can also contact site staff by submitting a support request »

Find/Replace across multiple files, multiple directories






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »