Urgent!!!! Pdf convert to word or txt. Chinese character support?
Thread poster: rachel_xiao
rachel_xiao
rachel_xiao  Identity Verified
United Kingdom
Local time: 21:49
English to Chinese
+ ...
Apr 10, 2007

I downloaded ABBYY FineRead, hoping it can easily convert my scanned pdf files into word or some sort.
But it turned out I need to have the extended package support ifor Chinese lanuage recognisation.....and the extended package is only supplied to company users by ABBYY. obviously I am not a company user.
Anyone has any suggestions? or any other tools complying with Chinese?
Thanks in advance.
Rach


 
Mulyadi Subali
Mulyadi Subali  Identity Verified
Indonesia
Local time: 03:49
Member
English to Indonesian
+ ...
Easy PDF to Text Converter Apr 11, 2007

i don't know whether this supports chinese character or not, but it's free and so far it's been pretty good. the info page is here: http://www.pdf-to-html-word.com/pdf-to-text/

 
Katrin Hollberg
Katrin Hollberg  Identity Verified
Germany
Local time: 22:49
Japanese to German
+ ...
Check out this software (works fine with Japanese documents) Apr 11, 2007

Hello,

I can really really recommend this solution.
I just checked out the trial version and it works fine with pdfs created in the Japanese language. As I noticed this software also supports Chinese.

(You can just download from their site and use for several pages and then decide whether it is convenient for you)

http://www.solidpdf.com/

Have a try...it
... See more
Hello,

I can really really recommend this solution.
I just checked out the trial version and it works fine with pdfs created in the Japanese language. As I noticed this software also supports Chinese.

(You can just download from their site and use for several pages and then decide whether it is convenient for you)

http://www.solidpdf.com/

Have a try...it is really nice and cheap

Regards,

Katrin
Collapse


 
rachel_xiao
rachel_xiao  Identity Verified
United Kingdom
Local time: 21:49
English to Chinese
+ ...
TOPIC STARTER
Thanks all! Apr 12, 2007

I finally sorted it out yesterday using a chinese tool for the scanned pdf recognition. I tried those two tools recommended by Katrin and Mulyadi first but turned out they were not working for the scanned files.
Anyway, thanks for both of you.


 
Katrin Hollberg
Katrin Hollberg  Identity Verified
Germany
Local time: 22:49
Japanese to German
+ ...
What a pity, Rachel... Apr 12, 2007

Sorry for not being helpful but of course it does not work with pdfs which have not been converted from a standard text file (e.g. Word...).

I also tried a pdf file the other day which finally turned out originally being a bmp file converted into pdf. In that case I guess no converting tool is able to retrieve any information out of a picture file because all relating information data gets lost during the processes.

In those cases probably only a good OCR-software is fi
... See more
Sorry for not being helpful but of course it does not work with pdfs which have not been converted from a standard text file (e.g. Word...).

I also tried a pdf file the other day which finally turned out originally being a bmp file converted into pdf. In that case I guess no converting tool is able to retrieve any information out of a picture file because all relating information data gets lost during the processes.

In those cases probably only a good OCR-software is finally helpful in order to prevent you from typing a document manually for a translation process with Trados and other tools.

I just found one for Japanese characters ("YONDE! Koko") and have in mind to buy very soon. I am sure there must be a similar one for Chinese characters. Actually it is the same fonts' nature.

I would try asking the customer whether they could not send you an original text file. If I face any troubles with those "originally produced" files I can also use my Japanese OS + Office and fix any compatibility problems in most cases. But not every agency is aware of these possibilities - depending on their general experience with Chinese/Japanese fonts.

Nevertheless - Good luck for your project

Katrin

[Bearbeitet am 2007-04-12 08:27]
Collapse


 
Jan Sundström
Jan Sundström  Identity Verified
Sweden
Local time: 22:49
English to Swedish
+ ...
Other OCR programs Apr 16, 2007

Hi all,

The sticking pont seems to be finding an OCR program that has all these features:
- can OCR a flattened PDF, or unlayered image file like JPG, TIFF
- preferably comes with a western UI (for easy handling by non-chinese users)
- is available unbundled for individual users at a reasonable price

Search the forum for previous suggestions. The concensus seems to be that the best OCR for Chinese is Hanwang. But I still don't know if it has any Eng
... See more
Hi all,

The sticking pont seems to be finding an OCR program that has all these features:
- can OCR a flattened PDF, or unlayered image file like JPG, TIFF
- preferably comes with a western UI (for easy handling by non-chinese users)
- is available unbundled for individual users at a reasonable price

Search the forum for previous suggestions. The concensus seems to be that the best OCR for Chinese is Hanwang. But I still don't know if it has any English UI!
Product page here (it seems that it's now available standalone too, not just bundled):
http://www.hanwang.com.cn/english/products0303.asp

Mentioned here:
http://www.proz.com/post/343923#343923

"Hanwang (汉王), Shangshu (尚书) and 清华紫光 are top ones in the industry. Where Hanwang is bundled solely with Hanwang Scanners, whereas Shangshu is not scanner-dependant, but sold as a bundle with Microtek series. 清华紫光s are for Qinghua series.

Amongst them I like Hanwang best. Their website:
Hanwang: http://www.hw99.com/

Shangshu is also part of Hanwang, with limited features."

Other suggestions:
http://www.proz.com/topic/65264
http://www.proz.com/post/506370#506370
http://bbs.betabbs.com/index.php?showtopic=45198&mode=linear
http://www.cyberway.com.sg/~computek/
Collapse


 
chloee
chloee
English
convert pdf (chinese words) into text Apr 29, 2007

hello to rachel and all,

Which chinese tool can be used for the scanned pdf recognition? and how to used it, can teach? i do scanned my chinese document but cant convert it into text, it appear as a image. meant i need to retype whole document!! anyone can help please? tq in advanced.


 
Angeline PhD
Angeline PhD  Identity Verified
China
Local time: 04:49
English to Chinese
+ ...
Try this. May 27, 2007

"慧视小灵鼠"

For Chinese character, It is better than other ocr software.


 
display output error Aug 25, 2012

Hi, tried Hanwang, it able to recognised, but when convert to text or words, it shown simbol instead of chinese charactor. Anybody encounter this? What setting should I do?

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Urgent!!!! Pdf convert to word or txt. Chinese character support?






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »