How to search very large TMX files on a Mac? Thread poster: wilhelm_zwo (X)
| wilhelm_zwo (X) Netherlands Local time: 12:16 German to Dutch
What is a good way to search very large TMX files (3 GB and up) on a Mac, for concordance purposes? WII | | |
wilhelm_zwo wrote: What is a good way to search very large TMX files (3 GB and up) on a Mac, for concordance purposes? WII for this TextWrangler but I think any Editor will do it. | | |
Fernando Toledo wrote: [I use] for this TextWrangler No you don't. TextWrangler can't open files larger than around 350 MB. but I think any Editor will do it. Nope. As far as I can see, only Java based text-editors can open such large files. Cheers, Hans | | |
|
|
wilhelm_zwo (X) Netherlands Local time: 12:16 German to Dutch TOPIC STARTER UltraEdit vs. TextWrangler | Jul 19, 2013 |
Fernando Toledo wrote: wilhelm_zwo wrote: What is a good way to search very large TMX files (3 GB and up) on a Mac, for concordance purposes? for this TextWrangler but I think any Editor will do it. I'm sorry, but these DGT TMs are way too large for good old TextWrangler. Luckily the new UltraEdit 4.1 can handle them. AMOF its Find in Files function can search very fast in all TMs in a folder, even when they are as gigantic as the DGT. Still looking for the optimal solution, though. (Since UE isn't integrated in my CAT tool CafeTran on my Mac.) Joakim, when will your revamped TMX searcher be relaunched? | | |
MacVim should be able to open large files, but I think it requires increasing the Java heap. I can't. It's not in a *.plist or *.config file (as far as I can see), and I stay away from the Terminal if I don't know what I'm doing which is most of the time. Advice welcome. Martin? Cheers, Hans | | | John Moran Ireland Local time: 11:16 German to English + ...
Assuming you have more than 4GB RAM OmegaT has no problems with 3GB TM's but you have to tell the Java Virtual Machine to make enough space for the file as the default is too small. To do this go to Applications/Utilities and open terminal The type: cd /Applications/OmegaT.app/Contents/ and then open . Drang and drop the file "Info.plist" into a text editor (I use TextWrangler). Look for the VMOptions and ... See more Assuming you have more than 4GB RAM OmegaT has no problems with 3GB TM's but you have to tell the Java Virtual Machine to make enough space for the file as the default is too small. To do this go to Applications/Utilities and open terminal The type: cd /Applications/OmegaT.app/Contents/ and then open . Drang and drop the file "Info.plist" into a text editor (I use TextWrangler). Look for the VMOptions and change the -Xmx value to something above 3GB. I have 8GB RAM so I use: VMOptions -Xmx6024M Then create a project with a small dummy file (a docx with one dummy sentence) and place the tmx file in the /tm directory. Then you can use Ctrl+F to search the TMX file and it also uses lemmatisation so "dog" will find "dogs". ▲ Collapse | | | It's a worry | Jul 20, 2013 |
John Moran wrote: Assuming you have more than 4GB RAM OmegaT has no problems with 3GB TM's I'm pretty sure der Wilhelm is well aware of that solution. Like me, he uses CafeTran. Loading and searching large files in CT is no problem, and you can even run two instances of CT at the same time. And if that isn't enough, you can load a huge TMX file as an "external" database, in which case it uses very little RAM. So let me rephrase Wilhelm's question: How can I search - and index - large TMX (and other) files on a Mac, outside my CAT tool. There are two problems with that: You can't open documents (not files in general) exceeding around 350 MB on a Mac with apps that don't run under Java (I don't know if there are other solutions, but I doubt it) Spotlight/SpotInside cannot search TMX files So to search those files, you'll need a Java application, or you (still) need a Java application to open the TMX file, convert it to TXT, split it into files OS X can handle, i.e. smaller than 300 MB to make them searchable in Spotlight/SpotInside. I still don't know how to do it. I tried Martin's solution (above), but a 1.5 GB TMX file didn't open in MacVim. I tried to increase the Java heap for MacVim, to no avail, mainly because MacVim isn't a Java app. Der Wilhelm suggested UltraEdit (Java). The new beta can split files it seems, so that could be a solution. I downloaded the latest build which can't split files... I spent so many hours on trying to solve the issue, I could have learned the contents of those databases by heart. I'm sick of it. But I'm sure everybody knows we're talking about the EU files (DGT and Eurobook), and I happen to translate EU notifications. What's worse, from two source languages - ENG and GER - into DUT. I need those big files. Searching them, and even auto-assemble from them in CAfeTran, is not a problem, but I want to be able to search the DGT/Eurobook files of the other source language. And for non-EU texts, I want to be able to search them without attaching them to my current project. Cheers, Hans
[Edited at 2013-07-20 00:44 GMT]
[Edited at 2013-07-21 04:42 GMT] | |
|
|
Well, integrate it then | Jul 20, 2013 |
wilhelm_zwo wrote: Since UE isn't integrated in my CAT tool CafeTran on my Mac. Write an Automator Service to be able to search from within CafeTran (or any other app), or ask the UE developer to write it. Cheers, Hans | | | Import server-based database | Aug 30, 2013 |
It is too big to open this TMX even with text editor. Theoretically you can import the TMX to Heartsome supported server-based database such as MySQL, PostgreSQL or Oracle for searching on Mac. In this case you have to translate files in Heartsome, because Heartsome does not provide an independent TM program. | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » How to search very large TMX files on a Mac? Wordfast Pro | Translation Memory Software for Any Platform
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Buy now! » |
| Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |