Forums

Why is my parsing job killed?

When parsing a XML file the process is killed. Researching I found minidom.py is resource intensive and the message maybe triggered because hitting the memory threshold. Can you give me some options?

It's certainly possible that it's hitting a memory limit, though you'd have to be using a very large amount of memory to hit that -- at least a gigabyte. Where are you running it? Is it inside a web app, or on a console, or in a scheduled task?

Thanks Gilies - I performed, from BASH console, I have an API that successfully downloaded data into an XML then begins the parsing job, after a short time I get prompted it's been killed.

That does sound like it might be killed because it's using a lot of memory. How big is the XML file? minidom creates an entire DOM in memory from the parsed XML and that is very memory intensive for large files. You could switch to using a streaming parser that never keeps the entire DOM in memory. There's some good discussion here and here about streaming parsers. xml.sax is probably the way to go since it's built in to Python.