Forums

Scheduled Task Error

I've got a scheduled daily task that's been largely working fine for about two weeks now, but for one error, the log of which seems to be as follows:

/bin/bash: line 0: cd: /home/Johz: Stale NFS file handle
python: can't open file 'nsdb/nsdb/unravel_dump.py': [Errno 2] No such file or directory

2013-08-13 13:41:03 -- Completed task, took 0.00 seconds, return code was 2.

My understanding of linux systems isn't great enough to entirely understand what's happening here, but as far as I can tell the scheduler couldn't cd into my home directory for some reason. Can anyone advise what's happened, if it's likely to happen again, and how I might deal with it next time?

Hm, that looks most likely to be caused by an issue on the server. A stale NFS handle occurs when the client is holding a reference to something that's whisked away from under it by another client. The most common cause is when a client changes into a working directory which is then deleted - when the client then tries to access something relative to its current directory (including . and ..) then it gets the stale handle error. In this case the "client" is in fact part of the environment that's running your script, so this isn't related directly to some issue in your code.

I suspect it's most likely that your script was kicked off just as some sort of server maintenance was going on which caused this issue, but the devs can probably shed more light on probable causes. My advice to try and minimise the problem is to always work with absolute path names, never relative to the current directory (i.e. all path names should start with /home/Johz for files inside your home directory). Without knowing the exact cause I don't know how much this will help, but it's generally good practice anyway. To resolve a relative path name (e.g. ./mydir/file.txt) the current working directory must be accessed, and if the current directory has become stale at any point since the script was started then this will fail. Absolute path names, however, involve following the path from / (always accessible) down to the directory each time it's accessed. So, if the directory were to briefly disappear and reappear under the same name (but a different underlying object), an absolute access wouldn't be affected but a relative access would become stale. I probably haven't explained that very well, but I did my best. I managed to avoid using the term inode, at least. Damn, I was doing so well... (^_^)

Aside from that, occasional glitches are just a hazard of running 24/7 on a managed service which must periodically undergo maintenance and repair - as long as you write your script to be tolerant of failures at any point (i.e. don't leave anything lying around which will prevent it running next time) then it shouldn't be more than a minor annoyance unless it happens frequently.

You can read this page for a little more background on the error if you're interested, but it's not going to be of any actual help because it's aimed at operators/admins.

Cartroo's advice, as usual, is excellent. I learned something new. Absolute paths and glitch-tolerant scripts are the way to go...

Another thing to consider is that relative paths are always relative to something, typically the current working directory of the process. So, using relative paths can alter the execution of your script if you execute it from different locations. Scheduled tasks should always execute from the same location, but it's conceivable (though highly unlikely) that the devs might change where this is. In general, you should avoid making assumptions about the working directory (and PATH, come to that) wherever possible.

As an aside, if you're stuck with relative paths (typically due to user configuration or similar) then you can easily convert them to absolute paths with Python's os.path.abspath() - this function takes a single argument of a string containing a path (which may be absolute or relative to the current working directory) and always returns the equivalent absolute path. This shouldn't be confused with the similar-sounding os.path.normpath(), which just attempts to "simplify" the path by removing redundant items, but doesn't change whether the path is relative or absolute.

For example:

>>> import os
>>> os.getcwd()
'/home/myuser'
>>> os.path.abspath("../user123/../user456/file.txt")
'/home/user456/file.txt'
>>> os.path.normpath("../user123/../user456/file.txt")
'../user456/file.txt'

Sorry Johz, you probably never expected your post to provoke such a deluge of verbiage! (^_^)

Well I didn't expect it, but I enjoyed it, and I've learned something new today, which is always good, right? I'll change the way I get to files definitely, although the actual cause seems to be related to this problem. However, good habits, eh?

Yup, the actual cause in this case almost certainly was the same as the one you just linked to. But also, yup, good habits are good :-)