get_uptime program source codeAt work, I deal with a Windows environment.  Rebooting is a fact of life.  I try not to get too upset.  I try not to resist.  Like I said, it’s a fact of life.

We have a collection of worker computers that we use to run jobs.  Pretty regularly we have found that those computers start having problems after being up for 20 days.  They’re all running 24×7 and they keep mapping network drives and then unmapping, then mapping, then unmapping, and so on.  Those network connections start failing.  Then the jobs start failing.  Then the operators get errors and everyone complains.  So we have a regular routine to reboot those worker computers every couple of weeks.

Since that’s the routine I want to make sure that we properly check the uptime of the worker computers.  I decided to write a Python program that queries the computers and calculates how long they’ve been up (there’s a link to the full source below).  This program demonstrates how to go about querying the computers and formatting some output.

In my particular case, my job scheduling system captures the output and emails it to me daily.  That’s what works for me. Now let’s get this program working for you…

First, we do a little parameter (aka argument) checking.  The Pythonic thing to do is use the built in argparse module.

 

We create an instance of the ArgumentParser class.  Then we tell the parser that we have an argument to specify the name of a remote computer that we’ll be checking.  By specifying ‘-r’ and ‘–remote’ the user can use either the short or long naming of the parameter.  A nice little touch.  A nice advantage of using the parser is that it will verify all of the arguments and store them in a variable for you.  Additionally, you can use parser.print_help() to display a help message to your user (who may be your forgetful self a year from now!).

This version of the program is written, rather selfishly, to run on Windows.  I’m using the systeminfo command to find out from Windows the “System Boot Time”. So we’ve got a little safety check:

 

Program execution jumps to dealing with our “database” where we keep the list of computers to check. The database is actually just a text file with a list of computer names. What’s the name of the file? You guessed it! It’s the same as the name of the program, but with the file extension changed from .py to .txt.

 

If the file doesn’t exist yet then we’re going to put an entry in the file. So we open it for writing. If there isn’t any incoming argument then we’ll write localhost, otherwise, we’ll write the name of the computer from the argument.

 

Slick. Now our “database” has at least one entry in it. So now we can safely go rolling through the file and check the uptime for every entry in the database. As we’re rolling through the file we determine whether we need to save the command-line argument to our database file. Here’s how we roll through the file:

Calling get_uptime(computer) does the real work for each computer. We’ll come back to that in a sec. After we roll through the file then we add our command line argument if we didn’t see it in the database. The net effect is that we keep adding any new computer names to the database. Each time we examine a computer we save its name and then examine all of the entries in the database.

 


Now let’s get back to the meat on this stick. The heart of the action. The guts. The grit.

The program puts together our fancy systeminfo command and then shells out to run the command. It reads the output of the command (child_stdout.readlines()).

 

Once we have the output of the command we have to go looking for the one specific line of output we want. We actually want a line that contains the text “System Boot Time:”. That line contains the date and time that the computer was last rebooted. When we find that line then we can extract the elements we want. So we split the line into the date and the time elements. Then we split the date into its components and the time into its components.

 


Once we have those boot date+time elements then we get the current date+time. Happily, we can do some simple subtraction and get a datetime.timedelta object which makes it easy to get the varying timespans. Then we do a little programmatic grunt work to create a string that contains an English explanation of the timedelta.

It’s a thing of beauty now. The get_uptime program source code is ready to do some work for us!

Do you have computers that need occasional reboots? Is there something here that needs elaboration? Let us know in the comments below.

Share This