Problem Solving
We have explored various parts of the Python language and now we will take a look at how all these parts fit together, by designing and writing a program which does something useful. The idea is to learn how to write a Python script on your own.
The Problem
The problem we want to solve is:
Although, this is a simple problem, there is not enough information for us to get started with the solution. A little more analysis is required. For example, how do we specify which files are to be backed up? How are they stored? Where are they stored?
After analyzing the problem properly, we design our program. We make a list of things about how our program should work. In this case, I have created the following list on how I want it to work. If you do the design, you may not come up with the same kind of analysis since every person has their own way of doing things, so that is perfectly okay.
- The files and directories to be backed up are specified in a list.
- The backup must be stored in a main backup directory.
- The files are backed up into a zip file.
- The name of the zip archive is the current date and time.
- We use the standard 'zip' command available by default in any standard GNU/Linux or Unix distribution. Note that you can use any archiving command you want as long as it has a command line interface.
The Solution
As the design of our program is now reasonably stable, we can write the code which is an implementation of our solution.
Save as
import os
import time
# 1. The files and directories to be backed up are
# specified in a list.
# Example on Windows:
# source = ['"C:\\My Documents"']
# Example on Mac OS X and Linux:
source = ['/Users/swa/notes']
# Notice we have to use double quotes inside a string
# for names with spaces in it. We could have also used
# a raw string by writing [r'C:\My Documents'].
# 2. The backup must be stored in a
# main backup directory
# Example on Windows:
# target_dir = 'E:\\Backup'
# Example on Mac OS X and Linux:
target_dir = '/Users/swa/backup'
# Remember to change this to which folder you will be using
# 3. The files are backed up into a zip file.
# 4. The name of the zip archive is the current date and time
target = target_dir + os.sep + \
time.strftime('%Y%m%d%H%M%S') + '.zip'
# Create target directory if it is not present
if not os.path.exists(target_dir):
os.mkdir(target_dir) # make directory
# 5. We use the zip command to put the files in a zip archive
zip_command = 'zip -r {0} {1}'.format(target,
' '.join(source))
# Run the backup
print('Zip command is:')
print(zip_command)
print('Running:')
if os.system(zip_command) == 0:
print('Successful backup to', target)
else:
print('Backup FAILED')Output:
Now, we are in the testing phase where we test that our program works properly. If it doesn't behave as expected, then we have to debug our program i.e. remove the bugs (errors) from the program.Zip command is:zip -r /Users/swa/backup/20140328084844.zip /Users/swa/notes
Running:
adding: Users/swa/notes/ (stored 0%)
adding: Users/swa/notes/blah1.txt (stored 0%)
adding: Users/swa/notes/blah2.txt (stored 0%)
adding: Users/swa/notes/blah3.txt (stored 0%)
Successful backup to /Users/swa/backup/20140328084844.zip
If the above program does not work for you, copy the line printed after the
How It Works
You will notice how we have converted our design into code in a step-by-step manner.
We make use of the
Notice the use of the
The
We create the name of the target zip file using the addition operator which concatenates the strings i.e. it joins the two strings together and returns a new one. Then, we create a string zip_command which contains the command that we are going to execute. You can check if this command works by running it in the shell (GNU/Linux terminal or DOS prompt).
The
Then, we finally run the command using the
Depending on the outcome of the command, we print the appropriate message that the backup has failed or succeeded.
That's it, we have created a script to take a backup of our important files!
Now that we have a working backup script, we can use it whenever we want to take a backup of the files. This is called the operation phase or the deployment phase of the software.
The above program works properly, but (usually) first programs do not work exactly as you expect. For example, there might be problems if you have not designed the program properly or if you have made a mistake when typing the code, etc. Appropriately, you will have to go back to the design phase or you will have to debug your program.
Second Version
The first version of our script works. However, we can make some refinements to it so that it can work better on a daily basis. This is called the maintenance phase of the software.
One of the refinements I felt was useful is a better file-naming mechanism - using the time as the name of the file within a directory with the current date as a directory within the main backup directory. The first advantage is that your backups are stored in a hierarchical manner and therefore it is much easier to manage. The second advantage is that the filenames are much shorter. The third advantage is that separate directories will help you check if you have made a backup for each day since the directory would be created only if you have made a backup for that day.
Save as 'backup_ver2.py':
import os
import time
# 1. The files and directories to be backed up are
# specified in a list.
# Example on Windows:
# source = ['"C:\\My Documents"', 'C:\\Code']
# Example on Mac OS X and Linux:
source = ['/Users/swa/notes']
# Notice we had to use double quotes inside the string
# for names with spaces in it.
# 2. The backup must be stored in a
# main backup directory
# Example on Windows:
# target_dir = 'E:\\Backup'
# Example on Mac OS X and Linux:
target_dir = '/Users/swa/backup'
# Remember to change this to which folder you will be using
# Create target directory if it is not present
if not os.path.exists(target_dir):
os.mkdir(target_dir) # make directory
# 3. The files are backed up into a zip file.
# 4. The current day is the name of the subdirectory
# in the main directory.
today = target_dir + os.sep + time.strftime('%Y%m%d')
# The current time is the name of the zip archive.
now = time.strftime('%H%M%S')
# The name of the zip file
target = today + os.sep + now + '.zip'
# Create the subdirectory if it isn't already there
if not os.path.exists(today):
os.mkdir(today)
print('Successfully created directory', today)
# 5. We use the zip command to put the files in a zip archive
zip_command = 'zip -r {0} {1}'.format(target,
' '.join(source))
# Run the backup
print('Zip command is:')
print(zip_command)
print('Running:')
if os.system(zip_command) == 0:
print('Successful backup to', target)
else:
print('Backup FAILED')Output:
Successfully created directory /Users/swa/backup/20140329
Zip command is:
zip -r /Users/swa/backup/20140329/073201.zip /Users/swa/notes
Running:
adding: Users/swa/notes/ (stored 0%)
adding: Users/swa/notes/blah1.txt (stored 0%)
adding: Users/swa/notes/blah2.txt (stored 0%)
adding: Users/swa/notes/blah3.txt (stored 0%)
Successful backup to /Users/swa/backup/20140329/073201.zip
How It Works
Most of the program remains the same. The changes are that we check if there is a directory with the current day as its name inside the main backup directory using the 'os.path.exists' function. If it doesn't exist, we create it using the 'os.mkdir' function.
Third Version
The second version works fine when I do many backups, but when there are lots of backups, I am finding it hard to differentiate what the backups were for! For example, I might have made some major changes to a program or presentation, then I want to associate what those changes are with the name of the zip archive. This can be easily achieved by attaching a user-supplied comment to the name of the zip archive.
WARNING: The following program does not work, so do not be alarmed, please follow along because there's a lesson in here.
Save as 'backup_ver3.py':
import os
import time
# 1. The files and directories to be backed up are
# specified in a list.
# Example on Windows:
# source = ['"C:\\My Documents"', 'C:\\Code']
# Example on Mac OS X and Linux:
source = ['/Users/swa/notes']
# Notice we had to use double quotes inside the string
# for names with spaces in it.
# 2. The backup must be stored in a
# main backup directory
# Example on Windows:
# target_dir = 'E:\\Backup'
# Example on Mac OS X and Linux:
target_dir = '/Users/swa/backup'
# Remember to change this to which folder you will be using
# Create target directory if it is not present
if not os.path.exists(target_dir):
os.mkdir(target_dir) # make directory
# 3. The files are backed up into a zip file.
# 4. The current day is the name of the subdirectory
# in the main directory.
today = target_dir + os.sep + time.strftime('%Y%m%d')
# The current time is the name of the zip archive.
now = time.strftime('%H%M%S')
# Take a comment from the user to
# create the name of the zip file
comment = input('Enter a comment --> ')
# Check if a comment was entered
if len(comment) == 0:
target = today + os.sep + now + '.zip'
else:
target = today + os.sep + now + '_' +
comment.replace(' ', '_') + '.zip'
# Create the subdirectory if it isn't already there
if not os.path.exists(today):
os.mkdir(today)
print('Successfully created directory', today)
# 5. We use the zip command to put the files in a zip archive
zip_command = "zip -r {0} {1}".format(target,
' '.join(source))
# Run the backup
print('Zip command is:')
print(zip_command)
print('Running:')
if os.system(zip_command) == 0:
print('Successful backup to', target)
else:
print('Backup FAILED')Output:
File "backup_ver3.py", line 39
target = today + os.sep + now + '_' +
^
SyntaxError: invalid syntaxHow This (does not) Work
This program does not work! Python says there is a syntax error which means that the script does not satisfy the structure that Python expects to see. When we observe the error given by Python, it also tells us the place where it detected the error as well. So we start debugging our program from that line.
On careful observation, we see that the single logical line has been split into two physical lines but we have not specified that these two physical lines belong together. Basically, Python has found the addition operator ('
Fourth Version
Save as 'backup_ver4.py':
import os
import time
# 1. The files and directories to be backed up are
# specified in a list.
# Example on Windows:
# source = ['"C:\\My Documents"', 'C:\\Code']
# Example on Mac OS X and Linux:
source = ['/Users/swa/notes']
# Notice we had to use double quotes inside the string
# for names with spaces in it.
# 2. The backup must be stored in a
# main backup directory
# Example on Windows:
# target_dir = 'E:\\Backup'
# Example on Mac OS X and Linux:
target_dir = '/Users/swa/backup'
# Remember to change this to which folder you will be using
# Create target directory if it is not present
if not os.path.exists(target_dir):
os.mkdir(target_dir) # make directory
# 3. The files are backed up into a zip file.
# 4. The current day is the name of the subdirectory
# in the main directory.
today = target_dir + os.sep + time.strftime('%Y%m%d')
# The current time is the name of the zip archive.
now = time.strftime('%H%M%S')
# Take a comment from the user to
# create the name of the zip file
comment = input('Enter a comment --> ')
# Check if a comment was entered
if len(comment) == 0:
target = today + os.sep + now + '.zip'
else:
target = today + os.sep + now + '_' + \
comment.replace(' ', '_') + '.zip'
# Create the subdirectory if it isn't already there
if not os.path.exists(today):
os.mkdir(today)
print('Successfully created directory', today)
# 5. We use the zip command to put the files in a zip archive
zip_command = 'zip -r {0} {1}'.format(target,
' '.join(source))
# Run the backup
print('Zip command is:')
print(zip_command)
print('Running:')
if os.system(zip_command) == 0:
print('Successful backup to', target)
else:
print('Backup FAILED')Output:
Enter a comment --> added new examples
Zip command is:
zip -r /Users/swa/backup/20140329/074122_added_new_examples.zip /Users/swa/notes
Running:
adding: Users/swa/notes/ (stored 0%)
adding: Users/swa/notes/blah1.txt (stored 0%)
adding: Users/swa/notes/blah2.txt (stored 0%)
adding: Users/swa/notes/blah3.txt (stored 0%)
Successful backup to /Users/swa/backup/20140329/074122_added_new_examples.zip
How It Works
This program now works! Let us go through the actual enhancements that we had made in version 3. We take in the user's comments using the '
However, if a comment was supplied, then this is attached to the name of the zip archive just before the '.zip' extension. Notice that we are replacing spaces in the comment with underscores - this is because managing filenames without spaces is much easier.
More Refinements
The fourth version is a satisfactorily working script for most users, but there is always room for improvement. For example, you can include a verbosity level for the zip command by specifying a '-v' option to make your program become more talkative or a '-q' option to make it quiet.
Another possible enhancement would be to allow extra files and directories to be passed to the script at the command line. We can get these names from the 'sys.argv' list and we can add them to our 'source' list using the 'extend' method provided by the 'list' class.
The most important refinement would be to not use the 'os.system' way of creating archives and instead using the zipfile or tarfile built-in modules to create these archives. They are part of the standard library and available already for you to use without external dependencies on the zip program to be available on your computer.
However, I have been using the 'os.system' way of creating a backup in the above examples purely for pedagogical purposes, so that the example is simple enough to be understood by everybody but real enough to be useful.
Can you try writing the fifth version that uses the zipfile module instead of the 'os.system' call?
The Software Development Process
We have now gone through the various phases in the process of writing a software. These phases can be summarised as follows:
- What (Analysis)
- How (Design)
- Do It (Implementation)
- Test (Testing and Debugging)
- Use (Operation or Deployment)
- Maintain (Refinement)
A recommended way of writing programs is the procedure we have followed in creating the backup script: Do the analysis and design. Start implementing with a simple version. Test and debug it. Use it to ensure that it works as expected. Now, add any features that you want and continue to repeat the Do It-Test-Use cycle as many times as required.
Remember:
Summary
We have seen how to create our own Python programs/scripts and the various stages involved in writing such programs. You may find it useful to create your own program just like we did in this chapter so that you become comfortable with Python as well as problem-solving.
Next, we will discuss object-oriented programming.
Final Review: