Previous Index Next

Data File Handling

Data stored in variables and lists is temporary — it’s lost when the program terminates. Python allows a program to read data from a file or write data to a file. Once the data is saved in a file on computer disk, it will remain there after the program stops running. The data can then be retrieved and used at a later time.

There are two types of files in Python - text files and binary files.

A text file is processed as a sequence of characters. In a text file there is a special end-of-line symbol that divides file into lines. In addition, you can think that there is a special end-of-file symbol that follows the last component in a file. A big advantage of text files is that it may be opened and viewed in a text editor such as Notepad.

A binary file stores data that has not been translated into character form. Binary files typically use the same bit patterns to represent data as those used to represent the data in the computer's main memory. These files are called binary files because the only thing they have in common is that they store the data as sequences of zeros and ones.

Steps to process file Input/output in Python

Step 1. Open the file — Opening a file creates a connection between the file and the program.
Step 2. Process the file — In this step data is either written to the file (if it is an output file) or read from the file (if it is an input file).
Step 3. Close the file — When the program is finished using the file, the file must be closed. Closing a file disconnects the file from the program.

Open() Function

The open function is used in Python to open a file. The open function returns a file object and associates it with a file on the disk. Here is the general format:

variable = open(filename, mode)
File Mode Description
'r' Opens the file in read-only mode.
'rb' Opens the file in binary and read-only mode.
'r+' Opens the file in both read and write mode.
'w' Opens the file in write mode. If the file already exists, all the contents will be overwritten. If the file doesn’t exist, then a new file will be created.
'wb+' Opens the file in read, write and binary mode. If the file already exists, the contents will be overwritten. If the file doesn’t exist, then a new file will be created.
'a' Opens the file in append mode. If the file doesn’t exist, then a new file will be created.
'a+' Opens the file in append and read mode. If the file doesn’t exist, then it will create a new file.

Writing Data to a Text File

The write() method: write() method takes a string as an argument and writes it to the text file. The following program writes four lines to the text file (oceans.txt).

Program (oceans.py)

outfile = open('oceans.txt', 'w') #Step 1
outfile.write('Atlantic\n') #Step 2
outfile.write('Pacific\n')
outfile.write('Indian\n')
outfile.write('Arctic\n')
outfile.close() #Step 3

Output

On running this program a new text file 'oceans.txt' is created and stored in same folder where you have saved 'oceans.py'

Now discuss the above prgram step by step:

Step 1

outfile = open('oceans.txt', 'w') #Step 1

open function creates the file 'oceans.txt' and returns the file object to outfile. 'w' is file mode used for writing to text file.

Step 2

outfile.write('Atlantic\n') #Step 2

The above statement writes the contents of string to the file.

Step 3

outfile.close()

outfile.close() is used to close the file and immediately free up any system resources used by it.

The writelines() method: writeline method takes an iterable as argument (an iterable object can be a tuple, a list). Each item contained in the iterator is expected to be a string. The above program can be made using writelines() method:

outfile = open('oceans.txt', 'w')
text = 'Atlantic\n','Pacific\n','Indian\n','Arctic\n'
outfile.writelines(text)
outfile.close()

Note: if you want to write a single string you can do this with write(). If you have a sequence of strings you can write them all using writelines().

Relative and Absolute Path

Relative Path: A relative path is the path that is relative to the working directory location on your computer. In Program (oceans.py) we have used relative path of ocean.txt in open function. Therfore, text file 'oceans.txt' is created and stored in same folder (current working directory) where we have saved 'oceans.py'

Absolute Path: An absolute path is a path that contains the entire path to the file that you need to access. This path will begin with drive of your computer and will end with the file that you wish to access. Example:

C:\Users\91981\Desktop\oceans.txt

Program (absolute.py)

outfile = open(r'C:\Users\91981\Desktop\oceans.txt', 'w')
outfile.write('Atlantic\n')
outfile.write('Pacific\n')
outfile.write('Indian\n')
outfile.write('Arctic\n')
outfile.close()

In above program we have used 'r' before the path name. This is because, in Python, backslash is used to signify special characters. But we want them to mean actual backslashes, not special characters. r stands for "raw" and will cause backslashes in the string to be interpreted as actual backslashes rather than special characters.

Append Data to a File

Program (appendocean.py)

outfile = open('oceans.txt', 'a') #'a' For Append
outfile.write('Southern\n')
outfile.close()

Reading Data From a File

To open a file in read mode use the 'r' mode in open() function. You can use read(), readline() or readlines() methods for reading data from file.

The read([n]) method: read() method reads and returns a string of n characters, or the entire file as a single string if n is not provided.

Program (readfile1.py)

infile = open('oceans.txt', 'r')
data = infile.read()
print(data)
infile.close()

Output

Atlantic
Pacific
Indian
Arctic
Southern

Program (readfile2.py)

infile = open('oceans.txt', 'r')
data = infile.read(5)
print(data)
infile.close()

Output

Atlan

In above programs the open() function opens the oceans.txt file for reading, using the 'r' mode. The infile.read() method reads file content into memory as a string and assigned to the data variable.

Reading File Using readline() method

In Python you can use the readline() method to read a line from a file. A line is simply a string of characters that are terminated with a \n. The method returns the line as a string, including the \n. The following Program uses the readline method to read the contents of the oceans.txt file, one line at a time.

Program (readfile2.py)

infile = open('oceans.txt', 'r')
line1 = infile.readline()
line2 = infile.readline()
line3 = infile.readline()
print(line1)
print(line2)
print(line3)
infile.close()

Output

Atlantic

Pacific

Indian

rstrip()

Output of above program shows extra spances between lines. This is because readline() method reads the string including the \n. However, in many cases you want to remove the \n from a string after it is read from a file. In Python rstrip() method removes specific characters from the end of a string.

Program (readfile3.py)

infile = open('oceans.txt', 'r')
line1 = infile.readline().rstrip('\n')
line2 = infile.readline().rstrip('\n')
line3 = infile.readline().rstrip('\n')
print(line1)
print(line2)
print(line3)
infile.close()

Output

Atlantic
Pacific
Indian

The readlines() method

The readlines() returns a list of strings, each representing a single line of the file. If n is not provided then all lines of the file are returned.

for Loop to Read All Lines from the file

Program (readfile4.py)

file = open('oceans.txt', 'r')

for line in file:
    line = line.rstrip()
    print(line)

file.close()

Output

Atlantic
Pacific
Indian
Arctic
Southern

The first time the loop iterates, the line variable will reference the first line in the file, the second time the loop iterates, the line variable will reference the second line, and so forth. Remember that the line variable will contain the trailing new line character, e.g. "Atlantic\n"

The tell() and seek() methods

Python provides seek() and tell() methods to access data in a random fashion in a file.

The tell() method returns the current file position in a file stream. Syntax:

file_object.tell()

The seek() method sets the current file position in a file stream and returns the new position. Syntax:

file_object.seek(offset [, reference_point])

where offset means how many positions you will move; reference_point defines your point of reference:
0: means your reference point is the beginning of the file
1: means your reference point is the current file position
2: means your reference point is the end of the file

infile = open("testfile.txt","r")
print("Initially, the position is: ",infile.tell())
infile.seek(10)
print("Position after moving 10 bytes is:",infile.tell())
infile.seek(0,2)
print("Last position of file: ",infile.tell())

Output

Initially, the position is:  0
Position after moving 10 bytes is: 10
Last position of file:  55

Opening file using with statement

When you use with statement with open function, you do not need to close the file at the end, because with would automatically close it for you.

with open('somefile.txt', "w") as outfile:
    outfile.write("Some text\n")


Previous Index Next