When a programmer forgets to clear a memory allocated in heap memory, the memory leak occurs. It’s a type of resource leak or wastage. When there is a memory leak in the application, the memory of the machine gets filled and slows down the performance of the machine. This is a serious issue while building a large scalable application.
Request: The requests library is an integral part of Python for making HTTP requests to a specified URL. Whether it be REST APIs or Web Scrapping, requests are must be learned for proceeding further with these technologies. When one makes a request to a URI, it returns a response. Python requests provide inbuilt functionalities for managing both the request and response.
Gc Module: Module gc is a python inbuilt module, that provides an interface to the python Garbage collector. It provides features to enable collector, disable collector, tune collection frequency, debug options and more.
In lower-level languages like C and C++, the programmer should manually free the resource that is unused i.e write code to manage the resource. But high-level languages like python, java have a concept of automatic memory manager known as Garbage collector. Garbage collector manages the allocation and release of memory for an application.
Some gc methods that we will be using are listed below.
- get_objects(): This method returns a list of all tracked objects by the Garbage collector, excluded the list being returned.
- collect(): This method free the non referenced object in the list that is maintained by the Collector. Some non-referenced objects are not immediately free automatically due to their implementation.
We will be using get() method in requests, that returns a response object. When the response object is non-referenced i.e deleted its memory should be freed immediately, but due to its implementation, the resource is not freed automatically. Here its stats leaking memory.
Identify Memory Leak:
Approach:
- Get and store the number of objects, tracked ( created and alive) by Collector. You can use gc.get_objects() to get list of tracked objects then use len function to count no. of objects.
- Call the function that calls the request.get() method.
- Print the response status code, so that we can confirm that the object is created.
- Then return the function. When the function is returned, all the created within the function should be deleted.
- Get the number of objects tracked currently and compare the valve with the previous value for the leaked objects.
- If currently, returned object count is greater, so there is a memory leak.
Below is the implementation:
- Python3
import requests import gc def call(): # call the get with a url,here I used google.com # get method returns a response object response = requests.get( 'https://google.com' ) # print the status code of response print ( "Status code" , response.status_code) # After the function is been returned, # the response object becomes non-referenced return def main(): print ( "No.of tracked objects before calling get method" ) # gc.get_objects() returns list objects been tracked # by the collector. # print the length of object list with len function. print ( len ( gc.get_objects() ) ) # make a call to the function, that calls get method. call() print ( "No.of tracked objects after calling get method" ) # print the length of object list with len function. print ( len ( gc.get_objects() ) ) if __name__ = = "__main__" : main() |
Output:No.of tracked objects before calling get method 16071 Status code 200 No.of tracked objects after calling get method 16158
Fix Memory leak:
A simple solution to this is to manually call the gc.collect() method, this method will free the resource immediately.
Approach:
- Get and store the number of objects, tracked ( created and alive) by Collector. You can use gc.get_objects() to get list of tracked objects then use len function to count no. of objects.
- Call the function that calls the request.get() method.
- Print the response status code, so that we can confirm that the object is created.
- Then return the function. When the function is returned, all the created within the function should be deleted.
- Call the garbage collector to free the unused resource, i.e to call the gc.collect() method.
- Get the number of objects tracked currently, now you can notice the lesser number objects this due to cleaning unused resource.
Below is the implementation:
- Python3
import requests import gc def call(): # call the get with a url,here I used google.com # get method returns a response object response = requests.get( 'https://google.com' ) # print the status code of response print ( "Status code" ,response.status_code) # After the function is been returned, # the response object becomes non-referenced return def main(): print ( "No.of tracked objects before calling get method" ) # gc.get_objects() returns list objects been tracked # by the collector. # print the length of object list with len function. print ( len ( gc.get_objects() ) ) # make a call to the function, that calls get method. call() # collect method immediately free the resource of # non-referenced object. gc.collect() # print the length of object list with len # function after removing non-referenced object. print ( "No.of tracked objects after removing non-referenced objects" ) print ( len ( gc.get_objects() ) ) if __name__ = = "__main__" : main() |
Output:No.of tracked objects before calling get method 16071 Status code 200 No.of tracked objects after removing non-referenced objects 15954
Leave a Reply