Summary: in this tutorial, you will learn how to use Python Semaphore to control the number of threads that can access a shared resource simultaneously.
Introduction to the Python Semaphore
A Python semaphore is a synchronization primitive that allows you to control access to a shared resource. Basically, a semaphore is a counter associated with a lock that limits the number of threads that can access a shared resource simultaneously.
A semaphore helps prevent thread synchronization issues like race conditions, where multiple threads attempt to access the resource at the same time and interfere with each other’s operations.
A semaphore maintains a count. When a thread wants to access the shared resource, the semaphore checks the count.
If the count is greater than zero, it decreases the count and lets the thread accesses the resource. If the count is zero, the semaphore blocks the thread until the count becomes greater than zero.
A semaphore has two main operations:
- Acquire: the acquire operation checks the count and decrement it if it is greater than zero. If the count is zero, the semaphore will block the thread until another thread releases the semaphore.
- Release: the release operation increments the counts that allow other threads to acquire it.
Using a Python semaphore
To use semaphore, you follow these steps:
First, import the threading
module:
import threading
Code language: Python (python)
Second, create a Semaphore
object and specify the number of threads that can acquire it at the same time:
semaphore = threading.Semaphore(3)
Code language: Python (python)
In this example, we create a Semaphore
object that only allows up to three threads to acquire it at the same time.
Third, acquire a semaphore from a thread by calling the acquire()
method:
semaphore.acquire()
Code language: Python (python)
If the semaphore count is zero, the thread will wait until another thread releases the semaphore. Once having the semaphore, you can execute a critical section of code.
Finally, release a semaphore after running the critical section of code by calling the release()
method:
semaphore.release()
Code language: Python (python)
To ensure a semaphore is properly acquired and released, even if exceptions occur during running the critical section of a code, you can use the with
statement:
with semaphore: # Code within this block has acquired the semaphore # Perform operations on the shared resource # ... # The semaphore is released outside the with block
Code language: Python (python)
The with
statement acquire and release the semaphore automatically, making your code less error-prone.
Python semaphore example
The following example illustrates how to use the semaphore to limit the max number of concurrent downloads to three using multithreading in Python:
import threading import urllib.request MAX_CONCURRENT_DOWNLOADS = 3 semaphore = threading.Semaphore(MAX_CONCURRENT_DOWNLOADS) def download(url): with semaphore: print(f"Downloading {url}...") response = urllib.request.urlopen(url) data = response.read() print(f"Finished downloading {url}") return data def main(): # URLs to download urls = [ 'https://www.ietf.org/rfc/rfc791.txt', 'https://www.ietf.org/rfc/rfc792.txt', 'https://www.ietf.org/rfc/rfc793.txt', 'https://www.ietf.org/rfc/rfc794.txt', 'https://www.ietf.org/rfc/rfc795.txt', ] # Create threads for each download threads = [] for url in urls: thread = threading.Thread(target=download, args=(url,)) threads.append(thread) thread.start() # Wait for all threads to complete for thread in threads: thread.join() if __name__ == '__main__': main()
Code language: Python (python)
Output:
Downloading https://www.ietf.org/rfc/rfc791.txt... Downloading https://www.ietf.org/rfc/rfc792.txt... Downloading https://www.ietf.org/rfc/rfc793.txt... Finished downloading https://www.ietf.org/rfc/rfc792.txt Downloading https://www.ietf.org/rfc/rfc794.txt... Finished downloading https://www.ietf.org/rfc/rfc791.txt Downloading https://www.ietf.org/rfc/rfc795.txt... Finished downloading https://www.ietf.org/rfc/rfc793.txt Finished downloading https://www.ietf.org/rfc/rfc794.txt Finished downloading https://www.ietf.org/rfc/rfc795.txt
Code language: Python (python)
The output shows that only a maximum of three threads can download at the same time:
Downloading https://www.ietf.org/rfc/rfc791.txt... Downloading https://www.ietf.org/rfc/rfc792.txt... Downloading https://www.ietf.org/rfc/rfc793.txt...
Code language: Python (python)
Once the number of threads reaches three, the next thread needs to wait for the semaphore to be released by another thread.
For example, the following shows that thread #2 completed and released the semaphore, and the next thread start downloading the URL https://www.ietf.org/rfc/rfc794.txt
Finished downloading https://www.ietf.org/rfc/rfc792.txt Downloading https://www.ietf.org/rfc/rfc794.txt...
Code language: Python (python)
How the program works.
First, import the threading and urlib.request
modules:
import threading import urllib.request
Code language: Python (python)
Second, create a Semaphore object to control the number of threads that can download simultaneously at the same time to three:
MAX_CONCURRENT_DOWNLOADS = 3 semaphore = threading.Semaphore(MAX_CONCURRENT_DOWNLOADS)
Code language: Python (python)
Third, define the download()
function that downloads from a URL. The download function acquires and releases the semaphore using the with statement. It also uses the urllib.request
module to download data from a URL:
def download(url): with semaphore: print(f"Downloading {url}...") response = urllib.request.urlopen(url) data = response.read() print(f"Finished downloading {url}") return data
Code language: Python (python)
Fourth, define the main()
function that creates five threads based on a URL list and starts them to download data:
def main(): # URLs to download urls = [ 'https://www.ietf.org/rfc/rfc791.txt', 'https://www.ietf.org/rfc/rfc792.txt', 'https://www.ietf.org/rfc/rfc793.txt', 'https://www.ietf.org/rfc/rfc794.txt', 'https://www.ietf.org/rfc/rfc795.txt', ] # Create threads for each download threads = [] for url in urls: thread = threading.Thread(target=download, args=(url,)) threads.append(thread) thread.start() # Wait for all threads to complete for thread in threads: thread.join()
Code language: Python (python)
Finally, call the main()
function in the if __name__ == ‘__main__’ section:
if __name__ == '__main__': main()
Leave a Reply