一、多线程的基本概念 在Python中,线程是程序执行的最小单位。多线程编程允许程序同时执行多个任务,提高程序的执行效率。
二、线程的创建与启动 1. 使用threading模块 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 import threadingimport timedef task (name ): print (f"Task {name} started" ) time.sleep(2 ) print (f"Task {name} completed" ) thread1 = threading.Thread(target=task, args=('A' ,)) thread2 = threading.Thread(target=task, args=('B' ,)) thread1.start() thread2.start() thread1.join() thread2.join() print ("All tasks completed" )
2. 使用继承方式 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 import threadingimport timeclass MyThread (threading.Thread): def __init__ (self, name ): super ().__init__() self .name = name def run (self ): print (f"Task {self.name} started" ) time.sleep(2 ) print (f"Task {self.name} completed" ) thread1 = MyThread('A' ) thread2 = MyThread('B' ) thread1.start() thread2.start() thread1.join() thread2.join() print ("All tasks completed" )
三、线程同步 1. 锁(Lock) 当多个线程同时访问共享资源时,可能会导致数据不一致的问题。使用锁可以确保同一时间只有一个线程访问共享资源:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 import threadingimport timecounter = 0 lock = threading.Lock() def increment (): global counter for _ in range (1000000 ): with lock: counter += 1 def decrement (): global counter for _ in range (1000000 ): with lock: counter -= 1 thread1 = threading.Thread(target=increment) thread2 = threading.Thread(target=decrement) thread1.start() thread2.start() thread1.join() thread2.join() print (f"Final counter value: {counter} " )
2. 条件变量(Condition) 条件变量用于线程间的通信,允许线程在特定条件满足时才继续执行:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 import threadingimport timecondition = threading.Condition() data = [] def producer (): for i in range (5 ): time.sleep(1 ) with condition: data.append(i) print (f"Produced: {i} " ) condition.notify() def consumer (): for _ in range (5 ): with condition: while not data: condition.wait() item = data.pop(0 ) print (f"Consumed: {item} " ) thread1 = threading.Thread(target=producer) thread2 = threading.Thread(target=consumer) thread1.start() thread2.start() thread1.join() thread2.join() print ("All tasks completed" )
3. 信号量(Semaphore) 信号量用于控制对共享资源的访问数量,允许多个线程同时访问资源,但限制最大并发数:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 import threadingimport timesemaphore = threading.Semaphore(2 ) def task (name ): print (f"Task {name} waiting" ) with semaphore: print (f"Task {name} started" ) time.sleep(2 ) print (f"Task {name} completed" ) threads = [] for i in range (5 ): thread = threading.Thread(target=task, args=(i,)) threads.append(thread) thread.start() for thread in threads: thread.join() print ("All tasks completed" )
四、线程池 使用线程池可以更有效地管理线程,避免频繁创建和销毁线程的开销:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 from concurrent.futures import ThreadPoolExecutorimport timedef task (name ): print (f"Task {name} started" ) time.sleep(2 ) print (f"Task {name} completed" ) return f"Result of task {name} " with ThreadPoolExecutor(max_workers=3 ) as executor: futures = [executor.submit(task, i) for i in range (5 )] for future in futures: result = future.result() print (f"Received: {result} " ) print ("All tasks completed" )
五、GIL(全局解释器锁) Python的CPython解释器有一个全局解释器锁(GIL),它确保同一时间只有一个线程执行Python字节码。这意味着,即使在多核CPU上,Python的多线程也不能真正实现并行执行,只能实现并发。
对于CPU密集型任务,多线程可能不会提高性能,甚至会因为线程切换的开销而降低性能。对于I/O密集型任务,多线程可以提高性能,因为当一个线程等待I/O操作时,其他线程可以继续执行。
六、多线程的优缺点 1. 优点
提高程序的响应速度 :当一个线程等待I/O操作时,其他线程可以继续执行
充分利用CPU资源 :对于I/O密集型任务,多线程可以提高CPU的利用率
简化程序结构 :多线程可以使程序结构更加清晰,每个线程负责一个特定的任务
2. 缺点
线程安全问题 :多个线程同时访问共享资源时,可能会导致数据不一致的问题
GIL限制 :在CPython中,多线程不能真正实现并行执行
调试困难 :多线程程序的调试比单线程程序更困难,因为线程的执行顺序是不确定的
资源消耗 :每个线程都需要一定的内存和CPU资源
七、实际应用示例 1. 并发下载文件 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 import threadingimport requestsimport timedef download_file (url, filename ): print (f"Downloading {url} " ) response = requests.get(url) with open (filename, 'wb' ) as f: f.write(response.content) print (f"Downloaded {filename} " ) files = [ ('https://www.example.com' , 'example1.html' ), ('https://www.python.org' , 'python.html' ), ('https://www.google.com' , 'google.html' ) ] threads = [] for url, filename in files: thread = threading.Thread(target=download_file, args=(url, filename)) threads.append(thread) thread.start() for thread in threads: thread.join() print ("All downloads completed" )
2. 并发处理数据 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 from concurrent.futures import ThreadPoolExecutorimport timedef process_data (data ): print (f"Processing data: {data} " ) time.sleep(1 ) return data * 2 data_list = [1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 ] with ThreadPoolExecutor(max_workers=4 ) as executor: results = list (executor.map (process_data, data_list)) print (f"Results: {results} " )