Mutex 锁竞争导致 QPS 暴跌？从 GMP 角度看看怎么回事-开发者社区

Mutex 锁竞争导致 QPS 暴跌？从 GMP 角度看看怎么回事

前言

"老王，为什么本文们的服务 QPS 上不去？加了锁反而更慢了！" 后端工程师小李一脸着急。

本文看了看监控，发现锁等待时间占比超过 50%。"你这是锁竞争太激烈了！"

"锁竞争？不就是加把锁吗？"

看来得从 GMP 的角度讲起了。今天本文们聊聊 Mutex 锁竞争对性能的影响。

一、底层原理

1.1 Mutex 竞争对 GMP 的影响

当一个 goroutine 尝试获取已经被占的 Mutex 时：

graph TD A["G1 持有锁"] --> B["正在工作"] C["G2 请求锁"] --> D{"锁可用？"} D -->|否| E["G2 阻塞"] E --> F["G2 从 P 移除"] F --> G["P 调度其他 G"] F --> H["G2 进入等待队列"] A --> I["G1 释放锁"] I --> J["唤醒 G2"] J --> K["G2 重新进入 P 队列"] K --> L["等待 P 调度"]

关键影响：

G2 阻塞时让出 P
P 调度其他 G
G2 唤醒后还要排队
上下文切换比锁操作本身更贵

1.2 Mutex 不同模式对比

锁模式	优点	缺点
正常模式	公平	吞吐量低
饥饿模式	高吞吐	不公平
读写锁	读并发	写独占

二、快速上手

看锁竞争的典型场景：

package main import ( "fmt" "sync" "time" ) func main() { var mu sync.Mutex counter := 0 var wg sync.WaitGroup start := time.Now() for i := 0; i < 1000; i++ { wg.Add(1) go func() { defer wg.Done() for j := 0; j < 10000; j++ { mu.Lock() counter++ mu.Unlock() } }() } wg.Wait() fmt.Printf("Mutex: %v, count: %d\n", time.Since(start), counter) }

大量协程竞争同一把锁，性能惨不忍睹。

改进版，用分片：

type ShardedCounter struct { shards [32]struct { mu sync.Mutex value int64 } } func (sc *ShardedCounter) Inc(key int) { idx := key % 32 sc.shards[idx].mu.Lock() sc.shards[idx].value++ sc.shards[idx].mu.Unlock() }

三、核心 API / 深水区

3.1 减少锁竞争的技巧速查

技巧	做法	效果
缩小临界区	只锁必要代码	显著
读写分离	RWMutex	读场景好
分片锁	多把锁	显著
无锁	atomic	最好

3.2 RWMutex 的正确使用

type Cache struct { mu sync.RWMutex data map[string]string } func (c *Cache) Get(key string) string { c.mu.RLock() defer c.mu.RUnlock() return c.data[key] } func (c *Cache) Set(key, val string) { c.mu.Lock() defer c.mu.Unlock() c.data[key] = val }

高并发读场景，RWMutex 比普通 Mutex 快很多。

3.3 原子操作替代锁

对于简单计数，用 atomic：

type Counter struct { value int64 } func (c *Counter) Inc() { atomic.AddInt64(&c.value, 1) } func (c *Counter) Get() int64 { return atomic.LoadInt64(&c.value) }

四、实战演练

对比不同锁方案在高并发下的表现：

package main import ( "fmt" "sync" "sync/atomic" "time" ) type AtomicCounter struct { value int64 } func (c *AtomicCounter) Inc() { atomic.AddInt64(&c.value, 1) } type MutexCounter struct { mu sync.Mutex value int64 } func (c *MutexCounter) Inc() { c.mu.Lock() c.value++ c.mu.Unlock() } type ShardedCounter struct { shards [64]shard } type shard struct { mu sync.Mutex value int64 } func (c *ShardedCounter) Inc(key int) { s := &c.shards[key%64] s.mu.Lock() s.value++ s.mu.Unlock() } func main() { n := 1000 iterations := 100000 var wg sync.WaitGroup // atomic ac := &AtomicCounter{} start := time.Now() for i := 0; i < n; i++ { wg.Add(1) go func() { defer wg.Done() for j := 0; j < iterations; j++ { ac.Inc() } }() } wg.Wait() fmt.Printf("Atomic: %v\n", time.Since(start)) // Mutex mc := &MutexCounter{} start = time.Now() for i := 0; i < n; i++ { wg.Add(1) go func() { defer wg.Done() for j := 0; j < iterations; j++ { mc.Inc() } }() } wg.Wait() fmt.Printf("Mutex: %v\n", time.Since(start)) // 分片 sc := &ShardedCounter{} start = time.Now() for i := 0; i < n; i++ { wg.Add(1) go func() { defer wg.Done() for j := 0; j < iterations; j++ { sc.Inc(j) } }() } wg.Wait() fmt.Printf("分片锁: %v\n", time.Since(start)) }

五、避坑指南与最佳实践

💡 **技巧：缩小临界区
锁的时间越短，竞争越少。

⚠️ **警告：不要用 Mutex 保护只读数据
用 RWMutex，读操作不阻塞。

✅ **推荐：能用 atomic 就别上锁
atomic 没有上下文切换开销。

六、综合实战演示

分片缓存，减少锁竞争：

package main import ( "fmt" "sync" "time" ) const shardCount = 256 type CacheShard struct { items map[string]int mu sync.RWMutex } type ConcurrentCache struct { shards [shardCount]*CacheShard } func NewCache() *ConcurrentCache { c := &ConcurrentCache{} for i := range c.shards { c.shards[i] = &CacheShard{ items: make(map[string]int), } } return c } func (c *ConcurrentCache) getShard(key string) *CacheShard { hash := 0 for _, b := range key { hash = hash*31 + int(b) } if hash < 0 { hash = -hash } return c.shards[hash%shardCount] } func (c *ConcurrentCache) Get(key string) (int, bool) { s := c.getShard(key) s.mu.RLock() val, ok := s.items[key] s.mu.RUnlock() return val, ok } func (c *ConcurrentCache) Set(key string, val int) { s := c.getShard(key) s.mu.Lock() s.items[key] = val s.mu.Unlock() } func main() { cache := NewCache() var wg sync.WaitGroup for i := 0; i < 100; i++ { wg.Add(1) go func() { defer wg.Done() for j := 0; j < 100000; j++ { key := fmt.Sprintf("key_%d_%d", j/100, j%100) cache.Set(key, j) cache.Get(key) } }() } wg.Wait() fmt.Println("分片缓存完成") }