U.S. flag   An official website of the United States government
Dot gov

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Https

Secure .gov websites use HTTPS
A lock (Dot gov) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Vulnerability Change Records for CVE-2024-26628

Change History

CVE Translated by kernel.org 3/20/2024 1:15:07 PM

Action Type Old Value New Value
Removed Translation
Title: kernel de Linux
Description: En el kernel de Linux, se ha resuelto la siguiente vulnerabilidad: drm/amdkfd: Reparar advertencia de dependencia de bloqueo =============================== ======================== ADVERTENCIA: posible dependencia de bloqueo circular detectada 6.5.0-kfd-fkuehlin #276 No contaminado -------- ---------------------------------------------- ktrabajador/8: 2/2676 está intentando adquirir el bloqueo: ffff9435aae95c88 ((work_completion)(&svm_bo->eviction_work)){+.+.}-{0:0}, en: __flush_work+0x52/0x550 pero la tarea ya mantiene el bloqueo: ffff9435cd8e1720 ( &svms->lock){+.+.}-{3:3}, en: svm_range_deferred_list_work+0xe8/0x340 [amdgpu] cuyo bloqueo ya depende del nuevo bloqueo. la cadena de dependencia existente (en orden inverso) es: -> #2 (&svms->lock){+.+.}-{3:3}: __mutex_lock+0x97/0xd30 kfd_ioctl_alloc_memory_of_gpu+0x6d/0x3c0 [amdgpu] kfd_ioctl+0x1b2 /0x5d0 [amdgpu] __x64_sys_ioctl+0x86/0xc0 do_syscall_64+0x39/0x80 Entry_SYSCALL_64_after_hwframe+0x63/0xcd -> #1 (&mm->mmap_lock){++++}-{3:3}: down_read+0x42/0x160 svm_range_evi ct_svm_bo_worker+ 0x8b/0x340 [amdgpu] proceso_one_work+0x27a/0x540 trabajador_thread+0x53/0x3e0 kthread+0xeb/0x120 ret_from_fork+0x31/0x50 ret_from_fork_asm+0x11/0x20 -> #0 ((work_completion)(&svm_bo->eviction_work) ){+.+ .}-{0:0}: __lock_acquire+0x1426/0x2200 lock_acquire+0xc1/0x2b0 __flush_work+0x80/0x550 __cancel_work_timer+0x109/0x190 svm_range_bo_release+0xdc/0x1c0 [amdgpu] svm_range_free+0x175 /0x180 [amdgpu] svm_range_deferred_list_work+0x15d/0x340 [amdgpu] Process_one_work+0x27a/0x540 trabajador_thread+0x53/0x3e0 kthread+0xeb/0x120 ret_from_fork+0x31/0x50 ret_from_fork_asm+0x11/0x20 otra información que podría ayudarnos a depurar esto: Existe cadena de: (work_completion)(&svm_bo->eviction_work) --> &mm->mmap_lock --> &svms->lock Posible escenario de bloqueo inseguro: CPU0 CPU1 ---- ---- lock(&svms->lock); bloquear(&mm->mmap_lock); bloquear(&svms->bloquear); lock((work_completion)(&svm_bo->eviction_work)); Creo que esto realmente no puede llevar a un punto muerto en la práctica, porque svm_range_evict_svm_bo_worker solo toma mmap_read_lock si el recuento de BO no es 0. Eso significa que es imposible que svm_range_bo_release se esté ejecutando al mismo tiempo. Sin embargo, no existe una buena forma de anotar esto. Para evitar el problema, tome una referencia de BO en svm_range_schedule_evict_svm_bo en lugar de en el trabajador. De esa manera, es imposible que un BO sea liberado mientras el trabajo de desalojo está pendiente y la llamada cancel_work_sync en svm_range_bo_release puede eliminarse. v2: Use svm_bo_ref_unless_zero y explicó por qué es seguro. También se eliminaron las comprobaciones redundantes que ya se realizan en amdkfd_fence_enable_signaling.

								
						

CVE Modified by kernel.org 3/20/2024 1:15:07 PM

Action Type Old Value New Value
Changed Description
In the Linux kernel, the following vulnerability has been resolved:

drm/amdkfd: Fix lock dependency warning

======================================================
WARNING: possible circular locking dependency detected
6.5.0-kfd-fkuehlin #276 Not tainted
------------------------------------------------------
kworker/8:2/2676 is trying to acquire lock:
ffff9435aae95c88 ((work_completion)(&svm_bo->eviction_work)){+.+.}-{0:0}, at: __flush_work+0x52/0x550

but task is already holding lock:
ffff9435cd8e1720 (&svms->lock){+.+.}-{3:3}, at: svm_range_deferred_list_work+0xe8/0x340 [amdgpu]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #2 (&svms->lock){+.+.}-{3:3}:
       __mutex_lock+0x97/0xd30
       kfd_ioctl_alloc_memory_of_gpu+0x6d/0x3c0 [amdgpu]
       kfd_ioctl+0x1b2/0x5d0 [amdgpu]
       __x64_sys_ioctl+0x86/0xc0
       do_syscall_64+0x39/0x80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #1 (&mm->mmap_lock){++++}-{3:3}:
       down_read+0x42/0x160
       svm_range_evict_svm_bo_worker+0x8b/0x340 [amdgpu]
       process_one_work+0x27a/0x540
       worker_thread+0x53/0x3e0
       kthread+0xeb/0x120
       ret_from_fork+0x31/0x50
       ret_from_fork_asm+0x11/0x20

-> #0 ((work_completion)(&svm_bo->eviction_work)){+.+.}-{0:0}:
       __lock_acquire+0x1426/0x2200
       lock_acquire+0xc1/0x2b0
       __flush_work+0x80/0x550
       __cancel_work_timer+0x109/0x190
       svm_range_bo_release+0xdc/0x1c0 [amdgpu]
       svm_range_free+0x175/0x180 [amdgpu]
       svm_range_deferred_list_work+0x15d/0x340 [amdgpu]
       process_one_work+0x27a/0x540
       worker_thread+0x53/0x3e0
       kthread+0xeb/0x120
       ret_from_fork+0x31/0x50
       ret_from_fork_asm+0x11/0x20

other info that might help us debug this:

Chain exists of:
  (work_completion)(&svm_bo->eviction_work) --> &mm->mmap_lock --> &svms->lock

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&svms->lock);
                               lock(&mm->mmap_lock);
                               lock(&svms->lock);
  lock((work_completion)(&svm_bo->eviction_work));

I believe this cannot really lead to a deadlock in practice, because
svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO
refcount is non-0. That means it's impossible that svm_range_bo_release
is running concurrently. However, there is no good way to annotate this.

To avoid the problem, take a BO reference in
svm_range_schedule_evict_svm_bo instead of in the worker. That way it's
impossible for a BO to get freed while eviction work is pending and the
cancel_work_sync call in svm_range_bo_release can be eliminated.

v2: Use svm_bo_ref_unless_zero and explained why that's safe. Also
removed redundant checks that are already done in
amdkfd_fence_enable_signaling.
Rejected reason: This CVE ID has been rejected or withdrawn by its CVE Numbering Authority.
Removed Reference
kernel.org https://git.kernel.org/stable/c/28d2d623d2fbddcca5c24600474e92f16ebb3a05

								
						
Removed Reference
kernel.org https://git.kernel.org/stable/c/47bf0f83fc86df1bf42b385a91aadb910137c5c9

								
						
Removed Reference
kernel.org https://git.kernel.org/stable/c/7a70663ba02bd4e19aea8d70c979eb3bd03d839d

								
						
Removed Reference
kernel.org https://git.kernel.org/stable/c/8b25d397162b0316ceda40afaa63ee0c4a97d28b

								
						
Removed Reference
kernel.org https://git.kernel.org/stable/c/cb96e492d72d143d57db2d2bc143a1cee8741807

								
						

CVE Rejected by kernel.org 3/20/2024 1:15:07 PM

Action Type Old Value New Value