Make the Linux kernel ReBAR-over-Thunderbolt friendly
Here's a suggestion for the kernel devs, now that Thunderbolt eGPUs have become more common: make the Linux kernel ReBAR-over-Thunderbolt friendly.
The current behavior is this: BAR 2's hardware register powers up at 256 MB — the default size programmed into the BAR's address decoder by Intel at the factory. The PCIe Resizable BAR capability advertises support for up to 16 GB, but it's passive — software must explicitly exercise it. When a Thunderbolt eGPU is hotplugged at runtime, the kernel's PCI subsystem enumerates the new device, reads the BAR at its 256 MB default, sizes the bridge windows to match, and assigns addresses — all before any driver loads. The ReBAR capability is never consulted(!) during this process.
The current workaround is thunderbolt.host_reset=0, which preserves the BIOS's PCIe tunnel and BAR assignments from POST (where the BIOS does exercise ReBAR). This delivers the full 16 GB BAR but only works for cold-plug(!) scenarios — if the eGPU is power-cycled at runtime, the new tunnel gets the 256 MB default.
The proper fix would be for the kernel's PCI hotplug resource assignment to first check for ReBAR capability during enumeration, resize the BAR to the largest supported size that fits within available bridge headroom, and then commit bridge windows and assign addresses. This is essentially what the BIOS does during POST. It hasn't been implemented yet because eGPU-over-Thunderbolt-with-ReBAR is (was?) a niche use case.
Well, no more niche use case. eGPU-over-Thunderbolt is becoming mainstream.
Comments
-
Adding empirical data on Linux 6.17 + ReBAR-over-TB4 with detailed dmesg of the
allocator's failure mode. May be useful for whoever picks up the fix.Setup
Framework 13 (Intel Core Ultra 5 125H, BIOS INSYDE 3.06) — TB4 host, no
Above 4G Decoding/Resizable BARBIOS toggles exposed.Razer Core X V2 — Intel Barlow Ridge 4-port USB4-v2 hub. Three downstream
ports empty, one populated by the GPU.MSI RTX 5060 Ti 16G — advertises ReBAR sizes 64MB through 16GB (per
lspci -vvvand/sys/bus/pci/devices/.../resource1_resize=0x7fc0).Ubuntu 24.04, kernel 6.17.0-22-generic, NVIDIA 580.142-open.
Failure mode
GPU BAR 1 caps at 256 MB on every cmdline I tried. Standard guidance
(pci=realloc=on,pci=hpmmioprefsize=N) does not lift this on this
topology — see comparison below.What the allocator does (dmesg, current boot, with cmdline hints)
pci 0000:00:07.2: PCI bridge to [bus 55-7e] pci 0000:00:07.2: bridge window [mem 0x5010000000-0x580fffffff 64bit pref] # initial 32 GB [V2 enumerated, 4 downstream sibling bridges] pcieport 0000:00:07.2: Assigned bridge window [mem 0x6028000000-0x784fffffff 64bit pref] to [bus 55-7e] cannot fit 0x1830000000 required for 0000:55:00.0 bridging to [bus 56-7e] pcieport 0000:00:07.2: bridge window [mem 0x91000000-0x97ffffff]: failed to expand by 0x38000000 pcieport 0000:00:07.2: bridge window [mem 0x91000000-0x97ffffff]: failed to add optional 38000000 pci 0000:55:00.0: Assigned bridge window [mem 0x6028000000-0x784fffffff 64bit pref] to [bus 56-7e] cannot fit 0x30000000 required for 0000:56:00.0 bridging to [bus 57] pci 0000:56:00.0: bridge window [mem 0x00000000-0x17ffffff 64bit pref] to [bus 57] requires relaxed alignment rules pci 0000:56:00.0: bridge window [mem 0x08000000-0x1fffffff 64bit pref] to [bus 57] add_size c100000 add_align 10000000 pci 0000:56:00.0: bridge window [mem size 0x24100000 64bit pref]: can't assign; no space pci 0000:56:00.0: bridge window [mem size 0x24100000 64bit pref]: failed to assign pci 0000:56:00.0: bridge window [mem 0x6028000000-0x603fffffff 64bit pref]: assigned pci 0000:56:00.0: bridge window [mem 0x6028000000-0x603fffffff 64bit pref]: failed to expand by 0xc100000 pci 0000:56:00.0: bridge window [mem 0x6028000000-0x603fffffff 64bit pref]: failed to add optional c100000 pci 0000:57:00.0: BAR 1 [mem 0x6030000000-0x603fffffff 64bit pref]: assignedThe kernel knows
56:00.0(the populated downstream bridge) needs to grow.
It tries to add0xc100000(193 MB) more on top of its existing 384 MB and
fails because the parent (55:00.0) has no slack — the three empty sibling
bridges already consumed it.Final bridge windows
Bridge Populated Prefetchable window 00:07.2(host root port)yes 96 GB 55:00.0(V2 inbound)yes 96 GB (passed through) 56:00.0(populated, GPU)YES 384 MB ← starved 56:01.0(empty)no 32 GB 56:02.0(empty)no 32 GB 56:03.0(empty)no 32 GB GPU BAR 1 sizes to 256 MB, the largest power-of-2 that fits inside the 384 MB
window after subtracting BAR 3 (32 MB) + ROM + alignment.Pristine vs. patched comparison
Two boots on the same hardware, only the cmdline differs. Same eGPU, same
GPU, same physical port.- Pristine: default Ubuntu cmdline, no
pci=parameters. - Patched:
pci=realloc=on pci=hpmmiosize=256M,hpmmioprefsize=32G(the
flags this thread typically points users toward).
Bridge / BAR Pristine Patched 00:07.2window (host root port)32 GB 96 GB 55:00.0window (V2 inbound)32 GB 96 GB 56:00.0window (populated, GPU)384 MB 384 MB Empty sibling windows (each) 10.5 GB 32 GB GPU BAR 1 256 MB 256 MB The hints grow everything except the bridge that actually has a 16 GB BAR
request behind it. The redistribution pass triggered bypci=realloc=oncannot
recover from this — by the time it tries to expand the populated bridge, the
empty siblings have already been allocated their hint-sized windows.Where the fix probably lives
The hot-plug bridge sizing path applies the
pci=hpmmioprefsizehint
uniformly across all hot-pluggable downstream ports of a switch, regardless of
whether anything is currently behind those ports. When ReBAR is in play, this
is exactly backwards: the populated port (whose downstream device advertises a
large ReBAR) should get a larger share, not the same share, as empty ports.A reasonable heuristic: when distributing prefetchable budget across siblings
of a hot-plug switch, weight by the maximum ReBAR size each downstream device
advertises (default 0 for empty ports, the normal hint for cold-pluggable
unknown topologies). This would let the populated port absorb most of the
parent's window and leave each empty port with the minimum needed for future
hot-plug events.What I tried that didn't work (for completeness)
pci=realloc=on(alone or with size hints) — no effect on populated bridge.pci=hpmmiosize=256M,hpmmioprefsize=32G— grows empties, populated unchanged.Live PCI rescan via sysfs (
echo 1 > .../remove, thenecho 1 > /sys/bus/pci/rescan) — leaves parent bridge with prefetchable window
disabled. GPU doesn't re-enumerate cleanly. Worse, not better.NVIDIA driver upgrade 570 → 580 — driver requests resize on a window that
the allocator has already locked at 384 MB. No effect.Cold boot with eGPU pre-attached — no help. BIOS doesn't size for ReBAR on
this platform (Framework BIOS lacks the toggle), so the kernel sees the
same starting state as a hotplug.
Downstream impact
Heavy GPU work over the 256 MB BAR triggers an NVIDIA full-chip-reset cascade
(Xid 119: GSP RPC timeout → Xid 154: GPU Reset Required), reproduced via
Fallout 76 on this hardware. Confirmed identical mechanism on both Xorg and
Wayland sessions; only the blast radius differs (Xorg session takes the
desktop with it, Wayland on PRIME on-demand keeps the iGPU compositor alive
because it isn't dependent on the dead NVIDIA card).0 -
Hello,
I am facing a similar problem. But my setup is different.
Minisforum S1Max
CPU: AMD RYZEN AI MAX+ 395 (32) @ 5.19 GHz
GPU: AMD Radeon 8060S Graphics [Integrated]
I have two Thunderbolt 5 ports.
I enabled rebar in Bios, I did not see any toggle for above 4g decoding. I updated bios to the latest 1.0.6.
I am running Arch Linux.
I have Minisforum deg2 dock connected through Thunderbolt 5. Inside the dock, I have Intel ARC Pro B70.I get this after sourcing intel one API environment:
sycl-ls
WARNING: Resizable BAR not detected for device 0000:6c:00.0
WARNING: Resizable BAR not detected for device 0000:6c:00.0
[opencl:cpu][opencl:0] Intel(R) OpenCL, AMD RYZEN AI MAX+ 395 w/ Radeon 8060S OpenCL 3.0 (Build 0)
lspci -vv -s 6c:00.0 | grep -iE 'region|prefetch|resizable|bar'
Region 0: Memory at 9090000000 (64-bit, prefetchable) [size=16M]
Region 2: Memory at 9080000000 (64-bit, prefetchable) [size=256M]
If you need me to try something out or provide more information, I will be happy to help.
Thank you.
0
Categories
- All Categories
- 175 LFX Mentorship
- 175 LFX Mentorship: Linux Kernel
- 744 Linux Foundation IT Professional Programs
- 372 Cloud Engineer IT Professional Program
- 168 Advanced Cloud Engineer IT Professional Program
- 72 DevOps IT Professional Program - Discontinued
- 3 DevOps & GitOps IT Professional Program
- 98 Cloud Native Developer IT Professional Program
- 39 Express Training Courses & Microlearning
- 34 Express Courses - Discussion Forum
- 7.6K Training Courses & Learning Paths
- AI & ML Training
- Blockchain & Decentralized Identity Training
- Cloud & Containers Training
- Cybersecurity Training
- DevOps & Site-Reliability Training
- Linux Kernel Development Training
- Networking Training
- Open Source Best Practice Training
- System Administration Training
- System Engineering Training
- Web & Application Development Training
- 55 LFD102 Class Forum
- 261 LFD103 Class Forum
- 2 LFD103-JP クラス フォーラム
- 2 LFD114 Class Forum
- 56 LFD121 Class Forum
- 4 LFD123 Class Forum
- 3 LFD125 Class Forum
- 5 LFD137 Class Forum
- 2 LFD140 Class Forum
- 4 LFD210-CN Class Forum
- 2 LFD221 Class Forum
- 26 LFD254 Class Forum
- 764 LFD259 Class Forum
- 679 LFS101 Class Forum
- 3 LFS114 Class Forum
- 6 LFS118 Class Forum
- 3 LFS120 Class Forum
- 2 LFS140 Class Forum
- 13 LFS142 Class Forum
- 10 LFS144 Class Forum
- 8 LFS146 Class Forum
- 30 LFS148 Class Forum
- 6 LFS157 Class Forum
- 179 LFS158 Class Forum
- 2 LFS158-JP クラス フォーラム
- 19 LFS162 Class Forum
- 4 LFS180 Class Forum
- 3 LFS184 Class Forum
- 162 LFS207 Class Forum
- 3 LFS207-DE-Klassenforum
- 4 LFS207-JP クラス フォーラム
- 61 LFS241 Class Forum
- 52 LFS242 Class Forum
- 42 LFS243 Class Forum
- 19 LFS244 Class Forum
- 9 LFS245 Class Forum
- 3 LFS246 Class Forum
- 2 LFS248 Class Forum
- 179 LFS250 Class Forum
- 4 LFS250-JP クラス フォーラム
- 166 LFS253 Class Forum
- 5 LFS255 Class Forum
- 19 LFS256 Class Forum
- 3 LFS257 Class Forum
- 1.4K LFS258 Class Forum
- 13 LFS258-JP クラス フォーラム
- 151 LFS260 Class Forum
- 165 LFS261 Class Forum
- 26 LFS267 Class Forum
- 28 LFS268 Class Forum
- 39 LFS269 Class Forum
- 13 LFS270 Class Forum
- 792 Hardware
- 202 Drivers
- 68 I/O Devices
- 37 Monitors
- 95 Multimedia
- 173 Networking
- 91 Printers & Scanners
- 87 Storage
- 768 Linux Distributions
- 81 Debian
- 67 Fedora
- 22 Linux Mint
- 13 Mageia
- 24 openSUSE
- 150 Red Hat Enterprise
- 31 Slackware
- 13 SUSE Enterprise
- 356 Ubuntu
- 465 Linux System Administration
- 31 Cloud Computing
- 73 Command Line/Scripting
- Github systems admin projects
- 98 Linux Security
- 78 Network Management
- 101 System Management
- 46 Web Management
- 105 Mobile Computing
- 18 Android
- 72 Development
- 1.2K New to Linux
- 1K Getting Started with Linux
- 392 Off Topic
- 121 Introductions
- 181 Small Talk
- 29 Study Material
- 928 Programming and Development
- 310 Kernel Development
- 600 Software Development
- 969 Software
- 361 Applications
- 182 Command Line
- 5 Compiling/Installing
- 68 Games
- 317 Installation
- Archived
- 75 All In Program
- 75 All In Forum
- 25 LFC110 Class Forum - Discontinued
- 2 LFS112 Class Forum - Discontinued
- 22 LFS151 Class Forum - Discontinued
- 1 LFS166 Class Forum - Discontinued
- 9 LFS167 Class Forum - Discontinued
- 4 LFS170 Class Forum - Discontinued
- 1 LFS171 Class Forum - Discontinued
- 3 LFS178 Class Forum - Discontinued
- 736 LFS201 Class Forum - Discontinued
- 2 LFS201-JP クラス フォーラム - Discontinued
- 301 LFS211 Class Forum - Discontinued
- 55 LFS216 Class Forum - Discontinued
- 2 LFS251 Class Forum - Discontinued
- 1 LFS254 Class Forum - Discontinued
- 82 LFS263 Class Forum - Discontinued
- 15 LFS264 Class Forum - Discontinued
- 11 LFS266 Class Forum - Discontinued
- 2 LFS272-JP クラス フォーラム - Discontinued
- 202 LFS272 Class Forum - Discontinued
- 1 LFS274 Class Forum - Discontinued
- 4 LFS281 Class Forum - Discontinued
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)