-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathrss2.xml
340 lines (163 loc) · 364 KB
/
rss2.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
<title>Lan Tian @ Blog</title>
<link>https://lantian.pub/</link>
<atom:link href="https://lantian.pub/rss2.xml" rel="self" type="application/rss+xml"/>
<description></description>
<pubDate>Wed, 17 May 2023 07:08:32 GMT</pubDate>
<generator>http://hexo.io/</generator>
<item>
<title>如何引爆 DN42 网络(2023-05-12 更新)</title>
<link>https://lantian.pub/article/modify-website/how-to-kill-the-dn42-network.lantian/</link>
<guid>https://lantian.pub/article/modify-website/how-to-kill-the-dn42-network.lantian/</guid>
<pubDate>Fri, 12 May 2023 14:03:33 GMT</pubDate>
<description><blockquote>
<p>DN42 是一个<strong>测试网络</strong>,所有人都在帮助所有人。即使你不小心搞砸了,也没有人会指责你。你可以在 DN42 的 <a href="https://wiki.dn42.us/services/IRC">IRC 频道</</description>
<enclosure url="https://lantian.pub//usr/uploads/202008/i-love-niantic-network.png" type="image"/>
<content:encoded><![CDATA[<blockquote><p>DN42 是一个<strong>测试网络</strong>,所有人都在帮助所有人。即使你不小心搞砸了,也没有人会指责你。你可以在 DN42 的 <a href="https://wiki.dn42.us/services/IRC">IRC 频道</a>,<a href="https://wiki.dn42.us/contact#contact_mailing-list">邮件列表</a>或者<a href="https://t.me/Dn42Chat">非官方 Telegram 群组</a>寻求帮助。</p></blockquote><p>由于 DN42 是一个实验用网络,其中也有很多新手、小白参与,因此时不时会有新手配置出现错误,而对整个 DN42 网络造成影响,甚至炸掉整个网络。</p><p>现在,作为一名长者(x),我将教各位小白如何操作才能炸掉 DN42,以及如果你作为小白的邻居(指 Peer 关系),应该如何防止他炸到你。</p><blockquote><p>注意:你不应该在 DN42 网络中实际执行这些操作,你应该更加注重对破坏的防御。</p><p>恶意破坏会导致你被踢出 DN42 网络。</p></blockquote><p>本文信息根据 Telegram 群及 IRC 中的<strong>真实惨案</strong>改编。</p><h1 id="更新记录">更新记录</h1><ul><li>2023-05-12:增加修改 BGP Localpref 导致环路的内容。</li><li>2020-08-27:格式修改,添加完整 IRC 日志,部分内容的中文翻译,添加另一段地址掩码填错的内容,以及 ASN 少了一位的内容。</li><li>2020-07-13:添加 Registry IPv6 地址段掩码填错的内容,和 Bird 不同协议左右互博的内容。</li><li>2020-05-30:第一版,包含 OSPF、Babel、左右横跳。</li></ul><h1 id="ospf-真好玩">OSPF 真好玩</h1><p>你刚刚加入 DN42,并且准备把你手上的几台服务器都连接进去。你通过邮件,IRC 或者 Telegram 找了几个人分别和你的几台服务器 Peer,但是你还没有配置好你的内部路由分发。</p><p>于是你准备配置 OSPF,并打开 Bird 的配置文件加了一个 protocol:</p><pre><code class="hljs language-bash">protocol ospf { ipv4 { import all; <span class="hljs-built_in">export</span> all; }; area 0.0.0.0 { interface zt0 { <span class="hljs-built_in">type</span> broadcast; <span class="hljs-comment"># 略掉一些不重要的参数</span> }; };};</code></pre><p>你心满意足地把配置文件复制到每台服务器上,然后 <code>bird configure</code>,看到你的各台服务器都通过 OSPF 获取到了其它服务器的路由。</p><p>突然,你的 IRC / Telegram 弹出了一个提示框,你点开来一看:</p><pre><code class="hljs language-html"><mc**> shit.... as424242**** is hijacking my prefixes, for example 172.23.*.*/27 草…… AS424242**** 在劫持我的地址前缀(即地址块),例如 172.23.*.*/27<he**> yup, I see some roa fails for them as well 对,我也看到 ROA 验证失败了</code></pre><p>恭喜你,你成功劫持了 DN42 网络(的一部分)。</p><h2 id="发生了什么">发生了什么</h2><p>当你的服务器通过 BGP 协议和其他人 Peer 时,每一条路由都包含了路径信息,包括它从哪里来,经过了哪些节点到达你这里。例如 <code>172.22.76.184/29</code> 这条路由可能就带有 <code>4242422547 -> 4242422601 -> 424242****</code> 这条路径,其中 <code>4242422547</code> 是路由来源(就是我),而 <code>4242422601</code> 是你的邻居(此处以 Burble 举例)。</p><p>但是,你的内网在传递路由时使用的是 OSPF 协议,而 OSPF 在传递路由信息时不会保留 BGP 的路径,因为它并不认识这些东西。此时你的另一台服务器通过 OSPF 获取到了 <code>172.22.76.184/29</code> 这条路由,但是不包含任何路径信息,它在与邻居的 BGP 宣告中就会将这条路由使用你自己的 ASN 播出去,造成劫持效果。</p><p>画成图大概是这样的:</p><pre><code class="hljs language-bash">[2547] -> [2601] -> [你的 A 节点] -> [你的 B 节点] -> [你的 B 节点的邻居] 2547 2547 2547 没了! 你的 ASN(BOOM) 2601 2601 你的 ASN</code></pre><h1 id="babel-也很好玩">Babel 也很好玩</h1><p>Telegram 里的老哥说话很好听,一边帮助你修上面那个 Bug,一边向你推荐 Babel:</p><ul><li>Babel 可以自动根据延迟选择最短路线;</li><li>Babel 配置非常简单。</li></ul><p>但是,群友不推荐你使用 Bird 自带的 Babel 协议支持,因为 Bird 的 Babel 不能根据延迟选路。</p><p>你心动了,删掉了 OSPF 的配置文件,并装了一个 Babeld。很快你的每台机器上都出现了其它节点通过 Babel 发来的路由。你等了几分钟,似乎没有爆炸。</p><p>但是你注意到,你的 Bird 没有把这些路由通过 BGP 发出去。老哥们怂恿你开启 Bird Kernel Protocol 的 Learn:</p><pre><code class="hljs language-bash">protocol kernel sys_kernel_v4 { scan time 20; <span class="hljs-comment"># 群友怂恿你添加这一行</span> learn; <span class="hljs-comment"># 不重要的略过</span>};</code></pre><p>你照做了。几分钟后,你被 IRC 和 Telegram 里的人疯狂艾特。是的,你又把其他人的网络劫持了。</p><h2 id="发生了什么-1">发生了什么</h2><p>这和上面 OSPF 一段其实是相同的问题,Babel 在传递路由时丢弃了 BGP 的路径信息。只不过默认情况下,Bird 会忽略其它路由软件写入内核路由表的路由信息,除非你开了 learn。</p><h2 id="正确的操作">正确的操作</h2><ul><li>永远记住一点原则:OSPF,Babel 等 IGP(内部路由协议)不应处理 BGP 路由信息,BGP 路由就应该让 BGP 协议自己处理。<ul><li>在网络内部配置 BGP 有多种方案,可以参考《<a href="/article/modify-website/bird-confederation.lantian/">Bird 配置 BGP Confederation</a>》这篇文章。</li></ul></li><li>同时,内部路由协议的路由也不应漏到 BGP 中,除非内部路由协议中处理的所有 IP 段都是你自己所有。</li><li>所以你应该把 BGP 的 <code>export filter</code> 写成这样:</li></ul><pre><code class="hljs language-bash"><span class="hljs-built_in">export</span> filter { <span class="hljs-comment"># 只允许向外发送来自 STATIC(手动配置)和 BGP 协议的路由</span> <span class="hljs-keyword">if</span> <span class="hljs-built_in">source</span> ~ [RTS_STATIC, RTS_BGP] <span class="hljs-keyword">then</span> accept; <span class="hljs-comment"># 拒绝掉其它路由协议的路由</span> reject;}</code></pre><h2 id="如何防御">如何防御</h2><ul><li>最佳的方法是 ROA,即路由来源验证(<code>Route Origin Authorization</code>),限制每条路由的来源 ASN。<ul><li>对于 DN42,ROA 配置文件根据 Registry 的信息自动生成,可以在 <a href="https://wiki.dn42.us/howto/Bird#route-origin-authorization">DN42 Wiki 的 Bird 配置页面</a>下载,并且可以设置 Cron 定时任务自动更新。</li></ul></li><li>如果你不想配置 ROA,你可以尝试与尽量多的人 Peer。<ul><li>由于 BGP 默认选择经过的 AS 最少的路径,如果你和很多人直连,即使有人在劫持路由,你的网络仍然会优先选择这些直连路径。</li><li>但注意这样<strong>不能保证</strong>防住路由劫持,例如以下情况:<ul><li>真实 AS 到你路径比劫持者的长;</li><li>劫持者与真实 AS 到你的 AS 路径等长,此时会选择哪个看脸;</li><li>你有配置 DN42 Community Filter,导致劫持者的路由优先级比较高。</li></ul></li></ul></li></ul><h1 id="左右横跳">左右横跳</h1><p>左右横跳是多种错误的总称,它们会造成 BGP 路由程序频繁切换获得的最优路径。由于最优路径会通过 Peering 传递给别的节点,这个切换过程会造成连锁反应,相连的多个节点都会因为一个节点的故障而一起切换,最终故障扩散到全网。</p><p>这一过程会造成大量的流量消耗,而由于 DN42 内多数人用的是便宜的 VPS 做节点,因此长期下来结果只有以下两种:</p><ol><li>你的邻居发现了流量消耗异常,主动切断了和你的 Peering;</li><li>你的主机商(可能还有你的邻居的主机商)发现你长期占用带宽(或者用完了流量),停掉了你的 VPS。</li></ol><p>而且左右横跳错误可能会造成严重的影响:</p><ul><li>如果出错的 AS 和其它多个 AS 建立了 Peering,即使你断开了和他的直接连接,路由切换仍然可能从其它 AS 传递到你的 AS。<ul><li>为了解决一个 AS 的问题,可能需要断开好几个 AS。</li></ul></li></ul><p>例如,某 Telegram 群友从 Fullmesh + Direct 转向 Multihop 时出现事故,造成了非常大量的路由切换。</p><p><img src="../../../../usr/uploads/202008/i-love-niantic-network.png" alt="我永远喜欢 Niantic Network"></p><p>他在切换过程中没有断开 BGP,而 Babel 的配置错误导致大量路由被传递及撤销。</p><p>由于上述路由切换的连锁传递,并且该群友接了较多的 Peering,多个较大的 AS 被迫断开之间的连接,以(在该群友睡醒之前)控制住问题规模。</p><p>另外,该群友先前还有多次类似的路由切换事故,但这里地方太小了写不下。(滑稽)</p><h2 id="案情回顾">案情回顾</h2><pre><code class="hljs language-html"><bur*> is someone awake who is on telegram ? 有用 Telegram 的人醒着吗?<bur*> Kio*, sun*, ie**, lantian perhaps ? 可能是 Kio*,sun*,ie**,Lan Tian?<Kio*> Kio* is here Kio* 在<fox*> I am in that dn42 telegram chat too but I do not understand moon runes 我也在 DN42 的 Telegram 群,但我不懂月相<fox*> also its midnight for china? 另外现在是中国的半夜?<bur*> yes, I'm going to be nuking a lot of peerings if they are all asleep 对,如果他们全在睡觉,我就要炸掉一大堆 Peering 了<bur*> I think its originating from NIA*, but a lovely multi mb/s flap going on for the past hour 我觉得问题来自 NIA*,一个小时前开始有一个好几 MB/s 的「可爱」的左右横跳<bur*> and its like whack-a-mole, if I disable one peering the traffic just pops up on a different one 而且像打地鼠,如果我关掉一个 Peering,它又会从另一个 Peering 上跳出来<fox*> petition for bur* network to stop accepting new peers to help save dn42 network health 建议 Bur* 的网络不要再接受新的 Peer 了,以保证「42 号去中心网络」的健康发展<Kio*> NIA* is awake now NIA* 现在醒了<bur*> NIA* certainly has ipv4 next hop problems, they are advertising routes with next hops in other networks NIA* 的 IPv4 Nexthop 肯定有问题,他们广播的路由的 Nexthop 都在其它网络<Kio*> He says he is adjusting his "network from full-mesh to rr and multihops" 他说他在「把网络从 Full-mesh 调整成 Route Reflector 和 Multihop」<bur*> well its not working ;) 唔姆,这没有正常工作 ;)<stv*> bur*: I also took down our peering bur*:我也把我们的 Peering 断了<bur*> stv*, too much traffic from the grc? stv*, 来自 GRC(全球路由收集节点)的流量太多了?<stv*> I added a new peer around 1hr ago. Just to check that this hasnt be the cause.. 我一小时前接了一个新的 Peer,只是为了确认这不是原因……<stv*> bur*: no the grc is still up and running bur*:不,GRC 还在正常工作<bur*> ah, if you are getting a lot of route updates its cos of NIA* 啊,如果你收到很多路由更新,它们是来自 NIA* 的<bur*> grc is currently pumping about 4mb/s to downstram peers GRC 现在正在向下游发送 4 MB/s 的更新<sun*> bur*: what happen? bur*:发生了什么?<bur*> NIA* is having issues NIA* 出了问题<bur*> sun* anyway, you are up late! sun* 不管怎么说,你睡得好晚!<sun*> I just came back from the bar:) 我刚从酒吧回来 :)<do**> don't drink and root 酒后不要 root(指用管理员权限修改系统)<bur*> nice :) 不错 :)<sun*> l like drink ;) 我喜欢喝酒 ;)<bur*> ok, I'm bored of this now, if you are currently sending me more than 1mb/s of bgp traffic your peering is about to get disabled. 行吧,我现在累了,如果你正在向我发送超过 1MB/s 的 BGP 流量,那你的 Peering 会被我禁用。<bur*> Kio*, sun*, Tch*, jrb*, lantian, ie**, so far 目前是 Kio*,sun*,Tch*,jrb*,Lan Tian,ie** 几个<Kio*> barely notice any flapping here, is it v4 or v6 ? 几乎没观察到左右横跳,是 IPv4 还是 IPv6?<bur*> 4 mostly, I think. you got killed on us-nyc1 我觉得大部分是 IPv4,你和我美国纽约 1 号节点的 Peer 被关了<bur*> Nap* Nap*<Nap*> Shut mine down if you need, I can't look into with much detail until tonight 有必要的话就把我的 Peer 关了吧,我今晚之前都不能仔细检查<bau*> half of dn42 is about to loose connectivity due to bur* disableing peerings lol 哈哈,半个 DN42 会因为 Bur* 禁用 Peering 而断网<do**> oh yeah, this looks nice 哦耶,太棒了<Kio*> thats why everybody should be at least multi homed with two peers 因此所有人都应该至少接两个 Peer<jrb*> bur*: and on which peering? bur*:在哪个 Peering 上?<Kio*> you shouldnt loose connectivity if only one peer drops 如果只有一个 Peer 掉线,你不应该也掉线<bur*> jrb* us-nyc1 and us-lax1 for you so far jrb* 目前是美国纽约 1 号和美国洛杉矶 1 号<jrb*> mapping table says us-3 and us-5, let me check. 映射表显示是美国 3 号和 5 号,我检查一下。<Nap*> Do we know what routes are flapping causing the updates? 我们知道是谁的路由造成这些更新吗?<Kio*> filtering problematic ASN on my us node now 正在我的节点上过滤有问题的 ASN<bur*> Nap* its NIA* Nap*,是 NIA*<bur*> AS42424213** AS42424213**<jrb*> sun*, rou*: disabling my peerings with you for now, there seems to be serious flapping sun*,rou*:我现在禁用和你们的 Peering,看起来有严重的左右横跳<do**> him again? 又是他?<sun*> what? 啥?<sun*> is me problem? 我的问题吗?<bur*> sun*, I've killed all of our peerings sun*,我关掉了我们所有的 Peering<sun*> why? 为什么?<bur*> sun*, you are distributing the problems from NIA* sun*,你在传递 NIA* 造成的问题<Nap*> bur*: K, gonna try to filter on ATL/CHI at least. bur*:行,准备尝试至少在亚特兰大和芝加哥节点上做过滤。<bur*> thanks Nap* 谢了 Nap*<Kio*> recommend everybody to temporarily enable "bgp_path ~" filter for the problematic ASN 推荐所有人暂时打开「bgp_path ~」过滤掉有问题的 ASN<sun*> i disabled NIA*, would fix problem? 我禁用了 NIA*,会解决问题吗?<do**> bur*: I also peer with NIA* and I don't get any bgp updates from him bur*:我也和 NIA* Peer 了,但没收到他的任何 BGP 更新<do**> ah wait 啊等等<bur*> sun*, depends if you are also getting the updates from other peers too sun*,取决于你会不会也从其他 Peer 收到这些更新<do**> now I see it 现在我看到了<do**> disabling peering 正在禁用 Peering<sun*> if bgp_path ~ [= 42424213** =] then reject; (Bird Filter 命令)<bur*> ~ [= * 42424213** * =] to reject all paths 用「~ [= * 42424213** * =]」过滤掉所有包含他的路径<sun*> ohh 噢哦<jrb*> bur*: seems to be mostly rou* from my perspective bur*:从我这看主要是 rou*<Kio*> Should be filtered on my side, if anyone continues to receive those updates please notify 我这里应该过滤好了,如果任何人继续收到这些更新,请通知我<bur*> sun*, I tried re-enabling you on lax1 but you jumped striaght to 1mb/s+ again sun*,我尝试在洛杉矶 1 号节点重新启用我们的 Peering,但流量马上到了 1 MB/s 多<bur*> jrb*, re-enabled jrb*,重新启用了<sun*> i have disabled NIA* 我也禁用 NIA* 了<bur*> Kio*, re-enabled Kio*,重新启用了<do**> oh btw, I have notified NIA* about this issue 哦顺便提一句,我已经告知 NIA* 这个问题了<jrb*> do**: also tell him to notify everybody to get out of the blacklists. do**:另外告诉他(修好网络后)通知所有人解除黑名单。<do**> jrb*: will do jib*:好的<Nap*> bur*: I should have it filtered on my ATL (your CHI) bur*:我应该在我的亚特兰大节点上过滤了(对应你的芝加哥节点)<Kio*> wrote NIA* also directly on telegram 在 Telegram 上直接向 NIA* 发了消息<sun*> bur*: is it better now? bur*:现在好点了吗?<bur*> for the record, this is the first time that I've mass disabled peerings, but this was causing issues across the board 这是我有史以来第一次大规模禁用 Peering,但这次的确造成了很多问题<bur*> sun*, no not really sun*,不,没有<An**> I've stop importing route from NIA* 我已经停止从 NIA* 导入路由了<stv*> I am also dropping NIA* now 我现在也丢弃 NIA*(的路由)了<bur*> sun*, thats like 1k updates every few seconds sun*,每过几秒就会有一千条路由更新<Nap*> bur*: all host should have it filtered now. bur*:所有节点都应该过滤了。<bur*> Nap*, looks to me, thanks Nap*,看起来没问题,谢谢<sun*> bur*: seems to have reduced traffic bur*:看起来流量降低了<bur*> sun*, yes that looks better sun*,的确看起来好些了<bur*> sun*, is that now ok across all your nodes ? sun*,现在你的所有节点都正常吗?<sun*> yep 对<bur*> sun*, ok re-enabled sun*,好的,重新启用了<do**> alright, also filtered 42424213** 好的,也把 42424213** 过滤了<tm**> hi, also filtered 42424213** 大家好,我也把 42424213** 过滤了<bur*> I guess they got the message, seems we're back to normal again and everyone I disabled is back again 我猜他们(指 NIA*)收到消息了,看起来我们再次回复正常了,所有我禁用的人都被重新启用了<do**> bur*: I think NIA* is asleep, probably everyone filtered it bur*:我觉得 NIA* 还在睡觉,也许所有人都过滤了<do**> or disabled peering 或者禁用了 Peering<bur*> do**, there is that, but I also renabled NIA* and am not getting the same errors now do**,有可能,但我也重新启用了 NIA*,现在没有看到先前的错误<do**> oh, interesting 哦,有趣<bur*> I might regret doing that by morning, but hey. I do try and keep everything open as best as possible. 到了早上我有可能会后悔(指 NIA* 的问题在 bur* 睡觉时再次出现),但我尝试尽量公开/开放所有东西。<do**> bur*: last time when NIA* did that I waited for their response bur*:上次 NIA* 搞出这种事情的时候,我等他们的回复(后才采取行动)<Kio*> Nope nia* just messaged in Telegram about it 不,NIA* 刚在 Telegram 上发了消息<do**> ah 啊<bur*> my peering hasn't re-established, so I guess they hit the big red shutdown button 我(和 NIA*)的 Peering 还没有重新建立,我猜他们按下了那个巨大的、红色的关闭按钮<Kio*> He tried to migrate his network to a full mesh 他尝试把网络迁移到 Full mesh<Kio*> and is now "pulling all the wires" 现在正在「全部拔线」<do**> Kio*: did you message him directly or was that on any of the groups? Kio*:你给他直接发了消息吗,还是在哪个群里?<Kio*> on the telegram group 在 Telegram 群里<do**> bur*: you didn't get that many bgp updates from me? bur*:你没有从我这里收到那么多 BGP 更新?<sun*> NIA* woke up :) NIA* 醒了 :)<bur*> do**, you went from an average of ~3kbs to ~10kbs+, peaking at 50kbs. In the grand scheme of things that was lost in the noise do**,你从平均 3 KB/s 到十几 KB/s,峰值 50 KB/s。在如此巨大的量级中这点小问题被淹没了<do**> interesting 有趣<do**> I also peer directly with NIA* 我也和 NIA* 直接 Peer 了<bur*> do**, yes, interesting. Is the link restricted in bandwidth ? do**,是的,有趣。(你和他的)链路有带宽限制吗?<do**> not at all 完全没有</code></pre><h2 id="如何防御-1">如何防御</h2><ul><li>最理想的方案是 Route Dampening,也就是限制一段时间能收到的路由更新数量。<ul><li>但是 Bird 不支持这个,没救了,等死吧,告辞.heic</li></ul></li><li> 次优的方法是使用 Prometheus、Grafana 等工具对各个节点进行监控,在流量异常时收到提醒,上去手动处理。<ul><li>显而易见的是,如果你当时不在线,当你看到提醒时有可能已经几 G 的流量没了。</li></ul></li><li>再次优的方法是对 Peering 的端口进行限速。<ul><li>由于 DN42 内目前几乎没有大流量应用,这种方法的确能保证安全。</li><li>缺点显而易见:性能下降。</li></ul></li><li>土豪的方法是买无限流量的服务器。</li></ul><h1 id="这段地址到底多长">这段地址到底多长</h1><p>因为今年是 2020 年,你准备给你的网络加一组 IPv6 地址。按照<a href="/article/modify-website/dn42-experimental-network-2020.lantian">我的 DN42 注册教程</a>,你很快就给自己注册了一个 IPv6 地址块,并且很快被合并进了 Registry。</p><p>在你看来,一切都很正常。但在地球的另一边,一个人的手机 / 电脑上弹出消息,告诉他他的 DN42 ROA 记录生成器出现了错误。他打开 Registry,扶额叹息,并 commit 了这样一个修改:</p><p><img src="../../../../usr/uploads/202007/dn42-registry-error.png" alt="DN42 Registry 中的错误 IPv6 地址块"></p><p><a href="https://git.dn42.dev/dn42/registry/commit/9f45ee31cdea4a997d59a262c4a8ac8eb3cbd1f1">https://git.dn42.dev/dn42/registry/commit/9f45ee31cdea4a997d59a262c4a8ac8eb3cbd1f1</a></p><h2 id="发生了什么-2">发生了什么</h2><p>这位群友添加了 <code>fd37:03b3:cae6:5158::/48</code> 这样一个地址块。因为一个 IPv6 地址由 32 个 16 进制数构成(共 128 比特),而这个地址块显式定义了其中的前 16 个数(即 64 位),对应的子网掩码应该是 <code>/64</code> 或更高。</p><p>但是由于未知原因,这个错误没有被 DN42 Registry 的内容检查程序检查出来,当时也没有被操作合并的管理员发现,就成功进入了 Registry。</p><p>随后,ROA 记录生成器在解析 Registry 内容时遇到了这个格式错误的地址块,就直接报错退出了。</p><h1 id="再放-送">再放 送</h1><p><img src="../../../../usr/uploads/202008/dn42-registry-error.png" alt="DN42 Registry 中的错误 IPv6 地址块 - 再放送"></p><p><a href="https://git.dn42.dev/dn42/registry/commit/00f90f592a35e325152ce28157f64d3fca7c8d7d">https://git.dn42.dev/dn42/registry/commit/00f90f592a35e325152ce28157f64d3fca7c8d7d</a></p><h2 id="正确的操作-1">正确的操作</h2><ul><li>用户在注册地址块时应该检查子网掩码的大小和地址块的有效性。</li><li>DN42 Registry 的检查程序,或者操作合并的管理员,应该发现这个错误。</li><li>ROA 生成器应该跳过这条有问题的记录,正常处理剩下的数据,而非报错退出。</li></ul><p>万幸的是这个问题对整个 DN42 网络影响不大,只是 ROA 更新延迟了几小时而已。</p><h2 id="如何防御-2">如何防御</h2><p>由于 DN42 从建立之初就在强调去中心特性,因此你可以写一个自己的 ROA 生成器作为备份。</p><blockquote><p>虽然这次我的 ROA 生成器也挂掉了……</p></blockquote><p>原因是不同人写的程序即使功能相同,也会在实现上有细微的差别。这样在遇到这样一个输入内容的 Bug 时,就有可能有人的程序仍能保持正常运行。</p><h1 id="bird-左右互搏">Bird 左右互搏</h1><p>我有一个朋友…… 行吧就是我自己。</p><p>因为我同时接了 DN42 和 NeoNetwork,还有一段自己的内网,所以为了防止把内网路由发到 DN42 和 NeoNetwork,我采取了以下方法:</p><ul><li>把所有来自 Kernel 协议(从内核获取路由)和 Direct 协议(获取系统网络界面(网卡)所在的网段)的路由打上一个 Community。</li><li>在 DN42 和 NeoNetwork 的 Peering 中把它们过滤掉。</li><li>这样我的内网 IP 就不会被广播出去,但因为 DN42 和 NeoNetwork 的路由被配置在 Static Protocol 中,所以不受影响。</li></ul><p>配置完后一切看起来都很正常,直到几天后群友发现我的 Telegram Bot(就是我的 Looking Glass)Ping 不通任何 DN42 内的 IP。</p><h2 id="发生了什么-3">发生了什么</h2><p>刚开始一切都很正常,我的网段 <code>172.22.76.184/29</code> 被正常广播。直到某次 Direct 协议刷新了一次,从系统的某个网络界面获取到了 <code>172.22.76.184/29</code> 这个网段,并再次将它传进了路由表。</p><p>这条新的路由信息就把原先的路由覆盖了,同时因为这条路由来自 Direct 协议,被打上了 Community,就不再被广播了。并且 Static 如其名是「静态」协议,其内容不会改变,自然也不会产生新的路由再覆盖回去。</p><p>此时我相当于停止宣告了我的 IP 段,自然就无法收到回程数据包了。</p><h2 id="正确的操作-2">正确的操作</h2><p>在 Bird 中,尽量避免多个路由协议产生相同的路由条目,相互覆盖可能会造成不可预料的后果。</p><p>我最终选择添加 Filter 将 Direct 协议限制在我的内网网段,避免它再次覆盖我的 DN42 网段。</p><h1 id="星际玩家">星际玩家</h1><p>一名新玩家注册了一个 ASN:</p><p><img src="../../../../usr/uploads/202008/dn42-asn-error.png" alt="DN42 Registry 中的错误 ASN"></p><p>这是 DN42 发生的变化:</p><ul><li><p>Telegram 群:</p><p><img src="../../../../usr/uploads/202008/dn42-asn-error-response.png" alt="Telegram 群友的反应"></p></li><li><p>蒂花 之秀:</p><p><img src="../../../../usr/uploads/202008/dn42-asn-error-response-2.png" alt="Telegram 群友的反应 2"></p></li><li><p>IRC:</p><pre><code class="hljs language-html"><span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> Someone successfully registered in DN42 with ASN 424242236 (9 digits) 有人成功在 DN42 上注册了 ASN 424242236(9 位数)<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> Is this expected? 这是正常的吗? <xu**> doh 噢 <xu**> shouldt have happened 不应该发生 <xu**> probably forgot the extra 2 或许忘了个 2 <xu**> 424242 2236 424242 2236 <Kai*> too late tho. it already has one peer with tech9 太晚了,已经和 Tech9 Peer 上了 <dne*> filtering fail! 过滤器挂了! <xu**> pomoke? (用户名)<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> yep, doesn't seem to be on irc though 对,但看起来不在 IRC 上<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> nor on telegram 也不在 Telegram 上 <0x7*> so how a 9-digit ASN passed the schema checker...? 所以 9 位数 ASN 怎么过的检查程序……?<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> I don't think schema checker checks ASN, or it will block out clearnet ASNs 我不觉得检查程序会检查 ASN,否则会阻挡掉公网 ASN<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> But maybe we need a warning? 但也许需要加个警告? <xu**> probably a bug in the policy checker 也许是检查程序的一个 Bug <xu**> i wish we had gone with a prefix that had a visual space 我希望我们的 ASN 前缀有个看起来明显的分隔 <xu**> like AS424200xxxx 例如 AS424200xxxx<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> Well pomoke tried to peer with me via email (but ended in spam folder) 总之 Pomoke 尝试发邮件找我 Peer(但进了垃圾箱)<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> I'm going to tell him/her to correct the ASN 我准备告诉他/她改正自己的 ASN <Kai*> 9 is a good number tho 不管怎么说 9 是个好数字 <Kai*> once in a blue moon that bur* made mistake bur* 犯错,蓝月将至(英语成语,即千载难逢) <sun*> westerners love digital 9 西方人喜欢数字 9 <bur*> crap 草 <bur*> lantian, are you in contact with pomoke? if they can submit a fix quickly then I'll merge it. Otherwise I'll need to pull the commit Lan Tian,你能联系上 Pomoke 吗?如果他们可以迅速提交修正信息我就马上把它合并了。 否则我就得撤销变更了<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> bur*: I sent him/her an email, not sure about response time bur*,我给他/她发了封邮件,不知什么时候会回 <bur*> umm, I'm going to have to pull it then 唔姆,那我就不得不撤销了</code></pre></li><li><p>裁决之镰:</p><p><img src="../../../../usr/uploads/202008/dn42-asn-error-correction.png" alt="错误被撤销"></p></li></ul><h2 id="如何防御-3">如何防御</h2><ul><li><p>看戏就完事了,这种事情太少见了:</p><pre><code class="hljs language-html"><Kai*> once in a blue moon that bur* made mistake bur* 犯错,蓝月将至</code></pre></li><li><p>当然看戏归看戏,还是要上 IRC 说一句出问题了。</p></li><li><p>在和别人 Peer 的时候,多检查一遍对方的信息。</p></li><li><p>以及没事可以翻翻 <a href="https://t.me/DN42new">DN42 New ASN</a> 这个自动推送新 ASN 的 Telegram 频道。</p></li></ul><h1 id="小心-bgp-local-pref">小心 BGP Local Pref</h1><p>在 DN42 Telegram 群帮别人调试网络时,我突然发现我的两个节点之间出现了环路:</p><pre><code class="hljs language-bash">traceroute to fd28:cb8f:4c92:1::1 (fd28:cb8f:4c92:1::1), 30 hops max, 80 byte packets 1 us-new-york-city.virmach-ny1g.lantian.dn42 (fdbc:f9dc:67ad:8::1) 88.023 ms 2 lu-bissen.buyvm.lantian.dn42 (fdbc:f9dc:67ad:2::1) 94.401 ms 3 us-new-york-city.virmach-ny1g.lantian.dn42 (fdbc:f9dc:67ad:8::1) 167.664 ms 4 lu-bissen.buyvm.lantian.dn42 (fdbc:f9dc:67ad:2::1) 174.235 ms 5 us-new-york-city.virmach-ny1g.lantian.dn42 (fdbc:f9dc:67ad:8::1) 247.213 ms 6 lu-bissen.buyvm.lantian.dn42 (fdbc:f9dc:67ad:2::1) 253.499 ms 7 us-new-york-city.virmach-ny1g.lantian.dn42 (fdbc:f9dc:67ad:8::1) 326.690 ms 8 lu-bissen.buyvm.lantian.dn42 (fdbc:f9dc:67ad:2::1) 333.412 ms 9 us-new-york-city.virmach-ny1g.lantian.dn42 (fdbc:f9dc:67ad:8::1) 406.978 ms10 lu-bissen.buyvm.lantian.dn42 (fdbc:f9dc:67ad:2::1) 413.537 ms11 us-new-york-city.virmach-ny1g.lantian.dn42 (fdbc:f9dc:67ad:8::1) 486.762 ms12 lu-bissen.buyvm.lantian.dn42 (fdbc:f9dc:67ad:2::1) 493.147 ms18 hops not responding.</code></pre><p>我登录上这两个节点一看,VirMach 节点的确优先选择了 BuyVM 发来的路由,而 BuyVM 也选择了 VirMach 的路由。</p><p>BGP 不应该是防环路的吗?为什么这两个节点会互相选择对方的路由?</p><h2 id="发生了什么-4">发生了什么</h2><p>这个问题总共涉及到三个 AS 的四个节点:</p><p><!--?xml version="1.0" encoding="UTF-8" standalone="no"?--><!-- Generated by graphviz version 8.0.5 (0) --><!-- Pages: 1 --><svg width="186pt" height="155pt" viewBox="0.00 0.00 186.08 155.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 151)"><polygon fill="white" stroke="none" points="-4,4 -4,-151 182.08,-151 182.08,4 -4,4"></polygon><g id="clust1" class="cluster"><title>cluster_1</title><polygon fill="lightgrey" stroke="lightgrey" points="87.54,-8 87.54,-139 170.08,-139 170.08,-8 87.54,-8"></polygon><text text-anchor="middle" x="128.81" y="-122.4" font-family="Times,serif" font-size="14.00"> 我的节点</text></g><!-- VirMach --><g id="node1" class="node"><title> VirMach</title><polygon fill="white" stroke="white" points="162.08,-106 95.54,-106 95.54,-70 162.08,-70 162.08,-106"></polygon><text text-anchor="middle" x="128.81" y="-83.8" font-family="Times,serif" font-size="14.00">VirMach</text></g><!-- BuyVM --><g id="node2" class="node"><title>BuyVM</title><polygon fill="white" stroke="white" points="159.76,-52 97.87,-52 97.87,-16 159.76,-16 159.76,-52"></polygon><text text-anchor="middle" x="128.81" y="-29.8" font-family="Times,serif" font-size="14.00">BuyVM</text></g><!-- VirMach->BuyVM --><g id="edge2" class="edge"><title>VirMach->BuyVM</title><path fill="none" stroke="black" d="M149.72,-69.51C150.94,-67.25 151.68,-64.99 151.93,-62.73"></path><polygon fill="black" stroke="black" points="155.15,-62.12 149.53,-53.14 148.33,-63.67 155.15,-62.12"></polygon></g><!-- BuyVM->VirMach --><g id="edge3" class="edge"><title>BuyVM->VirMach</title><path fill="none" stroke="black" d="M108.1,-52.14C106.83,-54.4 106.04,-56.66 105.74,-58.93"></path><polygon fill="black" stroke="black" points="102.47,-59.42 107.91,-68.51 109.33,-58.01 102.47,-59.42"></polygon></g><!-- KSKB --><g id="node3" class="node"><title>KSKB</title><polygon fill="none" stroke="black" points="56.77,-106 2.77,-106 2.77,-70 56.77,-70 56.77,-106"></polygon><text text-anchor="middle" x="29.77" y="-83.8" font-family="Times,serif" font-size="14.00">KSKB</text></g><!-- KSKB->VirMach --><g id="edge4" class="edge"><title>KSKB->VirMach</title><path fill="none" stroke="black" d="M57.03,-88C65.51,-88 75.13,-88 84.5,-88"></path><polygon fill="black" stroke="black" points="84.29,-91.5 94.29,-88 84.29,-84.5 84.29,-91.5"></polygon></g><!-- Lutoma --><g id="node4" class="node"><title>Lutoma</title><polygon fill="none" stroke="black" points="59.54,-52 0,-52 0,-16 59.54,-16 59.54,-52"></polygon><text text-anchor="middle" x="29.77" y="-29.8" font-family="Times,serif" font-size="14.00">Lutoma</text></g><!-- KSKB->Lutoma --><g id="edge1" class="edge"><title>KSKB->Lutoma</title><path fill="none" stroke="black" d="M29.77,-69.51C29.77,-67.34 29.77,-65.17 29.77,-63"></path><polygon fill="black" stroke="black" points="33.27,-63.14 29.77,-53.14 26.27,-63.14 33.27,-63.14"></polygon></g><!-- Lutoma->BuyVM --><g id="edge5" class="edge"><title>Lutoma->BuyVM</title><path fill="none" stroke="black" d="M59.89,-34C68.3,-34 77.65,-34 86.65,-34"></path><polygon fill="black" stroke="black" points="86.37,-37.5 96.37,-34 86.37,-30.5 86.37,-37.5"></polygon></g></g></svg></p><p>其中 KSKB 是 <code>fd28:cb8f:4c92::/48</code> 这条路由的源头,他将路由广播给了 Lutoma,以及我的 VirMach 节点。Lutoma 随后将这条路由广播给了我的 BuyVM 节点。</p><p><strong>我的所有节点都开启了 <code>add paths yes;</code> 选项</strong>,也就是说节点间会互相交换所有收到的路由,而不只是节点选出来、写入内核路由表的最佳路由。因此,对于我的 BuyVM 节点来说,到路由的源头有两条路线:</p><p><!--?xml version="1.0" encoding="UTF-8" standalone="no"?--><!-- Generated by graphviz version 8.0.5 (0) --><!-- Pages: 1 --><svg width="186pt" height="155pt" viewBox="0.00 0.00 186.08 155.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 151)"><polygon fill="white" stroke="none" points="-4,4 -4,-151 182.08,-151 182.08,4 -4,4"></polygon><g id="clust1" class="cluster"><title>cluster_1</title><polygon fill="lightgrey" stroke="lightgrey" points="8,-8 8,-139 90.54,-139 90.54,-8 8,-8"></polygon><text text-anchor="middle" x="49.27" y="-122.4" font-family="Times,serif" font-size="14.00"> 我的节点</text></g><!-- VirMach --><g id="node1" class="node"><title> VirMach</title><polygon fill="white" stroke="white" points="82.54,-52 16,-52 16,-16 82.54,-16 82.54,-52"></polygon><text text-anchor="middle" x="49.27" y="-29.8" font-family="Times,serif" font-size="14.00">VirMach</text></g><!-- KSKB --><g id="node3" class="node"><title>KSKB</title><polygon fill="none" stroke="black" points="175.31,-52 121.31,-52 121.31,-16 175.31,-16 175.31,-52"></polygon><text text-anchor="middle" x="148.31" y="-29.8" font-family="Times,serif" font-size="14.00">KSKB</text></g><!-- VirMach->KSKB --><g id="edge2" class="edge"><title>VirMach->KSKB</title><path fill="none" stroke="black" d="M82.85,-34C91.6,-34 101.13,-34 110.11,-34"></path><polygon fill="black" stroke="black" points="110.09,-37.5 120.09,-34 110.09,-30.5 110.09,-37.5"></polygon></g><!-- BuyVM --><g id="node2" class="node"><title>BuyVM</title><polygon fill="white" stroke="white" points="80.22,-106 18.32,-106 18.32,-70 80.22,-70 80.22,-106"></polygon><text text-anchor="middle" x="49.27" y="-83.8" font-family="Times,serif" font-size="14.00">BuyVM</text></g><!-- BuyVM->VirMach --><g id="edge1" class="edge"><title>BuyVM->VirMach</title><path fill="none" stroke="black" d="M49.27,-69.51C49.27,-67.34 49.27,-65.17 49.27,-63"></path><polygon fill="black" stroke="black" points="52.77,-63.14 49.27,-53.14 45.77,-63.14 52.77,-63.14"></polygon></g><!-- Lutoma --><g id="node4" class="node"><title>Lutoma</title><polygon fill="none" stroke="black" points="178.08,-106 118.54,-106 118.54,-70 178.08,-70 178.08,-106"></polygon><text text-anchor="middle" x="148.31" y="-83.8" font-family="Times,serif" font-size="14.00">Lutoma</text></g><!-- BuyVM->Lutoma --><g id="edge3" class="edge"><title>BuyVM->Lutoma</title><path fill="none" stroke="black" d="M80.71,-88C89.15,-88 98.44,-88 107.34,-88"></path><polygon fill="black" stroke="black" points="107.31,-91.5 117.31,-88 107.31,-84.5 107.31,-91.5"></polygon></g><!-- Lutoma->KSKB --><g id="edge4" class="edge"><title>Lutoma->KSKB</title><path fill="none" stroke="black" d="M148.31,-69.51C148.31,-67.34 148.31,-65.17 148.31,-63"></path><polygon fill="black" stroke="black" points="151.81,-63.14 148.31,-53.14 144.81,-63.14 151.81,-63.14"></polygon></g></g></svg></p><p>对于 VirMach 节点也是一样的:</p><p><!--?xml version="1.0" encoding="UTF-8" standalone="no"?--><!-- Generated by graphviz version 8.0.5 (0) --><!-- Pages: 1 --><svg width="186pt" height="155pt" viewBox="0.00 0.00 186.08 155.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 151)"><polygon fill="white" stroke="none" points="-4,4 -4,-151 182.08,-151 182.08,4 -4,4"></polygon><g id="clust1" class="cluster"><title>cluster_1</title><polygon fill="lightgrey" stroke="lightgrey" points="8,-8 8,-139 90.54,-139 90.54,-8 8,-8"></polygon><text text-anchor="middle" x="49.27" y="-122.4" font-family="Times,serif" font-size="14.00"> 我的节点</text></g><!-- VirMach --><g id="node1" class="node"><title> VirMach</title><polygon fill="white" stroke="white" points="82.54,-52 16,-52 16,-16 82.54,-16 82.54,-52"></polygon><text text-anchor="middle" x="49.27" y="-29.8" font-family="Times,serif" font-size="14.00">VirMach</text></g><!-- BuyVM --><g id="node2" class="node"><title>BuyVM</title><polygon fill="white" stroke="white" points="80.22,-106 18.32,-106 18.32,-70 80.22,-70 80.22,-106"></polygon><text text-anchor="middle" x="49.27" y="-83.8" font-family="Times,serif" font-size="14.00">BuyVM</text></g><!-- VirMach->BuyVM --><g id="edge2" class="edge"><title>VirMach->BuyVM</title><path fill="none" stroke="black" d="M49.27,-52.14C49.27,-54.31 49.27,-56.48 49.27,-58.65"></path><polygon fill="black" stroke="black" points="45.77,-58.51 49.27,-68.51 52.77,-58.51 45.77,-58.51"></polygon></g><!-- KSKB --><g id="node3" class="node"><title>KSKB</title><polygon fill="none" stroke="black" points="175.31,-52 121.31,-52 121.31,-16 175.31,-16 175.31,-52"></polygon><text text-anchor="middle" x="148.31" y="-29.8" font-family="Times,serif" font-size="14.00">KSKB</text></g><!-- VirMach->KSKB --><g id="edge1" class="edge"><title>VirMach->KSKB</title><path fill="none" stroke="black" d="M82.85,-34C91.6,-34 101.13,-34 110.11,-34"></path><polygon fill="black" stroke="black" points="110.09,-37.5 120.09,-34 110.09,-30.5 110.09,-37.5"></polygon></g><!-- Lutoma --><g id="node4" class="node"><title>Lutoma</title><polygon fill="none" stroke="black" points="178.08,-106 118.54,-106 118.54,-70 178.08,-70 178.08,-106"></polygon><text text-anchor="middle" x="148.31" y="-83.8" font-family="Times,serif" font-size="14.00">Lutoma</text></g><!-- BuyVM->Lutoma --><g id="edge3" class="edge"><title>BuyVM->Lutoma</title><path fill="none" stroke="black" d="M80.71,-88C89.15,-88 98.44,-88 107.34,-88"></path><polygon fill="black" stroke="black" points="107.31,-91.5 117.31,-88 107.31,-84.5 107.31,-91.5"></polygon></g><!-- Lutoma->KSKB --><g id="edge4" class="edge"><title>Lutoma->KSKB</title><path fill="none" stroke="black" d="M148.31,-69.51C148.31,-67.34 148.31,-65.17 148.31,-63"></path><polygon fill="black" stroke="black" points="151.81,-63.14 148.31,-53.14 144.81,-63.14 151.81,-63.14"></polygon></g></g></svg></p><p>一般来说,VirMach 节点肯定选择直连 KSKB 的路由,而不是经过 BuyVM 和 Lutoma,总共两跳(iBGP 同一 AS 内不计跳数)的路由。此时 BuyVM 节点下一跳无论选择 Lutoma 还是 VirMach 节点,都可以获得一条可达的路由,而不是出现环路。</p><p>问题是,<strong>我用 BIRD 的 Filter 手动调整了路由优先级</strong>。<a href="https://wiki.dn42.dev/howto/Bird-communities">DN42 有一组标准的 BGP Community,用于标记每条路由来源的地区。</a>为了降低网络延迟,我使用下面的算法(简化后)来调整路由优先级:</p><pre><code class="hljs language-bash">优先级 = 200 - 10 * 路由跳数如果当前节点和路由来源在同一地区: 优先级 += 100</code></pre><p>问题发生时,KSKB 的原始路由并没有添加来源地区的 Community。<strong>但是 Lutoma 的网络配置错误,给来自 KSKB 的路由也加上了来源地区 Community</strong>,地区和我的 VirMach 节点相同。(根据 DN42 的标准,各个网络只应该给自己的路由添加来源地区 Community,不能给别人的路由添加。)</p><p>此时我的 BuyVM 节点算出了以下的路由优先级,并选择了经过我的 VirMach 节点的路由:</p><p><!--?xml version="1.0" encoding="UTF-8" standalone="no"?--><!-- Generated by graphviz version 8.0.5 (0) --><!-- Pages: 1 --><svg width="290pt" height="173pt" viewBox="0.00 0.00 289.64 173.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 169)"><polygon fill="white" stroke="none" points="-4,4 -4,-169 285.64,-169 285.64,4 -4,4"></polygon><g id="clust1" class="cluster"><title>cluster_1</title><polygon fill="lightgrey" stroke="lightgrey" points="8,-8 8,-157 90.54,-157 90.54,-8 8,-8"></polygon><text text-anchor="middle" x="49.27" y="-140.4" font-family="Times,serif" font-size="14.00"> 我的节点</text></g><!-- VirMach --><g id="node1" class="node"><title> VirMach</title><polygon fill="white" stroke="white" points="82.54,-52 16,-52 16,-16 82.54,-16 82.54,-52"></polygon><text text-anchor="middle" x="49.27" y="-29.8" font-family="Times,serif" font-size="14.00">VirMach</text></g><!-- KSKB --><g id="node3" class="node"><title>KSKB</title><polygon fill="none" stroke="black" points="278.87,-52 224.87,-52 224.87,-16 278.87,-16 278.87,-52"></polygon><text text-anchor="middle" x="251.87" y="-29.8" font-family="Times,serif" font-size="14.00">KSKB</text></g><!-- VirMach->KSKB --><g id="edge2" class="edge"><title>VirMach->KSKB</title><path fill="none" stroke="red" d="M82.95,-34C118.76,-34 176.12,-34 213.64,-34"></path><polygon fill="red" stroke="red" points="213.61,-37.5 223.61,-34 213.61,-30.5 213.61,-37.5"></polygon></g><!-- BuyVM --><g id="node2" class="node"><title>BuyVM</title><polygon fill="white" stroke="white" points="80.22,-124 18.32,-124 18.32,-88 80.22,-88 80.22,-124"></polygon><text text-anchor="middle" x="49.27" y="-101.8" font-family="Times,serif" font-size="14.00">BuyVM</text></g><!-- BuyVM->VirMach --><g id="edge1" class="edge"><title>BuyVM->VirMach</title><path fill="none" stroke="red" d="M49.27,-87.59C49.27,-80.11 49.27,-71.29 49.27,-62.99"></path><polygon fill="red" stroke="red" points="52.77,-63.17 49.27,-53.17 45.77,-63.17 52.77,-63.17"></polygon><text text-anchor="middle" x="37.87" y="-65.8" font-family="Times,serif" font-size="14.00">200 - 10 * 1 = 190</text></g><!-- Lutoma --><g id="node4" class="node"><title>Lutoma</title><polygon fill="none" stroke="black" points="281.64,-124 222.1,-124 222.1,-88 281.64,-88 281.64,-124"></polygon><text text-anchor="middle" x="251.87" y="-101.8" font-family="Times,serif" font-size="14.00">Lutoma</text></g><!-- BuyVM->Lutoma --><g id="edge3" class="edge"><title>BuyVM->Lutoma</title><path fill="none" stroke="black" d="M80.3,-106C115.02,-106 172.31,-106 210.8,-106"></path><polygon fill="black" stroke="black" points="210.61,-109.5 220.61,-106 210.61,-102.5 210.61,-109.5"></polygon><text text-anchor="middle" x="152.32" y="-110.2" font-family="Times,serif" font-size="14.00">200 - 10 * 2 = 180</text></g><!-- Lutoma->KSKB --><g id="edge4" class="edge"><title>Lutoma->KSKB</title><path fill="none" stroke="black" d="M251.87,-87.53C251.87,-79.36 251.87,-71.19 251.87,-63.02"></path><polygon fill="black" stroke="black" points="255.37,-63.28 251.87,-53.28 248.37,-63.28 255.37,-63.28"></polygon></g></g></svg></p><p>而我的 VirMach 节点反而选择了经过 BuyVM 节点的路由:</p><p><!--?xml version="1.0" encoding="UTF-8" standalone="no"?--><!-- Generated by graphviz version 8.0.5 (0) --><!-- Pages: 1 --><svg width="290pt" height="189pt" viewBox="0.00 0.00 289.64 189.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 185)"><polygon fill="white" stroke="none" points="-4,4 -4,-185 285.64,-185 285.64,4 -4,4"></polygon><g id="clust1" class="cluster"><title>cluster_1</title><polygon fill="lightgrey" stroke="lightgrey" points="8,-8 8,-173 90.54,-173 90.54,-8 8,-8"></polygon><text text-anchor="middle" x="49.27" y="-156.4" font-family="Times,serif" font-size="14.00"> 我的节点</text></g><!-- VirMach --><g id="node1" class="node"><title> VirMach</title><polygon fill="white" stroke="white" points="82.54,-52 16,-52 16,-16 82.54,-16 82.54,-52"></polygon><text text-anchor="middle" x="49.27" y="-29.8" font-family="Times,serif" font-size="14.00">VirMach</text></g><!-- BuyVM --><g id="node2" class="node"><title>BuyVM</title><polygon fill="white" stroke="white" points="80.22,-140 18.32,-140 18.32,-104 80.22,-104 80.22,-140"></polygon><text text-anchor="middle" x="49.27" y="-117.8" font-family="Times,serif" font-size="14.00">BuyVM</text></g><!-- VirMach->BuyVM --><g id="edge2" class="edge"><title>VirMach->BuyVM</title><path fill="none" stroke="red" d="M49.27,-52.23C49.27,-64.03 49.27,-79.65 49.27,-93.11"></path><polygon fill="red" stroke="red" points="45.77,-92.79 49.27,-102.79 52.77,-92.79 45.77,-92.79"></polygon><text text-anchor="middle" x="29.47" y="-82.2" font-family="Times,serif" font-size="14.00">200 - 10 * 2 + 100 = 280</text><text text-anchor="middle" x="29.47" y="-65.4" font-family="Times,serif" font-size="14.00">(地区相同)</text></g><!-- KSKB --><g id="node3" class="node"><title>KSKB</title><polygon fill="none" stroke="black" points="278.87,-64 224.87,-64 224.87,-28 278.87,-28 278.87,-64"></polygon><text text-anchor="middle" x="251.87" y="-41.8" font-family="Times,serif" font-size="14.00">KSKB</text></g><!-- VirMach->KSKB --><g id="edge1" class="edge"><title>VirMach->KSKB</title><path fill="none" stroke="black" d="M82.95,-35.95C118.76,-38.1 176.12,-41.53 213.64,-43.77"></path><polygon fill="black" stroke="black" points="213.42,-47.33 223.61,-44.43 213.83,-40.34 213.42,-47.33"></polygon><text text-anchor="middle" x="152.32" y="-65" font-family="Times,serif" font-size="14.00">200 - 10 * 1 = 190</text><text text-anchor="middle" x="152.32" y="-48.2" font-family="Times,serif" font-size="14.00">(路由无地区信息)</text></g><!-- Lutoma --><g id="node4" class="node"><title>Lutoma</title><polygon fill="none" stroke="black" points="281.64,-138 222.1,-138 222.1,-102 281.64,-102 281.64,-138"></polygon><text text-anchor="middle" x="251.87" y="-115.8" font-family="Times,serif" font-size="14.00">Lutoma</text></g><!-- BuyVM->Lutoma --><g id="edge3" class="edge"><title>BuyVM->Lutoma</title><path fill="none" stroke="red" d="M80.3,-121.7C115.02,-121.35 172.31,-120.78 210.8,-120.4"></path><polygon fill="red" stroke="red" points="210.65,-123.89 220.61,-120.29 210.58,-116.89 210.65,-123.89"></polygon></g><!-- Lutoma->KSKB --><g id="edge4" class="edge"><title>Lutoma->KSKB</title><path fill="none" stroke="red" d="M251.87,-101.79C251.87,-92.95 251.87,-84.12 251.87,-75.28"></path><polygon fill="red" stroke="red" points="255.37,-75.5 251.87,-65.5 248.37,-75.5 255.37,-75.5"></polygon></g></g></svg></p><p>这样,环路就形成了。</p><h2 id="正确的操作-3">正确的操作</h2><p>这个问题出现时,以下三个因素缺一不可:</p><ol><li><strong>开启 <code>add paths yes;</code> 选项</strong>,导致备选路由被同时发给其它节点。如果不开启此选项,BuyVM 节点在选择 VirMach 作为下一跳时,就不会把经过 Lutoma 的路由也发给 VirMach 节点了,此时 VirMach 节点只有直连 KSKB 的一条路由可走。</li><li>路由优先级调整算法导致<strong>其它节点的备选路由反而在当前节点被优先选择</strong>。因此,如果要保持 <code>add paths yes;</code> 选项开启,就需要在设计 iBGP 用的优先级算法时,保证在任何情况下,来自同一节点的路由之间<strong>优先级顺序都不变</strong>,从而保证总能选到这一节点的首选路由,而非备选路由。</li><li>经过 Lutoma 的路由被异常加上了 BGP Community,导致了优先级顺序的变化。</li></ol><p>我解决问题的方法是,不再在 iBGP 内部重新计算路由优先级,而是统一使用由收到路由的节点计算的、由 iBGP 传递来的优先级,来保证首选、备选路由的优先级顺序不变。</p>]]></content:encoded>
<category domain="https://lantian.pub/category/modify-website/">网站与服务端</category>
<category domain="https://lantian.pub/tag/DN42/">DN42</category>
<category domain="https://lantian.pub/tag/BGP/">BGP</category>
<comments>https://lantian.pub/article/modify-website/how-to-kill-the-dn42-network.lantian/#disqus_thread</comments>
</item>
<item>
<title>How to Kill the DN42 Network (Updated 2023-05-12)</title>
<link>https://lantian.pub/en/article/modify-website/how-to-kill-the-dn42-network.lantian/</link>
<guid>https://lantian.pub/en/article/modify-website/how-to-kill-the-dn42-network.lantian/</guid>
<pubDate>Fri, 12 May 2023 14:03:33 GMT</pubDate>
<description><blockquote>
<p>DN42 is an <strong>experimental network</strong>, where everyone helps everyone. Nobody is going to blame you if you screwed</description>
<enclosure url="https://lantian.pub//usr/uploads/202008/i-love-niantic-network.png" type="image"/>
<content:encoded><![CDATA[<blockquote><p>DN42 is an <strong>experimental network</strong>, where everyone helps everyone. Nobody is going to blame you if you screwed up. You may seek help at DN42's <a href="https://wiki.dn42.us/services/IRC">IRC channel</a>, <a href="https://wiki.dn42.us/contact#contact_mailing-list">mailing list</a> or the <a href="https://t.me/Dn42Chat">unofficial Telegram group</a>.</p></blockquote><p>Since DN42 is a network for experimentation, a lot of relatively inexperienced users also participate in it. Therefore, occasionally an inexperienced user may misconfigure his/her system and impact the whole DN42 network or even shut it down.</p><p>As a more experienced user, here I will teach new users about some operations that can kill the network and about defense against such misconfigurations that everyone can set up against peers.</p><blockquote><p>WARNING: You should not actually perform these operations in DN42. You should focus more on protecting yourself against them.</p><p>Malicious actions will make you kicked from DN42.</p></blockquote><p>The stories are based on <strong>real disasters</strong> in the Telegram group and IRC channel.</p><h1 id="changelog">Changelog</h1><ul><li>2023-05-12: Add contents for a routing loop caused by modifying BGP local preference.</li><li>2020-08-27: Format changes, add full IRC logs, add another netmask error content, and add content on missing digit in ASN.</li><li>2020-07-13: Add an IPv6 netmask error in the registry and Bird's conflicts between different protocols.</li><li>2020-05-30:Initial version, including OSPF, Babel, and route flaps.</li></ul><h1 id="ospf-is-fun">OSPF is Fun</h1><p>You just joined DN42 and plan to connect all of your servers. You've already peered with a few others on several of your nodes, but you haven't finished on your internal routing yet.</p><p>So you plan to configure OSPF. You opened Bird's configuration file and added a protocol:</p><pre><code class="hljs language-bash">protocol ospf { ipv4 { import all; <span class="hljs-built_in">export</span> all; }; area 0.0.0.0 { interface zt0 { <span class="hljs-built_in">type</span> broadcast; <span class="hljs-comment"># Unimportant stuff redacted</span> }; };};</code></pre><p>Satisfied, you copied the config file to every server and ran <code>bird configure</code>. You checked and confirmed that every server obtained routes from each other via OSPF.</p><p>Suddenly a message box pops up on your IRC client / Telegram. You clicked on it:</p><pre><code class="hljs language-html"><mc**> shit.... as424242**** is hijacking my prefixes, for example 172.23.*.*/27<he**> yup, I see some roa fails for them as well</code></pre><p>Congratulations! You've successfully hijacked (part of) DN42.</p><h2 id="whats-going-on">What's Going On</h2><p>When your server peers with others via BGP protocol, each route contains path information, including the origin as well as the list of nodes it went through. For example, the route <code>172.22.76.184/29</code> may have the path information of <code>4242422547 -> 4242422601 -> 424242****</code>, where <code>4242422547</code> is the origin (me by the way), and <code>4242422601</code> is your neighbor (Burble here, as an example).</p><p>But since your internal networking uses OSPF, which has no idea what BGP paths are, it doesn't preserve them while passing routes around. Now another node of yours obtained <code>172.22.76.184/29</code> via OSPF, yet without any path information. It will then proceed to announce the route with your own ASN to your peers, causing a hijack.</p><p>Here is a graph of what's going on:</p><pre><code class="hljs language-bash">[2547] -> [2601] -> [Your Node A] -> [Your Node B] -> [Peer of Node B] 2547 2547 2547 Gone! Your ASN (BOOM) 2601 2601 Your ASN</code></pre><h1 id="babel-is-fun-too">Babel is Fun, Too</h1><p>Those in the Telegram group are really nice guys. As they help you in fixing the problem, they also recommended Babel to you:</p><ul><li>Babel automatically selects the shortest path by latency.</li><li>Babel is extremely simple to configure.</li></ul><p>But they don't recommend Bird's built-in Babel support since it doesn't support selecting paths by latency.</p><p>You are persuaded, removed the OSPF configuration, and installed Babeld. Soon each of your nodes is getting Babel routes. You waited for a few minutes. No sign of catastrophe yet.</p><p>But you do notice that Bird isn't announcing the routes via BGP. The Telegram guys instigated you to enable the <code>learn</code> option of Bird's kernel protocol:</p><pre><code class="hljs language-bash">protocol kernel sys_kernel_v4 { scan time 20; <span class="hljs-comment"># You're gonna add this line!</span> learn; <span class="hljs-comment"># Unimportant stuff redacted</span>};</code></pre><p>You do this. A few minutes later, you are called out again by people in IRC and Telegram. Yes, you hijacked other's networks. Again.</p><h2 id="whats-going-on-1">What's Going On</h2><p>It is actually the same problem as the OSPF one since Babel also dropped all BGP path information while passing routes around. However, Bird ignores routing information installed to the system by other routing software by default, until you enabled <code>learn</code>.</p><h2 id="correct-way-to-do-this">Correct Way to Do This</h2><ul><li>Always remember: Interior Gateway Protocols, including OSPF, Babel, etc, should never process BGP routing information. BGP routing should be handled solely by BGP.<ul><li>There are multiple schemes to configure BGP in a network. You may refer to: <a href="/en/article/modify-website/bird-confederation.lantian">Bird BGP Confederation: Configuration and Emulation</a>.</li></ul></li><li>Similarly, interior routes should not be passed to BGP, unless you own each and every IP that you're using internally on DN42.</li><li>So you should set BGP's <code>export filter</code> to this in Bird:</li></ul><pre><code class="hljs language-bash"><span class="hljs-built_in">export</span> filter { <span class="hljs-comment"># Only allow announcing STATIC (manually configured) and BGP routes</span> <span class="hljs-keyword">if</span> <span class="hljs-built_in">source</span> ~ [RTS_STATIC, RTS_BGP] <span class="hljs-keyword">then</span> accept; <span class="hljs-comment"># Reject routes from other protocols</span> reject;}</code></pre><h2 id="defensive-measures">Defensive Measures</h2><ul><li>The best countermeasure is ROA, or Route Origin Authorization. It restricts the source ASN of each route.<ul><li>For DN42, ROA configuration is generated automatically based on registry data. They can be downloaded from <a href="https://wiki.dn42.us/howto/Bird#route-origin-authorization">DN42 Wiki's Bird Config Page</a>, and can be automatically updated with a cron job.</li></ul></li><li>If you don't want to configure ROA, you may try to peer with more people.<ul><li>Since BGP chooses the path with the least number of ASes, if you're directly connected to a lot of people, your network will prefer these direct routes even if someone is hijacking.</li><li>But this <strong>doesn't guarantee</strong> full defense against the problem, for example:<ul><li>The path from the hijacker to you is shorter than the real AS.</li><li>The path from the hijacker and from real AS is of equal length, and your routing software chooses one randomly.</li><li>You have DN42 Community Filter, and for some reason, prefers the hijacker's route over the real ones.</li></ul></li></ul></li></ul><h1 id="route-flapping">Route Flapping</h1><p>Route flapping is a whole range of errors that cause one problem: they cause the BGP routing software to frequently switch (or flap) the best route they chose. Since the best route gets announced to other nodes via peering, the flapping sets off a chained reaction, where multiple connected nodes will flap together for one node's mistake. Eventually, the problem will be distributed to the whole network.</p><p>This process consumes a significant amount of bandwidth or traffic. Since many people in DN42 use cheap VPSes for nodes, there are only two possible outcomes eventually:</p><ol><li>Your peer found out about the abnormal traffic and cut the peering to you.</li><li>Your hosting provider (or even your peer's provider) found out of your high bandwidth consumption (or using up your traffic limit) and shut down the VPS.</li></ol><p>In addition, route flapping may cause severe impacts:</p><ul><li>If the problematic AS peered with many other ASes, even if you disconnected from it, the route flap may still be passed from another AS to your AS again.<ul><li>To fix the problem of one problematic AS, you may have to cut off multiple ASes.</li></ul></li></ul><p>For example, one user in the Telegram group had a misconfiguration while transitioning from Full-mesh + Direct connections to Multihop.</p><p><img src="../../../../../usr/uploads/202008/i-love-niantic-network.png" alt="I Always Love Niantic Network"></p><p>He didn't disconnect BGP in the process, and the Babel configuration error caused large amounts of routes to be announced and withdrawn.</p><p>Because of the chain reaction and the number of peerings the guy has set up, multiple large ASes had to disconnect from each other to control the problem (before he woke up).</p><blockquote><p>By the way, this guy had a number of similar accidents before at a smaller scale, which this margin is too narrow to contain.</p></blockquote><h2 id="case-review">Case Review</h2><pre><code class="hljs language-html"><bur*> is someone awake who is on telegram ?<bur*> Kio*, sun*, ie**, lantian perhaps ?<Kio*> Kio* is here<fox*> I am in that dn42 telegram chat too but I do not understand moon runes<fox*> also its midnight for china?<bur*> yes, I'm going to be nuking a lot of peerings if they are all asleep<bur*> I think its originating from NIA*, but a lovely multi mb/s flap going on for the past hour<bur*> and its like whack-a-mole, if I disable one peering the traffic just pops up on a different one<fox*> petition for bur* network to stop accepting new peers to help save dn42 network health<Kio*> NIA* is awake now<bur*> NIA* certainly has ipv4 next hop problems, they are advertising routes with next hops in other networks<Kio*> He says he is adjusting his "network from full-mesh to rr and multihops"<bur*> well its not working ;)<stv*> bur*: I also took down our peering<bur*> stv*, too much traffic from the grc?<stv*> I added a new peer around 1hr ago. Just to check that this hasnt be the cause..<stv*> bur*: no the grc is still up and running<bur*> ah, if you are getting a lot of route updates its cos of NIA*<bur*> grc is currently pumping about 4mb/s to downstram peers<sun*> bur*: what happen?<bur*> NIA* is having issues<bur*> sun* anyway, you are up late!<sun*> I just came back from the bar:)<do**> don't drink and root<bur*> nice :)<sun*> l like drink ;)<bur*> ok, I'm bored of this now, if you are currently sending me more than 1mb/s of bgp traffic your peering is about to get disabled.<bur*> Kio*, sun*, Tch*, jrb*, lantian, ie**, so far<Kio*> barely notice any flapping here, is it v4 or v6 ?<bur*> 4 mostly, I think. you got killed on us-nyc1<bur*> Nap*<Nap*> Shut mine down if you need, I can't look into with much detail until tonight<bau*> half of dn42 is about to loose connectivity due to bur* disableing peerings lol<do**> oh yeah, this looks nice<Kio*> thats why everybody should be at least multi homed with two peers<jrb*> bur*: and on which peering?<Kio*> you shouldnt loose connectivity if only one peer drops<bur*> jrb* us-nyc1 and us-lax1 for you so far<jrb*> mapping table says us-3 and us-5, let me check.<Nap*> Do we know what routes are flapping causing the updates?<Kio*> filtering problematic ASN on my us node now<bur*> Nap* its NIA*<bur*> AS42424213**<jrb*> sun*, rou*: disabling my peerings with you for now, there seems to be serious flapping<do**> him again?<sun*> what?<sun*> is me problem?<bur*> sun*, I've killed all of our peerings<sun*> why?<bur*> sun*, you are distributing the problems from NIA*<Nap*> bur*: K, gonna try to filter on ATL/CHI at least.<bur*> thanks Nap*<Kio*> recommend everybody to temporarily enable "bgp_path ~" filter for the problematic ASN<sun*> i disabled NIA*, would fix problem?<do**> bur*: I also peer with NIA* and I don't get any bgp updates from him<do**> ah wait<bur*> sun*, depends if you are also getting the updates from other peers too<do**> now I see it<do**> disabling peering<sun*> if bgp_path ~ [= 42424213** =] then reject;<bur*> ~ [= * 42424213** * =] to reject all paths<sun*> ohh<jrb*> bur*: seems to be mostly rou* from my perspective<Kio*> Should be filtered on my side, if anyone continues to receive those updates please notify<bur*> sun*, I tried re-enabling you on lax1 but you jumped striaght to 1mb/s+ again<bur*> jrb*, re-enabled<sun*> i have disabled NIA*<bur*> Kio*, re-enabled<do**> oh btw, I have notified NIA* about this issue<jrb*> do**: also tell him to notify everybody to get out of the blacklists.<do**> jrb*: will do<Nap*> bur*: I should have it filtered on my ATL (your CHI)<Kio*> wrote NIA* also directly on telegram<sun*> bur*: is it better now?<bur*> for the record, this is the first time that I've mass disabled peerings, but this was causing issues across the board<bur*> sun*, no not really<An**> I've stop importing route from NIA*<stv*> I am also dropping NIA* now<bur*> sun*, thats like 1k updates every few seconds<Nap*> bur*: all host should have it filtered now.<bur*> Nap*, looks to me, thanks<sun*> bur*: seems to have reduced traffic<bur*> sun*, yes that looks better<bur*> sun*, is that now ok across all your nodes ?<sun*> yep<bur*> sun*, ok re-enabled<do**> alright, also filtered 42424213**<tm**> hi, also filtered 42424213**<bur*> I guess they got the message, seems we're back to normal again and everyone I disabled is back again<do**> bur*: I think NIA* is asleep, probably everyone filtered it<do**> or disabled peering<bur*> do**, there is that, but I also renabled NIA* and am not getting the same errors now<do**> oh, interesting<bur*> I might regret doing that by morning, but hey. I do try and keep everything open as best as possible.<do**> bur*: last time when NIA* did that I waited for their response<Kio*> Nope nia* just messaged in Telegram about it<do**> ah<bur*> my peering hasn't re-established, so I guess they hit the big red shutdown button<Kio*> He tried to migrate his network to a full mesh<Kio*> and is now "pulling all the wires"<do**> Kio*: did you message him directly or was that on any of the groups?<Kio*> on the telegram group<do**> bur*: you didn't get that many bgp updates from me?<sun*> NIA* woke up :)<bur*> do**, you went from an average of ~3kbs to ~10kbs+, peaking at 50kbs. In the grand scheme of things that was lost in the noise<do**> interesting<do**> I also peer directly with NIA*<bur*> do**, yes, interesting. Is the link restricted in bandwidth ?<do**> not at all</code></pre><h2 id="defensive-measures-1">Defensive Measures</h2><ul><li>The best solution is Route Dampening, which restricts the number of routing updates to be accepted in a time range.<ul><li>But Bird doesn't support this. You'd have to put up with it.</li></ul></li><li>Alternatively, you can monitor your nodes with Prometheus, Grafana, etc., so you get an alarm that something's off and handle it manually.<ul><li>But obviously, if you aren't online at that time, you may have already used a few gigs of traffic before you're aware.</li></ul></li><li>Next solution is to rate-limit the peering connection.<ul><li>Since there is almost no application that requires lots of bandwidth in DN42, this is a viable solution that ensures safety.</li><li>But the downside is also obvious: degradation of performance.</li></ul></li><li>If you're rich enough, get a server with uncapped traffic.</li></ul><h1 id="how-long-is-that-ip-block">How Long is That IP Block?</h1><p>Since it's the year 2020, you plan to add an IPv6 block to your network. With <a href="/en/article/modify-website/dn42-experimental-network-2020.lantian">my DN42 registration guide</a>, you registered yourself a IPv6 block, which quickly got merged to registry.</p><p>From your perspective, everything is normal. Yet on the other side of the planet, a message pops up on one person's phone/computer that his DN42 ROA generator is malfunctioning. He opens the registry page, facepalms, and commits this change:</p><p><img src="../../../../../usr/uploads/202007/dn42-registry-error.png" alt="Errorneous IPv6 Block in DN42 Registry"></p><p><a href="https://git.dn42.dev/dn42/registry/commit/9f45ee31cdea4a997d59a262c4a8ac8eb3cbd1f1">https://git.dn42.dev/dn42/registry/commit/9f45ee31cdea4a997d59a262c4a8ac8eb3cbd1f1</a></p><h2 id="whats-going-on-2">What's Going On</h2><p>This user added a IPv6 block, <code>fd37:03b3:cae6:5158::/48</code>. Since an IPv6 address consists of 32 hex numbers (128 bits total), and this block defined the first 16 digits (or 64 bits), the corresponding netmask should be <code>/64</code> or higher.</p><p>But for some reason, this error wasn't detected by DN42 Registry's schema checker, nor by the admin who inspected and merged the change, so it successfully ended up in the registry.</p><p>Later, the ROA generator found the erroneous IP block while parsing the registry and crashed.</p><h1 id="and-it-happened-again">And It Happened Again</h1><p><img src="../../../../../usr/uploads/202008/dn42-registry-error.png" alt="Errorneous IPv6 Block Happened Again"></p><p><a href="https://git.dn42.dev/dn42/registry/commit/00f90f592a35e325152ce28157f64d3fca7c8d7d">https://git.dn42.dev/dn42/registry/commit/00f90f592a35e325152ce28157f64d3fca7c8d7d</a></p><h2 id="correct-way-to-do-this-1">Correct Way to Do This</h2><ul><li>While registering for an IP block, the user should check the validity of netmasks and address blocks.</li><li>The DN42 Registry schema checker, or the admin performing the merge operation, should have found out the problem.</li><li>ROA generator should skip the problematic record and properly handle the rest of the data instead of crashing.</li></ul><p>Fortunately, except that the ROA update was delayed by a few hours, this error didn't impact the network itself much.</p><h2 id="defensive-measures-2">Defensive Measures</h2><p>Since the decentralized nature of DN42 as it's born, you can write your own ROA generator as a backup.</p><blockquote><p>Although my ROA generator also failed this time...</p></blockquote><p>The reason is that different implementations may have minor differences even though they do the same thing. When such a bug on input content arises, some implementations may survive.</p><h1 id="bird-protocol-fights">Bird Protocol Fights</h1><p>The story starts with my friend Joe... Fine. The story starts with me.</p><p>Since my network is connected to both DN42 and NeoNetwork, as well as my internal network with a private IP range, to prevent announcing my internal network to DN42 and NeoNetwork, I did this:</p><ul><li>All routes from the Kernel protocol (from OS routing table) and the Direct protocol (from network interface addresses) are labeled with a BGP community.</li><li>Routes with the community are filtered in exterior peerings with DN42 and NeoNetwork.</li><li>This way, my internal IPs won't be announced to other networks, but since my DN42 and NeoNetwork IP blocks are configured in Static protocol, they won't be impacted.</li></ul><p>Initially, everything looked normal, until a few days later when some users on Telegram found that my looking glass bot times out on any IP in DN42.</p><h2 id="whats-going-on-3">What's Going On</h2><p>Initially, everything is indeed normal, and my IP block <code>172.22.76.184/29</code> is announced correctly. Until Direct protocol performed a refresh and obtained <code>172.22.76.184/29</code> from one of the network interfaces, and sent the route to Bird routing table again.</p><p>The new route overwrote the previous route, and since it comes from Direct protocol, it's labeled with the community and wasn't broadcasted. Static protocol, on the other hand, is indeed "static", and won't overwrite the route again.</p><p>At this time, I effectively stopped announcing my IP range. No wonder I cannot receive any packets coming back to my nodes now.</p><h2 id="correct-way-to-do-this-2">Correct Way to Do This</h2><p>In Bird, you should avoid getting the same route entry from multiple routing protocols, as they overwrite each other and may cause unexpected behavior.</p><p>I finally chose to limit Direct protocol to my internal IP range with a filter, so it won't overwrite my DN42 ranges again.</p><h1 id="need-better-glasses">Need Better Glasses?</h1><p>A new user registered an ASN:</p><p><img src="../../../../../usr/uploads/202008/dn42-asn-error.png" alt="Errorneous ASN in DN42 Registry"></p><p>This is what happened to DN42:</p><ul><li><p>Telegram Group: (Translation available below the image)</p><p><img src="../../../../../usr/uploads/202008/dn42-asn-error-response.png" alt="Telegram Reactions"></p><p>Translation:</p><pre><code class="hljs language-html"><span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> Why someone with an ASN of 424242236 came to peer with me<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> Yep, 9 digits<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> /whois@lantian_lg_bot 424242236 <span class="hljs-tag"><<span class="hljs-name">lg</span>></span> (outputs WHOIS information of the AS)<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> And it has proper WHOIS information <span class="hljs-tag"><<span class="hljs-name">KaiKai</span>></span> https://net-info.nia.ac.cn/#424242236 <span class="hljs-tag"><<span class="hljs-name">KaiKai</span>></span> Really, it exists <span class="hljs-tag"><<span class="hljs-name">Pastel</span>></span> Burble didn't spot the error? <span class="hljs-tag"><<span class="hljs-name">Pastel</span>></span> Like the /64, which crashed the ROA generator</code></pre></li><li><p>IRC:</p><pre><code class="hljs language-html"><span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> Someone successfully registered in DN42 with ASN 424242236 (9 digits)<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> Is this expected? <xu**> doh <xu**> shouldt have happened <xu**> probably forgot the extra 2 <xu**> 424242 2236 <Kai*> too late tho. it already has one peer with tech9 <dne*> filtering fail! <xu**> pomoke?<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> yep, doesn't seem to be on irc though<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> nor on telegram <0x7*> so how a 9-digit ASN passed the schema checker...?<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> I don't think schema checker checks ASN, or it will block out clearnet ASNs<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> But maybe we need a warning? <xu**> probably a bug in the policy checker <xu**> i wish we had gone with a prefix that had a visual space <xu**> like AS424200xxxx<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> Well pomoke tried to peer with me via email (but ended in spam folder)<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> I'm going to tell him/her to correct the ASN <Kai*> 9 is a good number tho <Kai*> once in a blue moon that bur* made mistake <sun*> westerners love digital 9 <bur*> crap <bur*> lantian, are you in contact with pomoke? if they can submit a fix quickly then I'll merge it. Otherwise I'll need to pull the commit<span class="hljs-tag"><<span class="hljs-name">lantian</span>></span> bur*: I sent him/her an email, not sure about response time <bur*> umm, I'm going to have to pull it then</code></pre></li><li><p>Justice Has Arrived:</p><p><img src="../../../../../usr/uploads/202008/dn42-asn-error-correction.png" alt="Errorneous Commit Reverted"></p></li></ul><h2 id="defensive-measures-3">Defensive Measures</h2><ul><li><p>Just have fun, as this is so rare:</p><pre><code class="hljs language-html"><Kai*> once in a blue moon that bur* made mistake</code></pre></li><li><p>But while having fun, remember to point out the problem on IRC.</p></li><li><p>Double-check your peer's information when peering.</p></li><li><p>Check <a href="https://t.me/DN42new">DN42 New ASN</a>, a Telegram channel that notifies of new DN42 ASNs, in your free time.</p></li></ul><h1 id="be-careful-of-bgp-local-preferences">Be Careful of BGP Local Preferences</h1><p>When I was helping others debugging their network in the DN42 Telegram group, I suddenly noticed a routing loop between two of my nodes:</p><pre><code class="hljs language-bash">traceroute to fd28:cb8f:4c92:1::1 (fd28:cb8f:4c92:1::1), 30 hops max, 80 byte packets 1 us-new-york-city.virmach-ny1g.lantian.dn42 (fdbc:f9dc:67ad:8::1) 88.023 ms 2 lu-bissen.buyvm.lantian.dn42 (fdbc:f9dc:67ad:2::1) 94.401 ms 3 us-new-york-city.virmach-ny1g.lantian.dn42 (fdbc:f9dc:67ad:8::1) 167.664 ms 4 lu-bissen.buyvm.lantian.dn42 (fdbc:f9dc:67ad:2::1) 174.235 ms 5 us-new-york-city.virmach-ny1g.lantian.dn42 (fdbc:f9dc:67ad:8::1) 247.213 ms 6 lu-bissen.buyvm.lantian.dn42 (fdbc:f9dc:67ad:2::1) 253.499 ms 7 us-new-york-city.virmach-ny1g.lantian.dn42 (fdbc:f9dc:67ad:8::1) 326.690 ms 8 lu-bissen.buyvm.lantian.dn42 (fdbc:f9dc:67ad:2::1) 333.412 ms 9 us-new-york-city.virmach-ny1g.lantian.dn42 (fdbc:f9dc:67ad:8::1) 406.978 ms10 lu-bissen.buyvm.lantian.dn42 (fdbc:f9dc:67ad:2::1) 413.537 ms11 us-new-york-city.virmach-ny1g.lantian.dn42 (fdbc:f9dc:67ad:8::1) 486.762 ms12 lu-bissen.buyvm.lantian.dn42 (fdbc:f9dc:67ad:2::1) 493.147 ms18 hops not responding.</code></pre><p>I logged onto these two nodes, and indeed, the VirMach node did choose BuyVM's route as the preferred path, and the BuyVM node did the same for VirMach's route.</p><p>Isn't BGP supposed to prevent loops? Why are these two nodes choosing the route from each other?</p><h2 id="whats-going-on-4">What's Going On</h2><p>The problem involves 4 nodes from 3 ASes:</p><p><!--?xml version="1.0" encoding="UTF-8" standalone="no"?--><!-- Generated by graphviz version 8.0.5 (0) --><!-- Pages: 1 --><svg width="186pt" height="155pt" viewBox="0.00 0.00 186.08 155.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 151)"><polygon fill="white" stroke="none" points="-4,4 -4,-151 182.08,-151 182.08,4 -4,4"></polygon><g id="clust1" class="cluster"><title>cluster_1</title><polygon fill="lightgrey" stroke="lightgrey" points="87.54,-8 87.54,-139 170.08,-139 170.08,-8 87.54,-8"></polygon><text text-anchor="middle" x="128.81" y="-122.4" font-family="Times,serif" font-size="14.00">My Nodes</text></g><!-- VirMach --><g id="node1" class="node"><title>VirMach</title><polygon fill="white" stroke="white" points="162.08,-106 95.54,-106 95.54,-70 162.08,-70 162.08,-106"></polygon><text text-anchor="middle" x="128.81" y="-83.8" font-family="Times,serif" font-size="14.00">VirMach</text></g><!-- BuyVM --><g id="node2" class="node"><title>BuyVM</title><polygon fill="white" stroke="white" points="159.76,-52 97.87,-52 97.87,-16 159.76,-16 159.76,-52"></polygon><text text-anchor="middle" x="128.81" y="-29.8" font-family="Times,serif" font-size="14.00">BuyVM</text></g><!-- VirMach->BuyVM --><g id="edge2" class="edge"><title>VirMach->BuyVM</title><path fill="none" stroke="black" d="M149.72,-69.51C150.94,-67.25 151.68,-64.99 151.93,-62.73"></path><polygon fill="black" stroke="black" points="155.15,-62.12 149.53,-53.14 148.33,-63.67 155.15,-62.12"></polygon></g><!-- BuyVM->VirMach --><g id="edge3" class="edge"><title>BuyVM->VirMach</title><path fill="none" stroke="black" d="M108.1,-52.14C106.83,-54.4 106.04,-56.66 105.74,-58.93"></path><polygon fill="black" stroke="black" points="102.47,-59.42 107.91,-68.51 109.33,-58.01 102.47,-59.42"></polygon></g><!-- KSKB --><g id="node3" class="node"><title>KSKB</title><polygon fill="none" stroke="black" points="56.77,-106 2.77,-106 2.77,-70 56.77,-70 56.77,-106"></polygon><text text-anchor="middle" x="29.77" y="-83.8" font-family="Times,serif" font-size="14.00">KSKB</text></g><!-- KSKB->VirMach --><g id="edge4" class="edge"><title>KSKB->VirMach</title><path fill="none" stroke="black" d="M57.03,-88C65.51,-88 75.13,-88 84.5,-88"></path><polygon fill="black" stroke="black" points="84.29,-91.5 94.29,-88 84.29,-84.5 84.29,-91.5"></polygon></g><!-- Lutoma --><g id="node4" class="node"><title>Lutoma</title><polygon fill="none" stroke="black" points="59.54,-52 0,-52 0,-16 59.54,-16 59.54,-52"></polygon><text text-anchor="middle" x="29.77" y="-29.8" font-family="Times,serif" font-size="14.00">Lutoma</text></g><!-- KSKB->Lutoma --><g id="edge1" class="edge"><title>KSKB->Lutoma</title><path fill="none" stroke="black" d="M29.77,-69.51C29.77,-67.34 29.77,-65.17 29.77,-63"></path><polygon fill="black" stroke="black" points="33.27,-63.14 29.77,-53.14 26.27,-63.14 33.27,-63.14"></polygon></g><!-- Lutoma->BuyVM --><g id="edge5" class="edge"><title>Lutoma->BuyVM</title><path fill="none" stroke="black" d="M59.89,-34C68.3,-34 77.65,-34 86.65,-34"></path><polygon fill="black" stroke="black" points="86.37,-37.5 96.37,-34 86.37,-30.5 86.37,-37.5"></polygon></g></g></svg></p><p>KSKB is the source for the route <code>fd28:cb8f:4c92::/48</code>. He broadcasted the route to Lutoma, as well as my VirMach node. Lutoma then broadcased the route to my BuyVM node.</p><p><strong>All my nodes have <code>add path yes;</code> option turned on</strong>, which means my nodes will exchange all received routes, rather than only the preferred ones written into kernel routing table. Therefore, as far as the BuyVM node concerns, it can choose from two paths to the source:</p><p><!--?xml version="1.0" encoding="UTF-8" standalone="no"?--><!-- Generated by graphviz version 8.0.5 (0) --><!-- Pages: 1 --><svg width="186pt" height="155pt" viewBox="0.00 0.00 186.08 155.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 151)"><polygon fill="white" stroke="none" points="-4,4 -4,-151 182.08,-151 182.08,4 -4,4"></polygon><g id="clust1" class="cluster"><title>cluster_1</title><polygon fill="lightgrey" stroke="lightgrey" points="8,-8 8,-139 90.54,-139 90.54,-8 8,-8"></polygon><text text-anchor="middle" x="49.27" y="-122.4" font-family="Times,serif" font-size="14.00">My Nodes</text></g><!-- VirMach --><g id="node1" class="node"><title>VirMach</title><polygon fill="white" stroke="white" points="82.54,-52 16,-52 16,-16 82.54,-16 82.54,-52"></polygon><text text-anchor="middle" x="49.27" y="-29.8" font-family="Times,serif" font-size="14.00">VirMach</text></g><!-- KSKB --><g id="node3" class="node"><title>KSKB</title><polygon fill="none" stroke="black" points="175.31,-52 121.31,-52 121.31,-16 175.31,-16 175.31,-52"></polygon><text text-anchor="middle" x="148.31" y="-29.8" font-family="Times,serif" font-size="14.00">KSKB</text></g><!-- VirMach->KSKB --><g id="edge2" class="edge"><title>VirMach->KSKB</title><path fill="none" stroke="black" d="M82.85,-34C91.6,-34 101.13,-34 110.11,-34"></path><polygon fill="black" stroke="black" points="110.09,-37.5 120.09,-34 110.09,-30.5 110.09,-37.5"></polygon></g><!-- BuyVM --><g id="node2" class="node"><title>BuyVM</title><polygon fill="white" stroke="white" points="80.22,-106 18.32,-106 18.32,-70 80.22,-70 80.22,-106"></polygon><text text-anchor="middle" x="49.27" y="-83.8" font-family="Times,serif" font-size="14.00">BuyVM</text></g><!-- BuyVM->VirMach --><g id="edge1" class="edge"><title>BuyVM->VirMach</title><path fill="none" stroke="black" d="M49.27,-69.51C49.27,-67.34 49.27,-65.17 49.27,-63"></path><polygon fill="black" stroke="black" points="52.77,-63.14 49.27,-53.14 45.77,-63.14 52.77,-63.14"></polygon></g><!-- Lutoma --><g id="node4" class="node"><title>Lutoma</title><polygon fill="none" stroke="black" points="178.08,-106 118.54,-106 118.54,-70 178.08,-70 178.08,-106"></polygon><text text-anchor="middle" x="148.31" y="-83.8" font-family="Times,serif" font-size="14.00">Lutoma</text></g><!-- BuyVM->Lutoma --><g id="edge3" class="edge"><title>BuyVM->Lutoma</title><path fill="none" stroke="black" d="M80.71,-88C89.15,-88 98.44,-88 107.34,-88"></path><polygon fill="black" stroke="black" points="107.31,-91.5 117.31,-88 107.31,-84.5 107.31,-91.5"></polygon></g><!-- Lutoma->KSKB --><g id="edge4" class="edge"><title>Lutoma->KSKB</title><path fill="none" stroke="black" d="M148.31,-69.51C148.31,-67.34 148.31,-65.17 148.31,-63"></path><polygon fill="black" stroke="black" points="151.81,-63.14 148.31,-53.14 144.81,-63.14 151.81,-63.14"></polygon></g></g></svg></p><p>The same applies for my VirMach node:</p><p><!--?xml version="1.0" encoding="UTF-8" standalone="no"?--><!-- Generated by graphviz version 8.0.5 (0) --><!-- Pages: 1 --><svg width="186pt" height="155pt" viewBox="0.00 0.00 186.08 155.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 151)"><polygon fill="white" stroke="none" points="-4,4 -4,-151 182.08,-151 182.08,4 -4,4"></polygon><g id="clust1" class="cluster"><title>cluster_1</title><polygon fill="lightgrey" stroke="lightgrey" points="8,-8 8,-139 90.54,-139 90.54,-8 8,-8"></polygon><text text-anchor="middle" x="49.27" y="-122.4" font-family="Times,serif" font-size="14.00">My Nodes</text></g><!-- VirMach --><g id="node1" class="node"><title>VirMach</title><polygon fill="white" stroke="white" points="82.54,-52 16,-52 16,-16 82.54,-16 82.54,-52"></polygon><text text-anchor="middle" x="49.27" y="-29.8" font-family="Times,serif" font-size="14.00">VirMach</text></g><!-- BuyVM --><g id="node2" class="node"><title>BuyVM</title><polygon fill="white" stroke="white" points="80.22,-106 18.32,-106 18.32,-70 80.22,-70 80.22,-106"></polygon><text text-anchor="middle" x="49.27" y="-83.8" font-family="Times,serif" font-size="14.00">BuyVM</text></g><!-- VirMach->BuyVM --><g id="edge2" class="edge"><title>VirMach->BuyVM</title><path fill="none" stroke="black" d="M49.27,-52.14C49.27,-54.31 49.27,-56.48 49.27,-58.65"></path><polygon fill="black" stroke="black" points="45.77,-58.51 49.27,-68.51 52.77,-58.51 45.77,-58.51"></polygon></g><!-- KSKB --><g id="node3" class="node"><title>KSKB</title><polygon fill="none" stroke="black" points="175.31,-52 121.31,-52 121.31,-16 175.31,-16 175.31,-52"></polygon><text text-anchor="middle" x="148.31" y="-29.8" font-family="Times,serif" font-size="14.00">KSKB</text></g><!-- VirMach->KSKB --><g id="edge1" class="edge"><title>VirMach->KSKB</title><path fill="none" stroke="black" d="M82.85,-34C91.6,-34 101.13,-34 110.11,-34"></path><polygon fill="black" stroke="black" points="110.09,-37.5 120.09,-34 110.09,-30.5 110.09,-37.5"></polygon></g><!-- Lutoma --><g id="node4" class="node"><title>Lutoma</title><polygon fill="none" stroke="black" points="178.08,-106 118.54,-106 118.54,-70 178.08,-70 178.08,-106"></polygon><text text-anchor="middle" x="148.31" y="-83.8" font-family="Times,serif" font-size="14.00">Lutoma</text></g><!-- BuyVM->Lutoma --><g id="edge3" class="edge"><title>BuyVM->Lutoma</title><path fill="none" stroke="black" d="M80.71,-88C89.15,-88 98.44,-88 107.34,-88"></path><polygon fill="black" stroke="black" points="107.31,-91.5 117.31,-88 107.31,-84.5 107.31,-91.5"></polygon></g><!-- Lutoma->KSKB --><g id="edge4" class="edge"><title>Lutoma->KSKB</title><path fill="none" stroke="black" d="M148.31,-69.51C148.31,-67.34 148.31,-65.17 148.31,-63"></path><polygon fill="black" stroke="black" points="151.81,-63.14 148.31,-53.14 144.81,-63.14 151.81,-63.14"></polygon></g></g></svg></p><p>Generally speaking, the VirMach node should prefer the direct route to KSKB, instead of the path through my BuyVM node and Lutoma's node, for a total of 2 hops (hops aren't counted for iBGP within the same AS). Now regardless of the next hop BuyVM node prefers, either Lutoma's node or my VirMach node, it will have a reachable path rather than a routing loop.</p><p>The problem is that <strong>I manually adjusted route preferences with a BIRD filter</strong>. <a href="https://wiki.dn42.dev/howto/Bird-communities">DN42 has a standard set of BGP communities to mark the source region of each route.</a> To reduce network latency, I used the following algorithm (simplified) to adjust my route preferences:</p><pre><code class="hljs language-bash">Preference = 200 - 10 * (Hop count)If the current node is <span class="hljs-keyword">in</span> the same region as the route <span class="hljs-built_in">source</span>: Preference += 100</code></pre><p>When the problem happened, the original route from KSKB don't have source region community set up. <strong>However, Lutoma's network was set up incorrectly, and added source region community to KSKB's route as well</strong>, and with the same region as my VirMach node. (According to the standard of DN42, networks should only add source region communities to their own routes, not to routes received from other networks.)</p><p>Now my BuyVM node calculated the following route preferences, and chose the route through my VirMach node:</p><p><!--?xml version="1.0" encoding="UTF-8" standalone="no"?--><!-- Generated by graphviz version 8.0.5 (0) --><!-- Pages: 1 --><svg width="290pt" height="173pt" viewBox="0.00 0.00 289.64 173.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 169)"><polygon fill="white" stroke="none" points="-4,4 -4,-169 285.64,-169 285.64,4 -4,4"></polygon><g id="clust1" class="cluster"><title>cluster_1</title><polygon fill="lightgrey" stroke="lightgrey" points="8,-8 8,-157 90.54,-157 90.54,-8 8,-8"></polygon><text text-anchor="middle" x="49.27" y="-140.4" font-family="Times,serif" font-size="14.00">My Nodes</text></g><!-- VirMach --><g id="node1" class="node"><title>VirMach</title><polygon fill="white" stroke="white" points="82.54,-52 16,-52 16,-16 82.54,-16 82.54,-52"></polygon><text text-anchor="middle" x="49.27" y="-29.8" font-family="Times,serif" font-size="14.00">VirMach</text></g><!-- KSKB --><g id="node3" class="node"><title>KSKB</title><polygon fill="none" stroke="black" points="278.87,-52 224.87,-52 224.87,-16 278.87,-16 278.87,-52"></polygon><text text-anchor="middle" x="251.87" y="-29.8" font-family="Times,serif" font-size="14.00">KSKB</text></g><!-- VirMach->KSKB --><g id="edge2" class="edge"><title>VirMach->KSKB</title><path fill="none" stroke="red" d="M82.95,-34C118.76,-34 176.12,-34 213.64,-34"></path><polygon fill="red" stroke="red" points="213.61,-37.5 223.61,-34 213.61,-30.5 213.61,-37.5"></polygon></g><!-- BuyVM --><g id="node2" class="node"><title>BuyVM</title><polygon fill="white" stroke="white" points="80.22,-124 18.32,-124 18.32,-88 80.22,-88 80.22,-124"></polygon><text text-anchor="middle" x="49.27" y="-101.8" font-family="Times,serif" font-size="14.00">BuyVM</text></g><!-- BuyVM->VirMach --><g id="edge1" class="edge"><title>BuyVM->VirMach</title><path fill="none" stroke="red" d="M49.27,-87.59C49.27,-80.11 49.27,-71.29 49.27,-62.99"></path><polygon fill="red" stroke="red" points="52.77,-63.17 49.27,-53.17 45.77,-63.17 52.77,-63.17"></polygon><text text-anchor="middle" x="37.87" y="-65.8" font-family="Times,serif" font-size="14.00">200 - 10 * 1 = 190</text></g><!-- Lutoma --><g id="node4" class="node"><title>Lutoma</title><polygon fill="none" stroke="black" points="281.64,-124 222.1,-124 222.1,-88 281.64,-88 281.64,-124"></polygon><text text-anchor="middle" x="251.87" y="-101.8" font-family="Times,serif" font-size="14.00">Lutoma</text></g><!-- BuyVM->Lutoma --><g id="edge3" class="edge"><title>BuyVM->Lutoma</title><path fill="none" stroke="black" d="M80.3,-106C115.02,-106 172.31,-106 210.8,-106"></path><polygon fill="black" stroke="black" points="210.61,-109.5 220.61,-106 210.61,-102.5 210.61,-109.5"></polygon><text text-anchor="middle" x="152.32" y="-110.2" font-family="Times,serif" font-size="14.00">200 - 10 * 2 = 180</text></g><!-- Lutoma->KSKB --><g id="edge4" class="edge"><title>Lutoma->KSKB</title><path fill="none" stroke="black" d="M251.87,-87.53C251.87,-79.36 251.87,-71.19 251.87,-63.02"></path><polygon fill="black" stroke="black" points="255.37,-63.28 251.87,-53.28 248.37,-63.28 255.37,-63.28"></polygon></g></g></svg></p><p>Yet my VirMach node chose the route through BuyVM:</p><p><!--?xml version="1.0" encoding="UTF-8" standalone="no"?--><!-- Generated by graphviz version 8.0.5 (0) --><!-- Pages: 1 --><svg width="290pt" height="189pt" viewBox="0.00 0.00 289.64 189.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink"><g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 185)"><polygon fill="white" stroke="none" points="-4,4 -4,-185 285.64,-185 285.64,4 -4,4"></polygon><g id="clust1" class="cluster"><title>cluster_1</title><polygon fill="lightgrey" stroke="lightgrey" points="8,-8 8,-173 90.54,-173 90.54,-8 8,-8"></polygon><text text-anchor="middle" x="49.27" y="-156.4" font-family="Times,serif" font-size="14.00">My Nodes</text></g><!-- VirMach --><g id="node1" class="node"><title>VirMach</title><polygon fill="white" stroke="white" points="82.54,-52 16,-52 16,-16 82.54,-16 82.54,-52"></polygon><text text-anchor="middle" x="49.27" y="-29.8" font-family="Times,serif" font-size="14.00">VirMach</text></g><!-- BuyVM --><g id="node2" class="node"><title>BuyVM</title><polygon fill="white" stroke="white" points="80.22,-140 18.32,-140 18.32,-104 80.22,-104 80.22,-140"></polygon><text text-anchor="middle" x="49.27" y="-117.8" font-family="Times,serif" font-size="14.00">BuyVM</text></g><!-- VirMach->BuyVM --><g id="edge2" class="edge"><title>VirMach->BuyVM</title><path fill="none" stroke="red" d="M49.27,-52.23C49.27,-64.03 49.27,-79.65 49.27,-93.11"></path><polygon fill="red" stroke="red" points="45.77,-92.79 49.27,-102.79 52.77,-92.79 45.77,-92.79"></polygon><text text-anchor="middle" x="29.47" y="-82.2" font-family="Times,serif" font-size="14.00">200 - 10 * 2 + 100 = 280</text><text text-anchor="middle" x="29.47" y="-65.4" font-family="Times,serif" font-size="14.00">(Same region)</text></g><!-- KSKB --><g id="node3" class="node"><title>KSKB</title><polygon fill="none" stroke="black" points="278.87,-64 224.87,-64 224.87,-28 278.87,-28 278.87,-64"></polygon><text text-anchor="middle" x="251.87" y="-41.8" font-family="Times,serif" font-size="14.00">KSKB</text></g><!-- VirMach->KSKB --><g id="edge1" class="edge"><title>VirMach->KSKB</title><path fill="none" stroke="black" d="M82.95,-35.95C118.76,-38.1 176.12,-41.53 213.64,-43.77"></path><polygon fill="black" stroke="black" points="213.42,-47.33 223.61,-44.43 213.83,-40.34 213.42,-47.33"></polygon><text text-anchor="middle" x="152.32" y="-65" font-family="Times,serif" font-size="14.00">200 - 10 * 1 = 190</text><text text-anchor="middle" x="152.32" y="-48.2" font-family="Times,serif" font-size="14.00">(No region info)</text></g><!-- Lutoma --><g id="node4" class="node"><title>Lutoma</title><polygon fill="none" stroke="black" points="281.64,-138 222.1,-138 222.1,-102 281.64,-102 281.64,-138"></polygon><text text-anchor="middle" x="251.87" y="-115.8" font-family="Times,serif" font-size="14.00">Lutoma</text></g><!-- BuyVM->Lutoma --><g id="edge3" class="edge"><title>BuyVM->Lutoma</title><path fill="none" stroke="red" d="M80.3,-121.7C115.02,-121.35 172.31,-120.78 210.8,-120.4"></path><polygon fill="red" stroke="red" points="210.65,-123.89 220.61,-120.29 210.58,-116.89 210.65,-123.89"></polygon></g><!-- Lutoma->KSKB --><g id="edge4" class="edge"><title>Lutoma->KSKB</title><path fill="none" stroke="red" d="M251.87,-101.79C251.87,-92.95 251.87,-84.12 251.87,-75.28"></path><polygon fill="red" stroke="red" points="255.37,-75.5 251.87,-65.5 248.37,-75.5 255.37,-75.5"></polygon></g></g></svg></p><p>And now we have a routing loop.</p><h2 id="correct-way-to-do-this-3">Correct Way to Do This</h2><p>For this problem to appear, all three requirements must be met:</p><ol><li><strong><code>add paths yes;</code> option is turned on</strong>, so that secondary routes are sent to other nodes as well. If this option wasn't turned on, as soon as the BuyVM node choose the VirMach node as the next hop, it won't broadcast its route through Lutoma to the VirMach node. Then, the VirMach node will only have the option of sending traffic directly to KSKB.</li><li>Route preference algorithm <strong>preferring secondary routes over primary routes from other nodes</strong>. Therefore, if we want to keep <code>add paths yes;</code> option on, while designing the iBGP route preference algorithm, we need to guarantee that routes from the same node <strong>have their priorities in the same order as that node</strong>, so that the primary routes will always be used over secondary routes.</li><li>Routes passing Lutoma's network were incorrectly updated with the new BGP community, causing the change of route priority orders.</li></ol><p>My solution to the problem is to no longer recalculate route priority for those received from iBGP. Instead, I will always use the priority value calculated by the edge node receiving the route, and passed over along with the route announcement over iBGP, to guarantee that the order of primary and secondary routes never change.</p>]]></content:encoded>
<category domain="https://lantian.pub/category/modify-website/">Website and Servers</category>
<category domain="https://lantian.pub/tag/DN42/">DN42</category>
<category domain="https://lantian.pub/tag/BGP/">BGP</category>
<comments>https://lantian.pub/en/article/modify-website/how-to-kill-the-dn42-network.lantian/#disqus_thread</comments>
</item>
<item>
<title>Optimus MUXed 笔记本上的 NVIDIA 虚拟机显卡直通(2023-05 更新)</title>
<link>https://lantian.pub/article/modify-computer/laptop-muxed-nvidia-passthrough.lantian/</link>
<guid>https://lantian.pub/article/modify-computer/laptop-muxed-nvidia-passthrough.lantian/</guid>
<pubDate>Sun, 07 May 2023 16:28:52 GMT</pubDate>
<description><p>一年前,为了能够一边用 Arch Linux 浏览网页、写代码,一边用 Windows 运行游戏等没法在 Linux 上方便地完成的任务,<a href="/article/modify-computer/laptop-intel-nvidia-optimus-passth</description>
<content:encoded><![CDATA[<p>一年前,为了能够一边用 Arch Linux 浏览网页、写代码,一边用 Windows 运行游戏等没法在 Linux 上方便地完成的任务,<a href="/article/modify-computer/laptop-intel-nvidia-optimus-passthrough.lantian">我试着在我的联想 R720 游戏本上进行了显卡直通</a>。但是由于那台电脑是 Optimus MUXless 架构(前文有各种架构的介绍),也就是独显没有输出端口、全靠核显显示画面,那套配置的应用受到了很大的阻碍,最后被我放弃。</p><p>但是现在,我换了台新电脑。这台电脑的 HDMI 输出接口是直连 NVIDIA 独立显卡的,也就是 Optimus MUXed 架构。在这种架构下,有办法让虚拟机识别到一个「独显上的显示器」,从而正常启用大部分功能。于是,我终于可以配置出一套可以长期使用的显卡直通配置。</p><h1 id="更新日志">更新日志</h1><ul><li>2023-05-08:针对新版 Looking Glass B6 更新部分内容。</li><li>2022-01-26:PCIe 省电补丁实测无效。</li></ul><h1 id="准备工作">准备工作</h1><p>在按照本文进行操作前,你需要准备好:</p><ol><li><p>一台 Optimus MUXed 架构的笔记本电脑。我的电脑型号是 HP OMEN 17t-ck000(i7-11800H,RTX 3070)。</p><ul><li>(2022-01)我用的操作系统是 Arch Linux,更新到最新版本。</li><li>(2023-05)本次更新时我用的操作系统是 NixOS,但大部分步骤同样适用于其它 Linux 发行版。</li><li>建议关闭安全启动功能,但既然你已经装上了 Linux,你大概率已经关掉了。安全启动理论上可能会对 PCIe 直通功能造成一定的限制。</li></ul></li><li><p>用 Libvirt(Virt-Manager)配置好一台 Windows 10 或 Windows 11 的虚拟机,我用的是 Windows 11。</p><ul><li>我的虚拟机用的是 UEFI(OVMF)模式启动,但理论上用 BIOS 方式(SeaBIOS)也可以。这次的步骤没有必须用 UEFI 启动方式的地方。</li><li><strong>一定要关闭虚拟机的安全启动!不然有些驱动装不上!</strong><ul><li>Windows 11 安装程序会检测安全启动是否开启,关闭安全启动后可能会提示计算机不兼容,拒绝安装。此时可以参照这篇文章解决问题:<a href="https://sysin.org/blog/windows-11-no-tpm/">https://sysin.org/blog/windows-11-no-tpm/</a></li></ul></li><li> 先配置好 QXL 虚拟显卡,保证自己可以看得到虚拟机的视频输出。</li></ul></li><li><p>(可选)根据电脑视频输出接口的不同,一个 HDMI,DP,或 USB Type-C 接口的假显示器(诱骗接头),淘宝上一般几块到十几块钱一个。</p><ul><li>(2023-05)或者你也可以选择安装虚拟显示器驱动。</li><li><img src="../../../../usr/uploads/202201/hdmi-dummy-plug.jpg" alt="HDMI 假显示器"></li></ul></li><li><p>(可选)外接一套 USB 键鼠套装。</p></li></ol><p>开始操作之前,预先提醒:</p><ul><li>整个步骤中会多次重启宿主系统,同时一些操作存在导致宿主系统崩溃的风险,请备份好你的数据。</li><li>整个步骤中你不需要手动下载任何 NVIDIA 显卡驱动,交给 Windows 自动下载就好。<ul><li>如果 Windows 自动下载失败,手动安装驱动的底线是下载驱动 EXE 然后双击安装。</li><li>千万不要在设备管理器中手动指定设备安装。</li><li>手动安装显卡驱动有时反而会干扰判断。</li></ul></li></ul><h2 id="购买-optimus-muxed-架构的新电脑">购买 Optimus MUXed 架构的新电脑</h2><p>如果你有兴趣尝试显卡直通,并正准备购买一台新电脑,你可以参考以下我的选择方法。</p><p>显卡直通的前提条件是:</p><ol><li>NVIDIA 独立显卡本身要具有视频输出功能</li><li>机身上至少有一个连接到独立显卡的视频接口</li></ol><p>但是,游戏本厂商很少会在宣传页上写明视频接口连接的是独显还是核显。因此我们只能根据常见的参数进行推测:</p><ol><li><p>优先选择支持「独显直连内屏」的电脑,因为这种情况下独显一定具有视频输出功能,并且厂家大概率会将机身视频接口连接到独显上。</p><ul><li>典型的例子包括:2020 和 2021 款的联想拯救者系列、惠普暗影精灵系列,以及戴尔游侠 G15。</li><li>** 但我不保证这些例子是准确的!** 请自行查阅资料或询问客服,确保电脑支持「独显直连」功能。</li></ul></li><li><p>或者选择带有中高端独立显卡的电脑,一般 NVIDIA 显卡型号要以 60 或以上结尾。</p><ul><li>中高端 NVIDIA 显卡一般都有视频输出功能,此时厂家大概率会将机身视频接口连接到独显上。</li><li>请勿购买显卡型号以 50 或以下结尾的电脑,例如 RTX 3050,GTX 1650 Ti 等等。它们大概率不支持视频输出。</li></ul></li><li><p>用好七天无理由退货服务。</p><ul><li>因为厂家不会宣传、甚至不会特别在意视频输出接口的连接方式,我们只能看配置参数和宣传页盲猜。因此,你完全可能按照以上规则挑选到一台无法进行显卡直通的电脑,此时可以考虑退货或者转卖。</li><li>在一些国家(包括中国),笔记本电脑厂家对无理由退货的要求都是「电脑自带的 Windows 和 Office 都没有联网激活」,而最新的 Windows 11 在首次启动的配置向导中会强制联网激活,因此可以考虑用 U 盘或移动硬盘上的 Linux 启动电脑测试,通过后再激活 Windows。</li></ul></li></ol><h2 id="关于-intel-gvt-g-虚拟核显">关于 Intel GVT-g 虚拟核显</h2><p>Intel 第五代到第九代的 CPU 核显都支持对显卡本身进行虚拟化,也就是划分出几个虚拟的显卡,将虚拟显卡直通进虚拟机、让虚拟机享受显卡加速的同时,允许宿主机同时使用显卡进行显示。</p><p>但是 Linux 下的 GVT-g 驱动不支持第十代及更新的 CPU,<a href="https://github.com/intel/gvt-linux/issues/126">而且 Intel 也没有支持的计划</a>。再加上 GVT-g 虚拟显卡无法和 NVIDIA 独显组成 Optimus 结构,它也没有什么用。</p><p>所以,我们不用管 GVT-g 了,只直通 NVIDIA 独显就好。</p><h2 id="2023-05关于-intel-核显-sr-iov-虚拟化">(2023-05)关于 Intel 核显 SR-IOV 虚拟化</h2><p>Intel 十一代及之后的 CPU 核显使用另一种虚拟化方式:SR-IOV。Intel 官方<a href="https://github.com/intel/linux-intel-lts/tree/lts-v5.15.49-adl-linux-220826T092047Z/drivers/gpu/drm/i915">已经发布了 SR-IOV 的内核模块代码</a>,但尚未合入 Linux 主线。<a href="https://github.com/strongtz/i915-sriov-dkms">有第三方项目将这部分内核代码移植成 DKMS 模块</a>,但根据 Issues 反馈成功率不高,我在 i7-11800H 上测试也没成功。所以,本文将不涉及 Intel 核显的 SR-IOV 功能。</p><h1 id="操作步骤">操作步骤</h1><h2 id="禁止宿主系统管理-nvidia-独显">禁止宿主系统管理 NVIDIA 独显</h2><blockquote><p>这一段的大部分内容和 <a href="/article/modify-computer/laptop-intel-nvidia-optimus-passthrough.lantian">2021 年的这篇文章</a>是一样的。</p></blockquote><p>宿主系统上的 NVIDIA 的驱动会占用独显,阻止虚拟机调用它,因此需要先用 PCIe 直通用的 <code>vfio-pci</code> 驱动替换掉它。</p><p>禁用 NVIDIA 驱动,把独显交给处理虚拟机 PCIe 直通的内核模块管理的步骤如下:</p><ol><li><p>运行 <code>lspci -nn | grep NVIDIA</code>,获得类似如下输出:</p><pre><code class="hljs language-bash">0000:01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104M [GeForce RTX 3070 Mobile / Max-Q] [10de:249d] (rev a1)0000:01:00.1 Audio device [0403]: NVIDIA Corporation GA104 High Definition Audio Controller [10de:228b] (rev a1)</code></pre><p>这里的 <code>[10de:249d]</code> 就是独显的制造商 ID 和设备 ID,其中 <code>10de</code> 代表这个 PCIe 设备由 NVIDIA 生产,而 <code>249d</code> 代表这是张 3070。<code>228b</code> 是 HDMI 接口的音频输出,也需要用 <code>vfio-pci</code> 驱动接管。</p></li><li><p>创建 <code>/etc/modprobe.d/lantian.conf</code>,添加如下内容:</p><pre><code class="hljs language-bash">options vfio-pci ids=10de:249d,10de:228b</code></pre><p>给 <code>vfio-pci</code> 这个负责 PCIe 直通的内核驱动一个配置,让它去管理独显。<code>ids</code> 参数就是要直通的独显的制造商 ID 和设备 ID。</p></li><li><p>修改 <code>/etc/mkinitcpio.conf</code>,在 <code>MODULES</code> 中添加以下内容:</p><pre><code class="hljs language-bash">MODULES=(vfio_pci vfio vfio_iommu_type1 vfio_virqfd)</code></pre><p>删除 <code>nvidia</code> 等与 NVIDIA 驱动相关的内核模块,或者确保它们排在 VFIO 驱动后面。这样 PCIe 直通模块就会在系统启动的早期抢占独显,阻止 NVIDIA 驱动后续占用独显。</p></li><li><p>运行 <code>mkinitcpio -P</code> 更新 initramfs。</p></li><li><p>重启电脑。</p></li></ol><p>(2023-05)如果你用的是 NixOS 系统,可以直接使用下面的配置:</p><pre><code class="hljs language-nix">{ boot.<span class="hljs-attr">kernelModules</span> = [<span class="hljs-string">"vfio-pci"</span>]; boot.<span class="hljs-attr">extraModprobeConfig</span> = <span class="hljs-string">'' # 这里改成你的显卡的制造商 ID 和设备 ID options vfio-pci ids=10de:249d ''</span>; boot.<span class="hljs-attr">blacklistedKernelModules</span> = [<span class="hljs-string">"nouveau"</span> <span class="hljs-string">"nvidiafb"</span> <span class="hljs-string">"nvidia"</span> <span class="hljs-string">"nvidia-uvm"</span> <span class="hljs-string">"nvidia-drm"</span> <span class="hljs-string">"nvidia-modeset"</span>];}</code></pre><h2 id="配置-nvidia-独显直通">配置 NVIDIA 独显直通</h2><p>在 <a href="/article/modify-computer/laptop-intel-nvidia-optimus-passthrough.lantian">2021 年的这篇文章</a>中,我在这里介绍了一大堆绕过 NVIDIA 驱动限制的内容。<a href="https://nvidia.custhelp.com/app/answers/detail/a_id/5173">但是从 465 版本开始,NVIDIA 解除了大部分的限制</a>,理论上来说现在直接把显卡直通进虚拟机就能用。</p><p>但也只是理论上而已。</p><p>我依然建议大家做完所有的隐藏虚拟机的步骤,因为:</p><ol><li><p>(2022-01)对于笔记本电脑来说,NVIDIA 并没有解除所有的限制。</p><ul><li><del>至少在我测试时,显卡的 PCIe 总线位置和系统是否存在电池依然会导致直通失败、驱动报错代码 43。</del></li><li>(2023-05)这次测试时,PCIe 总线位置和是否存在电池不再影响直通结果。</li></ul></li><li><p>即使 NVIDIA 驱动不检测虚拟机,你运行的程序也会检测虚拟机,隐藏虚拟机特征可以提高成功运行这些程序的概率。</p><ul><li>典型例子包括带有反作弊系统的网游,或者部分需要联网激活的商业软件。</li></ul></li></ol><p>那么,开始操作:</p><ol><li><p>与 Optimus MUXless 架构不同,我这次没有手动提取显卡 BIOS、修改 UEFI 固件就成功进行了显卡直通。</p><ul><li>如果你的显卡直通进虚拟机后无法安装驱动,包括 Windows 不会自动下载安装、手动下载 NVIDIA 官网驱动安装器也提示找不到兼容的显卡,那么你大概率仍然需要提取显卡 BIOS。</li><li>为了二次确认,你可以在虚拟机里进入设备管理器,找到你的显卡,查看它的硬件 ID,类似 <code>PCI\VEN_10DE&DEV_1C8D&SUBSYS_39D117AA&REV_A1</code>。如果 <code>SUBSYS</code> 后面跟着的是一串 0,这就意味着显卡 BIOS 加载失败,你需要手动提取显卡 BIOS。</li><li>具体步骤请看<a href="/article/modify-computer/laptop-intel-nvidia-optimus-passthrough.lantian">去年的文章</a>的「配置 NVIDIA 独显直通」一段。</li></ul></li><li><p>编辑你的虚拟机配置,<code>virsh edit Windows</code>,做如下修改:</p><pre><code class="hljs language-xml"><span class="hljs-comment"><!-- 把 features 一段改成这样,就是让 QEMU 隐藏虚拟机的特征 --></span><span class="hljs-tag"><<span class="hljs-name">features</span>></span> <span class="hljs-tag"><<span class="hljs-name">acpi</span>/></span> <span class="hljs-tag"><<span class="hljs-name">apic</span>/></span> <span class="hljs-tag"><<span class="hljs-name">hyperv</span> <span class="hljs-attr">mode</span>=<span class="hljs-string">"custom"</span>></span> <span class="hljs-tag"><<span class="hljs-name">relaxed</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">vapic</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">spinlocks</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span> <span class="hljs-attr">retries</span>=<span class="hljs-string">"8191"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">vpindex</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">runtime</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">synic</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">stimer</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">reset</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">vendor_id</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"GenuineIntel"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">frequencies</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">tlbflush</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">hyperv</span>></span> <span class="hljs-tag"><<span class="hljs-name">kvm</span>></span> <span class="hljs-tag"><<span class="hljs-name">hidden</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">kvm</span>></span> <span class="hljs-tag"><<span class="hljs-name">vmport</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"off"</span>/></span><span class="hljs-tag"></<span class="hljs-name">features</span>></span><span class="hljs-comment"><!-- 添加显卡直通的 PCIe 设备 --></span><span class="hljs-tag"><<span class="hljs-name">hostdev</span> <span class="hljs-attr">mode</span>=<span class="hljs-string">'subsystem'</span> <span class="hljs-attr">type</span>=<span class="hljs-string">'pci'</span> <span class="hljs-attr">managed</span>=<span class="hljs-string">'yes'</span>></span> <span class="hljs-tag"><<span class="hljs-name">source</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">'0x0000'</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">'0x01'</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">'0x00'</span> <span class="hljs-attr">function</span>=<span class="hljs-string">'0x0'</span>/></span> <span class="hljs-tag"></<span class="hljs-name">source</span>></span> <span class="hljs-tag"><<span class="hljs-name">rom</span> <span class="hljs-attr">bar</span>=<span class="hljs-string">'off'</span>/></span> <span class="hljs-comment"><!-- 注意这里的 PCIe 总线地址必须是 01:00.0,一点都不能差 --></span> <span class="hljs-comment"><!-- 如果保存时提示 PCIe 总线地址冲突,就把其它设备的 <address> 全部删掉 --></span> <span class="hljs-comment"><!-- 这样 Libvirt 会重新分配一遍 PCIe 地址 --></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">'pci'</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">'0x0000'</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">'0x01'</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">'0x00'</span> <span class="hljs-attr">function</span>=<span class="hljs-string">'0x0'</span> <span class="hljs-attr">multifunction</span>=<span class="hljs-string">'on'</span>/></span><span class="hljs-tag"></<span class="hljs-name">hostdev</span>></span><span class="hljs-comment"><!-- 添加一块在虚拟机和宿主机之间共享的内存,以便将虚拟机显示内容传回宿主机 --></span><span class="hljs-tag"><<span class="hljs-name">shmem</span> <span class="hljs-attr">name</span>=<span class="hljs-string">'looking-glass'</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">type</span>=<span class="hljs-string">'ivshmem-plain'</span>/></span> <span class="hljs-comment"><!-- 这里内存大小的公式是:分辨率宽 x 分辨率高 / 131072,然后向上取到 2 的 n 次方 --></span> <span class="hljs-comment"><!-- 因为大部分 HDMI 假显示器的分辨率都是 3840 x 2160,计算结果是 63.28MB,向上取到 64MB --></span> <span class="hljs-tag"><<span class="hljs-name">size</span> <span class="hljs-attr">unit</span>=<span class="hljs-string">'M'</span>></span>64<span class="hljs-tag"></<span class="hljs-name">size</span>></span><span class="hljs-tag"></<span class="hljs-name">shmem</span>></span><span class="hljs-comment"><!-- 禁用内存 Balloon,也就是内存动态伸缩,严重影响性能 --></span><span class="hljs-tag"><<span class="hljs-name">memballoon</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"none"</span>/></span><span class="hljs-comment"><!-- 在 </qemu:commandline> 之前添加这些参数 --></span><span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">'-acpitable'</span>/></span><span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">'file=/ssdt1.dat'</span>/></span></code></pre><p>此处的 ssdt1.dat 是一个修改后的 ACPI 表,用来模拟一块满电的电池。它对应如下 Base64,可以用 <a href="https://base64.guru/converter/decode/file">Base64 解码网站</a>转换成二进制文件,放在根目录,或者<a href="../../../../usr/uploads/202007/ssdt1.dat">从本站下载</a>。</p><pre><code class="hljs language-bash">U1NEVKEAAAAB9EJPQ0hTAEJYUENTU0RUAQAAAElOVEwYEBkgoA8AFVwuX1NCX1BDSTAGABBMBi5fU0JfUENJMFuCTwVCQVQwCF9ISUQMQdAMCghfVUlEABQJX1NUQQCkCh8UK19CSUYApBIjDQELcBcLcBcBC9A5C1gCCywBCjwKPA0ADQANTElPTgANABQSX0JTVACkEgoEAAALcBcL0Dk=</code></pre></li><li><p>修改共享内存文件的权限。</p><ol><li><p>修改 <code>/etc/apparmor.d/local/abstractions/libvirt-qemu</code> 文件增加一行:</p><pre><code class="hljs language-bash">/dev/shm/looking-glass rw,</code></pre><p>然后运行 <code>sudo systemctl restart apparmor</code> 重启 AppArmor。</p></li><li><p>创建 <code>/etc/tmpfiles.d/looking-glass.conf</code>,写入以下内容,把 <code>lantian</code> 换成你的用户名:</p><pre><code class="hljs language-bash">f /dev/shm/looking-glass 0660 lantian kvm -</code></pre><p>然后运行 <code>sudo systemd-tmpfiles /etc/tmpfiles.d/looking-glass.conf --create</code> 生效。</p></li></ol></li><li><p>启动虚拟机,等一会,Windows 会自动装好 NVIDIA 驱动。</p><ul><li>如果设备管理器里显卡打感叹号,显示代码 43,即驱动程序加载失败,你需要检查上面的步骤有没有遗漏,所有配置是否正确。<ul><li>(2022-01)将设备管理器切换到 <code>Device by Connection</code>(按照连接方式显示设备),确认显卡的地址是总线 Bus 1,接口 Slot 0,功能 Function 0,并且确认显卡上级的 PCIe 接口是总线 Bus 0,接口 Slot 1,功能 Function 0。</li><li>如果对不上,你需要按上面的方法重新分配一遍设备的 PCIe 地址。</li><li>(2023-05)我这次尝试时不再需要进行这一步骤。</li></ul></li><li>如果系统没有自动安装 NVIDIA 驱动,并且你手动下载的也显示系统不兼容 / 找不到显卡,那么你需要查看显卡的属性,其硬件 ID 中,<code>SUBSYS</code> 后是否跟着一串 0。<ul><li>如果是一串 0,请参照第一步。</li></ul></li></ul></li><li><p>关闭虚拟机并再次启动,<strong>注意不是直接重启</strong>,再次在设备管理器里确认显卡工作正常。</p><ul><li>如果此时出现代码 43 了,检查你有没有添加好第二步最后的模拟电池。</li><li>我第一次尝试用的是 Windows 10 LTSC 2019,也是重启后出现了代码 43。但因为当时我没有添加模拟电池,我无法确认是 NVIDIA 驱动不兼容系统版本,还是模拟电池的原因。建议使用最新版本的 Windows 10 或 Windows 11。</li></ul></li><li><p>以下步骤二选一:</p><ol><li>(2022-01)把你的 HDMI 假显示器插入电脑,虚拟机应该识别到一个新的显示器。</li><li>(2023-05)安装虚拟显示器驱动:<ol><li>下载 <a href="https://github.com/ge9/IddSampleDriver">ge9/IddSampleDriver</a> 这份虚拟显示器驱动,解压到 <code>C:\IddSampleDriver</code>。注意这个文件夹不能移动到其它位置!</li><li>打开 <code>C:\IddSampleDriver\option.txt</code>,你会看到第一行是一个数字 1(不要修改),然后是分辨率 / 刷新率列表。只保留你想要的一项分辨率 / 刷新率,把其它的分辨率 / 刷新率都删掉。</li><li>打开设备管理器,在菜单中选择「操作 - 添加过时硬件」,点击「从列表中选择 - 全部 - 我有驱动磁盘」,然后选择 <code>C:\IddSampleDriver\IddSampleDriver.inf</code> 并一路下一步完成安装。</li><li>此时 Windows 系统应该检测到了一个新的显示器。</li><li>在我的测试中,使用虚拟显示器时,Looking Glass 显示的内容会有部分像素出错。有条件的话,还是建议使用 HDMI 假显示器。</li></ol></li></ol></li><li><p>(2023-05)现在新版 Looking Glass 会自动安装 IVSHMEM 驱动(虚拟机和宿主机共享内存的驱动),你无需再手动安装驱动。这里保留手动安装步骤以供参考:</p><ol><li><p>(2022-01)下载这份 Virtio 驱动复制到虚拟机内解压,<strong>注意一定是这份,其它的版本大都没有 IVSHMEM 驱动</strong>:</p><p><a href="https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/upstream-virtio/virtio-win10-prewhql-0.1-161.zip">https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/upstream-virtio/virtio-win10-prewhql-0.1-161.zip</a></p></li><li><p>在虚拟机里进入设备管理器,找到系统设备 - PCI 标准内存控制器(<code>PCI standard RAM controller</code>):</p><ol><li>右键选择「更新驱动」</li><li>点击「浏览我的电脑查找驱动程序」</li><li>点击「从列表中选择」</li><li>点击「我有驱动磁盘」按钮</li><li>选择 <code>Virtio 驱动/Win10/amd64/ivshmem.inf</code> 文件</li><li>一路下一步安装驱动,此时它的名字应该已经变成了 <code>IVSHMEM</code></li></ol></li></ol></li><li><p>安装 <a href="https://looking-glass.io/downloads">Looking Glass</a>,这是一个将虚拟机的显示画面传输到宿主机的工具。</p><ul><li>我们插入的假显示器将成为虚拟机唯一能识别到的显示器。如果不安装 Looking Glass,就看不到虚拟机的画面了。</li><li>在上面的链接点击「Windows Host Binary」下载,在虚拟机内双击安装。</li></ul></li><li><p>(2023-05)如果按照 2022-01 的步骤操作,虚拟机开机过程中、Looking Glass 启动前你将无法看到开机画面。因此我推荐在设备管理器中直接禁用 QXL 虚拟显卡。以下旧版步骤保留以供参考。</p><ul><li><p>(2022-01)关闭虚拟机,<code>virsh edit Windows</code> 编辑虚拟机配置。</p><p>找到 <code><video><model type="qxl" ...></video></code>,将 <code>type</code> 改为 <code>none</code>,以禁用 QXL 虚拟显卡:</p><pre><code class="hljs language-xml"><span class="hljs-tag"><<span class="hljs-name">video</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"none"</span>/></span><span class="hljs-tag"></<span class="hljs-name">video</span>></span></code></pre></li></ul></li><li><p>在宿主机上安装 Looking Glass 的客户端,Arch Linux 用户可以直接从 AUR 安装 <code>looking-glass</code> 包。运行 <code>looking-glass-client</code> 命令启动客户端。</p></li><li><p>回到 Virt-Manager,关掉虚拟机的窗口(就是查看虚拟机桌面、编辑配置的窗口),在 Virt-Manager 主界面右键选择你的虚拟机,点击启动。</p></li><li><p>稍等片刻,Looking Glass 的客户端就会显示出虚拟机的画面,此时显卡直通就配置完成了。</p></li></ol><h1 id="性能和体验优化">性能和体验优化</h1><p>虽然显卡直通已经完成,但是虚拟机的体验还需要优化。具体来说:</p><ol><li>(2022-01)Looking Glass 可以传输鼠标键盘操作,但无法传输声音,意味着虚拟机无法发声;<ul><li>(2023-05)最新的 Looking Glass 已经可以传输声音。</li></ul></li><li>(2022-01)Looking Glass 传输鼠标键盘操作有时会丢键;<ul><li>(2023-05)最新的 Looking Glass 已经可以稳定传输鼠标键盘操作。</li></ul></li><li>IVSHMEM 共享内存功能其实有一个宿主机的内核模块,可以让宿主机的 Looking Glass 使用 DMA 模式提高性能;</li><li>关闭虚拟机后,独立显卡会被设置成 PCIe D3hot 模式,在该模式下显卡仍会消耗 10W 左右的电力,影响电池续航。</li></ol><p>我们将一个个解决以上问题。</p><h2 id="传输虚拟机声音">传输虚拟机声音</h2><p>(2023-05)新版 Looking Glass 已经可以传输声音。以下步骤保留以供参考。</p><div class="btn-group btn-group" role="group" id="lti-g509757"><input type="radio" name="lti-g509757" id="lti-509757-4081349" class="btn-check lti-option" autocomplete="off" data-lti-tag="sound_hide"><label class="btn btn-outline-primary" for="lti-509757-4081349">隐藏</label><input type="radio" name="lti-g509757" id="lti-509757-5152650" class="btn-check lti-option" autocomplete="off" data-lti-tag="sound_show"><label class="btn btn-outline-primary" for="lti-509757-5152650">查看 2022-01 的旧版步骤</label></div><div id="lti-content-sound_show" class="lti-content"><p>虽然 Virt-Manager 本身可以通过 SPICE 协议连接虚拟机,从而传输虚拟机的声音,但是 Looking Glass 也会通过 SPICE 传输键鼠操作,而虚拟机上同时只能有一个 SPICE 连接。这就意味着我们无法使用 Virt-Manager 来听声音了。</p><p>我们可以安装 <a href="https://github.com/duncanthrax/scream">Scream</a>,一个 Windows 下的虚拟声卡软件,将声音通过虚拟机的网卡来传输,然后在宿主机上用 Scream 的客户端接收。</p><p>在虚拟机上,从 <a href="https://github.com/duncanthrax/scream/releases">Scream 的下载页面</a>下载 Scream 安装程序,解压后右键以管理员身份运行 <code>Install-x64.bat</code> 脚本安装驱动,然后重启。</p><p>在宿主机上安装 Scream 客户端,Arch Linux 用户可以安装 AUR 中的 <code>scream</code> 软件包。</p><p>在宿主机上开一个终端运行 <code>scream -v</code>,在虚拟机中播放音频,测试能不能听到。如果无法听到,尝试指定 Scream 客户端监听的网卡,例如 <code>scream -i virbr0 -v</code>,其中 <code>virbr0</code> 对应 Virt-Manager 默认的 NAT 网络,是你的虚拟机与宿主机通信的网卡。</p><p>最后,可以创建一个 SystemD 服务,来方便地启动 Scream 客户端。创建 <code>~/.config/systemd/user/scream.service</code>,写入以下内容:</p><pre><code class="hljs language-bash">[Unit]Description=Scream[Service]Type=simpleRestart=alwaysRestartSec=1ExecStart=/usr/bin/scream -i virbr0 -v[Install]WantedBy=graphical-session.target</code></pre><p>以后使用时就只需要运行 <code>systemctl --user start scream</code> 了。</p></div><h2 id="直通键盘鼠标操作">直通键盘鼠标操作</h2><p>(2023-05)新版 Looking Glass 已经可以稳定传输鼠标键盘操作。以下步骤保留以供参考。</p><div class="btn-group btn-group" role="group" id="lti-g4084584"><input type="radio" name="lti-g4084584" id="lti-4084584-9969683" class="btn-check lti-option" autocomplete="off" data-lti-tag="keyboardmouse_hide"><label class="btn btn-outline-primary" for="lti-4084584-9969683">隐藏</label><input type="radio" name="lti-g4084584" id="lti-4084584-5180584" class="btn-check lti-option" autocomplete="off" data-lti-tag="keyboardmouse_show"><label class="btn btn-outline-primary" for="lti-4084584-5180584">查看 2022-01 的旧版步骤</label></div><div id="lti-content-keyboardmouse_show" class="lti-content"><p>Looking Glass 的键盘鼠标传输不太稳定,有时会丢失一些操作,因此如果你想在虚拟机里玩游戏,就需要用更稳定的方法将键鼠操作传进虚拟机。</p><p>我们有两种方法:让 Libvirt 虚拟机直接捕获宿主机的键鼠操作,或者把一套 USB 键鼠直接直通进虚拟机。</p><ol><li><p>捕获宿主机键鼠操作。</p><p>在 Linux 系统上,所有的键鼠操作都是通过 <code>evdev</code>(即 <code>Event Device</code>)框架传输给桌面环境的。Libvirt 可以监听你的键鼠操作,将你的操作传给虚拟机。同时,Libvirt 可以在你按下左 Ctrl + 右 Ctrl 这套组合键的时候,在虚拟机和宿主机之间切换,这样你就可以用同一套键盘鼠标同时操作宿主机和虚拟机了。</p><p>首先在宿主机上运行 <code>ls -l /dev/input/by-path</code> 查看你现有的 <code>evdev</code> 设备,例如我就有:</p><pre><code class="hljs language-bash">pci-0000:00:14.0-usb-0:1:1.1-event-mouse <span class="hljs-comment"># USB 外接鼠标</span>pci-0000:00:14.0-usb-0:1:1.1-mousepci-0000:00:14.0-usb-0:6:1.0-eventpci-0000:00:15.0-platform-i2c_designware.0-event-mouse <span class="hljs-comment"># 电脑内置的触摸板</span>pci-0000:00:15.0-platform-i2c_designware.0-mousepci-0000:00:1f.3-platform-skl_hda_dsp_generic-eventplatform-i8042-serio-0-event-kbd <span class="hljs-comment"># 电脑内置的键盘</span>platform-pcspkr-event-spkr</code></pre><p>名字中带有 <code>event-mouse</code> 的就是鼠标,带有 <code>event-kbd</code> 的就是键盘。</p><p>然后,<code>virsh edit Windows</code> 编辑虚拟机配置,在 <code><devices></code> 中添加一段:</p><pre><code class="hljs language-xml"><span class="hljs-tag"><<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"evdev"</span>></span> <span class="hljs-comment"><!-- 根据上面 ls 的结果,修改鼠标或键盘的路径 --></span> <span class="hljs-tag"><<span class="hljs-name">source</span> <span class="hljs-attr">dev</span>=<span class="hljs-string">"/dev/input/by-path/platform-i8042-serio-0-event-kbd"</span> <span class="hljs-attr">grab</span>=<span class="hljs-string">"all"</span> <span class="hljs-attr">repeat</span>=<span class="hljs-string">"on"</span>/></span><span class="hljs-tag"></<span class="hljs-name">input</span>></span><span class="hljs-comment"><!-- 有多个鼠标键盘时,重复即可 --></span><span class="hljs-tag"><<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"evdev"</span>></span> <span class="hljs-tag"><<span class="hljs-name">source</span> <span class="hljs-attr">dev</span>=<span class="hljs-string">"/dev/input/by-path/pci-0000:00:15.0-platform-i2c_designware.0-event-mouse"</span> <span class="hljs-attr">grab</span>=<span class="hljs-string">"all"</span> <span class="hljs-attr">repeat</span>=<span class="hljs-string">"on"</span>/></span><span class="hljs-tag"></<span class="hljs-name">input</span>></span></code></pre><p>启动虚拟机,这时你会发现键鼠操作没反应了,因为它们被虚拟机捕获了。按下左 Ctrl + 右 Ctrl 组合键就可以恢复宿主机键鼠控制,再按一次就可以控制虚拟机。</p><p>然后,我们就可以禁用 Looking Glass 的键鼠传输功能了。创建 <code>/etc/looking-glass-client.ini</code>,写入以下内容:</p><pre><code class="hljs language-ini"><span class="hljs-section">[spice]</span><span class="hljs-attr">enable</span>=<span class="hljs-literal">no</span></code></pre></li><li><p>USB 键鼠直通</p><p>捕获键鼠操作并不是万能的,例如我的触摸板就无法被正常捕获,体现为无法移动虚拟机内的光标。</p><p>如果你也遇到了这种情况,并且你有一套 USB 键鼠,就可以将它们直通进虚拟机,专门用它们控制虚拟机。虚拟机的 USB 直通技术非常成熟,你遇到问题的概率非常小。</p><p>在 Virt-Manager 里选择添加硬件(<code>Add Hardware</code>) - USB 宿主设备(<code>USB Host Device</code>),选择你的鼠标键盘即可。</p></li></ol></div><h2 id="用内核模块加速-looking-glass">用内核模块加速 Looking Glass</h2><blockquote><p>这段内容大都来自 <a href="https://looking-glass.io/docs/B6/module/">https://looking-glass.io/docs/B6/module/</a></p></blockquote><p>Looking Glass 提供了一个内核模块,可以用于 IVSHMEM 共享内存设备,让 Looking Glass 能使用 DMA 技术高效地读取虚拟机画面,从而提高帧率。</p><ol><li><p>安装 Linux 内核头文件和 DKMS,在 Arch Linux 上就是安装 <code>linux-headers</code> 和 <code>dkms</code> 两个包。</p></li><li><p>从 AUR 安装 <code>looking-glass-module-dkms</code>。</p></li><li><p>配置 Udev 规则:创建 <code>/etc/udev/rules.d/99-kvmfr.rules</code>,写入以下内容:</p><pre><code class="hljs language-bash">SUBSYSTEM==<span class="hljs-string">"kvmfr"</span>, OWNER=<span class="hljs-string">"lantian"</span>, GROUP=<span class="hljs-string">"kvm"</span>, MODE=<span class="hljs-string">"0660"</span></code></pre><p>将 <code>lantian</code> 替换成你自己的用户名。</p></li><li><p>配置内存大小:创建 <code>/etc/modprobe.d/looking-glass.conf</code>,写入以下内容:</p><pre><code class="hljs language-bash"><span class="hljs-comment"># 这里的内存大小计算方法和虚拟机的 shmem 一项相同。</span>options kvmfr static_size_mb=64</code></pre></li><li><p>开机自动加载模块:创建 <code>/etc/modules-load.d/looking-glass.conf</code>,写入一行 <code>kvmfr</code>。</p></li><li><p>运行 <code>sudo modprobe kvmfr</code> 加载模块,此时 <code>/dev</code> 下会多出一个 <code>kvmfr0</code> 设备,就是 Looking Glass 的内存设备了。</p></li><li><p>修改 <code>/etc/apparmor.d/local/abstractions/libvirt-qemu</code> 文件增加一行:</p><pre><code class="hljs language-bash">/dev/kvmfr0 rw,</code></pre><p>以允许虚拟机访问这个设备。运行 <code>sudo systemctl restart apparmor</code> 重启 AppArmor。</p></li><li><p><code>virsh edit Windows</code> 编辑虚拟机配置:</p><ol><li><p>在 <code><devices></code> 中删除 <code><shmem></code> 一段:</p><pre><code class="hljs language-xml"><span class="hljs-tag"><<span class="hljs-name">shmem</span> <span class="hljs-attr">name</span>=<span class="hljs-string">'looking-glass'</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">type</span>=<span class="hljs-string">'ivshmem-plain'</span>/></span> <span class="hljs-tag"><<span class="hljs-name">size</span> <span class="hljs-attr">unit</span>=<span class="hljs-string">'M'</span>></span>64<span class="hljs-tag"></<span class="hljs-name">size</span>></span><span class="hljs-tag"></<span class="hljs-name">shmem</span>></span></code></pre></li><li><p>在 <code><qemu:commandline></code> 中增加下面几行:</p><pre><code class="hljs language-xml"><span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"-device"</span>/></span><span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"{<span class="hljs-symbol">&quot;</span>driver<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>ivshmem-plain<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>id<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>shmem-looking-glass<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>memdev<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>looking-glass<span class="hljs-symbol">&quot;</span>}"</span>/></span><span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"-object"</span>/></span><span class="hljs-comment"><!-- 下一行有一个 67108864,对应 64MB * 1048576 --></span><span class="hljs-comment"><!-- 如果你之前设置的内存大小不同请相应修改 --></span><span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"{<span class="hljs-symbol">&quot;</span>qom-type<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>memory-backend-file<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>id<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>looking-glass<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>mem-path<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>/dev/kvmfr0<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>size<span class="hljs-symbol">&quot;</span>:67108864,<span class="hljs-symbol">&quot;</span>share<span class="hljs-symbol">&quot;</span>:true}"</span>/></span></code></pre></li><li><p>启动虚拟机。</p></li></ol></li><li><p>修改 <code>/etc/looking-glass-client.ini</code>,添加以下内容:</p><pre><code class="hljs language-ini"><span class="hljs-section">[app]</span><span class="hljs-attr">shmFile</span>=/dev/kvmfr0</code></pre></li><li><p>启动 Looking Glass,此时应该可以看到虚拟机画面。</p></li><li><p>(2023-05)如果你用的是 NixOS,可以直接使用下面的配置:</p></li></ol><pre><code class="hljs language-nix">{ boot.<span class="hljs-attr">extraModulePackages</span> = <span class="hljs-keyword">with</span> config.boot.kernelPackages; [ kvmfr ]; boot.<span class="hljs-attr">extraModprobeConfig</span> = <span class="hljs-string">'' # 这里的内存大小计算方法和虚拟机的 shmem 一项相同。 options kvmfr static_size_mb=64 ''</span>; boot.<span class="hljs-attr">kernelModules</span> = [<span class="hljs-string">"kvmfr"</span>]; services.udev.<span class="hljs-attr">extraRules</span> = <span class="hljs-string">'' SUBSYSTEM=="kvmfr", OWNER="root", GROUP="libvirtd", MODE="0660" ''</span>; environment.etc.<span class="hljs-string">"looking-glass-client.ini"</span>.<span class="hljs-attr">text</span> = <span class="hljs-string">'' [app] shmFile=/dev/kvmfr0 ''</span>;}</code></pre><h2 id="不使用虚拟机时给独立显卡断电">不使用虚拟机时给独立显卡断电</h2><p><strong>2022-01-26 更新:实测应用这个补丁后,NVIDIA 显卡仍未完全断电,耗电量与未使用补丁前相同。本段内容失效。</strong></p><div class="btn-group btn-group" role="group" id="lti-g3995260"><input type="radio" name="lti-g3995260" id="lti-3995260-7719856" class="btn-check lti-option" autocomplete="off" data-lti-tag="power_hide"><label class="btn btn-outline-primary" for="lti-3995260-7719856">失效内容已隐藏</label><input type="radio" name="lti-g3995260" id="lti-3995260-8577973" class="btn-check lti-option" autocomplete="off" data-lti-tag="power_show"><label class="btn btn-outline-primary" for="lti-3995260-8577973">点此查看</label></div><div id="lti-content-power_show" class="lti-content"><blockquote><p>这一段只适用于 20 系及以上的 NVIDIA 显卡,当使用 NVIDIA 官方驱动时,它们也可以自动断电。10 系及以下的 NVIDIA 显卡不支持此功能。</p><p>这一段涉及自行编译内核,和<strong>使用未经严格检查和测试的内核补丁</strong>,不建议不熟悉 Linux 的用户操作。请自行衡量风险。</p></blockquote><p>当你不使用虚拟机时,管理 PCIe 直通的 <code>vfio-pci</code> 驱动会将设备设置成 <code>D3</code> 模式,也就是 PCIe 设备的省电模式。但是 <code>D3</code> 模式也分两种:<code>D3hot</code>,此时设备仍然通电,和 <code>D3cold</code>,此时设备完全断电。现在内核中的 <code>vfio-pci</code> 驱动只支持 <code>D3hot</code>,此时 NVIDIA 独立显卡由于芯片未断电,仍会消耗 10W 左右的功率,从而导致笔记本电脑续航下降。</p><p>一位 NVIDIA 的工程师在 Linux 内核的邮件列表上发布了一组让 <code>vfio-pci</code> 支持 <code>D3cold</code> 模式的补丁。应用此补丁后,当虚拟机关机时,NVIDIA 独立显卡会被彻底断电,从而节省电量。</p><p>这组补丁可以在 <a href="https://lore.kernel.org/lkml/20211115133640.2231-1-abhsahu@nvidia.com/T/">https://lore.kernel.org/lkml/20211115133640.2231-1-abhsahu@nvidia.com/T/</a> 看到。它总共由三个补丁组成,我将三个补丁合并后上传到了 <a href="https://github.com/xddxdd/pkgbuild/blob/master/linux-xanmod-lantian/0007-vfio-pci-d3cold.patch">https://github.com/xddxdd/pkgbuild/blob/master/linux-xanmod-lantian/0007-vfio-pci-d3cold.patch</a>。</p><p>对于 Arch Linux 来说,给内核打补丁是比较简单的。AUR 中大部分内核的 PKGBUILD 都可以自动打补丁,只需要下载一个内核的 PKGBUILD,然后把这个补丁加入 PKGBUILD 的 <code>source</code> 部分就可以了。具体修改可以看我的这个 commit:<a href="https://github.com/xddxdd/pkgbuild/commit/406adb7bf5657cfe07bb17ff561d11ed97ebab39">https://github.com/xddxdd/pkgbuild/commit/406adb7bf5657cfe07bb17ff561d11ed97ebab39</a></p><p><strong>要注意的是,这个补丁无法保证稳定。</strong></p><p>根据邮件列表的讨论:</p><ol><li>它是一个 RFC 补丁,也就是测试版补丁,邮件列表的标题上写着一个大大的 <code>[RFC]</code>。</li><li>如果虚拟机中的显卡驱动想把显卡切换成 <code>D3cold</code> 模式,这个补丁存在将显卡 reset,导致状态丢失,继而导致虚拟机崩溃的风险。虽然目前我使用 Windows 11 虚拟机暂时没有发现类似的问题,但是你需要了解其中的隐患。</li><li>目前开发者只测试了 NVIDIA 的部分显卡,不保证对其它 PCIe 设备的支持。</li></ol><p><strong>风险自负。</strong></p></div><h1 id="资料来源">资料来源</h1><p>感谢前人在显卡直通上做出的努力,没有他们的努力本文不可能存在。</p><p>以下是我配置时参考的资料:</p><ul><li>NVIDIA 独显直通<ul><li>GitHub Misairu-G 的 NVIDIA Optimus MUXed 直通教程 <a href="https://gist.github.com/Misairu-G/616f7b2756c488148b7309addc940b28">https://gist.github.com/Misairu-G/616f7b2756c488148b7309addc940b28</a></li><li>Reddit VFIO 版块的虚拟电池补丁 <a href="https://www.reddit.com/r/VFIO/comments/ebo2uk/nvidia_geforce_rtx_2060_mobile_success_qemu_ovmf/">https://www.reddit.com/r/VFIO/comments/ebo2uk/nvidia_geforce_rtx_2060_mobile_success_qemu_ovmf/</a></li><li>Arch Linux Wiki <a href="https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF">https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF</a></li></ul></li><li>Looking Glass 的文档<ul><li>安装文档 <a href="https://looking-glass.io/docs/B6/install/">https://looking-glass.io/docs/B6/install/</a></li><li> 内核模块文档 <a href="https://looking-glass.io/docs/B6/module/">https://looking-glass.io/docs/B6/module/</a></li></ul></li><li> 虚拟显示器驱动<ul><li>文中使用的可以修改分辨率和刷新率的版本 <a href="https://github.com/ge9/IddSampleDriver">https://github.com/ge9/IddSampleDriver</a></li><li> 分辨率和刷新率固定的原版 <a href="https://github.com/roshkins/IddSampleDriver">https://github.com/roshkins/IddSampleDriver</a></li></ul></li><li>VFIO D3cold 模式补丁<ul><li>Phoronix 的报道 <a href="https://www.phoronix.com/scan.php?page=news_item&px=NVIDIA-Runtime-PM-VFIO-PCI">https://www.phoronix.com/scan.php?page=news_item&px=NVIDIA-Runtime-PM-VFIO-PCI</a></li><li>Linux 内核邮件列表的链接 <a href="https://lore.kernel.org/lkml/20211115133640.2231-1-abhsahu@nvidia.com/T/">https://lore.kernel.org/lkml/20211115133640.2231-1-abhsahu@nvidia.com/T/</a></li></ul></li></ul><h2 id="附录最终-libvirt-xml-文件">附录:最终 Libvirt XML 文件</h2><div class="btn-group btn-group" role="group" id="lti-g4710752"><input type="radio" name="lti-g4710752" id="lti-4710752-7543940" class="btn-check lti-option" autocomplete="off" data-lti-tag="xml_hide"><label class="btn btn-outline-primary" for="lti-4710752-7543940">隐藏</label><input type="radio" name="lti-g4710752" id="lti-4710752-9622013" class="btn-check lti-option" autocomplete="off" data-lti-tag="xml_show"><label class="btn btn-outline-primary" for="lti-4710752-9622013">显示完整的 XML 文件 </label></div><div id="lti-content-xml_show" class="lti-content"><pre><code class="hljs language-xml"><span class="hljs-tag"><<span class="hljs-name">domain</span> <span class="hljs-attr">xmlns:qemu</span>=<span class="hljs-string">"http://libvirt.org/schemas/domain/qemu/1.0"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"kvm"</span>></span> <span class="hljs-tag"><<span class="hljs-name">name</span>></span>Windows11<span class="hljs-tag"></<span class="hljs-name">name</span>></span> <span class="hljs-tag"><<span class="hljs-name">uuid</span>></span>5d5b00d8-475a-4b6c-8053-9dda30cd2f95<span class="hljs-tag"></<span class="hljs-name">uuid</span>></span> <span class="hljs-tag"><<span class="hljs-name">metadata</span>></span> <span class="hljs-tag"><<span class="hljs-name">libosinfo:libosinfo</span> <span class="hljs-attr">xmlns:libosinfo</span>=<span class="hljs-string">"http://libosinfo.org/xmlns/libvirt/domain/1.0"</span>></span> <span class="hljs-tag"><<span class="hljs-name">libosinfo:os</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"http://microsoft.com/win/11"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">libosinfo:libosinfo</span>></span> <span class="hljs-tag"></<span class="hljs-name">metadata</span>></span> <span class="hljs-tag"><<span class="hljs-name">memory</span> <span class="hljs-attr">unit</span>=<span class="hljs-string">"KiB"</span>></span>16777216<span class="hljs-tag"></<span class="hljs-name">memory</span>></span> <span class="hljs-tag"><<span class="hljs-name">currentMemory</span> <span class="hljs-attr">unit</span>=<span class="hljs-string">"KiB"</span>></span>16777216<span class="hljs-tag"></<span class="hljs-name">currentMemory</span>></span> <span class="hljs-tag"><<span class="hljs-name">vcpu</span> <span class="hljs-attr">placement</span>=<span class="hljs-string">"static"</span>></span>16<span class="hljs-tag"></<span class="hljs-name">vcpu</span>></span> <span class="hljs-tag"><<span class="hljs-name">os</span>></span> <span class="hljs-tag"><<span class="hljs-name">type</span> <span class="hljs-attr">arch</span>=<span class="hljs-string">"x86_64"</span> <span class="hljs-attr">machine</span>=<span class="hljs-string">"pc-q35-8.0"</span>></span>hvm<span class="hljs-tag"></<span class="hljs-name">type</span>></span> <span class="hljs-tag"><<span class="hljs-name">loader</span> <span class="hljs-attr">readonly</span>=<span class="hljs-string">"yes"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pflash"</span>></span>/run/libvirt/nix-ovmf/OVMF_CODE.fd<span class="hljs-tag"></<span class="hljs-name">loader</span>></span> <span class="hljs-tag"><<span class="hljs-name">nvram</span> <span class="hljs-attr">template</span>=<span class="hljs-string">"/run/libvirt/nix-ovmf/OVMF_VARS.fd"</span>></span>/var/lib/libvirt/qemu/nvram/Windows11_VARS.fd<span class="hljs-tag"></<span class="hljs-name">nvram</span>></span> <span class="hljs-tag"></<span class="hljs-name">os</span>></span> <span class="hljs-tag"><<span class="hljs-name">features</span>></span> <span class="hljs-tag"><<span class="hljs-name">acpi</span>/></span> <span class="hljs-tag"><<span class="hljs-name">apic</span>/></span> <span class="hljs-tag"><<span class="hljs-name">hyperv</span> <span class="hljs-attr">mode</span>=<span class="hljs-string">"custom"</span>></span> <span class="hljs-tag"><<span class="hljs-name">relaxed</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">vapic</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">spinlocks</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span> <span class="hljs-attr">retries</span>=<span class="hljs-string">"8191"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">vpindex</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">runtime</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">synic</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">stimer</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">reset</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">vendor_id</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"GenuineIntel"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">frequencies</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">tlbflush</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">hyperv</span>></span> <span class="hljs-tag"><<span class="hljs-name">kvm</span>></span> <span class="hljs-tag"><<span class="hljs-name">hidden</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">kvm</span>></span> <span class="hljs-tag"><<span class="hljs-name">vmport</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"off"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">features</span>></span> <span class="hljs-tag"><<span class="hljs-name">cpu</span> <span class="hljs-attr">mode</span>=<span class="hljs-string">"host-passthrough"</span> <span class="hljs-attr">check</span>=<span class="hljs-string">"none"</span> <span class="hljs-attr">migratable</span>=<span class="hljs-string">"on"</span>></span> <span class="hljs-tag"><<span class="hljs-name">topology</span> <span class="hljs-attr">sockets</span>=<span class="hljs-string">"1"</span> <span class="hljs-attr">dies</span>=<span class="hljs-string">"1"</span> <span class="hljs-attr">cores</span>=<span class="hljs-string">"8"</span> <span class="hljs-attr">threads</span>=<span class="hljs-string">"2"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">cpu</span>></span> <span class="hljs-tag"><<span class="hljs-name">clock</span> <span class="hljs-attr">offset</span>=<span class="hljs-string">"localtime"</span>></span> <span class="hljs-tag"><<span class="hljs-name">timer</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"rtc"</span> <span class="hljs-attr">tickpolicy</span>=<span class="hljs-string">"catchup"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">timer</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pit"</span> <span class="hljs-attr">tickpolicy</span>=<span class="hljs-string">"delay"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">timer</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"hpet"</span> <span class="hljs-attr">present</span>=<span class="hljs-string">"no"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">timer</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"hypervclock"</span> <span class="hljs-attr">present</span>=<span class="hljs-string">"yes"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">clock</span>></span> <span class="hljs-tag"><<span class="hljs-name">on_poweroff</span>></span>destroy<span class="hljs-tag"></<span class="hljs-name">on_poweroff</span>></span> <span class="hljs-tag"><<span class="hljs-name">on_reboot</span>></span>restart<span class="hljs-tag"></<span class="hljs-name">on_reboot</span>></span> <span class="hljs-tag"><<span class="hljs-name">on_crash</span>></span>destroy<span class="hljs-tag"></<span class="hljs-name">on_crash</span>></span> <span class="hljs-tag"><<span class="hljs-name">pm</span>></span> <span class="hljs-tag"><<span class="hljs-name">suspend-to-mem</span> <span class="hljs-attr">enabled</span>=<span class="hljs-string">"no"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">suspend-to-disk</span> <span class="hljs-attr">enabled</span>=<span class="hljs-string">"no"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">pm</span>></span> <span class="hljs-tag"><<span class="hljs-name">devices</span>></span> <span class="hljs-tag"><<span class="hljs-name">emulator</span>></span>/run/libvirt/nix-emulators/qemu-system-x86_64<span class="hljs-tag"></<span class="hljs-name">emulator</span>></span> <span class="hljs-tag"><<span class="hljs-name">disk</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"file"</span> <span class="hljs-attr">device</span>=<span class="hljs-string">"disk"</span>></span> <span class="hljs-tag"><<span class="hljs-name">driver</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"qemu"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"qcow2"</span> <span class="hljs-attr">discard</span>=<span class="hljs-string">"unmap"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">source</span> <span class="hljs-attr">file</span>=<span class="hljs-string">"/var/lib/libvirt/images/Windows11.qcow2"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">dev</span>=<span class="hljs-string">"vda"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"virtio"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">boot</span> <span class="hljs-attr">order</span>=<span class="hljs-string">"1"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x04"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">disk</span>></span> <span class="hljs-tag"><<span class="hljs-name">disk</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"file"</span> <span class="hljs-attr">device</span>=<span class="hljs-string">"cdrom"</span>></span> <span class="hljs-tag"><<span class="hljs-name">driver</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"qemu"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"raw"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">source</span> <span class="hljs-attr">file</span>=<span class="hljs-string">"/mnt/root/persistent/media/LegacyOS/Common/virtio-win-0.1.215.iso"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">dev</span>=<span class="hljs-string">"sdb"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"sata"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">readonly</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"drive"</span> <span class="hljs-attr">controller</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">target</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">unit</span>=<span class="hljs-string">"1"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">disk</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"usb"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"qemu-xhci"</span> <span class="hljs-attr">ports</span>=<span class="hljs-string">"15"</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"1"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"1"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x10"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span> <span class="hljs-attr">multifunction</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"2"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"2"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x11"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x1"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"3"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"3"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x12"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x2"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"4"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"4"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x13"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x3"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"5"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"5"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x14"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x4"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"6"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"6"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x15"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x5"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"7"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"7"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x16"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x6"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"8"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"8"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x17"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x7"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"9"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"9"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x18"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x03"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span> <span class="hljs-attr">multifunction</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"10"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"10"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x19"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x03"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x1"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"11"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"11"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x1a"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x03"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x2"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"12"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"12"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x1b"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x03"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x3"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"13"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"13"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x1c"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x03"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x4"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"14"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"14"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x1d"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x03"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x5"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"sata"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"0"</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x1f"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x2"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"virtio-serial"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"0"</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x03"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">interface</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"network"</span>></span> <span class="hljs-tag"><<span class="hljs-name">mac</span> <span class="hljs-attr">address</span>=<span class="hljs-string">"52:54:00:f4:bf:15"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">source</span> <span class="hljs-attr">network</span>=<span class="hljs-string">"default"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"virtio"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x01"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">interface</span>></span> <span class="hljs-tag"><<span class="hljs-name">serial</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pty"</span>></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"isa-serial"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"isa-serial"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">target</span>></span> <span class="hljs-tag"></<span class="hljs-name">serial</span>></span> <span class="hljs-tag"><<span class="hljs-name">console</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pty"</span>></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"serial"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">console</span>></span> <span class="hljs-tag"><<span class="hljs-name">channel</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"spicevmc"</span>></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"virtio"</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"com.redhat.spice.0"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"virtio-serial"</span> <span class="hljs-attr">controller</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"1"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">channel</span>></span> <span class="hljs-tag"><<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"mouse"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"ps2"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"mouse"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"virtio"</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x06"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">input</span>></span> <span class="hljs-tag"><<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"keyboard"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"ps2"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"keyboard"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"virtio"</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x07"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">input</span>></span> <span class="hljs-tag"><<span class="hljs-name">tpm</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"tpm-crb"</span>></span> <span class="hljs-tag"><<span class="hljs-name">backend</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"passthrough"</span>></span> <span class="hljs-tag"><<span class="hljs-name">device</span> <span class="hljs-attr">path</span>=<span class="hljs-string">"/dev/tpm0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">backend</span>></span> <span class="hljs-tag"></<span class="hljs-name">tpm</span>></span> <span class="hljs-tag"><<span class="hljs-name">graphics</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"spice"</span> <span class="hljs-attr">autoport</span>=<span class="hljs-string">"yes"</span>></span> <span class="hljs-tag"><<span class="hljs-name">listen</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"address"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">image</span> <span class="hljs-attr">compression</span>=<span class="hljs-string">"off"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">graphics</span>></span> <span class="hljs-tag"><<span class="hljs-name">sound</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"ich9"</span>></span> <span class="hljs-tag"><<span class="hljs-name">audio</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"1"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x1b"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">sound</span>></span> <span class="hljs-tag"><<span class="hljs-name">audio</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"1"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"spice"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">video</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"qxl"</span> <span class="hljs-attr">ram</span>=<span class="hljs-string">"65536"</span> <span class="hljs-attr">vram</span>=<span class="hljs-string">"65536"</span> <span class="hljs-attr">vgamem</span>=<span class="hljs-string">"16384"</span> <span class="hljs-attr">heads</span>=<span class="hljs-string">"1"</span> <span class="hljs-attr">primary</span>=<span class="hljs-string">"yes"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x01"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">video</span>></span> <span class="hljs-tag"><<span class="hljs-name">hostdev</span> <span class="hljs-attr">mode</span>=<span class="hljs-string">"subsystem"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">managed</span>=<span class="hljs-string">"yes"</span>></span> <span class="hljs-tag"><<span class="hljs-name">source</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x01"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">source</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x05"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">hostdev</span>></span> <span class="hljs-tag"><<span class="hljs-name">redirdev</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"usb"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"spicevmc"</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"usb"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"2"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">redirdev</span>></span> <span class="hljs-tag"><<span class="hljs-name">redirdev</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"usb"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"spicevmc"</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"usb"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"3"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">redirdev</span>></span> <span class="hljs-tag"><<span class="hljs-name">watchdog</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"itco"</span> <span class="hljs-attr">action</span>=<span class="hljs-string">"reset"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">memballoon</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"none"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">devices</span>></span> <span class="hljs-tag"><<span class="hljs-name">qemu:commandline</span>></span> <span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"-device"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"{<span class="hljs-symbol">&quot;</span>driver<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>ivshmem-plain<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>id<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>shmem0<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>memdev<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>looking-glass<span class="hljs-symbol">&quot;</span>}"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"-object"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"{<span class="hljs-symbol">&quot;</span>qom-type<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>memory-backend-file<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>id<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>looking-glass<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>mem-path<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>/dev/kvmfr0<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>size<span class="hljs-symbol">&quot;</span>:67108864,<span class="hljs-symbol">&quot;</span>share<span class="hljs-symbol">&quot;</span>:true}"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"-acpitable"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"file=/etc/ssdt1.dat"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">qemu:commandline</span>></span><span class="hljs-tag"></<span class="hljs-name">domain</span>></span></code></pre></div>]]></content:encoded>
<category domain="https://lantian.pub/category/modify-computer/">计算机与客户端</category>
<category domain="https://lantian.pub/tag/%E6%98%BE%E5%8D%A1/">显卡</category>
<category domain="https://lantian.pub/tag/%E8%99%9A%E6%8B%9F%E6%9C%BA/">虚拟机</category>
<category domain="https://lantian.pub/tag/NVIDIA/">NVIDIA</category>
<category domain="https://lantian.pub/tag/MUXed/">MUXed</category>
<comments>https://lantian.pub/article/modify-computer/laptop-muxed-nvidia-passthrough.lantian/#disqus_thread</comments>
</item>
<item>
<title>NVIDIA GPU Passthrough on an Optimus MUXed Laptop (Updated 2023-05)</title>
<link>https://lantian.pub/en/article/modify-computer/laptop-muxed-nvidia-passthrough.lantian/</link>
<guid>https://lantian.pub/en/article/modify-computer/laptop-muxed-nvidia-passthrough.lantian/</guid>
<pubDate>Sun, 07 May 2023 16:28:52 GMT</pubDate>
<description><p>A year ago, to simultaneously browse webpages and write codes on my Arch Linux installation and use Windows to run tasks infeasible on Li</description>
<content:encoded><![CDATA[<p>A year ago, to simultaneously browse webpages and write codes on my Arch Linux installation and use Windows to run tasks infeasible on Linux (such as gaming), <a href="/en/article/modify-computer/laptop-intel-nvidia-optimus-passthrough.lantian/">I tried GPU passthrough on my Lenovo R720 gaming laptop</a>. But since that laptop has an Optimus MUXless architecture (as mentioned in that post), its dedicated GPU doesn't have output ports, and the integrated GPU is in charge of all the displays. Therefore, severe limitations exist for that setup, and I eventually gave up on it.</p><p>But now, I've purchased a new laptop. The HDMI output port on this laptop is directly connected to its NVIDIA dedicated graphics card, or in other words, it has an Optimus MUXed architecture. Since there is a way to make the virtual machine aware of a "monitor on the dedicated GPU", most functionalities work normally. I am finally able to create a GPU passthrough setup that works long-term.</p><h1 id="changelog">Changelog</h1><ul><li>2023-05-08: Update some contents for new version of Looking Glass B6.</li><li>2022-01-26: The PCIe power-saving patch isn't effective.</li></ul><h1 id="preparation">Preparation</h1><p>Before following steps in this post, you need to prepare:</p><ol><li><p>A laptop with the Optimus MUXed architecture. My laptop is a HP OMEN 17t-ck000 (i7-11800H, RTX 3070).</p><ul><li>(2022-01) My operating system is Arch Linux, with the latest updates.</li><li>(2023-05) My operating system is now NixOS while writing this update. Most of the steps, however, should still apply to other Linux distros.</li><li>It's recommended to turn off Secure Boot, but you likely did it anyway since you installed Linux. Theoretically, Secure Boot may cause limitations on the PCIe passthrough functionality.</li></ul></li><li><p>Set up a virtual machine of Windows 10 or Windows 11 with Libvirt (Virt-Manager). I'm using Windows 11.</p><ul><li>My VM boots in UEFI (OVMF) mode, but theoretically, this guide will also work with BIOS (SeaBIOS) mode. There are no steps that specifically require UEFI boot mode.</li><li><strong>You MUST turn off Secure Boot in the VM! Or some drivers won't work!</strong><ul><li>Windows 11 installer will check whether Secure Boot is enabled. With Secure Boot off, the installer might prompt that the computer is incompatible, and refuse to install Windows. You can follow the steps in this post to fix the problem: <a href="https://www.tomshardware.com/how-to/bypass-windows-11-tpm-requirement">https://www.tomshardware.com/how-to/bypass-windows-11-tpm-requirement</a></li></ul></li><li>Set up the emulated QXL graphics card first, so you get video output from the VM.</li></ul></li><li><p>(Optional) Depending on the video output ports on your computer, purchase an HDMI, DP, or USB Type-C dummy plug. You can get one for a few bucks on Amazon.</p><ul><li>(2023-05) Or you can choose to install a virtual monitor driver.</li><li><img src="../../../../../usr/uploads/202201/hdmi-dummy-plug.jpg" alt="HDMI Dummy Plug"></li></ul></li><li><p>(Optional) A USB keyboard and mouse combo.</p></li></ol><p>A reminder before we begin:</p><ul><li>Multiple reboots of the host OS is required, and your host OS may crash! Back up your data.</li><li>You don't need to download any NVIDIA driver manually. Windows will do it for you automatically.<ul><li>If it doesn't, don't go any further than downloading the driver EXE and double-clicking.</li><li><strong>Never</strong> specify the exact driver to be used in Device Manager.</li><li>Debugging will be harder if you do this.</li></ul></li></ul><h2 id="purchasing-a-new-optimus-muxed-laptop">Purchasing A New Optimus MUXed Laptop</h2><p>If you are interested in GPU passthrough and are looking for a new laptop, you can refer to my guidelines.</p><p>The prerequisites for laptop GPU passthrough is:</p><ol><li>The NVIDIA GPU itself must be capable of video output</li><li>There is at least one video output directly connected to that GPU</li></ol><p>However, it's extremely rare for a laptop manufacturer to mention the port connection schemes on their product pages, so we have to infer from more common specifications:</p><ol><li><p>Prefer a laptop with a MUX switch, aka ones that can switch their internal screen onto the dedicated GPU. In this case, the dedicated GPU must be capable of video output, and there's a high chance that the manufacturer connected the chassis video outputs to the dedicated GPU:</p><ul><li>Common examples are: 2020 and 2021 Lenovo Legion series, HP OMEN series, and Dell G15.</li><li><strong>I do not guarantee that this list is accurate!</strong> Do your own research or ask a sales agent to make sure.</li></ul></li><li><p>Or choose a laptop with a mid-range to high-end graphics card. For NVIDIA GPUs the model number needs to end with 60 or larger.</p><ul><li>It's common for mid to top-tier NVIDIA GPUs to have video output functionality, and the manufacturer is likely to connect chassis video ports to them.</li><li>Do not purchase a laptop with a GPU that ends in 50 or lower, like an RTX 3050 or GTX 1650 Ti. They very likely don't support video output.</li></ul></li><li><p>Take advantage of unconditional return policies.</p><ul><li>Since manufacturers won't advertise or won't even care much about their video output connections, we have to guess from the specifications and advertising pages. Therefore, it's possible that you followed every rule above yet got a laptop that doesn't support GPU passthrough. In this case, you may consider returning or reselling it.</li><li>In some countries, laptop manufacturers will only accept an unconditional return if the preinstalled Windows and Office are never activated online. However, the latest Windows 11 requires you to activate online in its first-boot setup wizard. You may consider trying GPU passthrough with a Linux installation on your USB drive before activating Windows.</li></ul></li></ol><h2 id="about-intel-gvt-g-virtual-gpus">About Intel GVT-g Virtual GPUs</h2><p>5th to 9th-Gen Intel integrated graphics support virtualizing the GPU itself, or in other words, splitting it into several virtual GPUs. The virtual GPUs can be passed through into VMs so they get GPU acceleration, while the host can still display stuff on the very same GPU.</p><p>However, the GVT-g driver in Linux doesn't support 10th-Gen or newer Intel CPUs, and <a href="https://github.com/intel/gvt-linux/issues/126">Intel has no plan to support them</a>. In addition, the GVT-g virtual GPU cannot form an Optimus configuration with an NVIDIA GPU, so it isn't useful anyway.</p><p>That's why we're ignoring GVT-g and focusing on the NVIDIA GPU in this guide.</p><h2 id="2023-05-about-intel-sr-iov-virtual-gpus">(2023-05) About Intel SR-IOV Virtual GPUs</h2><p>11th-Gen and later Intel integrated graphics support another form of virtualization: SR-IOV. Intel has <a href="https://github.com/intel/linux-intel-lts/tree/lts-v5.15.49-adl-linux-220826T092047Z/drivers/gpu/drm/i915">officially released the source code to the kernel module with SR-IOV</a>, but it isn't merged into Linux mainline as of now. <a href="https://github.com/strongtz/i915-sriov-dkms">There's a third party project that ports the code into a DKMS module</a>, but success rate is not high according to reports in Issues section. I tried it with my i7-11800H and didn't succeed. Therefore, this time we will not try SR-IOV on Intel GPUs.</p><h1 id="steps">Steps</h1><h2 id="stop-host-os-from-tampering-with-nvidia-gpu">Stop Host OS from Tampering with NVIDIA GPU</h2><blockquote><p>Most of the content is the same as <a href="/en/article/modify-computer/laptop-intel-nvidia-optimus-passthrough.lantian/">my post in 2021</a>.</p></blockquote><p>The NVIDIA driver on the Host OS will hold control of the dGPU, and stop VM from using it. Therefore you need to replace the driver with <code>vfio-pci</code>, built solely for PCIe passthrough.</p><p>Here are the steps for disabling the NVIDIA driver and passing control to the PCIe passthrough module:</p><ol><li><p>Run <code>lspci -nn | grep NVIDIA</code> and obtain an output similar to:</p><pre><code class="hljs language-bash">0000:01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA104M [GeForce RTX 3070 Mobile / Max-Q] [10de:249d] (rev a1)0000:01:00.1 Audio device [0403]: NVIDIA Corporation GA104 High Definition Audio Controller [10de:228b] (rev a1)</code></pre><p>Here <code>[10de:249d]</code> is the vendor ID and device ID of the dGPU, where <code>10de</code> means this device is manufactured by NVIDIA, and <code>249d</code> means this is a RTX 3070. <code>228b</code> is the audio output on the HDMI port, which should also be taken over by <code>vfio-pci</code>.</p></li><li><p>Create <code>/etc/modprobe.d/lantian.conf</code> with the following content:</p><pre><code class="hljs language-bash">options vfio-pci ids=10de:249d,10de:228b</code></pre><p>This configures <code>vfio-pci</code>, the kernel module responsible for PCIe passthrough, to manage the dGPU. <code>ids</code> is the vendor ID and device ID of the device to be passed through.</p></li><li><p>Modify <code>/etc/mkinitcpio.conf</code>, add the following contents to <code>MODULES</code>:</p><pre><code class="hljs language-bash">MODULES=(vfio_pci vfio vfio_iommu_type1 vfio_virqfd)</code></pre><p>And remove anything related to NVIDIA drivers (such as <code>nvidia</code>), or make sure they're listed after VFIO drivers. Now PCIe passthrough module will take control of the dGPU in the early booting process, preventing NVIDIA drivers from taking control.</p></li><li><p>Run <code>mkinitcpio -P</code> to update the initramfs.</p></li><li><p>Reboot.</p></li></ol><p>(2023-05) If you're using NixOS, you can use the following config:</p><pre><code class="hljs language-nix">{ boot.<span class="hljs-attr">kernelModules</span> = [<span class="hljs-string">"vfio-pci"</span>]; boot.<span class="hljs-attr">extraModprobeConfig</span> = <span class="hljs-string">'' # Change to your GPU's vendor ID and device ID options vfio-pci ids=10de:249d ''</span>; boot.<span class="hljs-attr">blacklistedKernelModules</span> = [<span class="hljs-string">"nouveau"</span> <span class="hljs-string">"nvidiafb"</span> <span class="hljs-string">"nvidia"</span> <span class="hljs-string">"nvidia-uvm"</span> <span class="hljs-string">"nvidia-drm"</span> <span class="hljs-string">"nvidia-modeset"</span>];}</code></pre><h2 id="setting-up-nvidia-dgpu-passthrough">Setting up NVIDIA dGPU Passthrough</h2><p>In <a href="/en/article/modify-computer/laptop-intel-nvidia-optimus-passthrough.lantian/">my post in 2021</a>, I mentioned a lot of configurations to circumvent restrictions of the NVIDIA driver. But <a href="https://nvidia.custhelp.com/app/answers/detail/a_id/5173">since version 465, NVIDIA lifted most of the restrictions</a>, so theoretically, you pass a GPU into the VM, and everything should just work.</p><p>But that's just the theory.</p><p>I still recommend everyone to follow all the steps and hide the VM characteristics, because:</p><ol><li><p>(2022-01) Not all restructions are lifted for laptops.</p><ul><li><del>At least in my tests, an incorrect PCIe bus address for the GPU and the absence of a battery still causes passthrough to fail, and the driver will error out with the infamous code 43.</del></li><li>(2023-05) In the attempt today, the PCIe bus address and absence of battery no longer affects outcome of GPU passthrough.</li></ul></li><li><p>Even if NVIDIA driver isn't detecting VMs, the programs you run might. Hiding VM characteristics increases the chance to run them successfully.</p><ul><li>Examples include online games with anti-cheat systems, or commercial software that require online activation.</li></ul></li></ol><p>And here we start:</p><ol><li><p>Unlike the Optimus MUXless architecture, I didn't manually extract the graphic card's BIOS nor modify the UEFI firmware, and everything just works.</p><ul><li>If you cannot install the GPU driver after passing it into the VM, including the cases that Windows won't automatically install them, or NVIDIA's official installer errors out saying lack of compatible devices, you likely will still need to extract your GPU's video BIOS.</li><li>To double-check, open Device Manager in the VM, and look at the Hardware ID, which looks like <code>PCI\VEN_10DE&DEV_1C8D&SUBSYS_39D117AA&REV_A1</code>. If <code>SUBSYS</code> is followed by a sequence of zeros, then the GPU video BIOS is missing, and you need the manual steps.</li><li>Refer to <a href="/en/article/modify-computer/laptop-intel-nvidia-optimus-passthrough.lantian/">my post last year</a>, specifically the NVIDIA GPU passthrough section, for detailed steps.</li></ul></li><li><p>Modify your VM configuration, <code>virsh edit Windows</code>, and make the following changes:</p><pre><code class="hljs language-xml"><span class="hljs-comment"><!-- Modify the features section, so QEMU will hide the fact that this is a VM --></span><span class="hljs-tag"><<span class="hljs-name">features</span>></span> <span class="hljs-tag"><<span class="hljs-name">acpi</span>/></span> <span class="hljs-tag"><<span class="hljs-name">apic</span>/></span> <span class="hljs-tag"><<span class="hljs-name">hyperv</span> <span class="hljs-attr">mode</span>=<span class="hljs-string">"custom"</span>></span> <span class="hljs-tag"><<span class="hljs-name">relaxed</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">vapic</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">spinlocks</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span> <span class="hljs-attr">retries</span>=<span class="hljs-string">"8191"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">vpindex</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">runtime</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">synic</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">stimer</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">reset</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">vendor_id</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"GenuineIntel"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">frequencies</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">tlbflush</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">hyperv</span>></span> <span class="hljs-tag"><<span class="hljs-name">kvm</span>></span> <span class="hljs-tag"><<span class="hljs-name">hidden</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">kvm</span>></span> <span class="hljs-tag"><<span class="hljs-name">vmport</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"off"</span>/></span><span class="hljs-tag"></<span class="hljs-name">features</span>></span><span class="hljs-comment"><!-- Add the PCIe passthrough device, must be below the hostdev for iGPU --></span><span class="hljs-tag"><<span class="hljs-name">hostdev</span> <span class="hljs-attr">mode</span>=<span class="hljs-string">'subsystem'</span> <span class="hljs-attr">type</span>=<span class="hljs-string">'pci'</span> <span class="hljs-attr">managed</span>=<span class="hljs-string">'yes'</span>></span> <span class="hljs-tag"><<span class="hljs-name">source</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">'0x0000'</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">'0x01'</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">'0x00'</span> <span class="hljs-attr">function</span>=<span class="hljs-string">'0x0'</span>/></span> <span class="hljs-tag"></<span class="hljs-name">source</span>></span> <span class="hljs-tag"><<span class="hljs-name">rom</span> <span class="hljs-attr">bar</span>=<span class="hljs-string">'off'</span>/></span> <span class="hljs-comment"><!-- The PCIe bus address here MUST BE EXACTLY 01:00.0 --></span> <span class="hljs-comment"><!-- If there is a PCIe bus address conflict when saving config changes, --></span> <span class="hljs-comment"><!-- Remove <address> of all other devices --></span> <span class="hljs-comment"><!-- And Libvirt will reallocate PCIe bus addresses --></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">'pci'</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">'0x0000'</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">'0x01'</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">'0x00'</span> <span class="hljs-attr">function</span>=<span class="hljs-string">'0x0'</span> <span class="hljs-attr">multifunction</span>=<span class="hljs-string">'on'</span>/></span><span class="hljs-tag"></<span class="hljs-name">hostdev</span>></span><span class="hljs-comment"><!-- Add a shared memory between VM and host --></span><span class="hljs-comment"><!-- So VM can transfer its display content to host --></span><span class="hljs-tag"><<span class="hljs-name">shmem</span> <span class="hljs-attr">name</span>=<span class="hljs-string">'looking-glass'</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">type</span>=<span class="hljs-string">'ivshmem-plain'</span>/></span> <span class="hljs-comment"><!-- Size is calculated as: display resolution width x height / 131072 --></span> <span class="hljs-comment"><!-- then round up to power of 2 --></span> <span class="hljs-comment"><!-- Most HDMI dummy plugs have a resolution of 3840 x 2160 --></span> <span class="hljs-comment"><!-- The result is 63.28MB which rounds up to 64MB --></span> <span class="hljs-tag"><<span class="hljs-name">size</span> <span class="hljs-attr">unit</span>=<span class="hljs-string">'M'</span>></span>64<span class="hljs-tag"></<span class="hljs-name">size</span>></span><span class="hljs-tag"></<span class="hljs-name">shmem</span>></span><span class="hljs-comment"><!-- Disable memory ballooning, this hurts performance significantly --></span><span class="hljs-tag"><<span class="hljs-name">memballoon</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"none"</span>/></span><span class="hljs-comment"><!-- Add these parameters before </qemu:commandline> --></span><span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">'-acpitable'</span>/></span><span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">'file=/ssdt1.dat'</span>/></span></code></pre><p>The ssdt1.dat is an ACPI table, and it emulates a fully-charged battery. It corresponds to the Base64 below. It can be converted to a binary file with <a href="https://base64.guru/converter/decode/file">Base64 decoding website</a> or <a href="../../../../../usr/uploads/202007/ssdt1.dat">downloaded from this site</a>. Place it in the root folder.</p><pre><code class="hljs language-bash">U1NEVKEAAAAB9EJPQ0hTAEJYUENTU0RUAQAAAElOVEwYEBkgoA8AFVwuX1NCX1BDSTAGABBMBi5fU0JfUENJMFuCTwVCQVQwCF9ISUQMQdAMCghfVUlEABQJX1NUQQCkCh8UK19CSUYApBIjDQELcBcLcBcBC9A5C1gCCywBCjwKPA0ADQANTElPTgANABQSX0JTVACkEgoEAAALcBcL0Dk=</code></pre></li><li><p>Modify permissions for the shared memory.</p><ol><li><p>Modify <code>/etc/apparmor.d/local/abstractions/libvirt-qemu</code> and add this line:</p><pre><code class="hljs language-bash">/dev/shm/looking-glass rw,</code></pre><p>Then execute <code>sudo systemctl restart apparmor</code> to restart AppArmor.</p></li><li><p>Create <code>/etc/tmpfiles.d/looking-glass.conf</code> with the following contents, replacing <code>lantian</code> to your username:</p><pre><code class="hljs language-bash">f /dev/shm/looking-glass 0660 lantian kvm -</code></pre><p>Then execute <code>sudo systemd-tmpfiles /etc/tmpfiles.d/looking-glass.conf --create</code> to make it effective.</p></li></ol></li><li><p>Start the VM and wait a while. Windows will automatically install NVIDIA drivers.</p><ul><li>If Device Manager shows the GPU with an exclamation sign and error code 43, you need to check if you've missed any steps and if you've configured everything correctly.<ul><li>(2022-01) <del>Switch Device Manager to <code>Device by Connection</code> and verify that the NVIDIA GPU is at Bus 1, Slot 0, Function 0. The parent PCIe port to the dGPU should be at Bus 0, Slot 1, Function 0.</del></li><li><del>If they don't match, you need to reallocate PCIe addresses with the method above.</del></li><li>(2023-05) This step is no longer needed in my attempt today.</li></ul></li><li>If the OS didn't automatically install the NVIDIA driver, and your manually downloaded driver installer also shows that the system is incompatible, you need to check the properties of the GPU device. Check if there is a sequence of zeros after <code>SUBSYS</code> in its Hardware ID.<ul><li>If there is, refer to step 1.</li></ul></li></ul></li><li><p>Turn off the virtual machine and restart it. <strong>This is not just a reboot.</strong> Confirm in the device manager that the GPU is working.</p><ul><li>If you get code 43 this time, check if you have the emulated battery in step 2.</li><li>I tried first on Windows 10 LTSC 2019 and got this error. Since I didn't have the emulated battery set up, I cannot confirm if it's an incompatibility between the NVIDIA driver and the OS or the lack of battery. I recommend using the latest versions of Windows 10 or Windows 11.</li></ul></li><li><p>Do either one of the following steps:</p><ol><li>(2022-01) Plug your HDMI dummy plug into your laptop, and the VM should detect a new monitor.</li><li>(2023-05) Install a virtual monitor driver:<ol><li>Download the virtual monitor driver from <a href="https://github.com/ge9/IddSampleDriver">ge9/IddSampleDriver</a>, and decompress it to <code>C:\IddSampleDriver</code>. Note that you must not move the folder anywhere else!</li><li>Open <code>C:\IddSampleDriver\option.txt</code>. You'll see the number 1 on the first line (don't change it), followed by a list of resolutions and refresh rates. Only keep the one resolution and refresh rate entry you want, and remove all other items.</li><li>Open Device Manager, select "Action - Add Legacy Hardware", click "Let me pick from a list... - All Devices - Have disk", choose the file <code>C:\IddSampleDriver\IddSampleDriver.inf</code>, and complete the installation.</li><li>Windows should now detect a new monitor.</li><li>In my testing, while using the virtual monitor driver, I saw some corrupted pixels on the Looking Glass display. As long as you can obtain an HDMI dummy plug, I would recommend it over the virtual monitor driver.</li></ol></li></ol></li><li><p>(2023-05) Now the newer version of Looking Glass will install IVSHMEM driver automatically (the driver for shared memory between VM and host). You no longer need to install it manually. These manual installation steps are kept for reference only:</p><ol><li><p>(2022-01) Download this Virtio driver, copy it into the VM, and extract it. <strong>You MUST use this copy, as no other copies have the IVSHMEM driver!</strong></p><p><a href="https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/upstream-virtio/virtio-win10-prewhql-0.1-161.zip">https://fedorapeople.org/groups/virt/virtio-win/direct-downloads/upstream-virtio/virtio-win10-prewhql-0.1-161.zip</a></p></li><li><p>Open Device Manager in the VM, and find <code>System devices - PCI standard RAM controller</code>:</p><ol><li>Right-click and select "Update driver"</li><li>Click "Browse my computer for drivers"</li><li>Click "Let me pick from a list of device drivers on my computer"</li><li>Click "Have disk" button</li><li>Navigate to <code>Virtio Drivers/Win10/amd64/ivshmem.inf</code> file</li><li>Click next to install the driver. The device's name should change to <code>IVSHMEM</code></li></ol></li></ol></li><li><p>Install <a href="https://looking-glass.io/downloads">Looking Glass</a>, a tool to transfer the display output from the VM to the host.</p><ul><li>Our dummy plug is going to be the only monitor the VM can see. If we don't install looking glass, we won't be able to see the VM desktop.</li><li>Click "Windows Host Binary" on the link above, double click in the VM to install it.</li></ul></li><li><p>(2023-05) If you followed the steps from 2022-01, you won't be able to see the startup screen while the VM is booting, and before Looking Glass starts. Therefore, I recommend disabling the QXL virtual adapter in Device Manager. The following older steps are kept for reference purpose only.</p><ul><li><p>(2022-01) Turn off the VM and run <code>virsh edit Windows</code> to edit the VM config.</p><p>Find <code><video><model type="qxl" ...></video></code>, change <code>type</code> to <code>none</code> to disable the QXL emulated GPU:</p><pre><code class="hljs language-xml"><span class="hljs-tag"><<span class="hljs-name">video</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"none"</span>/></span><span class="hljs-tag"></<span class="hljs-name">video</span>></span></code></pre></li></ul></li><li><p>Install Looking Glass client on the host. Arch Linux users can simply install <code>looking-glass</code> from AUR. Run <code>looking-glass-client</code> to start the client.</p></li><li><p>Back to Virt-Manager, close the window of the VM (the window that shows the VM desktop and changes VM configurations), right-click on the VM on Virt-Manager's main window, and select Run.</p></li><li><p>In a moment, you should see the VM's display on Looking Glass client. Now the GPU passthrough setup is complete.</p></li></ol><h1 id="performance-and-experience-optimizations">Performance and Experience Optimizations</h1><p>Although GPU passthrough is done, there is still room for user experience optimization. Particularly:</p><ol><li>(2022-01) <del>Looking Glass can relay keyboard and mouse events, but not audio, so we can't hear the VM;</del><ul><li>(2023-05) The latest Looking Glass can relay audio now.</li></ul></li><li>(2022-01) <del>Looking Glass may miss a few keystrokes from time to time;</del><ul><li>(2023-05) The latest Looking Glass can relay keyboard and mouse events reliably now.</li></ul></li><li>There is a host kernel module for IVSHMEM, which improves Looking Glass performance with its DMA mode;</li><li>After VM shutdown, the GPU is set to PCIe D3hot mode. It still consumes around 10 watts of power, which is undesirable for battery life.</li></ol><p>We will fix the problems one by one.</p><h2 id="get-audio-output-from-vm">Get Audio Output from VM</h2><p>(2023-05) The latest Looking Glass can relay audio now. These steps are kept for reference only.</p><div class="btn-group btn-group" role="group" id="lti-g8466248"><input type="radio" name="lti-g8466248" id="lti-8466248-5433526" class="btn-check lti-option" autocomplete="off" data-lti-tag="sound_hide"><label class="btn btn-outline-primary" for="lti-8466248-5433526">Hide</label><input type="radio" name="lti-g8466248" id="lti-8466248-2793512" class="btn-check lti-option" autocomplete="off" data-lti-tag="sound_show"><label class="btn btn-outline-primary" for="lti-8466248-2793512">Show older steps from 2022-01</label></div><div id="lti-content-sound_show" class="lti-content"><p>While Virt-Manager can connect to the VM with SPICE protocol to get the VM's sound output, Looking Glass also relays keyboard and mouse events through SPICE. Since the VM only accepts one simultaneous SPICE connection, we cannot get the audio output with Virt-Manager.</p><p>We can install <a href="https://github.com/duncanthrax/scream">Scream</a>, a virtual sound card software in Windows, to transfer audio output over the network. A Scream client can be run on the host to receive the audio signal.</p><p>Download the Scream installer from <a href="https://github.com/duncanthrax/scream/releases">its download page</a> on the VM, extract it, run <code>Install-x64.bat</code> as administrator to install the driver, and reboot.</p><p>Install the Scream client on the host. Arch Linux users can install the <code>scream</code> package from AUR.</p><p>Open a terminal on the host and run <code>scream -v</code>. Test by playing some sound in the VM. If you can't hear anything, try specifying the network interface to the VM, like <code>scream -i virbr0 -v</code>, where <code>virbr0</code> is the default NAT network for Virt-Manager, and the network interface between the VM and the host.</p><p>Finally, you can create a SystemD service to run the Scream client conveniently later. Create <code>~/.config/systemd/user/scream.service</code> with the following content:</p><pre><code class="hljs language-bash">[Unit]Description=Scream[Service]Type=simpleRestart=alwaysRestartSec=1ExecStart=/usr/bin/scream -i virbr0 -v[Install]WantedBy=graphical-session.target</code></pre><p>You will only need to run <code>systemctl --user start scream</code> in the future.</p></div><h2 id="passthrough-keyboard--mouse-operations">Passthrough Keyboard & Mouse Operations</h2><p>The latest Looking Glass can relay keyboard and mouse events reliably now. These steps are kept for reference only.</p><div class="btn-group btn-group" role="group" id="lti-g7658874"><input type="radio" name="lti-g7658874" id="lti-7658874-1730685" class="btn-check lti-option" autocomplete="off" data-lti-tag="keyboardmouse_hide"><label class="btn btn-outline-primary" for="lti-7658874-1730685">Hide</label><input type="radio" name="lti-g7658874" id="lti-7658874-6205099" class="btn-check lti-option" autocomplete="off" data-lti-tag="keyboardmouse_show"><label class="btn btn-outline-primary" for="lti-7658874-6205099">Show older steps from 2022-01</label></div><div id="lti-content-keyboardmouse_show" class="lti-content"><p>The relay of the keyboard and mouse in Looking Glass isn't very stable, as misses of operation can happen from time to time. Therefore, if you want to play some games in the VM, you need a more reliable way to pass your keyboard and mouse into the VM.</p><p>We have two options: letting Libvirt capture the keyboard and mouse events or simply pass your keyboard and mouse into the VM.</p><ol><li><p>Capturing Keyboard and Mouse Events.</p><p>On Linux, all keyboard and mouse operations are passed to the desktop environment via the <code>evdev</code> (or <code>Event Device</code>) framework. Libvirt can capture your operations and pass them to the VM. In addition, Libvirt can switch the control between the host and the VM whenever you press Left Ctrl and the Right Ctrl, so you can operate on both the host and the VM with one keyboard-mouse combo.</p><p>First run <code>ls -l /dev/input/by-path</code> on the host to see your present <code>evdev</code> devices. I have these ones for example:</p><pre><code class="hljs language-bash">pci-0000:00:14.0-usb-0:1:1.1-event-mouse <span class="hljs-comment"># USB mouse</span>pci-0000:00:14.0-usb-0:1:1.1-mousepci-0000:00:14.0-usb-0:6:1.0-eventpci-0000:00:15.0-platform-i2c_designware.0-event-mouse <span class="hljs-comment"># Builtin Touchpad</span>pci-0000:00:15.0-platform-i2c_designware.0-mousepci-0000:00:1f.3-platform-skl_hda_dsp_generic-eventplatform-i8042-serio-0-event-kbd <span class="hljs-comment"># Builtin Keyboard</span>platform-pcspkr-event-spkr</code></pre><p>Those with <code>event-mouse</code> are mouses, and the <code>event-kbd</code> ones are keyboards.</p><p>Then, run <code>virsh edit Windows</code> to edit the VM config. Add these into the <code><devices></code> section:</p><pre><code class="hljs language-xml"><span class="hljs-tag"><<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"evdev"</span>></span> <span class="hljs-comment"><!-- Change the mouse or keyboard path based on your ls result --></span> <span class="hljs-tag"><<span class="hljs-name">source</span> <span class="hljs-attr">dev</span>=<span class="hljs-string">"/dev/input/by-path/platform-i8042-serio-0-event-kbd"</span> <span class="hljs-attr">grab</span>=<span class="hljs-string">"all"</span> <span class="hljs-attr">repeat</span>=<span class="hljs-string">"on"</span>/></span><span class="hljs-tag"></<span class="hljs-name">input</span>></span><span class="hljs-comment"><!-- Repeat if you have many mouses or keyboards --></span><span class="hljs-tag"><<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"evdev"</span>></span> <span class="hljs-tag"><<span class="hljs-name">source</span> <span class="hljs-attr">dev</span>=<span class="hljs-string">"/dev/input/by-path/pci-0000:00:15.0-platform-i2c_designware.0-event-mouse"</span> <span class="hljs-attr">grab</span>=<span class="hljs-string">"all"</span> <span class="hljs-attr">repeat</span>=<span class="hljs-string">"on"</span>/></span><span class="hljs-tag"></<span class="hljs-name">input</span>></span></code></pre><p>Start the VM, and you should notice that your keyboard and mouse aren't working on the host. They're captured by the VM. Press Left Ctrl + Right Ctrl to return control to the host. Press again to control the VM.</p><p>Now we can disable the keyboard and mouse relay of Looking Glass. Create <code>/etc/looking-glass-client.ini</code> with the following content:</p><pre><code class="hljs language-ini"><span class="hljs-section">[spice]</span><span class="hljs-attr">enable</span>=<span class="hljs-literal">no</span></code></pre></li><li><p>USB Keyboard and Mouse Passthrough</p><p>Capturing keyboard and mouse operations doesn't always work. For example, my touchpad cannot be captured properly, as I can't move the cursor in the VM.</p><p>If you also encountered the issue and you have a USB keyboard and mouse combo, you can pass them into the VM and use them specifically for it. USB passthrough to VM is a mature technology, so the chance of running into problems is very low.</p><p>Simply click <code>Add Hardware - USB Host Device</code> in Virt-Manager and select your keyboard and mouse.</p></li></ol></div><h2 id="accelerating-looking-glass-with-kernel-modules">Accelerating Looking Glass with Kernel Modules</h2><blockquote><p>Most of the content in this section is from <a href="https://looking-glass.io/docs/B6/module/">https://looking-glass.io/docs/B6/module/</a></p></blockquote><p>Looking Glass provides a kernel module for the IVSHMEM shared memory device. It allows Looking Glass to read the display output efficiently with DMA to improve the framerate.</p><ol><li><p>Install Linux kernel header files and DKMS, or the packages of <code>linux-headers</code> and <code>dkms</code> on Arch Linux.</p></li><li><p>Install <code>looking-glass-module-dkms</code> from AUR.</p></li><li><p>Set up an Udev rule: create <code>/etc/udev/rules.d/99-kvmfr.rules</code> with the following content:</p><pre><code class="hljs language-bash">SUBSYSTEM==<span class="hljs-string">"kvmfr"</span>, OWNER=<span class="hljs-string">"lantian"</span>, GROUP=<span class="hljs-string">"kvm"</span>, MODE=<span class="hljs-string">"0660"</span></code></pre><p>Replace <code>lantian</code> with your own username.</p></li><li><p>Configure the memory size: create <code>/etc/modprobe.d/looking-glass.conf</code> with the following content:</p><pre><code class="hljs language-bash"><span class="hljs-comment"># The memory size is calculates in the same way as VM's shmem.</span>options kvmfr static_size_mb=64</code></pre></li><li><p>Load the module automatically on boot: create <code>/etc/modules-load.d/looking-glass.conf</code> with a single line of <code>kvmfr</code>.</p></li><li><p>Run <code>sudo modprobe kvmfr</code> to load the module. Now a <code>kvmfr0</code> device should appear under <code>/dev</code>, and this is the memory device for Looking Glass.</p></li><li><p>Edit <code>/etc/apparmor.d/local/abstractions/libvirt-qemu</code> and add this line:</p><pre><code class="hljs language-bash">/dev/kvmfr0 rw,</code></pre><p>It allows the VM to access the device. Run <code>sudo systemctl restart apparmor</code> to restart AppArmor.</p></li><li><p>Run <code>virsh edit Windows</code> to change the VM's configuration:</p><ol><li><p>Delete <code><shmem></code> section from <code><devices></code>:</p><pre><code class="hljs language-xml"><span class="hljs-tag"><<span class="hljs-name">shmem</span> <span class="hljs-attr">name</span>=<span class="hljs-string">'looking-glass'</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">type</span>=<span class="hljs-string">'ivshmem-plain'</span>/></span> <span class="hljs-tag"><<span class="hljs-name">size</span> <span class="hljs-attr">unit</span>=<span class="hljs-string">'M'</span>></span>64<span class="hljs-tag"></<span class="hljs-name">size</span>></span><span class="hljs-tag"></<span class="hljs-name">shmem</span>></span></code></pre></li><li><p>Add these lines under <code><qemu:commandline></code>:</p><pre><code class="hljs language-xml"><span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"-device"</span>/></span><span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"{<span class="hljs-symbol">&quot;</span>driver<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>ivshmem-plain<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>id<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>shmem-looking-glass<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>memdev<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>looking-glass<span class="hljs-symbol">&quot;</span>}"</span>/></span><span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"-object"</span>/></span><span class="hljs-comment"><!-- There is a number 67108864 in the next line, which is 64MB * 1048576 --></span><span class="hljs-comment"><!-- Change accordingly if you've set a different memory size --></span><span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"{<span class="hljs-symbol">&quot;</span>qom-type<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>memory-backend-file<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>id<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>looking-glass<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>mem-path<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>/dev/kvmfr0<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>size<span class="hljs-symbol">&quot;</span>:67108864,<span class="hljs-symbol">&quot;</span>share<span class="hljs-symbol">&quot;</span>:true}"</span>/></span></code></pre></li><li><p>Start the VM.</p></li></ol></li><li><p>Change <code>/etc/looking-glass-client.ini</code> and add the following content:</p><pre><code class="hljs language-ini"><span class="hljs-section">[app]</span><span class="hljs-attr">shmFile</span>=/dev/kvmfr0</code></pre></li><li><p>Start Looking Glass. You should see the VM display now.</p></li><li><p>(2023-05) If you use NixOS, you can directly use the config below:</p></li></ol><pre><code class="hljs language-nix">{ boot.<span class="hljs-attr">extraModulePackages</span> = <span class="hljs-keyword">with</span> config.boot.kernelPackages; [ kvmfr ]; boot.<span class="hljs-attr">extraModprobeConfig</span> = <span class="hljs-string">'' # The memory size is calculates in the same way as VM's shmem. options kvmfr static_size_mb=64 ''</span>; boot.<span class="hljs-attr">kernelModules</span> = [<span class="hljs-string">"kvmfr"</span>]; services.udev.<span class="hljs-attr">extraRules</span> = <span class="hljs-string">'' SUBSYSTEM=="kvmfr", OWNER="root", GROUP="libvirtd", MODE="0660" ''</span>; environment.etc.<span class="hljs-string">"looking-glass-client.ini"</span>.<span class="hljs-attr">text</span> = <span class="hljs-string">'' [app] shmFile=/dev/kvmfr0 ''</span>;}</code></pre><h2 id="cutting-power-to-gpu-when-unused">Cutting Power to GPU When Unused</h2><p><strong>2022-01-26 Update: testing shows that the NVIDIA GPU still isn't completely shut down after applying the patch. The power draw is the same as before. This section is now invalid.</strong></p><div class="btn-group btn-group" role="group" id="lti-g7380466"><input type="radio" name="lti-g7380466" id="lti-7380466-3642182" class="btn-check lti-option" autocomplete="off" data-lti-tag="power_hide"><label class="btn btn-outline-primary" for="lti-7380466-3642182">Invalid contents are hidden</label><input type="radio" name="lti-g7380466" id="lti-7380466-3006630" class="btn-check lti-option" autocomplete="off" data-lti-tag="power_show"><label class="btn btn-outline-primary" for="lti-7380466-3006630">Show</label></div><div id="lti-content-power_show" class="lti-content"><blockquote><p>This section only applies to 20-series of NVIDIA GPUs or newer. They can shut themselves down with the NVIDIA official drivers. The 10-series or older GPUs don't support this feature.</p><p>This section involves compiling a kernel yourself, and <strong>using an patch without extensive inspection or testing</strong>. Not intended for novice users. Evaluate the risks yourself.</p></blockquote><p>When you aren't using the VM, the <code>vfio-pci</code> driver in charge of PCIe passthrough sets the device to the <code>D3</code> mode, aka the power saving mode of PCIe devices. But there are two types of <code>D3</code> modes: <code>D3hot</code>, where the device is still powered, and <code>D3cold</code>, where the device is shut off completely. Currently, the <code>vfio-pci</code> driver in the kernel only supports <code>D3hot</code>, and the NVIDIA GPU will still consume around 10 watts of power since its chip power isn't cut. This impacts the battery life of laptops.</p><p>An NVIDIA engineer posted a patchset for <code>vfio-pci</code>'s <code>D3cold</code> support on the Linux kernel mailing list. With this patchset, the NVIDIA GPU will be shut down completely when the VM is off. It saves power for your battery.</p><p>The patchset can be found at <a href="https://lore.kernel.org/lkml/20211115133640.2231-1-abhsahu@nvidia.com/T/">https://lore.kernel.org/lkml/20211115133640.2231-1-abhsahu@nvidia.com/T/</a>, which consists of three patches. I combined the three patches and uploaded the result to <a href="https://github.com/xddxdd/pkgbuild/blob/master/linux-xanmod-lantian/0007-vfio-pci-d3cold.patch">https://github.com/xddxdd/pkgbuild/blob/master/linux-xanmod-lantian/0007-vfio-pci-d3cold.patch</a>.</p><p>Patching kernel is relatively simple for Arch Linux. Most kernel PKGBUILDs in AUR can apply patches automatically. All you have to do is to download the PKGBUILD for a kernel and add the patch to its <code>source</code> section. See my commit for an example: <a href="https://github.com/xddxdd/pkgbuild/commit/406adb7bf5657cfe07bb17ff561d11ed97ebab39">https://github.com/xddxdd/pkgbuild/commit/406adb7bf5657cfe07bb17ff561d11ed97ebab39</a>.</p><p><strong>DO NOTE that this patch doesn't guarantee stability.</strong></p><p>Based on mailing list discussions:</p><ol><li>It's an RFC patch, aka a testing patch. There is a big <code>[RFC]</code> on the title of the e-mails.</li><li>If the GPU driver in the VM ever attempts to switch the GPU to <code>D3cold</code> mode, with this patch, there are risks of resetting the GPU, losing all states, and crashing the VM. Although I have never encountered such problems in my experience, you should be aware of the possible outcomes.</li><li>The developer only tested some GPUs from NVIDIA. It does not guarantee support for other PCIe devices.</li></ol><p><strong>Use this at your own risk.</strong></p></div><h2 id="references">References</h2><p>Huge thanks to previous explorers on the topic of GPU passthrough. Without their efforts, this post won't have existed in the first place.</p><p>Here are the sources I referenced when I did my configuration:</p><ul><li>NVIDIA dGPU passthrough<ul><li>GitHub Misairu-G's NVIDIA Optimus MUXed passthrough guide <a href="https://gist.github.com/Misairu-G/616f7b2756c488148b7309addc940b28">https://gist.github.com/Misairu-G/616f7b2756c488148b7309addc940b28</a></li><li>Reddit r/VFIO's emulated battery patch <a href="https://www.reddit.com/r/VFIO/comments/ebo2uk/nvidia_geforce_rtx_2060_mobile_success_qemu_ovmf/">https://www.reddit.com/r/VFIO/comments/ebo2uk/nvidia_geforce_rtx_2060_mobile_success_qemu_ovmf/</a></li><li>Arch Linux Wiki <a href="https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF">https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF</a></li></ul></li><li>Looking Glass Documentation<ul><li>Installation Docs <a href="https://looking-glass.io/docs/B6/install/">https://looking-glass.io/docs/B6/install/</a></li><li>Kernel Module Docs <a href="https://looking-glass.io/docs/B6/module/">https://looking-glass.io/docs/B6/module/</a></li></ul></li><li>Virtual Monitor Driver<ul><li>The one used in this post, with ability to customize resolution and refresh rate <a href="https://github.com/ge9/IddSampleDriver">https://github.com/ge9/IddSampleDriver</a></li><li>The original version with fixed list of resolution and refresh rate <a href="https://github.com/roshkins/IddSampleDriver">https://github.com/roshkins/IddSampleDriver</a></li></ul></li><li>VFIO D3cold Patch<ul><li>News report from Phoronix <a href="https://www.phoronix.com/scan.php?page=news_item&px=NVIDIA-Runtime-PM-VFIO-PCI">https://www.phoronix.com/scan.php?page=news_item&px=NVIDIA-Runtime-PM-VFIO-PCI</a></li><li>Link to Linux kernel mailing list <a href="https://lore.kernel.org/lkml/20211115133640.2231-1-abhsahu@nvidia.com/T/">https://lore.kernel.org/lkml/20211115133640.2231-1-abhsahu@nvidia.com/T/</a></li></ul></li></ul><h2 id="appendix-final-libvirt-xml-file">Appendix: Final Libvirt XML File</h2><div class="btn-group btn-group" role="group" id="lti-g5852534"><input type="radio" name="lti-g5852534" id="lti-5852534-4358712" class="btn-check lti-option" autocomplete="off" data-lti-tag="xml_hide"><label class="btn btn-outline-primary" for="lti-5852534-4358712">Hide</label><input type="radio" name="lti-g5852534" id="lti-5852534-7819576" class="btn-check lti-option" autocomplete="off" data-lti-tag="xml_show"><label class="btn btn-outline-primary" for="lti-5852534-7819576">Show the entire XML file</label></div><div id="lti-content-xml_show" class="lti-content"><pre><code class="hljs language-xml"><span class="hljs-tag"><<span class="hljs-name">domain</span> <span class="hljs-attr">xmlns:qemu</span>=<span class="hljs-string">"http://libvirt.org/schemas/domain/qemu/1.0"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"kvm"</span>></span> <span class="hljs-tag"><<span class="hljs-name">name</span>></span>Windows11<span class="hljs-tag"></<span class="hljs-name">name</span>></span> <span class="hljs-tag"><<span class="hljs-name">uuid</span>></span>5d5b00d8-475a-4b6c-8053-9dda30cd2f95<span class="hljs-tag"></<span class="hljs-name">uuid</span>></span> <span class="hljs-tag"><<span class="hljs-name">metadata</span>></span> <span class="hljs-tag"><<span class="hljs-name">libosinfo:libosinfo</span> <span class="hljs-attr">xmlns:libosinfo</span>=<span class="hljs-string">"http://libosinfo.org/xmlns/libvirt/domain/1.0"</span>></span> <span class="hljs-tag"><<span class="hljs-name">libosinfo:os</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"http://microsoft.com/win/11"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">libosinfo:libosinfo</span>></span> <span class="hljs-tag"></<span class="hljs-name">metadata</span>></span> <span class="hljs-tag"><<span class="hljs-name">memory</span> <span class="hljs-attr">unit</span>=<span class="hljs-string">"KiB"</span>></span>16777216<span class="hljs-tag"></<span class="hljs-name">memory</span>></span> <span class="hljs-tag"><<span class="hljs-name">currentMemory</span> <span class="hljs-attr">unit</span>=<span class="hljs-string">"KiB"</span>></span>16777216<span class="hljs-tag"></<span class="hljs-name">currentMemory</span>></span> <span class="hljs-tag"><<span class="hljs-name">vcpu</span> <span class="hljs-attr">placement</span>=<span class="hljs-string">"static"</span>></span>16<span class="hljs-tag"></<span class="hljs-name">vcpu</span>></span> <span class="hljs-tag"><<span class="hljs-name">os</span>></span> <span class="hljs-tag"><<span class="hljs-name">type</span> <span class="hljs-attr">arch</span>=<span class="hljs-string">"x86_64"</span> <span class="hljs-attr">machine</span>=<span class="hljs-string">"pc-q35-8.0"</span>></span>hvm<span class="hljs-tag"></<span class="hljs-name">type</span>></span> <span class="hljs-tag"><<span class="hljs-name">loader</span> <span class="hljs-attr">readonly</span>=<span class="hljs-string">"yes"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pflash"</span>></span>/run/libvirt/nix-ovmf/OVMF_CODE.fd<span class="hljs-tag"></<span class="hljs-name">loader</span>></span> <span class="hljs-tag"><<span class="hljs-name">nvram</span> <span class="hljs-attr">template</span>=<span class="hljs-string">"/run/libvirt/nix-ovmf/OVMF_VARS.fd"</span>></span>/var/lib/libvirt/qemu/nvram/Windows11_VARS.fd<span class="hljs-tag"></<span class="hljs-name">nvram</span>></span> <span class="hljs-tag"></<span class="hljs-name">os</span>></span> <span class="hljs-tag"><<span class="hljs-name">features</span>></span> <span class="hljs-tag"><<span class="hljs-name">acpi</span>/></span> <span class="hljs-tag"><<span class="hljs-name">apic</span>/></span> <span class="hljs-tag"><<span class="hljs-name">hyperv</span> <span class="hljs-attr">mode</span>=<span class="hljs-string">"custom"</span>></span> <span class="hljs-tag"><<span class="hljs-name">relaxed</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">vapic</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">spinlocks</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span> <span class="hljs-attr">retries</span>=<span class="hljs-string">"8191"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">vpindex</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">runtime</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">synic</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">stimer</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">reset</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">vendor_id</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"GenuineIntel"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">frequencies</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">tlbflush</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">hyperv</span>></span> <span class="hljs-tag"><<span class="hljs-name">kvm</span>></span> <span class="hljs-tag"><<span class="hljs-name">hidden</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">kvm</span>></span> <span class="hljs-tag"><<span class="hljs-name">vmport</span> <span class="hljs-attr">state</span>=<span class="hljs-string">"off"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">features</span>></span> <span class="hljs-tag"><<span class="hljs-name">cpu</span> <span class="hljs-attr">mode</span>=<span class="hljs-string">"host-passthrough"</span> <span class="hljs-attr">check</span>=<span class="hljs-string">"none"</span> <span class="hljs-attr">migratable</span>=<span class="hljs-string">"on"</span>></span> <span class="hljs-tag"><<span class="hljs-name">topology</span> <span class="hljs-attr">sockets</span>=<span class="hljs-string">"1"</span> <span class="hljs-attr">dies</span>=<span class="hljs-string">"1"</span> <span class="hljs-attr">cores</span>=<span class="hljs-string">"8"</span> <span class="hljs-attr">threads</span>=<span class="hljs-string">"2"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">cpu</span>></span> <span class="hljs-tag"><<span class="hljs-name">clock</span> <span class="hljs-attr">offset</span>=<span class="hljs-string">"localtime"</span>></span> <span class="hljs-tag"><<span class="hljs-name">timer</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"rtc"</span> <span class="hljs-attr">tickpolicy</span>=<span class="hljs-string">"catchup"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">timer</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pit"</span> <span class="hljs-attr">tickpolicy</span>=<span class="hljs-string">"delay"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">timer</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"hpet"</span> <span class="hljs-attr">present</span>=<span class="hljs-string">"no"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">timer</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"hypervclock"</span> <span class="hljs-attr">present</span>=<span class="hljs-string">"yes"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">clock</span>></span> <span class="hljs-tag"><<span class="hljs-name">on_poweroff</span>></span>destroy<span class="hljs-tag"></<span class="hljs-name">on_poweroff</span>></span> <span class="hljs-tag"><<span class="hljs-name">on_reboot</span>></span>restart<span class="hljs-tag"></<span class="hljs-name">on_reboot</span>></span> <span class="hljs-tag"><<span class="hljs-name">on_crash</span>></span>destroy<span class="hljs-tag"></<span class="hljs-name">on_crash</span>></span> <span class="hljs-tag"><<span class="hljs-name">pm</span>></span> <span class="hljs-tag"><<span class="hljs-name">suspend-to-mem</span> <span class="hljs-attr">enabled</span>=<span class="hljs-string">"no"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">suspend-to-disk</span> <span class="hljs-attr">enabled</span>=<span class="hljs-string">"no"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">pm</span>></span> <span class="hljs-tag"><<span class="hljs-name">devices</span>></span> <span class="hljs-tag"><<span class="hljs-name">emulator</span>></span>/run/libvirt/nix-emulators/qemu-system-x86_64<span class="hljs-tag"></<span class="hljs-name">emulator</span>></span> <span class="hljs-tag"><<span class="hljs-name">disk</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"file"</span> <span class="hljs-attr">device</span>=<span class="hljs-string">"disk"</span>></span> <span class="hljs-tag"><<span class="hljs-name">driver</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"qemu"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"qcow2"</span> <span class="hljs-attr">discard</span>=<span class="hljs-string">"unmap"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">source</span> <span class="hljs-attr">file</span>=<span class="hljs-string">"/var/lib/libvirt/images/Windows11.qcow2"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">dev</span>=<span class="hljs-string">"vda"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"virtio"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">boot</span> <span class="hljs-attr">order</span>=<span class="hljs-string">"1"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x04"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">disk</span>></span> <span class="hljs-tag"><<span class="hljs-name">disk</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"file"</span> <span class="hljs-attr">device</span>=<span class="hljs-string">"cdrom"</span>></span> <span class="hljs-tag"><<span class="hljs-name">driver</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"qemu"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"raw"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">source</span> <span class="hljs-attr">file</span>=<span class="hljs-string">"/mnt/root/persistent/media/LegacyOS/Common/virtio-win-0.1.215.iso"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">dev</span>=<span class="hljs-string">"sdb"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"sata"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">readonly</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"drive"</span> <span class="hljs-attr">controller</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">target</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">unit</span>=<span class="hljs-string">"1"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">disk</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"usb"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"qemu-xhci"</span> <span class="hljs-attr">ports</span>=<span class="hljs-string">"15"</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"1"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"1"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x10"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span> <span class="hljs-attr">multifunction</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"2"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"2"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x11"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x1"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"3"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"3"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x12"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x2"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"4"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"4"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x13"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x3"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"5"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"5"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x14"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x4"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"6"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"6"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x15"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x5"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"7"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"7"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x16"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x6"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"8"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"8"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x17"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x02"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x7"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"9"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"9"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x18"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x03"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span> <span class="hljs-attr">multifunction</span>=<span class="hljs-string">"on"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"10"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"10"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x19"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x03"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x1"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"11"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"11"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x1a"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x03"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x2"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"12"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"12"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x1b"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x03"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x3"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"13"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"13"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x1c"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x03"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x4"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"14"</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"pcie-root-port"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"pcie-root-port"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">chassis</span>=<span class="hljs-string">"14"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0x1d"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x03"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x5"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"sata"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"0"</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x1f"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x2"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">controller</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"virtio-serial"</span> <span class="hljs-attr">index</span>=<span class="hljs-string">"0"</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x03"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">controller</span>></span> <span class="hljs-tag"><<span class="hljs-name">interface</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"network"</span>></span> <span class="hljs-tag"><<span class="hljs-name">mac</span> <span class="hljs-attr">address</span>=<span class="hljs-string">"52:54:00:f4:bf:15"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">source</span> <span class="hljs-attr">network</span>=<span class="hljs-string">"default"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"virtio"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x01"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">interface</span>></span> <span class="hljs-tag"><<span class="hljs-name">serial</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pty"</span>></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"isa-serial"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0"</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"isa-serial"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">target</span>></span> <span class="hljs-tag"></<span class="hljs-name">serial</span>></span> <span class="hljs-tag"><<span class="hljs-name">console</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pty"</span>></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"serial"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">console</span>></span> <span class="hljs-tag"><<span class="hljs-name">channel</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"spicevmc"</span>></span> <span class="hljs-tag"><<span class="hljs-name">target</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"virtio"</span> <span class="hljs-attr">name</span>=<span class="hljs-string">"com.redhat.spice.0"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"virtio-serial"</span> <span class="hljs-attr">controller</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"1"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">channel</span>></span> <span class="hljs-tag"><<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"mouse"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"ps2"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"mouse"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"virtio"</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x06"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">input</span>></span> <span class="hljs-tag"><<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"keyboard"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"ps2"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">input</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"keyboard"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"virtio"</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x07"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">input</span>></span> <span class="hljs-tag"><<span class="hljs-name">tpm</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"tpm-crb"</span>></span> <span class="hljs-tag"><<span class="hljs-name">backend</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"passthrough"</span>></span> <span class="hljs-tag"><<span class="hljs-name">device</span> <span class="hljs-attr">path</span>=<span class="hljs-string">"/dev/tpm0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">backend</span>></span> <span class="hljs-tag"></<span class="hljs-name">tpm</span>></span> <span class="hljs-tag"><<span class="hljs-name">graphics</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"spice"</span> <span class="hljs-attr">autoport</span>=<span class="hljs-string">"yes"</span>></span> <span class="hljs-tag"><<span class="hljs-name">listen</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"address"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">image</span> <span class="hljs-attr">compression</span>=<span class="hljs-string">"off"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">graphics</span>></span> <span class="hljs-tag"><<span class="hljs-name">sound</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"ich9"</span>></span> <span class="hljs-tag"><<span class="hljs-name">audio</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"1"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x1b"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">sound</span>></span> <span class="hljs-tag"><<span class="hljs-name">audio</span> <span class="hljs-attr">id</span>=<span class="hljs-string">"1"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"spice"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">video</span>></span> <span class="hljs-tag"><<span class="hljs-name">model</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"qxl"</span> <span class="hljs-attr">ram</span>=<span class="hljs-string">"65536"</span> <span class="hljs-attr">vram</span>=<span class="hljs-string">"65536"</span> <span class="hljs-attr">vgamem</span>=<span class="hljs-string">"16384"</span> <span class="hljs-attr">heads</span>=<span class="hljs-string">"1"</span> <span class="hljs-attr">primary</span>=<span class="hljs-string">"yes"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x01"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">video</span>></span> <span class="hljs-tag"><<span class="hljs-name">hostdev</span> <span class="hljs-attr">mode</span>=<span class="hljs-string">"subsystem"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">managed</span>=<span class="hljs-string">"yes"</span>></span> <span class="hljs-tag"><<span class="hljs-name">source</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x01"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">source</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"pci"</span> <span class="hljs-attr">domain</span>=<span class="hljs-string">"0x0000"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0x05"</span> <span class="hljs-attr">slot</span>=<span class="hljs-string">"0x00"</span> <span class="hljs-attr">function</span>=<span class="hljs-string">"0x0"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">hostdev</span>></span> <span class="hljs-tag"><<span class="hljs-name">redirdev</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"usb"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"spicevmc"</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"usb"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"2"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">redirdev</span>></span> <span class="hljs-tag"><<span class="hljs-name">redirdev</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"usb"</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"spicevmc"</span>></span> <span class="hljs-tag"><<span class="hljs-name">address</span> <span class="hljs-attr">type</span>=<span class="hljs-string">"usb"</span> <span class="hljs-attr">bus</span>=<span class="hljs-string">"0"</span> <span class="hljs-attr">port</span>=<span class="hljs-string">"3"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">redirdev</span>></span> <span class="hljs-tag"><<span class="hljs-name">watchdog</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"itco"</span> <span class="hljs-attr">action</span>=<span class="hljs-string">"reset"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">memballoon</span> <span class="hljs-attr">model</span>=<span class="hljs-string">"none"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">devices</span>></span> <span class="hljs-tag"><<span class="hljs-name">qemu:commandline</span>></span> <span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"-device"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"{<span class="hljs-symbol">&quot;</span>driver<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>ivshmem-plain<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>id<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>shmem0<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>memdev<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>looking-glass<span class="hljs-symbol">&quot;</span>}"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"-object"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"{<span class="hljs-symbol">&quot;</span>qom-type<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>memory-backend-file<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>id<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>looking-glass<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>mem-path<span class="hljs-symbol">&quot;</span>:<span class="hljs-symbol">&quot;</span>/dev/kvmfr0<span class="hljs-symbol">&quot;</span>,<span class="hljs-symbol">&quot;</span>size<span class="hljs-symbol">&quot;</span>:67108864,<span class="hljs-symbol">&quot;</span>share<span class="hljs-symbol">&quot;</span>:true}"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"-acpitable"</span>/></span> <span class="hljs-tag"><<span class="hljs-name">qemu:arg</span> <span class="hljs-attr">value</span>=<span class="hljs-string">"file=/etc/ssdt1.dat"</span>/></span> <span class="hljs-tag"></<span class="hljs-name">qemu:commandline</span>></span><span class="hljs-tag"></<span class="hljs-name">domain</span>></span></code></pre></div>]]></content:encoded>
<category domain="https://lantian.pub/category/modify-computer/">Computers and Clients</category>
<category domain="https://lantian.pub/tag/NVIDIA/">NVIDIA</category>
<category domain="https://lantian.pub/tag/MUXed/">MUXed</category>
<category domain="https://lantian.pub/tag/GPU/">GPU</category>
<category domain="https://lantian.pub/tag/Virtual-Machine/">Virtual Machine</category>
<comments>https://lantian.pub/en/article/modify-computer/laptop-muxed-nvidia-passthrough.lantian/#disqus_thread</comments>
</item>
<item>
<title>修改 APN 解决 AOSP ROM 上中国电信无法 4G 漫游问题</title>
<link>https://lantian.pub/article/random-notes/fix-china-telecom-roaming-no-4g-on-aosp-rom.lantian/</link>
<guid>https://lantian.pub/article/random-notes/fix-china-telecom-roaming-no-4g-on-aosp-rom.lantian/</guid>
<pubDate>Mon, 17 Apr 2023 11:16:39 GMT</pubDate>
<description><p>由于我的一加 8T 官方系统维护周期即将结束,我给手机刷上了 Nameless OS,一个基于 Lineage OS 的第三方安卓 ROM。但是刷入后,我发现我的中国电信手机卡无法连接本地运营商的 4G 网络进行国际漫游,只有 2G 或 3G 信号。由于本地运营商近期正在进</description>
<content:encoded><![CDATA[<p>由于我的一加 8T 官方系统维护周期即将结束,我给手机刷上了 Nameless OS,一个基于 Lineage OS 的第三方安卓 ROM。但是刷入后,我发现我的中国电信手机卡无法连接本地运营商的 4G 网络进行国际漫游,只有 2G 或 3G 信号。由于本地运营商近期正在进行 2G、3G 退网,手机的漫游信号很差,收发短信都有很大延迟,也无法连接 VoLTE 进行正常通话。</p><p>我测试了其它的基于 Lineage OS 的第三方 ROM,都有类似的问题。</p><p>多次尝试后,我发现问题<strong>似乎</strong>出在手机的 APN 设置上。我说「似乎」是因为修改 APN 解决了我的问题,但我完全不明白为什么能行,也不确定这就是正确的解决方法。</p><p>我解决问题的操作步骤如下:</p><ol><li>进入手机设置 - SIM 卡 - 中国电信 - APN 设置。</li><li>点击「中国电信 NET 设置」(<code>ctnet</code>)进入详细设置。</li><li>把 APN 类型一项(<code>APN Type</code>)修改为:<code>default,supl,dun,ims</code><ul><li>原始值是 <code>default,supl,dun</code></li></ul></li><li>点击右上角三个点的菜单,选择保存。</li><li>选中「中国电信 NET 设置」右边的单选按钮,启用这个 APN。</li><li>关机,等待几分钟再开机。</li></ol>]]></content:encoded>
<category domain="https://lantian.pub/category/random-notes/">随手记</category>
<category domain="https://lantian.pub/tag/AOSP/">AOSP</category>
<category domain="https://lantian.pub/tag/%E4%B8%AD%E5%9B%BD%E7%94%B5%E4%BF%A1/">中国电信</category>
<category domain="https://lantian.pub/tag/%E6%BC%AB%E6%B8%B8/">漫游</category>
<category domain="https://lantian.pub/tag/4G/">4G</category>
<category domain="https://lantian.pub/tag/LTE/">LTE</category>
<comments>https://lantian.pub/article/random-notes/fix-china-telecom-roaming-no-4g-on-aosp-rom.lantian/#disqus_thread</comments>
</item>
<item>
<title>Fix China Telecom 4G Roaming on AOSP ROM by Changing APN</title>
<link>https://lantian.pub/en/article/random-notes/fix-china-telecom-roaming-no-4g-on-aosp-rom.lantian/</link>
<guid>https://lantian.pub/en/article/random-notes/fix-china-telecom-roaming-no-4g-on-aosp-rom.lantian/</guid>
<pubDate>Mon, 17 Apr 2023 11:16:39 GMT</pubDate>
<description><p>Since the support life of my OnePlus 8T's official ROM is about to end, I flashed Nameless OS, a Lineage OS based third party Android ROM</description>
<content:encoded><![CDATA[<p>Since the support life of my OnePlus 8T's official ROM is about to end, I flashed Nameless OS, a Lineage OS based third party Android ROM, onto my phone. But after flashing the ROM, I found that my China Telecom SIM card cannot roam on the 4G network of local mobile service providers, only 2G or 3G work. Since the local providers are recently shutting down 2G and 3G networks, the roaming cellular signal strength is really bad. I experience a high latency on receiving or sending messages, nor can I use VoLTE to make calls normally.</p><p>I tested other Lineage OS based third party ROMS, and experienced the same problem.</p><p>After numerous attempts, I found that the problem <strong>seems to be</strong> with the phone's APN settings. I use the term <strong>seems to be</strong> because while changing APN settings fixed my problem, I have no idea why this works, nor am I sure this is the correct way to fix it.</p><p>Here are my steps to resolve the problem:</p><ol><li>Enter phone settings - SIMs - China Telecom - Access Point Name settings.</li><li>Select "中国电信 NET 设置" (<code>ctnet</code>, China Telecom NET Settings) to manage detailed options.</li><li>Change <code>APN Type</code> to: <code>default,supl,dun,ims</code><ul><li>The original value is <code>default,supl,dun</code></li></ul></li><li>Click the three-dot menu on top right, and select Save.</li><li>Select the radio button on the right of <code>ctnet</code>.</li><li>Turn off the phone, wait a few minutes, and then turn it on again.</li></ol>]]></content:encoded>
<category domain="https://lantian.pub/category/random-notes/">Random Notes</category>
<category domain="https://lantian.pub/tag/AOSP/">AOSP</category>
<category domain="https://lantian.pub/tag/4G/">4G</category>
<category domain="https://lantian.pub/tag/LTE/">LTE</category>
<category domain="https://lantian.pub/tag/China-Telecom/">China Telecom</category>
<category domain="https://lantian.pub/tag/Roaming/">Roaming</category>
<comments>https://lantian.pub/en/article/random-notes/fix-china-telecom-roaming-no-4g-on-aosp-rom.lantian/#disqus_thread</comments>
</item>
<item>
<title>老款 HP 工作站配置 NAS + 软路由笔记</title>
<link>https://lantian.pub/article/random-notes/old-hp-workstation-as-all-in-one.lantian/</link>
<guid>https://lantian.pub/article/random-notes/old-hp-workstation-as-all-in-one.lantian/</guid>
<pubDate>Sun, 26 Mar 2023 07:05:19 GMT</pubDate>
<description><p>我买了一台 HP 的老款工作站,用作家里的 NAS + 软路由。本文简单记录这台工作站的配置过程。</p>
<h2 id="硬件选择">硬件选择</h2>
<p>NAS 的硬件主要有这些选择:</p>
<ul>
<li>成品 NAS(如群晖)
<ul>
<li>优点:开箱即</description>
<content:encoded><![CDATA[<p>我买了一台 HP 的老款工作站,用作家里的 NAS + 软路由。本文简单记录这台工作站的配置过程。</p><h2 id="硬件选择">硬件选择</h2><p>NAS 的硬件主要有这些选择:</p><ul><li>成品 NAS(如群晖)<ul><li>优点:开箱即用。</li><li>缺点:<ul><li>价格贵,到了「买系统送硬件」的程度。</li><li>而且与各种 Linux 发行版相比,NAS 的原厂系统难以定制。</li></ul></li></ul></li><li>二手服务器<ul><li>优点:<ul><li>便宜,在保修期结束后,这些服务器大都被数据中心当作废品处理,然后被以极低的成本回收,翻新后二次售卖。</li><li>稳定,这些服务器在设计时就考虑了长期运行,并且一直在温湿度稳定、无尘的数据中心中运行。</li></ul></li><li>缺点:<ul><li>噪音大,需要自行修改系统配置降低风扇转速,更换静音风扇,或者加装风扇减速线。</li><li>体积大(主要针对机架式服务器)。</li><li>专用配件,服务器厂商会定制各种配件供特定型号使用,如果你有扩展需求,需要加价购买这些专用配件。</li></ul></li></ul></li><li>二手工作站<ul><li>优点:<ul><li>便宜、稳定,与二手服务器相同。</li><li>噪音小,毕竟工作站是放在办公桌边,而不是在机房里使用的。</li></ul></li><li>缺点:<ul><li>专用配件,与二手服务器相同。</li></ul></li></ul></li><li>自己购买电脑配件组装<ul><li>优点:<ul><li>可以完全根据自己的需求定制。</li></ul></li><li>缺点:<ul><li>性价比不如二手服务器和工作站。</li><li>家用配件的稳定性可能不及商用服务器和工作站。</li></ul></li></ul></li></ul><p>而我对于这台 NAS 有这些需求:</p><ul><li>需要能用 Jellyfin 视频转码,而且最好支持 H265 编码,这需要一块 6 代(Skylake)以上、带核显的 Intel CPU,或者加装一块 NVIDIA 的显卡。</li><li>除此之外,只要不是非常差的 CPU(例如 Intel Atom 系列),对性能几乎没要求。</li><li>低噪音,我目前住在单间公寓,这台 NAS 会和我的床放在同一个房间。</li><li>我不喜欢成品 NAS 系统,希望自己安装 NixOS 然后自行配置。</li><li>能省则省。</li></ul><p>综上考虑,我选择购买二手工作站。由于我目前人在美国,我在 eBay 上花 50 刀购买了一台回收翻新的 HP 工作站。关键配置如下:</p><ul><li>型号:HP Z220 SFF</li><li>CPU:Intel E3-1225 v2</li><li> 内存:4GB DDR3</li><li> 硬盘:250GB 机械硬盘</li><li>电源:240W</li><li> 三根 PCI-E 插槽,一根 x16,一根 x4,一根 x1。</li><li>一根 PCI 插槽。</li></ul><h2 id="存储">存储</h2><p>作为一台 NAS,首先要考虑的就是数据存储。</p><p>这台工作站有 4 个 SATA 接口,两个 3.5 寸硬盘位,以及一个 5 寸光驱位(安装了光驱)。我拆掉了原装硬盘,加装了两块全新的 16TB 机械硬盘作为数据存储。我计划先将这两块硬盘组成 RAID-1 阵列,在未来容量不足时加装一块硬盘转换成 RAID-5。ZFS 不支持从 RAID-1 转换成 RAID-5,而且 ZFS 内核模块由于没有合入 Linux 主线,经常在升级内核版本时挂掉。而 Btrfs 本身的 RAID-5 功能有严重的稳定性问题,容易造成数据丢失。因此我最终选择的方案是 Btrfs + LVM RAID-1,让配置更加灵活的 LVM 来处理 RAID。</p><p>我还加装了一块 180GB 容量的 Intel 520 固态硬盘作为启动盘。虽然这块固态硬盘很老,但它的价格仅 15 刀,接近 eBay 上所有固态硬盘的最低价,同时它用的是 MLC 颗粒,相比现在的 TLC 颗粒寿命更长。这块硬盘上只装 NixOS 的系统文件(<code>/nix</code> 文件夹),以及一些会被每日备份到云存储的配置文件,所以我不担心丢失这块硬盘上的数据。</p><p>另外,我配置了 <a href="https://github.com/Zygo/bees">Bees</a> 软件用于 Btrfs 上的数据去重。我的两块硬盘 RAID-1 后有大约 14.6TB 的可用存储空间。Bees 要有效去重这些空间需要占用 2GB 内存创建一个硬盘内容的哈希表。再加上一些其它的服务,原装的 4GB 内存很快就不够用了。因此,我又花了 40 刀给这台工作站换上了 16GB 的 DDR3 ECC-UDIMM 内存。</p><h2 id="软路由">软路由</h2><p>这台工作站将直接接到运营商的光猫,作为我家的主路由。我不打算使用 OpenWRT 等专为路由器设计的 Linux 发行版,而是直接在 NixOS 上配置数据包转发和 NAT。开启转发功能只需要几行 sysctl 配置:</p><pre><code class="hljs language-nix">{ boot.kernel.<span class="hljs-attr">sysctl</span> = { <span class="hljs-string">"net.ipv4.conf.all.forwarding"</span> = lib.mkForce <span class="hljs-number">1</span>; <span class="hljs-string">"net.ipv4.conf.default.forwarding"</span> = lib.mkForce <span class="hljs-number">1</span>; <span class="hljs-string">"net.ipv4.conf.*.forwarding"</span> = lib.mkForce <span class="hljs-number">1</span>; <span class="hljs-string">"net.ipv6.conf.all.forwarding"</span> = lib.mkForce <span class="hljs-number">1</span>; <span class="hljs-string">"net.ipv6.conf.default.forwarding"</span> = lib.mkForce <span class="hljs-number">1</span>; <span class="hljs-string">"net.ipv6.conf.*.forwarding"</span> = lib.mkForce <span class="hljs-number">1</span>; };}</code></pre><p>而 NAT 则需要配置防火墙。我使用的是 Nftables,简化后的配置如下:</p><pre><code class="hljs language-bash">table inet lantian { <span class="hljs-comment"># 内网 IPv4 地址</span> <span class="hljs-built_in">set</span> RESERVED_IPV4 { <span class="hljs-built_in">type</span> ipv4_addr flags constant,interval elements = { 10.0.0.0/8, 172.16.0.0/12, 192.0.0.0/24, 192.0.2.0/24, 192.168.0.0/16, 198.18.0.0/15, 198.51.100.0/24, 203.0.113.0/24, 233.252.0.0/24, 240.0.0.0/4 } } <span class="hljs-comment"># 内网 IPv6 地址</span> <span class="hljs-built_in">set</span> RESERVED_IPV6 { <span class="hljs-built_in">type</span> ipv6_addr flags constant,interval elements = { 64:ff9b::/96, 64:ff9b:1::/48, 2001:2::/48, 2001:20::/28, 2001:db8::/32, fc00::/7 } } chain NAT_POSTROUTING { <span class="hljs-built_in">type</span> nat hook postrouting priority srcnat + 5; policy accept; <span class="hljs-comment"># 对来自内网,去向公网网卡的 IPv4 数据包进行 NAT</span> ip saddr @RESERVED_IPV4 oifname @INTERFACE_WAN masquerade <span class="hljs-comment"># 对来自内网,去向公网网卡的 IPv6 数据包进行 NAT</span> <span class="hljs-comment"># 这里只处理源地址是私有 IP 的情况,不影响直接分配了公网 IP 的设备</span> ip6 saddr @RESERVED_IPV6 oifname @INTERFACE_WAN masquerade }}</code></pre><p>然后,启用 MiniUPnPd,让客户端上的软件可以按需配置端口转发:</p><pre><code class="hljs language-nix">{ services.<span class="hljs-attr">miniupnpd</span> = { <span class="hljs-attr">enable</span> = <span class="hljs-literal">true</span>; <span class="hljs-attr">upnp</span> = <span class="hljs-literal">true</span>; <span class="hljs-attr">natpmp</span> = <span class="hljs-literal">true</span>; <span class="hljs-attr">internalIPs</span> = [<span class="hljs-string">"192.168.0.1"</span>]; <span class="hljs-attr">externalInterface</span> = <span class="hljs-string">"eth-wan"</span>; };}</code></pre><p>最后在内网接口上配置好 IP 和 DHCP 服务器:</p><pre><code class="hljs language-nix">{ systemd.network.networks.<span class="hljs-attr">eth-lan</span> = { matchConfig.<span class="hljs-attr">PermanentMACAddress</span> = <span class="hljs-string">"12:34:56:12:34:56"</span>; <span class="hljs-attr">address</span> = [<span class="hljs-string">"192.168.0.1/24"</span>]; <span class="hljs-attr">networkConfig</span> = { <span class="hljs-attr">DHCP</span> = <span class="hljs-string">"no"</span>; <span class="hljs-attr">DHCPServer</span> = <span class="hljs-string">"yes"</span>; }; <span class="hljs-attr">dhcpServerConfig</span> = { <span class="hljs-attr">PoolOffset</span> = <span class="hljs-number">10</span>; <span class="hljs-attr">PoolSize</span> = <span class="hljs-number">200</span>; <span class="hljs-attr">EmitDNS</span> = <span class="hljs-string">"yes"</span>; <span class="hljs-attr">DNS</span> = [<span class="hljs-string">"8.8.8.8"</span>]; }; };}</code></pre><p>应用配置,<code>networkctl reload</code> 后,客户端即可上网。</p><h2 id="视频转码">视频转码</h2><p>作为一台小体积(Small Form Factor)工作站,HP Z220 SFF 只能使用矮尺寸的 PCI-E 扩展卡(高 8 厘米)而非标准尺寸的扩展卡(高 12 厘米)。同时,功率仅 240W 的电源也没有额外的显卡供电线,所以我也无法使用高性能高功率的显卡。</p><p>我最开始花 40 刀买了一块 NVIDIA Quadro P400 显卡,TDP 为 30W。这块显卡本身的性能非常差,约等于传说中的 NVIDIA GT 1010 显卡,连 1030 都不如。但是,它搭载了和 10 系列显卡相同的视频编解码电路,<a href="https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new">支持 H265 10-bit 视频编码</a>。除了 GPU 核心工作频率不同外,Quadro P400 的视频转码性能和其它 10 系列显卡完全相同。因此,Quadro P400 也是 Reddit 上很多 DIY NAS 玩家的选择。</p><p>我买到卡后进行测试,发现它确实可以转码 4K HEVC 的视频,速度能达到每秒 60 帧。但如果我开启了 Jellyfin 的色彩空间映射(Tone-mapping),转码速度就会降到十几帧,导致视频无法流畅播放。Tone-mapping 的用途是将视频的特殊色彩空间(例如 HDR 或杜比视界)映射到标准的 SDR 色彩空间,让不支持 HDR 的设备也可以正确地显示这些视频。Jellyfin Tone-mapping 用到了显卡的 3D 处理单元,而 Quadro P400 的 3D 性能非常弱,导致了瓶颈的出现。</p><p>我后来把显卡换成了 NVIDIA Tesla P4,TDP 为 50W。虽然它是数据中心用的 Tesla 系列计算卡,但它也支持视频编解码,拥有和 10 系列显卡相同的编解码电路。这张卡本身在 eBay 上卖 100 刀左右,但它是被动散热卡,所以还需要加 20 刀加装风扇和导风罩,让它能在普通台式电脑中散热。</p><p>经过测试,这张卡可以在开启 Tone-mapping 的情况下,以 60 帧左右的速度将视频转码到 4K 80Mbps,完全满足我的视频转码需求。唯一的缺点是,HP Z220 SFF 的主板上没有标准的风扇接口,我只能使用 USB 5V 到 12V 的升压线给风扇供电,让风扇一直保持在满速状态。风扇噪音稍大,但对我来说可以接受。</p><p>另外,Tesla P4 还原生支持 NVIDIA GRID 功能,可以生成虚拟显卡给虚拟机使用。但 NVIDIA GRID 需要使用专门的宿主机驱动,不支持 3D、视频转码等功能,导致 Jellyfin 的视频转码无法使用。所以,我目前还是选择使用普通的驱动。</p>]]></content:encoded>
<category domain="https://lantian.pub/category/random-notes/">随手记</category>
<category domain="https://lantian.pub/tag/HP/">HP</category>
<category domain="https://lantian.pub/tag/NAS/">NAS</category>
<category domain="https://lantian.pub/tag/%E8%BD%AF%E8%B7%AF%E7%94%B1/">软路由</category>
<comments>https://lantian.pub/article/random-notes/old-hp-workstation-as-all-in-one.lantian/#disqus_thread</comments>
</item>
<item>
<title>Notes on Setting Up NAS+Router on Old HP Workstation</title>
<link>https://lantian.pub/en/article/random-notes/old-hp-workstation-as-all-in-one.lantian/</link>
<guid>https://lantian.pub/en/article/random-notes/old-hp-workstation-as-all-in-one.lantian/</guid>
<pubDate>Sun, 26 Mar 2023 07:05:19 GMT</pubDate>
<description><p>I purchased an old HP workstation to use as a NAS and router at my home. This post is a short note of my process of setting it up.</p>
<h</description>
<content:encoded><![CDATA[<p>I purchased an old HP workstation to use as a NAS and router at my home. This post is a short note of my process of setting it up.</p><h2 id="hardware-choice">Hardware Choice</h2><p>For a NAS, you usually have these hardware choices:</p><ul><li>Ready-to-use NAS (e.g. Synology)<ul><li>Pros: ready to use out of the box.</li><li>Cons:<ul><li>Expensive, to the extent of "free hardware for software purchase".</li><li>Harder to customize, when comparing the stock operating system with various Linux distributions.</li></ul></li></ul></li><li>Second-hand servers<ul><li>Pros:<ul><li>Cheap. Most servers are thrown away by datacenters once their warranty ends, and are obtained at minimum cost, refurbished and then resold.</li><li>Stable. These servers are built to last, and are used in a datacenter with controlled temperature, humidity and no dust.</li></ul></li><li>Cons:<ul><li>Noise. In order to lower the fan speed, you need to modify system settings, replace with quieter fans, and/or install a separate fan speed controller.</li><li>Large (mainly concerning rack-mount servers).</li><li>Proprietary parts. Server manufacturers customize their components only for specific models. You will need to purchase these parts at extra cost if you want to extend your system.</li></ul></li></ul></li><li>Second-hand workstations<ul><li>Pros:<ul><li>Cheap and stable, just like second-hand servers.</li><li>Lower noise. After all, workstations are meant to be used near office desktops rather than datacenters.</li></ul></li><li>Cons:<ul><li>Proprietary parts, same as second-hand servers.\</li></ul></li></ul></li><li>DIY from regular PC parts<ul><li>Pros:<ul><li>Customize to your exact need.</li></ul></li><li>Cons:<ul><li>Worse performance-to-price ratio compared to second-hand servers and workstations.</li><li>Stability of customer-facing parts may be worse than business-facing servers and workstations.</li></ul></li></ul></li></ul><p>And I have these requirements for my NAS:</p><ul><li>I want to use Jellyfin video transcoding, H265 if possible. This requires a 6th gen (Skylake) or newer Intel CPU with integrated graphics, or an NVIDIA graphics card.</li><li>Other than that, I don't have any needs on performance (except very low power CPUs like Intel Atom series).</li><li>Low noise. I'm currently living in a studio apartment, and the NAS will be in the same room where I sleep.</li><li>I don't like ready-to-use NAS operating systems. I want to install NixOS and customize it.</li><li>As cheap as it can get.</li></ul><p>Based on the requirements above, I chose to purchase a second-hand workstation. Since I'm in the United States at the moment, I purchased a refurbushed HP workstation on eBay for $50. It's key specifications are:</p><ul><li>Model: HP Z220 SFF</li><li>CPU: Intel E3-1225 v2</li><li>RAM: 4GB DDR3</li><li>HDD: 250GB Spinning Disk</li><li>PSU: 240W</li><li>Three PCI-E slots, one x16, one x4, and one x1.</li><li>One PCI slot.</li></ul><h2 id="storage">Storage</h2><p>The first thing to consider for a NAS is data storage.</p><p>This workstation has 4 SATA ports, two 3.5 inch HDD slots, and one 5 inch DVD drive slot (with a DVD drive in it). I removed the original hard drive, and installed two brand new 16TB drives as data storage. I plan to set them up in a RAID-1 array first, and then expand to RAID-5 with an extra drive when I run out of capacity. ZFS doesn't support conversion from RAID-1 to RAID-5, and ZFS's kernel module often fails on kernel upgrades due to its out-of-mainline status. Btrfs, on the other hand, has major stability issues with its RAID-5 feature, and could easily cause a data loss. Therefore, I ended up with Btrfs + LVM RAID-1, and chose the more flexible LVM for the RAID layer.</p><p>I also installed a 180GB Intel 520 SSD as the boot drive. Although it's an old drive, it only costs $15, almost as cheap as it gets for SSDs on eBay. In addition, it uses MLC NANDs which last longer than current TLC NANDs. This drive will only hold NixOS's system files (in <code>/nix</code>), and some configuration files that are backed up to cloud storage everyday. Therefore, I'm not worried of data loss of this drive.</p><p>In addition, I set up <a href="https://github.com/Zygo/bees">Bees</a> to deduplicate data on Btrfs. After RAID-1 of my two drives, I have a usable storage space of approximately 14.6TB. To effectly deduplicate them, Bees needs 2GB RAM to create a hash table of data on drive. With some additional services running, I quickly ran out of memory with the original 4GB RAM. Therefore, I spent an additional $40 on 16GB of DDR3 ECC-UDIMM RAM sticks.</p><h2 id="router">Router</h2><p>This workstation will directly connect to the ISP modem, and act as the main router at my home. I don't plan to use OpenWRT, or other Linux distributions designed for routers. Instead, I want to set up packet forwarding and NAT directly on NixOS. Enabling packet forwarding is just a few lines of sysctl settings:</p><pre><code class="hljs language-nix">{ boot.kernel.<span class="hljs-attr">sysctl</span> = { <span class="hljs-string">"net.ipv4.conf.all.forwarding"</span> = lib.mkForce <span class="hljs-number">1</span>; <span class="hljs-string">"net.ipv4.conf.default.forwarding"</span> = lib.mkForce <span class="hljs-number">1</span>; <span class="hljs-string">"net.ipv4.conf.*.forwarding"</span> = lib.mkForce <span class="hljs-number">1</span>; <span class="hljs-string">"net.ipv6.conf.all.forwarding"</span> = lib.mkForce <span class="hljs-number">1</span>; <span class="hljs-string">"net.ipv6.conf.default.forwarding"</span> = lib.mkForce <span class="hljs-number">1</span>; <span class="hljs-string">"net.ipv6.conf.*.forwarding"</span> = lib.mkForce <span class="hljs-number">1</span>; };}</code></pre><p>NAT, on the other hand, requires setting up the firewall. I used Nftables for the purpose, and my (simplified) configuration is:</p><pre><code class="hljs language-bash">table inet lantian { <span class="hljs-comment"># LAN IPv4 address range</span> <span class="hljs-built_in">set</span> RESERVED_IPV4 { <span class="hljs-built_in">type</span> ipv4_addr flags constant,interval elements = { 10.0.0.0/8, 172.16.0.0/12, 192.0.0.0/24, 192.0.2.0/24, 192.168.0.0/16, 198.18.0.0/15, 198.51.100.0/24, 203.0.113.0/24, 233.252.0.0/24, 240.0.0.0/4 } } <span class="hljs-comment"># LAN IPv6 address range</span> <span class="hljs-built_in">set</span> RESERVED_IPV6 { <span class="hljs-built_in">type</span> ipv6_addr flags constant,interval elements = { 64:ff9b::/96, 64:ff9b:1::/48, 2001:2::/48, 2001:20::/28, 2001:db8::/32, fc00::/7 } } chain NAT_POSTROUTING { <span class="hljs-built_in">type</span> nat hook postrouting priority srcnat + 5; policy accept; <span class="hljs-comment"># NAT all packets from LAN IPv4 addresses to WAN</span> ip saddr @RESERVED_IPV4 oifname @INTERFACE_WAN masquerade <span class="hljs-comment"># NAT all packets from LAN IPv6 addresses to WAN</span> <span class="hljs-comment"># This only handles packets with a private source IP, not affecting devices with public IPv6</span> ip6 saddr @RESERVED_IPV6 oifname @INTERFACE_WAN masquerade }}</code></pre><p>Then, enable MiniUPnPd to allow clients to set up port forwarding as needed:</p><pre><code class="hljs language-nix">{ services.<span class="hljs-attr">miniupnpd</span> = { <span class="hljs-attr">enable</span> = <span class="hljs-literal">true</span>; <span class="hljs-attr">upnp</span> = <span class="hljs-literal">true</span>; <span class="hljs-attr">natpmp</span> = <span class="hljs-literal">true</span>; <span class="hljs-attr">internalIPs</span> = [<span class="hljs-string">"192.168.0.1"</span>]; <span class="hljs-attr">externalInterface</span> = <span class="hljs-string">"eth-wan"</span>; };}</code></pre><p>Finally, set up IP addressing and DHCP server on the LAN port:</p><pre><code class="hljs language-nix">{ systemd.network.networks.<span class="hljs-attr">eth-lan</span> = { matchConfig.<span class="hljs-attr">PermanentMACAddress</span> = <span class="hljs-string">"12:34:56:12:34:56"</span>; <span class="hljs-attr">address</span> = [<span class="hljs-string">"192.168.0.1/24"</span>]; <span class="hljs-attr">networkConfig</span> = { <span class="hljs-attr">DHCP</span> = <span class="hljs-string">"no"</span>; <span class="hljs-attr">DHCPServer</span> = <span class="hljs-string">"yes"</span>; }; <span class="hljs-attr">dhcpServerConfig</span> = { <span class="hljs-attr">PoolOffset</span> = <span class="hljs-number">10</span>; <span class="hljs-attr">PoolSize</span> = <span class="hljs-number">200</span>; <span class="hljs-attr">EmitDNS</span> = <span class="hljs-string">"yes"</span>; <span class="hljs-attr">DNS</span> = [<span class="hljs-string">"8.8.8.8"</span>]; }; };}</code></pre><p>Apply the config, run <code>networkctl reload</code>, and the clients can now access the Internet.</p><h2 id="video-transcoding">Video Transcoding</h2><p>As a SFF (small form factor) workstation, HP Z220 SFF can only use low profile PCI-E expansion cards (with a height of 8cm) rather than standard expansion cards (with a height of 12cm). In addition, the power supply with a mere 240W output has no extra wiring for GPU power supply, meaning that I cannot use GPUs with high performance and high power consumption.</p><p>Initially I purchased a NVIDIA Quadro P400 for $40, with a TDP of 30W. The performance of this card is extremely weak, similar to the NVIDIA GT 1010 which few have heard off. It's even worse than a GT 1030! However, it comes with the same video encoding/decoding circuits as the entire 10 series GPUs, <a href="https://developer.nvidia.com/video-encode-and-decode-gpu-support-matrix-new">and has support for H265 10-bit video encoding</a>. Other than different GPU core frequency, Quadro P400 has the exact same video transcoding performance as other 10 series GPUs. Therefore, Quadro P400 is the choice of many NAS DIYers on Reddit.</p><p>After obtaining the GPU and testing it, I found that it can, indeed, transcode 4K HEVC videos, at a speed of 60 FPS. But once I enable Jellyfin's tone-mapping feature, the transcoding speed lowers to 10-20 FPS, causing stuttery video playback. Tone-mapping is used to convert the special color space (like HDR or Dolby Vision) to the standard SDR color space, so they can be correctly displayed on devices without HDR support. Jellyfin's tone-mapping implementation relies on GPU's 3D processing units, and the Quadro P400 is quite lacking in 3D performance, causing the bottleneck.</p><p>I later replaced the GPU with a NVIDIA Tesla P4, with a TDP of 50W. Although it's a Tesla series card geared towards science computations, it still has support for video transcoding, with the same circuits as 10 series GPUs. This card sells on eBay for around <span class="math math-inline"><span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mn>100</mn><mo separator="true">,</mo><mi>b</mi><mi>u</mi><mi>t</mi><mi>s</mi><mi>i</mi><mi>n</mi><mi>c</mi><mi>e</mi><mi>i</mi><msup><mi>t</mi><mo mathvariant="normal" lspace="0em" rspace="0em">′</mo></msup><mi>s</mi><mi>p</mi><mi>a</mi><mi>s</mi><mi>s</mi><mi>i</mi><mi>v</mi><mi>e</mi><mi>l</mi><mi>y</mi><mi>c</mi><mi>o</mi><mi>o</mi><mi>l</mi><mi>e</mi><mi>d</mi><mo separator="true">,</mo><mi>y</mi><mi>o</mi><mi>u</mi><mi>n</mi><mi>e</mi><mi>e</mi><mi>d</mi><mi>t</mi><mi>o</mi><mi>s</mi><mi>p</mi><mi>e</mi><mi>n</mi><mi>d</mi><mi>a</mi><mi>n</mi><mi>e</mi><mi>x</mi><mi>t</mi><mi>r</mi><mi>a</mi></mrow><annotation encoding="application/x-tex">100, but since it's passively cooled, you need to spend an extra </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.9463em;vertical-align:-0.1944em;"></span><span class="mord">100</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal">b</span><span class="mord mathnormal">u</span><span class="mord mathnormal">t</span><span class="mord mathnormal">s</span><span class="mord mathnormal">in</span><span class="mord mathnormal">ce</span><span class="mord mathnormal">i</span><span class="mord"><span class="mord mathnormal">t</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span class="vlist" style="height:0.7519em;"><span style="top:-3.063em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight">′</span></span></span></span></span></span></span></span></span><span class="mord mathnormal">s</span><span class="mord mathnormal">p</span><span class="mord mathnormal">a</span><span class="mord mathnormal">ss</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.03588em;">v</span><span class="mord mathnormal">e</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">ycoo</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span><span class="mord mathnormal">d</span><span class="mpunct">,</span><span class="mspace" style="margin-right:0.1667em;"></span><span class="mord mathnormal">yo</span><span class="mord mathnormal">u</span><span class="mord mathnormal">n</span><span class="mord mathnormal">ee</span><span class="mord mathnormal">d</span><span class="mord mathnormal">t</span><span class="mord mathnormal">os</span><span class="mord mathnormal">p</span><span class="mord mathnormal">e</span><span class="mord mathnormal">n</span><span class="mord mathnormal">d</span><span class="mord mathnormal">an</span><span class="mord mathnormal">e</span><span class="mord mathnormal">x</span><span class="mord mathnormal">t</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal">a</span></span></span></span></span>20 on fans and shrouds to cool it in a regular workstation.</p><p>Testing shows that this GPU can transcode videos to 4K 80Mbps at around 60 FPS, with tone-mapping enabled. This completely satisfies my requirement on video transcoding. The only downside is that, with no standard computer fan pinouts on the HP Z220 SFF motherboard, I have to power the fan with a USB 5V to 12V voltage step-up cable, and run the fan at maximum speed at all time. This creates a bit of noise, but I personally find it acceptable.</p><p>As a bonus, NVIDIA Tesla P4 also supports NVIDIA GRID, creating virtual GPUs for virtual machines. But NVIDIA GRID requires a special host driver lacking 3D and video transcoding support, causing my Jellyfin video transcodes to fail. Therefore, I went with the regular GPU driver for now.</p>]]></content:encoded>
<category domain="https://lantian.pub/category/random-notes/">Random Notes</category>
<category domain="https://lantian.pub/tag/HP/">HP</category>
<category domain="https://lantian.pub/tag/NAS/">NAS</category>
<category domain="https://lantian.pub/tag/Router/">Router</category>
<comments>https://lantian.pub/en/article/random-notes/old-hp-workstation-as-all-in-one.lantian/#disqus_thread</comments>
</item>
<item>
<title>NixOS 系列(四):「无状态」操作系统</title>
<link>https://lantian.pub/article/modify-computer/nixos-impermanence.lantian/</link>
<guid>https://lantian.pub/article/modify-computer/nixos-impermanence.lantian/</guid>
<pubDate>Fri, 13 Jan 2023 17:55:12 GMT</pubDate>
<description><blockquote>
<p>NixOS 系列文章目录:</p>
<ul>
<li><a href="/article/modify-website/nixos-why.lantian/">NixOS 系列(一):我为什么心动了</a></li>
<li><a href="/a</description>
<enclosure url="https://lantian.pub//usr/uploads/202110/nixos-social-preview.png" type="image"/>
<content:encoded><![CDATA[<blockquote><p>NixOS 系列文章目录:</p><ul><li><a href="/article/modify-website/nixos-why.lantian/">NixOS 系列(一):我为什么心动了</a></li><li><a href="/article/modify-website/nixos-initial-config-flake-deploy.lantian/">NixOS 系列(二):基础配置,Nix Flake,和批量部署</a><ul><li>推荐阅读:<a href="https://thiscute.world/posts/nixos-and-flake-basics/">NixOS 与 Nix Flakes 新手入门</a>,作者 Ryan Yin</li></ul></li><li><a href="/article/modify-computer/nixos-packaging.lantian/">NixOS 系列(三):软件打包,从入门到放弃</a></li><li><a href="/article/modify-computer/nixos-impermanence.lantian/">NixOS 系列(四):「无状态」操作系统</a></li></ul></blockquote><blockquote><p>更新记录:</p><p>2023-02-18:在「移动 Nix Daemon 的临时文件夹」一段,修正配置不对 root 用户生效的问题。</p></blockquote><p>NixOS 广为人知的一大特点是,系统大部分软件的设置都由 Nix 语言的配置文件统一生成并管理。即使这些软件在运行时修改了自己的配置文件,在下次切换 Nix 配置或者系统重启时,NixOS 也会将配置文件重新覆盖。</p><p>例如,在运行 NixOS 的电脑上运行 <code>ls -alh /etc</code>,可以看到大部分配置文件都只是到 <code>/etc/static</code> 的软链接:</p><pre><code class="hljs language-bash"><span class="hljs-comment"># 省略部分不相关的行</span>lrwxrwxrwx 1 root root 18 Jan 13 03:02 bashrc -> /etc/static/bashrclrwxrwxrwx 1 root root 18 Jan 13 03:02 dbus-1 -> /etc/static/dbus-1lrwxrwxrwx 1 root root 17 Jan 13 03:02 fonts -> /etc/static/fontslrwxrwxrwx 1 root root 17 Jan 13 03:02 fstab -> /etc/static/fstablrwxrwxrwx 1 root root 21 Jan 13 03:02 fuse.conf -> /etc/static/fuse.conf-rw-r--r-- 1 root root 913 Jan 13 03:02 grouplrwxrwxrwx 1 root root 21 Jan 13 03:02 host.conf -> /etc/static/host.conflrwxrwxrwx 1 root root 18 Jan 13 03:02 <span class="hljs-built_in">hostid</span> -> /etc/static/hostidlrwxrwxrwx 1 root root 20 Jan 13 03:02 hostname -> /etc/static/hostnamelrwxrwxrwx 1 root root 17 Jan 13 03:02 hosts -> /etc/static/hosts<span class="hljs-comment"># ...</span></code></pre><p>而 <code>/etc/static</code> 本身则被链接到 <code>/nix/store</code>,被 NixOS 统一管理:</p><pre><code class="hljs language-bash">lrwxrwxrwx 1 root root 51 Jan 13 03:02 /etc/static -> /nix/store/41plm7py84sp29w3bg4ahb41dpfxwf9l-etc/etc</code></pre><p>那么问题就来了:有必要把 <code>/etc</code> 的内容存在硬盘上吗?反正每次重启或切换配置时,这里的内容都会被重新生成一遍。</p><p>类似的,看起来 NixOS 根目录下大部分文件都是可以根据配置生成的:</p><ul><li><code>/bin</code> 文件夹下只有一个 <code>/bin/sh</code>,被软链接到 <code>/nix/store</code> 里的 Bash;</li><li><code>/etc</code> 文件夹中的大部分文件都由 NixOS 的配置文件管理;</li><li><code>/usr</code> 文件夹下只有一个 <code>/usr/bin/env</code>,被软链接到 <code>/nix/store</code> 里的 Coreutils;</li><li><code>/mnt</code> 和 <code>/srv</code> 文件夹默认是空的;<ul><li>并且 <code>/mnt</code> 本身一般不存数据,只用来放其它分区的挂载点。</li></ul></li><li><code>/dev</code>, <code>/proc</code> 和 <code>/sys</code> 本身就是存放硬件设备和系统状态的虚拟文件夹;</li><li><code>/run</code> 和 <code>/tmp</code> 本身都是存放临时文件的内存盘。<ul><li>注:在给软件打包时,Nix Daemon 会将临时文件存在 <code>/tmp</code> 目录下。如果 <code>/tmp</code> 是内存盘,打大型软件包(例如 Linux 内核)时容易爆内存。因此 NixOS 的 <code>/tmp</code> <strong>默认不是内存盘</strong>,需要手动用 <code>boot.tmpOnTmpfs = true;</code> 开启。</li></ul></li></ul><p>排除上面的文件夹,只有少数几个文件夹存放了需要真正写入硬盘的数据:</p><ul><li><code>/boot</code> 存放启动引导器;</li><li><code>/home</code> 和 <code>/root</code> 存放各个用户的家目录;</li><li><code>/nix</code> 存放 NixOS 的所有软件包;</li><li><code>/var</code> 存放系统软件的数据文件。</li></ul><p>实际上,NixOS 本身只需要 <code>/boot</code> 和 <code>/nix</code> 就可以正常启动。从 <a href="https://nixos.org/download.html">NixOS 官网下载页面</a>下载的 ISO 里面除了 ISOLinux 启动引导器,就只有一个 <code>nix-store.squashfs</code> 文件,对应 <code>/nix/store</code> 里的数据:</p><pre><code class="hljs language-bash"><span class="hljs-comment"># unsquashfs -l nix-store.squashfs | head</span>squashfs-rootsquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xzsquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/libsquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/lib/firmwaresquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/lib/firmware/rtw88squashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/lib/firmware/rtw88/rtl8822cu_fw.bin.xzsquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/lib/firmware/rtw88/rtw8723d_fw.bin.xzsquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/lib/firmware/rtw88/rtw8821c_fw.bin.xzsquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/lib/firmware/rtw88/rtw8822b_fw.bin.xzsquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/lib/firmware/rtw88/rtw8822c_fw.bin.xz<span class="hljs-comment"># ...</span></code></pre><p>那么,能不能改造 NixOS,模仿安装光盘的行为,在硬盘上只保留 <code>/boot</code>,<code>/home</code>,<code>/nix</code>,<code>/root</code>,<code>/var</code> 这几个必要文件夹的数据?更直接地说,能不能直接把 <code>/</code> 根目录配置成一个内存盘,再把这几个文件夹的数据挂载到对应位置?</p><p>答案是:可以,而且不用改造。安装光盘上的 NixOS 除了挂载 <code>nix-store.squashfs</code> 以外,其它的行为都与安装在硬盘上的 NixOS 相同。</p><h2 id="无状态的优点">「无状态」的优点</h2><p>相比普通的 NixOS,这样配置的「无状态」NixOS 只会把一部分你指定的「状态」保存在硬盘上。这些状态可能包括你的网站网页文件、数据库内容、浏览器的记录等。除此之外,剩余的、你没有指定保存的「状态」都会在重启之后被丢弃。</p><p>这就是这种配置的最大优点:只保留你想要的状态。</p><ul><li>如果有的软件偷偷修改了它的配置文件,或者把数据存在了不该存的位置,重启后这些修改都会丢失,从而保证软件的配置与你在 Nix 配置文件中指定的完全相同。</li><li>你的 <code>/etc</code> 中不会有卸载软件后的残留。如果有的话,它们在下次重启后就消失了。</li><li>你只需要备份不被 Nix 管理的状态(例如 <code>/home</code>,<code>/root</code>,<code>/var</code>),再加上 Nix 配置文件,就能保证可以还原出一模一样的系统。</li><li>由于根目录中的大部分文件是根据配置生成的软链接,根目录的内存盘几乎不占空间。例如我的一台服务器上,根目录只占用了 660KB 空间:</li></ul><pre><code class="hljs language-bash"><span class="hljs-comment"># sudo du -h -d1 -x /</span>0 /srv0 /mnt0 /usr0 /bin0 /home572K /root0 /tmp84K /etc4.0K /var660K /</code></pre><h2 id="准备工作">准备工作</h2><p>要根据本文对 NixOS 进行配置,你需要准备:</p><ol><li>一个安装完成的 NixOS,并且使用 Flake 管理配置。</li><li>一个 NixOS 或其它 Linux 发行版的 LiveCD,我们需要移动 NixOS 的关键文件。</li></ol><h2 id="将根目录修改成内存盘">将根目录修改成内存盘</h2><p>当你安装好一个普通的 NixOS,你的 Nix 配置中一般会有类似这样的根分区配置:</p><pre><code class="hljs language-nix">fileSystems.<span class="hljs-string">"/"</span> = { <span class="hljs-attr">device</span> = <span class="hljs-string">"/dev/vda1"</span>; <span class="hljs-attr">fsType</span> = <span class="hljs-string">"btrfs"</span>; <span class="hljs-attr">options</span> = [ <span class="hljs-string">"compress-force=zstd"</span> ];};<span class="hljs-comment"># 你可能还挂载了其它分区,例如 /boot</span>fileSystems.<span class="hljs-string">"/boot"</span> = {};</code></pre><p>对 NixOS 来说最重要的文件夹是 <code>/nix</code>,因此我们把根目录 <code>/</code> 修改成内存盘 <code>tmpfs</code>,并把原来的分区挂载到 <code>/nix</code> 上:</p><pre><code class="hljs language-nix">fileSystems.<span class="hljs-string">"/"</span> = { <span class="hljs-attr">device</span> = <span class="hljs-string">"tmpfs"</span>; <span class="hljs-attr">fsType</span> = <span class="hljs-string">"tmpfs"</span>; <span class="hljs-comment"># 必须设置 mode=755,否则默认的权限将是 777,导致 OpenSSH 报错并拒绝用户登录</span> <span class="hljs-attr">options</span> = [ <span class="hljs-string">"relatime"</span> <span class="hljs-string">"mode=755"</span> ];};fileSystems.<span class="hljs-string">"/nix"</span> = { <span class="hljs-attr">device</span> = <span class="hljs-string">"/dev/vda1"</span>; <span class="hljs-attr">fsType</span> = <span class="hljs-string">"btrfs"</span>; <span class="hljs-attr">options</span> = [ <span class="hljs-string">"compress-force=zstd"</span> ];};<span class="hljs-comment"># 其它分区不用动</span>fileSystems.<span class="hljs-string">"/boot"</span> = {};</code></pre><p>理论上来说,此时应用配置并关机,再到 LiveCD 里把文件移动到正确的位置,就可以获得一个<strong>只保留了 Nix 配置中的状态</strong>的 NixOS。这个状态可以满足临时使用的需要(例如 NixOS 安装光盘),但因为没有保存一些重要的、不由 Nix 配置文件管理的状态,不适合日常使用。</p><p>没有保存的重要状态包括:</p><ul><li><code>/etc/machine-id</code>,SystemD 给每个系统随机生成的 ID,用于管理日志</li><li><code>/etc/NetworkManager/system-connections</code>,Network Manager 保存的连接</li><li><code>/etc/ssh/ssh_host_ed25519_key.pub</code>,OpenSSH 的公钥</li><li><code>/etc/ssh/ssh_host_rsa_key.pub</code>,OpenSSH 的公钥</li><li><code>/etc/ssh/ssh_host_ed25519_key</code>,OpenSSH 的私钥</li><li><code>/etc/ssh/ssh_host_rsa_key</code>,OpenSSH 的私钥</li><li>以及 <code>/home</code>,<code>/root</code>,<code>/var</code> 里的数据文件</li></ul><p>所以,下一步操作就是单独指定这些文件 / 文件夹,把它们也保存到硬盘上。</p><h2 id="保存重要的状态文件">保存重要的状态文件</h2><p>由于已经挂载了 <code>/nix</code> 分区,所以我选择把状态文件放在 <code>/nix/persistent</code> 目录中。你也可以把这些状态文件放在其它的分区上。</p><p>然后,用 Bind mount 把状态文件映射回它们本该在的地方:</p><pre><code class="hljs language-nix">fileSystems.<span class="hljs-string">"/etc/machine-id"</span> = { <span class="hljs-attr">device</span> = <span class="hljs-string">"/nix/persistent/etc/machine-id"</span>; <span class="hljs-attr">options</span> = [ <span class="hljs-string">"bind"</span> ];};<span class="hljs-comment"># ...</span></code></pre><p>如果你要保存的文件很多,你就需要给每一个文件或文件夹单独添加一个 mount,麻烦且容易出错。好消息是,Nix 社区针对这种使用场景提供了一个 NixOS 模块 <a href="https://github.com/nix-community/impermanence">Impermanence</a>,可以方便地批量映射文件到另一个位置。</p><p>首先,在你的 <code>flake.nix</code> 中将 Impermanence 添加到 <code>inputs</code> 中:</p><pre><code class="hljs language-nix">{ <span class="hljs-attr">inputs</span> = { impermanence.<span class="hljs-attr">url</span> = <span class="hljs-string">"github:nix-community/impermanence"</span>; };}</code></pre><p>然后把 Impermanence 添加到 NixOS 的模块列表中:</p><pre><code class="hljs language-nix">{ <span class="hljs-attr">outputs</span> = { self, nixpkgs, ... }@inputs: { nixosConfigurations.<span class="hljs-string">"nixos"</span> = nixpkgs.lib.nixosSystem { <span class="hljs-attr">system</span> = <span class="hljs-string">"x86_64-linux"</span>; <span class="hljs-attr">modules</span> = [ <span class="hljs-comment"># 添加下面这行</span> inputs.impermanence.nixosModules.impermanence ./configuration.nix ]; }; };}</code></pre><p>你就可以用这样的格式批量映射文件,不用自己写一大堆 <code>fileSystems</code> 的 mount 了:</p><pre><code class="hljs language-nix"><span class="hljs-comment"># /nix/persistent 是你实际保存文件的地方</span>environment.persistence.<span class="hljs-string">"/nix/persistent"</span> = { <span class="hljs-comment"># 不让这些映射的 mount 出现在文件管理器的侧边栏中</span> <span class="hljs-attr">hideMounts</span> = <span class="hljs-literal">true</span>; <span class="hljs-comment"># 你要映射的文件夹</span> <span class="hljs-attr">directories</span> = [ <span class="hljs-string">"/etc/NetworkManager/system-connections"</span> <span class="hljs-string">"/home"</span> <span class="hljs-string">"/root"</span> <span class="hljs-string">"/var"</span> ]; <span class="hljs-comment"># 你要映射的文件</span> <span class="hljs-attr">files</span> = [ <span class="hljs-string">"/etc/machine-id"</span> <span class="hljs-string">"/etc/ssh/ssh_host_ed25519_key.pub"</span> <span class="hljs-string">"/etc/ssh/ssh_host_ed25519_key"</span> <span class="hljs-string">"/etc/ssh/ssh_host_rsa_key.pub"</span> <span class="hljs-string">"/etc/ssh/ssh_host_rsa_key"</span> ]; <span class="hljs-comment"># 类似的,你还可以在用户的 home 目录中单独映射文件和文件夹</span> users.<span class="hljs-attr">lantian</span> = { <span class="hljs-attr">directories</span> = [ <span class="hljs-comment"># 个人文件</span> <span class="hljs-string">"Desktop"</span> <span class="hljs-string">"Documents"</span> <span class="hljs-string">"Downloads"</span> <span class="hljs-string">"Music"</span> <span class="hljs-string">"Pictures"</span> <span class="hljs-string">"Videos"</span> <span class="hljs-comment"># 配置文件夹</span> <span class="hljs-string">".cache"</span> <span class="hljs-string">".config"</span> <span class="hljs-string">".gnupg"</span> <span class="hljs-string">".local"</span> <span class="hljs-string">".ssh"</span> ]; <span class="hljs-attr">files</span> = [ ]; };};</code></pre><h2 id="移动-nix-daemon-的临时文件夹">移动 Nix Daemon 的临时文件夹</h2><p>在给软件打包时,Nix Daemon 会将临时文件存在 <code>/tmp</code> 目录下。如果 <code>/tmp</code> 是内存盘,打大型软件包(例如 Linux 内核)时容易爆内存。</p><p>NixOS 的 <code>/tmp</code> 默认不是内存盘,但随着我们的配置,<code>/tmp</code> 也将被存放在根目录的内存盘上。因此,我们可以将 Nix Daemon 的临时文件移动到硬盘上,例如我设置的是 <code>/var/cache/nix</code>:</p><pre><code class="hljs language-nix">systemd.services.<span class="hljs-attr">nix-daemon</span> = { <span class="hljs-attr">environment</span> = { <span class="hljs-comment"># 指定临时文件的位置</span> <span class="hljs-attr">TMPDIR</span> = <span class="hljs-string">"/var/cache/nix"</span>; }; <span class="hljs-attr">serviceConfig</span> = { <span class="hljs-comment"># 在 Nix Daemon 启动时自动创建 /var/cache/nix</span> <span class="hljs-attr">CacheDirectory</span> = <span class="hljs-string">"nix"</span>; };};</code></pre><p>但是,这项配置不对 root 用户生效,这是因为在 root 用户下,nix 命令会自己处理构建请求,而不是把请求发给 Nix Daemon。因此,我们还需要添加一个环境变量 <code>NIX_REMOTE=daemon</code>,强制让 nix 命令调用 Daemon:</p><pre><code class="hljs language-nix">environment.variables.<span class="hljs-attr">NIX_REMOTE</span> = <span class="hljs-string">"daemon"</span>;</code></pre><blockquote><p>感谢 NixOS 中文 Telegram 群「洗白白」提出的问题,以及「Nick Cao」给出的解决方案。</p></blockquote><h2 id="激活配置">激活配置</h2><p>上面的配置完成后,终于可以激活配置了。</p><p>首先用 <code>sudo nixos-rebuild boot --flake .</code>,在下次开机时激活新的配置。注意不要用 <code>sudo nixos-rebuild switch --flake .</code>,因为在真正激活配置前,我们需要先到 LiveCD 中把文件移动到正确的位置。</p><p>重新启动电脑到 LiveCD 中,挂载并 <code>cd</code> 进原来的根分区:</p><ul><li><strong>如果你不熟悉流程,做好数据备份!</strong></li><li>新建一个 <code>persistent</code> 文件夹,对应系统启动后的 <code>/nix/persistent</code>;</li><li>把上面列出的,要保存的路径都移动到 <code>persistent</code> 文件夹中;</li><li>删除除了 <code>nix</code> 和 <code>persistent</code> 以外的其它文件夹;<ul><li><strong>删除前请做好数据备份!</strong></li></ul></li><li>把 <code>nix</code> 中的所有文件夹移到当前目录下;</li><li>最后删除 <code>nix</code> 文件夹,重启。</li></ul><p>如果你一切操作正确,就可以启动到「无状态」的 NixOS 中了。你选择保留的数据文件将全部映射到原来的位置,所以系统使用起来也应该没什么区别。但是此时,你的根分区已经变成了一个 <code>tmpfs</code> 内存盘,你不想要的状态数据将全部在重启后消失,你每次开机都将得到一个「全新」的操作系统。</p><h2 id="资料来源">资料来源</h2><p>我的配置过程参考了以下资料:</p><ul><li><a href="https://grahamc.com/blog/erase-your-darlings">Erase your darlings - Graham Christensen</a><ul><li> 最早的无状态实现,使用 ZFS 快照在每次重启时恢复状态。</li></ul></li><li><a href="https://elis.nu/blog/2020/05/nixos-tmpfs-as-root/">NixOS: tmpfs as root - Elis Hirwing</a></li><li><a href="https://github.com/nix-community/impermanence">Impermanence</a><ul><li>NixOS 的无状态辅助模块。</li></ul></li></ul><p>我的相关配置可以在这里找到:</p><ul><li>Impermanence 模块的配置:<a href="https://github.com/xddxdd/nixos-config/blob/f7cbc14f23a7d6bb21ca4edb153f704735fe5419/nixos/common-components/impermanence.nix">https://github.com/xddxdd/nixos-config/blob/f7cbc14f23a7d6bb21ca4edb153f704735fe5419/nixos/common-components/impermanence.nix</a></li><li> 用户 home 目录的配置:<a href="https://github.com/xddxdd/nixos-config/blob/f7cbc14f23a7d6bb21ca4edb153f704735fe5419/nixos/client-components/impermanence.nix">https://github.com/xddxdd/nixos-config/blob/f7cbc14f23a7d6bb21ca4edb153f704735fe5419/nixos/client-components/impermanence.nix</a></li></ul>]]></content:encoded>
<category domain="https://lantian.pub/category/modify-computer/">计算机与客户端</category>
<category domain="https://lantian.pub/tag/NixOS/">NixOS</category>
<comments>https://lantian.pub/article/modify-computer/nixos-impermanence.lantian/#disqus_thread</comments>
</item>
<item>
<title>NixOS Series 4: "Stateless" Operating System</title>
<link>https://lantian.pub/en/article/modify-computer/nixos-impermanence.lantian/</link>
<guid>https://lantian.pub/en/article/modify-computer/nixos-impermanence.lantian/</guid>
<pubDate>Fri, 13 Jan 2023 17:55:12 GMT</pubDate>
<description><blockquote>
<p>List of NixOS Series Posts:</p>
<ul>
<li><a href="/en/article/modify-website/nixos-why.lantian/">NixOS Series 1: Why I fell </description>
<enclosure url="https://lantian.pub//usr/uploads/202110/nixos-social-preview.png" type="image"/>
<content:encoded><![CDATA[<blockquote><p>List of NixOS Series Posts:</p><ul><li><a href="/en/article/modify-website/nixos-why.lantian/">NixOS Series 1: Why I fell in love</a></li><li><a href="/en/article/modify-website/nixos-initial-config-flake-deploy.lantian/">NixOS Series 2: Basic Config, Nix Flake & Batch Deploy</a><ul><li>Recommended: <a href="https://thiscute.world/en/posts/nixos-and-flake-basics/">NixOS & Nix Flakes - A Guide for Beginners</a> by Ryan Yin</li></ul></li><li><a href="/en/article/modify-computer/nixos-packaging.lantian/">NixOS Series 3: Software Packaging 101</a></li><li><a href="/en/article/modify-computer/nixos-impermanence.lantian/">NixOS Series 4: "Stateless" Operating System</a></li></ul></blockquote><blockquote><p>Changelog:</p><p>2023-02-18: Fix config not applied to the root user, in the "Move Temp Directory of Nix Daemon" section.</p></blockquote><p>One of the most famous features of NixOS is that most software configurations on the system are generated and managed exclusively by a Nix-language config file. Even if such software modifies its config file while running, the config file will still be overwritten on the next Nix config switch or the next reboot.</p><p>For example, if you run <code>ls -alh /etc</code> on a computer running NixOS, you can observe that most config files are simply soft links to <code>/etc/static</code>:</p><pre><code class="hljs language-bash"><span class="hljs-comment"># Unrelated lines omitted</span>lrwxrwxrwx 1 root root 18 Jan 13 03:02 bashrc -> /etc/static/bashrclrwxrwxrwx 1 root root 18 Jan 13 03:02 dbus-1 -> /etc/static/dbus-1lrwxrwxrwx 1 root root 17 Jan 13 03:02 fonts -> /etc/static/fontslrwxrwxrwx 1 root root 17 Jan 13 03:02 fstab -> /etc/static/fstablrwxrwxrwx 1 root root 21 Jan 13 03:02 fuse.conf -> /etc/static/fuse.conf-rw-r--r-- 1 root root 913 Jan 13 03:02 grouplrwxrwxrwx 1 root root 21 Jan 13 03:02 host.conf -> /etc/static/host.conflrwxrwxrwx 1 root root 18 Jan 13 03:02 <span class="hljs-built_in">hostid</span> -> /etc/static/hostidlrwxrwxrwx 1 root root 20 Jan 13 03:02 hostname -> /etc/static/hostnamelrwxrwxrwx 1 root root 17 Jan 13 03:02 hosts -> /etc/static/hosts<span class="hljs-comment"># ...</span></code></pre><p><code>/etc/static</code> itself, by the way, is linked to <code>/nix/store</code> and managed by NixOS:</p><pre><code class="hljs language-bash">lrwxrwxrwx 1 root root 51 Jan 13 03:02 /etc/static -> /nix/store/41plm7py84sp29w3bg4ahb41dpfxwf9l-etc/etc</code></pre><p>Here's the question: is it really necessary to store the contents of <code>/etc</code> on the disk drive? They're going to be regenerated on each reboot or config switch anyway.</p><p>Similarly, it seems that most files on the NixOS root partition can be generated with the config file:</p><ul><li><code>/bin</code> folder only contains <code>/bin/sh</code>, which is soft linked to Bash in <code>/nix/store</code>;</li><li><code>/etc</code> folder contains mostly files managed by NixOS;</li><li><code>/usr</code> folder only contains <code>/usr/bin/env</code>, which is soft linked to Coreutils in <code>/nix/store</code>;</li><li><code>/mnt</code> and <code>/srv</code> are empty by default;<ul><li>And instead of actual data, <code>/mnt</code> is usually used to store mount points for other partitions.</li></ul></li><li><code>/dev</code>, <code>/proc</code> and <code>/sys</code> are virtual folders that store the state of hardware devices and the system itself;</li><li><code>/run</code> and <code>/tmp</code> are RAM-backed storages for temporary files.<ul><li>Note: Nix Daemon stores its temporary files under <code>/tmp</code> while packaging. If <code>/tmp</code> is backed by RAM, the system may run out of memory while building large packages (such as the Linux kernel). Therefore, <code>/tmp</code> in NixOS is <strong>NOT BACKED BY RAM BY DEFAULT</strong>, and needs to be enabled with <code>boot.tmpOnTmpfs = true;</code>.</li></ul></li></ul><p>Excluding these folders, only a few folders store data that actually need preserving on disk:</p><ul><li><code>/boot</code> for the bootloader;</li><li><code>/home</code> and <code>/root</code> for home directories of users;</li><li><code>/nix</code> for all packages of NixOS;</li><li><code>/var</code> for data files of system-level software.</li></ul><p>In fact, NixOS itself only requires <code>/boot</code> and <code>/nix</code> to boot. The ISO downloaded from the <a href="https://nixos.org/download.html">Official NixOS download page</a> contains, in addition to the ISOLinux bootloader, only a <code>nix-store.squashfs</code> containing data in <code>/nix/store</code>:</p><pre><code class="hljs language-bash"><span class="hljs-comment"># unsquashfs -l nix-store.squashfs | head</span>squashfs-rootsquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xzsquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/libsquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/lib/firmwaresquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/lib/firmware/rtw88squashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/lib/firmware/rtw88/rtl8822cu_fw.bin.xzsquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/lib/firmware/rtw88/rtw8723d_fw.bin.xzsquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/lib/firmware/rtw88/rtw8821c_fw.bin.xzsquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/lib/firmware/rtw88/rtw8822b_fw.bin.xzsquashfs-root/01qm2r3cihmf4np82mim8vy9phzgc9cn-rtw88-firmware-unstable-2022-11-05-xz/lib/firmware/rtw88/rtw8822c_fw.bin.xz<span class="hljs-comment"># ...</span></code></pre><p>Then, is it possible to modify NixOS to mimic the behavior of the installation ISO, and only save the necessary folders of <code>/boot</code>, <code>/home</code>, <code>/nix</code>, <code>/root</code>, <code>/var</code> to disk? Or to put it in a direct way, is it possible to set the root directory <code>/</code> backed by RAM, and mount these folders to their expected locations?</p><p>The answer is yes, and there's no modification needed. Except mounting <code>nix-store.squashfs</code>, the NixOS on installation ISO uses the exact same boot sequence as a regular NixOS on hard drive.</p><h2 id="pros-for-going-stateless">Pros for going "Stateless"</h2><p>Compared to a regular NixOS, such a "stateless" NixOS only stores the "states" you designated onto your disk. Such state may contain files of your website, contents of your database, or browsing history of your browser. The rest of the states, which you did not designate to save, are discarded upon reboot.</p><p>This is the most significant "pro" for such a configuration: you only preserve the states you want.</p><ul><li>If some software secretly changes its config file, or stores its data on a different location, such modifications are lost on reboot, and the software's configuration will be exactly the same as the expected value in your Nix-language config file.</li><li>There will be no left over config files in your <code>/etc</code> folder. They are gone on the next reboot.</li><li>You only need to backup those states not managed by Nix (such as <code>/home</code>, <code>/root</code> and <code>/var</code>), in addition to the Nix-language config file, to ensure that you can restore the system to the exact state.</li><li>Since most files in the root directory are soft links generated according to the config file, it takes almost no space. On one of my servers, the root directory takes as small as 660KB of space:</li></ul><pre><code class="hljs language-bash"><span class="hljs-comment"># sudo du -h -d1 -x /</span>0 /srv0 /mnt0 /usr0 /bin0 /home572K /root0 /tmp84K /etc4.0K /var660K /</code></pre><h2 id="preparation">Preparation</h2><p>To set up a NixOS following steps in this post, you need to prepare:</p><ol><li>A NixOS installation, managing its configuration with Flake.</li><li>A LiveCD of NixOS or any other Linux distro, since we need to move critical files of NixOS.</li></ol><h2 id="convert-root-to-ram-drive">Convert Root to RAM Drive</h2><p>With a regular NixOS installation, you usually have a config entry for the root partition similar to this:</p><pre><code class="hljs language-nix">fileSystems.<span class="hljs-string">"/"</span> = { <span class="hljs-attr">device</span> = <span class="hljs-string">"/dev/vda1"</span>; <span class="hljs-attr">fsType</span> = <span class="hljs-string">"btrfs"</span>; <span class="hljs-attr">options</span> = [ <span class="hljs-string">"compress-force=zstd"</span> ];};<span class="hljs-comment"># You may also mounted other partitions, like /boot</span>fileSystems.<span class="hljs-string">"/boot"</span> = {};</code></pre><p>The most important folder for NixOS is <code>/nix</code>, so we change the root folder <code>/</code> to a RAM drive <code>tmpfs</code>, and mount the original partition onto <code>/nix</code>:</p><pre><code class="hljs language-nix">fileSystems.<span class="hljs-string">"/"</span> = { <span class="hljs-attr">device</span> = <span class="hljs-string">"tmpfs"</span>; <span class="hljs-attr">fsType</span> = <span class="hljs-string">"tmpfs"</span>; <span class="hljs-comment"># You must set mode=755. The default is 777, and OpenSSH will complain and disallow logins</span> <span class="hljs-attr">options</span> = [ <span class="hljs-string">"relatime"</span> <span class="hljs-string">"mode=755"</span> ];};fileSystems.<span class="hljs-string">"/nix"</span> = { <span class="hljs-attr">device</span> = <span class="hljs-string">"/dev/vda1"</span>; <span class="hljs-attr">fsType</span> = <span class="hljs-string">"btrfs"</span>; <span class="hljs-attr">options</span> = [ <span class="hljs-string">"compress-force=zstd"</span> ];};<span class="hljs-comment"># No need to change config for other partitions</span>fileSystems.<span class="hljs-string">"/boot"</span> = {};</code></pre><p>Theoretically, if you apply this config, shutdown, move the files to the correct location in LiveCD, you will get a NixOS that <strong>only persists whatever is in your Nix configuration</strong>. This is good for using temporarily (like the NixOS installation ISO), but since other important states not managed by Nix is not persisted, this is not ideal for everyday usage.</p><p>Such important states not preserved include:</p><ul><li><code>/etc/machine-id</code>, random ID generated by SystemD, used for log management</li><li><code>/etc/NetworkManager/system-connections</code>, connections stored by Network Manager</li><li><code>/etc/ssh/ssh_host_ed25519_key.pub</code>, OpenSSH public key</li><li><code>/etc/ssh/ssh_host_rsa_key.pub</code>, OpenSSH public key</li><li><code>/etc/ssh/ssh_host_ed25519_key</code>, OpenSSH private key</li><li><code>/etc/ssh/ssh_host_rsa_key</code>, OpenSSH private key</li><li>and data files in <code>/home</code>, <code>/root</code>, <code>/var</code></li></ul><p>Our next step is to add rules for each of these files/folders, to persist them on disk.</p><h2 id="persisting-important-state-files">Persisting Important State Files</h2><p>Since <code>/nix</code> partition is already mounted, I elected to store the states in <code>/nix/persistent</code>. You can store them on another partition at your discretion.</p><p>Then, use bind mounts to map the files back to where they should be:</p><pre><code class="hljs language-nix">fileSystems.<span class="hljs-string">"/etc/machine-id"</span> = { <span class="hljs-attr">device</span> = <span class="hljs-string">"/nix/persistent/etc/machine-id"</span>; <span class="hljs-attr">options</span> = [ <span class="hljs-string">"bind"</span> ];};<span class="hljs-comment"># ...</span></code></pre><p>If you need to persist lots of files, you need one mount for each file or folder, which is cumbersome and error-prone. The good news is that the Nix community provided a NixOS module <a href="https://github.com/nix-community/impermanence">Impermanence</a> for such scenario, which provides a convenient way to map files to another location.</p><p>First, add Impermanence to <code>inputs</code> in your <code>flake.nix</code>:</p><pre><code class="hljs language-nix">{ <span class="hljs-attr">inputs</span> = { impermanence.<span class="hljs-attr">url</span> = <span class="hljs-string">"github:nix-community/impermanence"</span>; };}</code></pre><p>Then add Impermanence to the module list of NixOS:</p><pre><code class="hljs language-nix">{ <span class="hljs-attr">outputs</span> = { self, nixpkgs, ... }@inputs: { nixosConfigurations.<span class="hljs-string">"nixos"</span> = nixpkgs.lib.nixosSystem { <span class="hljs-attr">system</span> = <span class="hljs-string">"x86_64-linux"</span>; <span class="hljs-attr">modules</span> = [ <span class="hljs-comment"># Add the following line</span> inputs.impermanence.nixosModules.impermanence ./configuration.nix ]; }; };}</code></pre><p>You will be able to map files in batch with the following format, no longer needing a lot of mounts in <code>fileSystems</code>:</p><pre><code class="hljs language-nix"><span class="hljs-comment"># /nix/persistent is the location you plan to store the files</span>environment.persistence.<span class="hljs-string">"/nix/persistent"</span> = { <span class="hljs-comment"># Hide these mount from the sidebar of file managers</span> <span class="hljs-attr">hideMounts</span> = <span class="hljs-literal">true</span>; <span class="hljs-comment"># Folders you want to map</span> <span class="hljs-attr">directories</span> = [ <span class="hljs-string">"/etc/NetworkManager/system-connections"</span> <span class="hljs-string">"/home"</span> <span class="hljs-string">"/root"</span> <span class="hljs-string">"/var"</span> ]; <span class="hljs-comment"># Files you want to map</span> <span class="hljs-attr">files</span> = [ <span class="hljs-string">"/etc/machine-id"</span> <span class="hljs-string">"/etc/ssh/ssh_host_ed25519_key.pub"</span> <span class="hljs-string">"/etc/ssh/ssh_host_ed25519_key"</span> <span class="hljs-string">"/etc/ssh/ssh_host_rsa_key.pub"</span> <span class="hljs-string">"/etc/ssh/ssh_host_rsa_key"</span> ]; <span class="hljs-comment"># Similarly, you can map files and folders in users' home directories</span> users.<span class="hljs-attr">lantian</span> = { <span class="hljs-attr">directories</span> = [ <span class="hljs-comment"># Personal files</span> <span class="hljs-string">"Desktop"</span> <span class="hljs-string">"Documents"</span> <span class="hljs-string">"Downloads"</span> <span class="hljs-string">"Music"</span> <span class="hljs-string">"Pictures"</span> <span class="hljs-string">"Videos"</span> <span class="hljs-comment"># Config folders</span> <span class="hljs-string">".cache"</span> <span class="hljs-string">".config"</span> <span class="hljs-string">".gnupg"</span> <span class="hljs-string">".local"</span> <span class="hljs-string">".ssh"</span> ]; <span class="hljs-attr">files</span> = [ ]; };};</code></pre><h2 id="move-temp-directory-of-nix-daemon">Move Temp Directory of Nix Daemon</h2><p>Nix Daemon stores its temporary files under <code>/tmp</code> while packaging. If <code>/tmp</code> is backed by RAM, the system may run out of memory while building large packages (such as the Linux kernel).</p><p><code>/tmp</code> in NixOS is not backed by RAM by default, but with our configuration, <code>/tmp</code> will be placed on root folder's RAM drive. Therefore, we can move Nix Daemon's temp files onto the disk. I'm moving it to <code>/var/cache/nix</code> for example:</p><pre><code class="hljs language-nix">systemd.services.<span class="hljs-attr">nix-daemon</span> = { <span class="hljs-attr">environment</span> = { <span class="hljs-comment"># Location for temporary files</span> <span class="hljs-attr">TMPDIR</span> = <span class="hljs-string">"/var/cache/nix"</span>; }; <span class="hljs-attr">serviceConfig</span> = { <span class="hljs-comment"># Create /var/cache/nix automatically on Nix Daemon start</span> <span class="hljs-attr">CacheDirectory</span> = <span class="hljs-string">"nix"</span>; };};</code></pre><p>However, this option does not apply to the root user. This is caused by the nix command handling the build request itself under root user, rather than passing it to the Nix Daemon. Therefore, we need to add an environment variable <code>NIX_REMOTE=daemon</code>, to force the nix command to call the daemon:</p><pre><code class="hljs language-nix">environment.variables.<span class="hljs-attr">NIX_REMOTE</span> = <span class="hljs-string">"daemon"</span>;</code></pre><blockquote><p>Thanks for NixOS CN Telegram group user "洗白白" for pointing out the problem, and "Nick Cao" for providing a fix.</p></blockquote><h2 id="activate-config">Activate Config</h2><p>With the configuration complete, it's finally time to activate it.</p><p>First, run <code>sudo nixos-rebuild boot --flake .</code> to activate the config on next reboot. Remember not to use <code>sudo nixos-rebuild switch --flake .</code>, since we need to move the files to their correct locations in LiveCD before we can use that config.</p><p>Reboot the computer into the LiveCD, and mount and <code>cd</code> into the original root partition:</p><ul><li><strong>BACK UP YOUR DATA if you're unfamiliar with the process!</strong></li><li>Create a <code>persistent</code> folder, correspondong to <code>/nix/persistent</code> after system starts;</li><li>Move all preserved paths listed above into the <code>persistent</code> folder;</li><li>Remove all folders except <code>nix</code> and <code>persistent</code>;<ul><li><strong>BACK UP YOUR DATA BEFORE REMOVAL!</strong></li></ul></li><li>Move all folders in <code>nix</code> to the current directory;</li><li>Finally, remove the <code>nix</code> directory and reboot.</li></ul><p>If you did everything correctly, you will boot into a "stateless" NixOS. Everything you elected to persist will be mounted back to their original location, so the system should behave exactly the same. However, your root partition has become a <code>tmpfs</code> RAM disk, all states you don't intend to keep will disappear after a reboot, and you will get a "brand new" operating system each time you start your computer.</p><h2 id="references">References</h2><p>During my configuration process, I referenced the following resources:</p><ul><li><a href="https://grahamc.com/blog/erase-your-darlings">Erase your darlings - Graham Christensen</a><ul><li>The earliest "stateless" implementation, restoring states with ZFS snapshots.</li></ul></li><li><a href="https://elis.nu/blog/2020/05/nixos-tmpfs-as-root/">NixOS: tmpfs as root - Elis Hirwing</a></li><li><a href="https://github.com/nix-community/impermanence">Impermanence</a><ul><li>NixOS helper module for going stateless.</li></ul></li></ul><p>You can find my relate configuration in these links:</p><ul><li>Impermanence module config: <a href="https://github.com/xddxdd/nixos-config/blob/f7cbc14f23a7d6bb21ca4edb153f704735fe5419/nixos/common-components/impermanence.nix">https://github.com/xddxdd/nixos-config/blob/f7cbc14f23a7d6bb21ca4edb153f704735fe5419/nixos/common-components/impermanence.nix</a></li><li>User home directory config: <a href="https://github.com/xddxdd/nixos-config/blob/f7cbc14f23a7d6bb21ca4edb153f704735fe5419/nixos/client-components/impermanence.nix">https://github.com/xddxdd/nixos-config/blob/f7cbc14f23a7d6bb21ca4edb153f704735fe5419/nixos/client-components/impermanence.nix</a></li></ul>]]></content:encoded>
<category domain="https://lantian.pub/category/modify-computer/">Computers and Clients</category>
<category domain="https://lantian.pub/tag/NixOS/">NixOS</category>
<comments>https://lantian.pub/en/article/modify-computer/nixos-impermanence.lantian/#disqus_thread</comments>
</item>
</channel>
</rss>