前言


電胖達今天一樣在努力發電呢(笑)

懶懶最近開始遊玩幻獸帕魯,雖然沒有朋友可以一起玩,但是仍想手動架設多人專用伺服器試試。在網路上有大神thijsvanloef開發palworld-server-dockerDocker映像,正好手上的Google Cloud帳號還有試用金可以使用,便開了一台4C16G的臺灣伺服器使用。

但在使用一段時間後發現AMD EPYC Milan處理器(Series T2D)的單核性能達不到需求,常常一顆帕魯球丟出去,卻發生卡在捕獲機率的畫面,約莫延遲5-10秒才跳出捕獲與否訊息,對遊玩體驗有非常大的影響。所以懶懶就將腦筋動到新加坡地區才有的AMD EPYC Genoa處理器(Series C3D),沒想到問題就此發生,新設的新加坡伺服器居然無法連上網路更新Steam!!

懶懶註:想試用AMD EPYC Genoa處理器的網友可以進入主控台Compute Engine API頁面,向Google申請擴充配額喔!


*問題為Docker環境中Steamcmd無法更新App,而查看Logs出現下列訊息:

  1. Update state (0x3) reconfiguring, progress: 0.00 (0 / 0)
  2. Error! App ‘2394010’ state is 0x2 after update job.
root@gcp-sg:~# docker logs palworld-server  
...  
Update state (0x3) reconfiguring, progress: 0.00 (0 / 0)  
Error! App '2394010' state is 0x2 after update job.  
./PalServer.sh does not exist.  
Try restarting with UPDATE_ON_BOOT=true  

故障分析


  1. 首先懶懶比較GCP臺灣伺服器與新設的GCP新加坡伺服器,除了CPU不同外其餘部分大同小異,值得注意的是新的C3D系列VM預設採用gVNIC介面卡

開機硬碟
進階選項-網路介面卡

  1. 查看VM硬體裝置及網路,然後測試Docker是否能ping通外部網路
root@gcp-sg:~# lspci
00:00.0 Host bridge: Intel Corporation 440FX - 82441FX PMC [Natoma] (rev 02)
00:01.0 ISA bridge: Intel Corporation 82371AB/EB/MB PIIX4 ISA (rev 03)
00:01.3 Bridge: Intel Corporation 82371AB/EB/MB PIIX4 ACPI (rev 03)
00:03.0 Ethernet controller: Google, Inc. Compute Engine Virtual Ethernet [gVNIC]
00:04.0 VGA compatible controller: Google, Inc. Device a002 (rev 01)
00:05.0 Unclassified device [00ff]: Red Hat, Inc. Virtio RNG
00:06.0 Non-Volatile memory controller: Google, Inc. Device 001f (rev 01)
root@gcp-sg:~# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc mq state UP group default qlen 1000
    link/ether 42:01:0a:94:00:02 brd ff:ff:ff:ff:ff:ff
    altname enp0s3
    inet 10.148.0.2/32 metric 100 scope global dynamic ens3
       valid_lft 44458sec preferred_lft 44458sec
    inet6 fe80::4001:aff:fe94:2/64 scope link 
       valid_lft forever preferred_lft forever
3: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:24:49:b2:ea brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:24ff:fe49:b2ea/64 scope link 
       valid_lft forever preferred_lft forever
122: veth303eb2c@if121: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master docker0 state UP group default 
    link/ether 16:50:3c:57:2a:0b brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet6 fe80::1450:3cff:fe57:2a0b/64 scope link 
       valid_lft forever preferred_lft forever
root@gcp-sg:~# docker network ls
NETWORK ID     NAME      DRIVER    SCOPE
4b00b3d5bbe5   bridge    bridge    local
7a187e3138e4   host      host      local
f07e4a2daf45   none      null      local
root@gcp-sg:~# docker ps
CONTAINER ID   IMAGE     COMMAND   CREATED        STATUS        PORTS     NAMES
20b37d2b7bc2   alpine    "ash"     1 hours ago   Up 1 hours             alpine1
root@gcp-sg:~# docker exec -it alpine1 ash
/ # ping www.google.com
PING www.google.com (142.250.4.99): 56 data bytes
64 bytes from 142.250.4.99: seq=0 ttl=114 time=0.709 ms
64 bytes from 142.250.4.99: seq=1 ttl=114 time=0.434 ms
64 bytes from 142.250.4.99: seq=2 ttl=114 time=0.435 ms
^C
--- www.google.com ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.434/0.526/0.709 ms
/ #
  1. 懶懶測試使用官方Palworld tech guide來架設伺服器,經實測SteamCMD可以正常更新
Steam>login anonymous +app_update 2394010 validate +quit
...
 Update state (0x81) verifying update, progress: 92.12 (2082252453 / 2260368082)
Success! App '2394010' fully installed.

經測試發現未使用Docker容器,採直接安裝的方式可以順利更新帕魯,但在Docker內使用alpine映像容器卻也同樣能ping通外部網路,可見問題沒那麼簡單!

問題解決


最後在懶懶不斷爬文後,發現問題出現在網路介面卡中的ens3docker0

  1. docker0設定之mtu值為1500
  2. ens3設定之mtu值為1460

Docker0傳送封包mtu(傳輸單元最大值)大過ens3限值時,封包即無法通過導致網路時通時不通。

所以只要手動設定Docker0的mtu值就可以順利解決問題囉!

詳細設置方法可參考最底層參考資料#3!

#使用編輯器新建下列文件
sudo nano /etc/docker/daemon.json

#加入下列內容
{
  "mtu": 1460
}

#重啟docker daemon
sudo systemctl restart docker

懶懶在此提供我的Docker Compose供參

services:
   palworld:
      image: thijsvanloef/palworld-server-docker:latest
      restart: unless-stopped
      container_name: palworld-server
      stop_grace_period: 30s
      ports:
        - 8211:8211/udp
        - 27015:27015/udp
      environment:
         PUID: 1000
         PGID: 1000
         PORT: 8211 # Optional but recommended
         PLAYERS: 16 # Optional but recommended
         MULTITHREADING: true
         RCON_ENABLED: true
         RCON_PORT: 25575
         TZ: "UTC"
         ADMIN_PASSWORD: ""
         COMMUNITY: false
         SERVER_NAME: "懶懶幻獸帕魯伺服器"
         SERVER_DESCRIPTION: "懶懶幻獸帕魯伺服器"
      env_file:
         - .env
      volumes:
         - ./palworld:/palworld/
networks:
   default:
      driver: bridge
      driver_opts:
         com.docker.network.driver.mtu: 1460 #指定default網路mtu值

參考資料

  1. https://www.cloudflare.com/zh-tw/learning/network-layer/what-is-mtu/
  2. https://www.googlecloudcommunity.com/gc/Infrastructure-Compute-Storage/Urgent-need-help-My-Docker-Container-can-t-access-internet/m-p/623132
  3. https://www.zeng.dev/post/2022-the-docker-mtu-problem/