-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Something Wrong with Sonic on Arista 7050QX-32. I can not run sonic on arista 7050QX-32. #592
Comments
Did this get worked out? |
It seems the created 'tmpfs' is too small, I've increased the size by remounting. Still not sure on how to boot up the containers tho.
|
Adding more memory of type DDR3 Unbuffered ECC will work, default speed is 1333Mhz. |
This device unfortunately has a really small storage device (2GB). Here are some pointers as to where you can poke to make it work for your case.
If it's a lab/hobby device, you should consider upgrading the storage capacity of your device. |
Hello @Staphylo I have the same issue here.
So actually, this is not the flash size which limit our space but tmpfs size.
About resolution:
Thanks |
Both the flash size and ram size are troublesome at this point. You have to consider using the right release based on their size.
USB key and SSD should work if the necessary changes are made in the code. The main problem you're facing is that the install destination is implied to be /mnt/flash which is internal storage device. Have you tried the following?
Yes, sonic-net/sonic-buildimage@48ba459 is possible for this product. |
Maybe only for this last time (before next versions grow in size), a bigger tmpfs could fit.
Regarding this extract, does it happen only for the first boot / installation, or each boot?
flash_size is what is toggling to use docker_inram, so even with a large flash device, as soon as this device is detected with any "real flash size", we will try to extract in the tmpfs, where the definition is 1,5G but we need at least 1,9G. Am I wrong?
If I saw correctly, if you wipe the internal flash (table partition), Aboot mount the USB volume as /mnt/flash (this is my hack for now). I suppose it works for SONiC as the created partition is placed on
No, I will try. Many thanks for your quick answer and your support. I am new to SONiC and my first challenge is to install it! |
Yes 1.9/2G should do just fine.
Yes the extraction process only happens on firstboot.
Right, you can override kernel parameters by putting new ones in
You shouldn't wipe the table partition. The storage device should have at least 1 partition available, you might run into problems otherwise. |
Will try 🤞
Ok so eventually I will try to use an external device for the first boot then move to the flash, if I still have issues. I understand by doing this that it will not a good solution with many devices.
Thanks for this hint, if it permit to changes settings without rebuilding SONiC each time..!
Sure I recreated it (identically as before), I only saw that it changes the mount point and "fake" the USB device to be mounted as the internal flash.
Which should be in vfat when EOS is installed. -> I try the different approaches above and I'll be back to give feedback. |
First observations:
I think this is enough to say that it will never fit into 2G and now makes it mandatory to change the internal flash before all in this device. Second part:
So, if we have to recommend something on this question, for a normal installation + later upgrades, is it more or less right to say:
? Do we need an absolute minimum of 8G of flash, but maybe with logs or anything 16 to 32 should let some good days to admins?
So not in this way but at the end it should do the same as me. If the final FS takes directly after the install ~2,5G, I think we can skip it and go to hardware replacement. Another thing not directly related: |
Glad you were able to tweak things your way!
Indeed, it's pretty convenient, though you should know that the "primary" image is self-installing.
Yes 8GB is the bare minimum nowadays, I'd definitely recommend upgrading to 16G minimum if possible.
Not quite sure why that is. It has happened when the flash is full but seems like it's not the case here. |
Ok understood. I didn't have issue by editing only "the first" boot0 maybe because flash_size is not used for something else. But I got your point and will do like that next.
I found 32G USB key cheaper than 16G. And at least, the cost of storage flash is not really an issue, always less than 20/30€ now (for 32G).
It happened with a successful installation. I start from scratch with new USB keys, boot0 edited as you told, and I come back with this logs (I think it is not related to this issue btw, so if you prefer I create another one for this problem). |
And logs
Resolution
|
I was able to repro the issue and it seems to be crashing in SONiC recently added support for FIPS certification by changing from |
Hello If someone has the same issue, I tried to summarize all what we said in this issue here: https://github.com/hugocollignon/SONiC/blob/main/SONiC-Arista-7050QX-32.md If all hints are ok for the community, maybe we could add a page in the wiki https://github.com/sonic-net/SONiC/wiki and a link to it in https://github.com/sonic-net/SONiC/wiki/Supported-Devices-and-Platforms? |
Hi all, sorry to necro this issue, it was the most fitting one I could find in this repo. I want to run SONiC on an Arista 7050QX-32S. Its DOM is too small, so I bought some SATA M.2s to put in my switches and use as new boot drive. I followed these steps:
and even modified the boot0 included in the sonic-aboot-barefoot.swi for my switches to switch the remaining flash: to drive: but I still cant get it to work. This is what I did and the error I receive after the initial extraction happened and it wants to boot SONiC now:
Am I doing something wrong or are there still pointers to flash: somewhere I didnt check because /host points to /mnt/flash, not /mnt/drive. |
Booting from Simple boot should be achievable in reasonable time by poking at the Also note that you are using a |
@Staphylo Thanks for the feedback. So my only remedy for this switch is to buy a larger DOM or get a DOM to USB adapter to use a USB drive within the switch? Also, regarding the barefoot vs broadcom thing, I used the official download link for my switch model from here: https://github.com/sonic-net/SONiC/blob/sonic_image_md_update/supported_devices_platforms.md which points to https://artprodcus3.artifacts.visualstudio.com/Af91412a5-a906-4990-9d7c-f697b81fc04d/be1b070f-be15-4154-aade-b1d3bfb17054/_apis/artifact/cGlwZWxpbmVhcnRpZmFjdDovL21zc29uaWMvcHJvamVjdElkL2JlMWIwNzBmLWJlMTUtNDE1NC1hYWRlLWIxZDNiZmIxNzA1NC9idWlsZElkLzM3MTI2MS9hcnRpZmFjdE5hbWUvc29uaWMtYnVpbGRpbWFnZS5iYXJlZm9vdA2/content?format=file&subpath=/target/sonic-aboot-barefoot.swi even though its says SONiC-Aboot-Broadcom. Maybe I should have used https://sonic-net.github.io/SONiC/Supported-Devices-and-Platforms.html instead because on there, its correct.
error when using it. See also #1664 |
Yes, at this point I believe your options are either replace the flash device or to spend cycles trying to add support for /mnt/drive in SONiC. I am aware of the issue with the image links being outdated. |
I am not too familiar with Azure DevOps, so an add-on question: |
@hugocollignon don't suppose you remember which branch you installed from to get this far? |
Good question, let's try to find an answer. Inside, I found "At this date (20221110)".
In the same page, there is a c/p of the first boot: I didn't find exactly the same artifact, but this should give a good point to start with a version which works. Then, I let you find the last working one. To answer to
It was 202205 IIRC (seems coherent looking for the image name) Hope that helps! |
@hugocollignon i'm getting the same issue with the 202205 branch. if i try and run
|
No idea how this works. @Staphylo maybe? |
@hugocollignon i ended up putting EOS back on to see if it was just all-round cooked. The first attempt at installing it didn't detect the fans or PSUs (but was still able to boot). I can't remember what I ran to get the warning, but it alerted me about needing to put the 2GB variant of EOS on as the version I was using was incompatible. After doing that, it properly detected the hardware, as the startup sequence didn't just go straight to fixed fan speed. I've got a 64GB USB in there but it did indeed come with a 2GB DOM. |
Aboot# ls
MD5SUMS dev init mnt root tmp
bin etc lib proc sys
Aboot# cd /mnt/
Aboot# ls
flash flash.conf
Aboot# cd flash
Aboot# ls
EOS-4.17.7M.swi debug sonic-aboot-broadcom.swi
boot-config persist startup-config
config_match schedule zerotouch-config
Aboot# boot /mnt/flash/sonic-aboot-broadcom.swi
42.05: Cleaning flash content /mnt/flash
46.39: Generating boot-config, machine.conf and cmdline
46.53: Installing image under /mnt/flash/image-HEAD.247-7bc8f129
46.53: Moving swi to a tmpfs
75.79: Extracting swi content
143.30: Extracting dockerfs.tar.gz from swi
187.61: Unpacking dockerfs.tar.gz delayed to initrd because /mnt/flash is vfat or docker_inram is on
187.61: Remove installer
193.79: Kexecing[ 193.801871] Starting new kernel
...
[ 4.239067] sd 4:0:0:0: [sda] No Caching mode page found
[ 4.302475] sd 4:0:0:0: [sda] Assuming drive cache: write through
Checking that no-one is using this disk right now ... OK
Disk /dev/sda: 7.5 GiB, 8044675072 bytes, 15712256 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x31625cc4
Old situation:
Device Boot Start End Sectors Size Id Type
/dev/sda1 2048 15712255 15710208 7.5G 83 Linux
New situation:
Device Boot Start End Sectors Size Id Type
/dev/sda1 2048 15712255 15710208 7.5G 83 Linux
The partition table has been altered.
Calling ioctl() to re-read partition table.
Re-reading the partition table failed.: Device or resource busy
The kernel still uses the old table. The new table will be used at the next reboot or after you run partprobe(8) or kpartx(8).
Syncing disks.
mke2fs 1.43.4 (31-Jan-2017)
/dev/sda1 contains a vfat file system
Creating filesystem with 1963776 4k blocks and 491520 inodes
Filesystem UUID: 0c20ebd7-8d93-4345-aa20-0283ca91e8ea
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done
tar: write error: No space left on device
[ 175.081028] rc.local[471]: + sonic-cfggen -y /etc/sonic/sonic_version.yml -v build_version
[ 179.015145] kdump-tools[470]: Starting kdump-tools: no crashkernel= parameter in the kernel cmdline ... failed!
[ 184.140222] rc.local[471]: + SONIC_VERSION=HEAD.247-7bc8f129
[ 184.216815] rc.local[471]: + FIRST_BOOT_FILE=/host/image-HEAD.247-7bc8f129/platform/firsttime
[ 184.325718] rc.local[471]: + logger SONiC version HEAD.247-7bc8f129 starting up...
[ 184.428719] rc.local[471]: + [ ! -e /host/machine.conf ]
[ 184.500714] rc.local[471]: + . /host/machine.conf
[ 184.560685] rc.local[471]: + aboot_version=2.0.10-1458058
[ 184.632667] rc.local[471]: + aboot_vendor=arista
[ 184.692670] rc.local[471]: + aboot_platform=x86_64-arista_7050_qx32
[ 184.768673] rc.local[471]: + aboot_machine=arista_7050_qx32
[ 184.840693] rc.local[471]: + aboot_arch=x86_64
[ 184.900692] rc.local[471]: + aboot_build_date=2013-09-21T19:34:39.000000000
[ 184.992670] rc.local[471]: + program_console_speed
[ 185.053733] rc.local[471]: + cat /proc/cmdline
[ 185.113403] rc.local[471]: + cut -d , -f2
[ 185.168938] rc.local[471]: + grep -Eo console=ttyS[0-9]+,[0-9]+
[ 185.245220] rc.local[471]: + speed=
[ 185.288725] rc.local[471]: + [ -z ]
[ 185.332675] rc.local[471]: + CONSOLE_SPEED=9600
[ 185.392667] rc.local[471]: + sed -i s|--keep-baud .* %I| 9600 %I|g /lib/systemd/system/serial-getty@.service
[ 185.520697] rc.local[471]: + systemctl daemon-reload
[ 185.580777] rc.local[471]: + [ -f /host/image-HEAD.247-7bc8f129/platform/firsttime ]
[ 185.684691] rc.local[471]: + echo First boot detected. Performing first boot tasks...
[ 185.792706] rc.local[471]: First boot detected. Performing first boot tasks...
[ 185.888721] rc.local[471]: + [ -n x86_64-arista_7050_qx32 ]
[ 185.960704] rc.local[471]: + platform=x86_64-arista_7050_qx32
[ 186.036709] rc.local[471]: + [ -d /host/old_config ]
[ 186.096689] rc.local[471]: + [ -f /host/minigraph.xml ]
[ 186.168728] rc.local[471]: + [ -n ]
[ 186.308188] rc.local[471]: + touch /tmp/pending_config_initialization
[ 186.396714] rc.local[471]: + touch /tmp/notify_firstboot_to_platform
[ 186.484706] rc.local[471]: + [ ! -d /host/reboot-cause/platform ]
[ 186.560741] rc.local[471]: + mkdir -p /host/reboot-cause/platform
[ 186.640758] rc.local[471]: + [ -d /host/image-HEAD.247-7bc8f129/platform/x86_64-arista_7050_qx32 ]
[ 186.748688] rc.local[471]: + sync
[ 190.768230] arista: waiting for switch chip
[ 190.918819] arista: switch chip is ready
[ 192.967795] arista: yielding...
[ OK ] Started Arista early platform initialization.
Starting Arista late platform initialization...
Starting Opennsl kernel modules init...
[ 195.083340] rc.local[471]: + [ -n ]
[ OK ] Started /etc/rc.local Compatibility.
[ 195.136653] rc.local[471]: + mkdir -p /var/platform
[ OK ] Started Opennsl kernel modules init.
[ 195.282353] rc.local[471]: + firsttime_exit
[ OK ] Started Getty on tty1.
[ OK ] Started Serial Getty on ttyS0.
[ OK ] Reached target Login Prompts.
[ 195.404451] rc.local[471]: + rm -rf /host/image-HEAD.247-7bc8f129/platform/firsttime
[ 195.680364] rc.local[471]: + exit 0
[ OK ] Started Docker Application Container Engine.
Starting Database container...
[FAILED] Failed to start Database container.
See 'systemctl status database.service' for details.
[DEPEND] Dependency failed for BGP container.
[DEPEND] Dependency failed for switch state service.
[DEPEND] Dependency failed for ICCPD container.
[DEPEND] Dependency failed for Management Framework container.
[DEPEND] Dependency failed for NAT container.
[DEPEND] Dependency failed for sFlow container.
[DEPEND] Dependency failed for syncd service.
[DEPEND] Dependency failed for LLDP container.
[DEPEND] Dependency failed for Config initialization and migration service.
[DEPEND] Dependency failed for Update minigr… configuration based on minigraph.
[DEPEND] Dependency failed for Control Plane ACL configuration daemon.
[DEPEND] Dependency failed for Update rsyslog configuration.
[DEPEND] Dependency failed for DHCP relay container.
[DEPEND] Dependency failed for TEAMD container.
[DEPEND] Dependency failed for Update interfaces configuration.
[DEPEND] Dependency failed for Host config enforcer daemon.
[DEPEND] Dependency failed for Router advertiser container.
[DEPEND] Dependency failed for Update hostname based on configdb.
[DEPEND] Dependency failed for Process and d…ry utilization data export daemon.
[DEPEND] Dependency failed for Platform monitor container.
[DEPEND] Dependency failed for Update NTP configuration.
[DEPEND] Dependency failed for Monitor warm …ry and disable warmboot when done.
Starting Credo phy init...
[FAILED] Failed to start Credo phy init.
See 'systemctl status phy-credo.service' for details.
Debian GNU/Linux 9 sonic ttyS0
sonic login: admin
Password:
Linux sonic 4.9.0-11-2-amd64 #1 SMP Debian 4.9.189-3+deb9u2 (2019-11-11) x86_64
You are on
/ | / _ | \ | ()/ |
_ | | | | | | | |
) | || | |\ | | |
|/ ___/|| _||____|
-- Software for Open Networking in the Cloud --
Unauthorized access and/or use are prohibited.
All access and/or use are subject to monitoring.
Help: http://azure.github.io/SONiC/
admin@sonic:~$ ls
The text was updated successfully, but these errors were encountered: