« Shell Script | Home | Virtualization »

Archive for the SuSE Category

How to recover broken grub2 boot on SLES15

Posted on Wed, Nov 25, 2020 at 12:28 by Hubertus A. Haniel

I am currently having issues where the boot of one of my VM's gets corrupted and I frequently have to recover it - I sometimes miss out mounting one of the required mount points so I noted them here to jolt my memory mainly but they may be helpful for others......

These are the steps I currently follow to recover the boot sector:

  1. - Boot the system from the DVD or PXE into rescue mode
  2. Mount the existing root file system under /mnt which is on /dev/rootvg/rootlv using "mount /dev/rootvg/rootlv /mnt"
  3. mount -t proc none /mnt/proc
  4. mount -t sysfs sys /mnt/sys
  5. mount -o bind /dev /mnt/dev
  6. chroot /mnt
  7. /sbin/mkinitrd (to recreate the initrd for the existing kernel
  8. /usr/sbin/grub2-mkconfig -o /boot/grub2/grub.cfg (To regenerate the grub config)
  9. /usr/sbin/grub2-install --force /dev/sda (/dev/sda is my boot disk)

Now exit the chroot jail and reboot the system and hopefully everything should be back to normal

Edited on: Wed, Nov 25, 2020 12:36

Remove graphical splash screen on SLES15

Posted on Wed, Nov 25, 2020 at 12:07 by Hubertus A. Haniel

By default now modern SLES/OpenSuSE distributions now have a graphical boot screen which can be quite irritating as it hides valuable debugging information during the boot process. - Although it can be removed on a boot by hitting escape but sometimes it can be quite irritating when having to do this all the time. The simple solution to this is to disable this completely by modifying the boot options on grub. The options for this are:

splash=verbose plymouth.enable=0

In regards to the plymouth option you could also permanently remove the plymouth packages but the above option is probably less intrusive if you wish to enable this at a later stage again.

Edited on: Wed, Nov 25, 2020 12:37

Linux SCSI rescan? - Reboot!?

Posted on Tue, Dec 06, 2011 at 11:56 by Hubertus A. Haniel

Recently people keep asking me how do I add and remove storage from a Linux system running a 2.6 kernel and get linux to rescan the SCSI bus and add (or remove) storage dynamically with out rebooting.

So here is how to do it:

1 - Find the host number for the HBA:

ls /sys/class/fc_host/

You will have something like host1 or host2

2 - Ask the HBA to issue a LIP signal to rescan the FC bus:

echo 1 > /sys/class/fc_host/host1/issue_lip

3 - Wait for a few seconds for the LIP command to complete

4 - Ask the linux kernel to rescan the SCSI devices on that HBA

echo "- - -" > /sys/class/scsi_host/host1/scan

( - - - means every channel, every target and every lun )

Edited on: Tue, Dec 06, 2011 12:07

LXC - Linux Containers on OpenSuSE/SLES

Posted on Wed, Oct 05, 2011 at 9:27 by Hubertus A. Haniel

Recently somebody pointed me at LXC so I thought I give it a try.

As I mainly work on SuSE/SLES - I attempted this following the documentation at http://en.opensuse.org/LXC with a little extra help from http://lxc.teegra.net unsing OpenSuSE 11.4

At the time of me writing this the OpenSuSE guide did work pretty well but I had to make a few adjustments to get the container running properly:

  • To get the network to start up properly I had to comment out the paragraph that sets the mode to "onboot" in /etc/init.d/network so during boot it is just called with start. - This is a bit of a hack and may break things if the networking setup is a little more complex then a single interface. I also adjusted my config slightly to use DHCP rather then static addresses as that is a little easier to handle in my test environment.
  • This is not mentioned in the documentation I found but autofs has problems within a container. - My homedirectory gets mounted from a NFS server and autofs just seemed to hang while a hard nfs mount in fstab would work just fine.
  • Booting the environment came up with lots of errors about udev so I re-enabled that even though the documentation mentions that it should be taken out.

I have the advantage that I have puppet in my environment to adjust the configuration of systems to suit my test environment but a few things to make LXC on SuSE viable in a proper environment would be:

  • LXC/YaST intigration so AutoYaST templates could be fed into a container creation.
  • Currently there are no LXC templates for OpenSuSE or only the frameworks so one would have to create proper templates to use the lxc-create/lxc-destroy commands to create and destroy containers on the fly.
  • LXC is part of the SLES11 distribution but there does not seem to be any documentation what Novell would support inside a container in a production environment especially since I had to hack startup scripts in /etc/init.d so I think the startup scripts would need to be properly adjusted to be aware to do the right things if they are running inside a container. Hacking the start up scripts is not really an option as those changes may get reversed out during patching.

Other then the above concerns and gotchas LXC is a very interesting project and has the potential of Solaris Zones and offers for Linux a full compliment of Virtualisation technologies alongside UML (Not realy used any more), Xen and KVM.

OpenSuSE filesystem snapshots on btrfs

Posted on Mon, Sep 19, 2011 at 17:16 by Hubertus A. Haniel

Only recently did I find out about a new cool feature on the OpenSuSE Factory builds called snapper. It has been in the works for quite a while and was actually announced back in April here but it is only recently that I have tried these tools out while familiarising myself with btrfs

Snapper is reasonably easy to configure and there are quite a few guides on the OpenSuSE website. Once it is all configured it creates snapshots of your system on regular intervals via cron and also everytime you make modifications to your system using zypper and YaST but it requires that your root filesystem is btrfs. - The only drawback currently is that you can not boot of btrfs so you can not create snapshots of the /boot filesystem and therfore you would not be able to roll back kernel updates unless you replicate the contents of /boot into your btrfs filesystem before kernel upgrades.

A detailed guide on how to get snapper going which I followed is here but it should be noted that the subvolume for snapshots should be created as /.snapshots rather then /snapshots otherwise things will just not work.

So on your btrfs root filesystem or any filesystem that you want to snapshot you create a subvolume called .snapshots like so:

  btrfs subvolume create /.snapshots

Then you create your configuration in /etc/snapper/configs/root for your root filesystem. - I got the example out of /etc/snapper/config-templates/default

Then you add "root" to SNAPPER_CONFIGS in /etc/sysconfig/snapper

Now everything should be ready to work and you can try to create an initial snapshot with the following command:

    snapper create --description "initial"
Edited on: Sat, Sep 24, 2011 17:41

Mirroring a LVM based root disk to a second disk on Linux

Posted on Fri, Dec 17, 2010 at 16:02 by Hubertus A. Haniel

I did this on OpenSuSE 11.2 so on other systems you may have to adjust this procedure slightly.

Note: Make sure that kernel dumping is disabled as OpenSuSE 11.2 does not seem to be able to dump to a raided lvm setup. Also before you do this you should ensure that you have backups of everything as you could end up with a corupted system if you get this procedure wrong.

On my setup the system is installed on /dev/sda and my spare disk which I want to mirror to is /dev/sdb. I have two partitions with /dev/sda1 being the /boot filesystem and /dev/sda2 being under LVM control containing the root filesystem and swap space. Obviously on a raided system you may have to trade in a bit of performance against the disk redundancy as every write to the disks will happen twice.

You need to ensure that both disks are the same size or /dev/sdb is bigger then /dev/sda. In my system both disks are exactly the same size which is preferable as the disk geometry will be the same then.

You also want to ensure that any applications are shut down to minimize any other disk access to the system while you are doing this.

As a first step you need to re-label all your partitions on the primary disk to raid (type fd):

carling:~ # fdisk /dev/sda
The number of cylinders for this disk is set to 2610.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs

Command (m for help): t
Partition number (1-4): 1
Hex code (type L to list codes): fd
Changed system type of partition 1 to fd (Linux raid autodetect)

The above example needs to be done for /dev/sda2 (partition 2) as well and then you should exit fdisk with the command "w" to write the partition table. You may get a warning that the new partition type is not used until the next re-boot but you can ignore that. Listing the partitions with sfdisk should still give you the new setup though which should look like this:

carling:~ # sfdisk -d /dev/sda
# partition table of /dev/sda
unit: sectors

/dev/sda1 : start= 63, size= 208782, Id=fd, bootable /dev/sda2 : start= 208845, size= 41720805, Id=fd /dev/sda3 : start= 0, size= 0, Id= 0 /dev/sda4 : start= 0, size= 0, Id= 0

Now we will copy the above partition table to the second disk using sfdisk again:

sfdisk -d /dev/sda > partitions.txt
sfdisk /dev/sdb < partitions.txt

At this stage you may get an error about /dev/sdb having an invalid DOS signature which can safely be ignored. You should check though that the bootable flag for /dev/sdb is set as below. This was not the case for me so I had to do this manually using fdisk:

carling:~ # fdisk -l /dev/sdb

Disk /dev/sdb: 21.5 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x3dc7fd79
Device Boot Start End Blocks Id System
/dev/sdb1   *           1          13      104391   fd  Linux raid autodetect
/dev/sdb2              14        2610    20860402+  fd  Linux raid autodetect    

Next we create a degraded array on /dev/sdb NOT touching /dev/sda:

carling:~ # mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdb1 missing
mdadm: array /dev/md0 started.
Carling:~ # mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sdb2 missing
mdadm: array /dev/md1 started.
carling:~ # cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb2[0]
      1582336 blocks [2/1] [U_]

md0 : active raid1 sdb1[0]
      513984 blocks [2/1] [U_]

unused devices: <none>

Now create and copy /boot on /dev/md0 - in my case /boot is a reiserfs filesystem:

mkreiserfs /dev/md0
mount /dev/md0 /mnt
cp -a /boot/* /mnt/    

Now you need to edit /etc/fstab and change /dev/sda1 to /dev/md0 for the boot filesystem. After this is done you need to umount /mnt and /boot and do a mount /boot. The output of df should now show that boot is on /dev/md0 as below:

carling:~ # df -h /boot
Filesystem Size Used Avail Use% Mounted on
/dev/md0              102M   50M   53M  49% /boot    

Now we can migrate rootvg which in my case is the volume group for /dev/sda2 containing my root filesystem and swap space with the following commands to /dev/md1:

pvcreate /dev/md1
vgextend rootvg /dev/md1
pvmove /dev/sda2 /dev/md1
vgreduce rootvg /dev/sda2
pvremove /dev/sda2    

Once this is done we can attach the first disk to the new raid setup using the command "mdadm -a /dev/md0 /dev/sda1" and "mdadm -a /dev/md1 /dev/sda2". It will now take a while for your raid to syncronise with can be monitored by checking the contents of /proc/mdstat with cat. While the syncronisation of the raid completes you can carry on with modifying some files. You should create /etc/mdadm.conf with the following contents:

carling:~ # cat /etc/mdadm.conf
DEVICE partitions
ARRAY /dev/md0 level=raid1 devices=/dev/sdb1,/dev/sda1
ARRAY /dev/md1 level=raid1 devices=/dev/sdb2,/dev/sda2    

You also need to edit the filter line in /etc/lvm/lvm.conf to ensure that /dev/md1 is an accepted device for LVM otherwise you will not be able to boot. My line now reads after modification:

filter = [ "r|/dev/.*/by-path/.*|", "r|/dev/.*/by-id/.*|", "a/.*/" "a|/dev/md1|" ]    

I have also read that some people on some Linux distributions have had the problem that LVM would always find the SCSI devices before the MD devices and use them instead which may give some unpredictable results but I have not run into this issue on OpenSuSE 11.2. After you have modified /etc/lvm/lvm.conf you eed to run "vgscvan" to update the .cache file in the same directory.

Now you need to create a new initrd using "mkinitrd -f md" which will tell mkinitrd to include the needed md kernel modules in the initrd.

carling:~ # mkinitrd -f md
Kernel image: /boot/vmlinuz-
Initrd image:   /boot/initrd-
Root device: /dev/rootvg/rootlv (mounted on / as reiserfs)
Resume device: /dev/rootvg/swaplv
setup-md.sh: md127 found multiple times
Kernel Modules: scsi_transport_spi mptbase mptscsih mptspi libata ata_piix ata_generic ide-core piix ide-pci-generic hwmon thermal_sys processor thermal
fan dm-mod dm-snapshot rtc-lib rtc-core rtc-cmos reiserfs ohci-hcd ehci-hcd uhci-hcd hid usbhid raid0 raid1 xor async_tx async_memcpy async_xor raid6_pq
raid456 linear
Features: dm block usb md lvm2
19415 blocks

As seen above the feature list includes md and lvm2

Once the raid is fully syncronised according to /proc/mdstat we can update the grub records and ensure that they are present on both disks boot sectors. This is done from the grub shell by running "grub":

    GNU GRUB  version 0.97  (640K lower / 3072K upper memory)

[ Minimal BASH-like line editing is supported. For the first word, TAB
lists possible command completions. Anywhere else TAB lists the possible
completions of a device/filename. ]

grub> find /boot/grub/stage1

grub> root (hd1,0)
Filesystem type is reiserfs, partition type 0xfd

grub> setup (hd1)
Checking if "/boot/grub/stage1" exists... yes
Checking if "/boot/grub/stage2" exists... yes
Checking if "/boot/grub/reiserfs_stage1_5" exists... yes
Running "embed /boot/grub/reiserfs_stage1_5 (hd1)"... 23 sectors are embedded.
Running "install /boot/grub/stage1 (hd1) (hd1)1+23 p (hd1,0)/boot/grub/stage2 /boot/grub/menu.lst"... succeeded
grub> root (hd0,0)
 Filesystem type is reiserfs, partition type 0xfd

grub> setup (hd0)
Checking if "/boot/grub/stage1" exists... yes
Checking if "/boot/grub/stage2" exists... yes
Checking if "/boot/grub/reiserfs_stage1_5" exists... yes
Running "embed /boot/grub/reiserfs_stage1_5 (hd0)"... 23 sectors are embedded.
Running "install /boot/grub/stage1 (hd0) (hd0)1+23 p (hd0,0)/boot/grub/stage2 /boot/grub/menu.lst"... succeeded

grub> quit

You should now be in a position to reboot the system and it should be able to boot stright of the raid set. - for me things went a little wrong while I was writing this procedure as I messed up the lvm filter line so I had to boot of rescue media (CD/PXE) and for some reaso my md devices have shuffled around a bit and rootlv ended up on /dev/md127 as you may have noticed from some of the above output.

Edited on: Sat, Sep 24, 2011 17:16

openSuSE 11.1 / SLES11 and add-on repositories

Posted on Thu, Jul 15, 2010 at 13:14 by Hubertus A. Haniel

In the past you could add a file called "add_on_products" to the root of the installation source. Now this has changed to an XML format and it is called add_on_products.xml in OpenSuSE 11.1 or SLES11

<?xml version="1.0"?>
<add_on_products xmlns="http://www.suse.com/1.0/yast2ns"
   <product_items config:type="list">
           <name>11.1 updates</name>
           <ask_user config:type="boolean">false</ask_user>
           <selected config:type="boolean">true</selected>
       <!-- Another product item -->
       <product_item />

To be able to use the file you also have to sign it and make sure the signature is available in the installer like so:

    sha1sum add_on_products.xml > SHA1SUMS

Sign it with your GPG Key:

    gpg -b --sign --armor SHA1SUMS

A file SHA1SUMS.asc will be created which contains the signature for the SHA1SUMS file. That means, if you change the SHA1SUMS file from now on, you have to recreate the SHA1SUMS.asc file too.

The installer needs to know your public gpg key now, so it can check the signature of that file. You need to add your public gpg key to the initrd AND you have to store it in a file called SHA1SUMS.key. First of all you need to export your public gpg key like this:

    gpg --export --armor $KEYID > SHA1SUMS.key

Now update the YaST directory listing:

    ls > directory.yast

Copy the key file to a file with a gpg extension:

    cp SHA1SUMS.key my-key.gpg

Now you have to add that key to the initrd in /boot/i386/loader/initrd on the DVD or on your tftp server for PXE booting. Add the key like this:

	mv initrd initrd.gz
	gunzip initrd.gz
	find my-key.gpg | cpio -o -A -F initrd -H newc
	gzip initrd
	mv initrd.gz initrd
Edited on: Sat, Sep 24, 2011 15:43

« Shell Script | Top | Virtualization »