Hubba's Blog

Notes from a Linux/Unix Engineer

Archive for the HowTo category

Managing AD Computer Accounts with adcli and kerberos on Linux

Posted on Mon, Jun 02, 2025 at 12:12 by Hubertus A. Haniel

When configuring Samba on Linux against active directory these steps are part of this as well but you may just want to use kerberos on its own so these are the initial streps to get it working on RHEL8/9

First you need to install the krb5-workstation and adcli packages which should be available in the default repos.

Then you need to configure /etc/krb5.conf to reflect your AD domain (mine is upnor.localnet.lan)

    
 includedir /etc/krb5.conf.d/

[logging]
    default = FILE:/var/log/krb5libs.log
    kdc = FILE:/var/log/krb5kdc.log
    admin_server = FILE:/var/log/kadmind.log

[libdefaults]
    dns_lookup_realm = false
    dns_lookup_kdc = false
    ticket_lifetime = 24h
    renew_lifetime = 7d
    forwardable = true
    rdns = false
    pkinit_anchors = FILE:/etc/pki/tls/certs/ca-bundle.crt
    spake_preauth_groups = edwards25519
    default_realm = UPNOR.LOCALNET.LAN
    default_ccache_name = KEYRING:persistent:%{uid}

[realms]
  UPNOR.LOCALNET.LAN = {
  kdc = 192.168.0.10
  }

Now we need to join the domain and for this the command is something like:

 adcli join -v --domain "upnor.localnet.lan" -U <userid> -O OU=Unix\ Samba\ Servers,OU=SERVERS,DC=upnor,DC=localnet,DC=lan
  

Note that the OU stuff seems back to front to what it shows in the Windows Active Directory GUI where my OU or path is "\SERVERS\Unix Samba Servers" and you obviously have to escape the spaces with \ - The userid needs to be somebody that has the rights to manage computer accounts in that OU. - This has to be run as root.

The command will create the computer account and the /etc/krb5.keytab file.

You should now be able to get a kerberos ticket with "kinit <userid>"

Now we are in a position to run other commands and we can authenticate against AD with the kerberos ticket (-C option)

So we can for example create a SPN for our host (again as root as /etc/krb5.keytab will get modified)

 adcli update --add-service-principal=cifs/alias.upnor.localnet.lan --domain "upnor.localnet.lan" -v -C
  

alias.localnet.lan is an alias to my server running samba and we may need this to authenticate against samba on this server using this alias. - All these commands I have run in verbose mode (-v) as with this command I noticed that while adding an SPN where the update in AD failed but it still carried on updating the local keytab file.

We should be able to query the SPN from a windows client using "setspn -T upnor.localnet.lan -Q */alias.upnor.localnet.lan"

We can also pre-set a computer account for another server that may not have adcli installed but we want to join the domain using samba with "net ads join -U <userid>" because samba for some reason does not create computer accounts and certainly can not create them in a specific OU:

 adcli preset-computer <other server name> -domain "upnor.localnet.lan" -U <userid> -O OU=Unix\ Samba\ Servers,OU=SERVERS,DC=upnor,DC=localnet,DC=lan -v -C
  

The only bit I can not figure out is how to edit the SPN's for a remote host like you can with setspn in windows - I have, without success, tried various combinations to archive the same as:

 setspn -S http/daserver daserver1
   It will register SPN "http/daserver" for computer "daserver1"
    if no such SPN exists in the domain
 setspn -D http/daserver daserver1
   It will delete SPN "http/daserver" for computer "daserver1"   

If you work it out - let me know and I will add it here!

Edited on: Mon, Jun 02, 2025 13:30

Posted in HowTo (RSS), System - Linux (RSS), System - Windows (RSS)

Version comparison using rpm

Posted on Thu, Feb 20, 2025 at 11:35 by Hubertus A. Haniel

I have been playing a little bit with ChatGPT and its code generators and while doing this I stumbled across this. I wrote about versiion comparison in a previos post which is sort of a common thing that keeps coming up and I have used the function that I refer to there lots of times. It seems that rpm actually has a build in function to do this which returns result codes so you can refer to this with a function like this but obviously it will not work on other platforms and it seems that this has not been available on all rpm versions but I do not know when it was introduced:

compare_rpm_versions() {
    local version1="$1"
    local version2="$2"
    
    if [[ -z "$version1" || -z "$version2" ]]; then
        printf "Error: Two versions must be provided\n" >&2
        return 1
    fi

    if ! command -v rpm &>/dev/null; then
        printf "Error: rpm command not found\n" >&2
        return 2
    fi

    if rpm --eval "%{lua: print(rpm.vercmp('$version1', '$version2'))}" &>/dev/null; then
        local result; result=$(rpm --eval "%{lua: print(rpm.vercmp('$version1', '$version2'))}")
        case "$result" in
            1)  printf "%s is newer than %s\n" "$version1" "$version2"; return 0 ;;
            0)  printf "%s and %s are identical\n" "$version1" "$version2"; return 0 ;;
            -1) printf "%s is older than %s\n" "$version1" "$version2"; return 0 ;;
            *)  printf "Error: Unexpected comparison result: %s\n" "$result" >&2; return 3 ;;
        esac
    else
        printf "Error: Failed to compare versions\n" >&2
        return 4
    fi
}    
  
Edited on: Wed, Jun 04, 2025 15:00

Posted in HowTo (RSS), Packaging (RPM) (RSS), Shell Scripting (RSS), System - Linux (RSS)

Large File System out of lots of small chunks of free space.....

Posted on Thu, Dec 05, 2024 at 10:03 by Hubertus A. Haniel

DON'T TRY THIS AT WORK!
Just because it is possible it does not mean it is good practice!
I would not endorse this trickery/hack in a commercial environment.

This would not be supported in an enterprise environment and is probably not the safest way of keeping your data so I would not recommend this in a critical environment with critical data. On top of this not being the safest way to store your data it will also not be very efficient and I would expect a performance impact as a result of this. So this is just a bit of fun and may just help you out with a temporary fix to get you over a hurdle. I have done this on RHEL9 but it will work in the same way on other Linux distributions

Lets say you have a system with lots of file systems of which the size does not really matter but there are a few gigabytes here and there and you may even be able to add NFS mounted stuff although if there is a network failure you may end up with corruptions when the NFS parts fail. You can not shrink or rearrange the file systems to free up enough to store a larger file. In this example:

/filesystem-01
/filesystem-02
/filesystem-03
/filesystem-04

It is irrelevant of how big these file systems are but let say each of these have only about 25gb free but you want to create a file that is in the region of 80gb to 100gb for what ever reason. So lets go ahead and in each of these file systems we will create a sparse file (We can just create a full size file but a sparse file is faster to create. You will find that after that step the real space is not actually being used yet but with ls -al you will see the file size is shown as what it could grow to.)

dd if=/dev/zero of=/filesystem-01/filestore.img bs=1 count=0 seek=25G
dd if=/dev/zero of=/filesystem-02/filestore.img bs=1 count=0 seek=25G
dd if=/dev/zero of=/filesystem-03/filestore.img bs=1 count=0 seek=25G
dd if=/dev/zero of=/filesystem-04/filestore.img bs=1 count=0 seek=25G

Next we will create loop back devices that point to these files:

losetup --show -f /filesystem-01/filestore.img
losetup --show -f /filesystem-02/filestore.img
losetup --show -f /filesystem-03/filestore.img
losetup --show -f /filesystem-04/filestore.img

These will most likely end up being /dev/loop0 through to /dev/loop3 but if you have other loop back stuff mounted it may differ. The command "losetup -a" will list them for you.

We can now create a raid device on top of these loop devices with level raid 0 to have on continuous device:

mdadm -C /dev/md/filestore -l 0 -n 4 /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3

You can now treat /dev/md/filestore like a normal disk device. So you can partition it create one large filesystem on it.

When you want to unmount it (before rebooting for example although Linux may do it for you the steps are:

  1. Unmount the file system
  2. Stop the raid device ("mdadm --stop /dev/md/filestore"
  3. Remove the loop devices ("losetup -d <for each device created>")

To remount the device again the steps are:

  1. "losetup --show -f" for each of the image files as above
  2. "mdadm --assemble /dev/md/filestore /dev/loop0 /dev/loop1 /dev/loop2 /dev/loop3"
  3. mount the device as previously

If you are going to keep this setup for a longer time you may want to script the above to ensure it gets done on boot etc. - The same is probably archievable just with plain LVM but I have not attempted that and I suspect it may be more intrusive as you may have to modify lvm.conf to scan the loop devices. You also run in the danger of messing up your lvm meta data so I did not want to take the risk on creating volume groups on files that are already on top of a volume group which is the case on my system. You also would not want to extend existing logical volumes on to files bearing in mind that you may not be able to shrink stuff down when you want to remove it again.

Edited on: Thu, Dec 05, 2024 13:42

Posted in HowTo (RSS), System - Linux (RSS)

AIX 7.1 to 7.2 upgrade

Posted on Wed, Mar 13, 2024 at 15:23 by Hubertus A. Haniel

Note: This guide is not aimed as a step by step command reference as systems may be configured in different way - It is more of a reminder of the steps that are involved and it is still a work in progress guide.

One should familiarize themselves with the following articles:

The assumption is that we are working on an AIX server where rootvg is mirrored across hdisk0 and hdisk1. Just to ensure that the boot partitions are up to date it is advisable to execute "bosboot -ad /dev/hdisk0" and the same for hdisk1.

It should be ensured that we have an up to date mksysb or we should create one preferably on a NIM server that we can boot of and recover this image.

We will now have to break the rootvg mirror using "unmirrorvg rootvg hdisk1" which will now free up hdisk1.

Now we can use "alt_disk_copy -d hdisk1" to create an alternative rootdisk copy which is NOT a mirror but a copy in itself. - This will create a copy and set hdisk1 as a bootdisk.

After the copy has completed we can reboot the server and we should now see that hdisk1 has become the active rootvg and hdisk0 is in a volume group called old_rootvg.

While running on this rootvg copy we should upgrade any components that may need to be upgraded as a pre requisite to AIX 7.2 eg Veritas Filesystems if they are in use.

To do the migration to 7.2 we need to boot of the 7.2 boot media from NIM or CD/DVD and the NIM server may need to be prepped for that by adding the NIM client for a bos_install.

Once we are successfully booting of the media (remember the LED lights so you can see the process) we should be prompted on the console to press 1 to recognise that we are on the correct console. We may also be prompted for a language selection.

We should then be dropped into the install menu which by default should have chosen "Migration" rather then "Full install" - This can be checked in the advanced install configuration menu and one should also make sure that the correct disk is selected which in our case should be hdisk1 but the default may go for hdisk0 which we do not want to touch.

After the migration the server should then boot into AIX 7.2 on hdisk1 and once we have confirmed that everything is OK we can remove old_rootvg and fully remirror the disks not forgetting bosboot on all mirrors to make sure the boot sector is populated.

Edited on: Wed, Mar 13, 2024 15:55

Posted in HowTo (RSS), System - AIX (RSS)

SSH troubleshooting

Posted on Fri, Feb 16, 2024 at 11:40 by Hubertus A. Haniel

When SSH issues are reported it is all to tempting to jump on a box make changes to the config file to fix the suspected issues and restarting sshd.

This may not always be the best way because:

  • Error messages in syslog may be misleading as it is difficult to track down an individual session and debug messages may be filtered out in syslog
  • On a busy system other users that still work may get disconnected/locked out while the problem is being worked on
  • In the worst case you will get disconnected and will not get back into the system other than via the console

The way to avoid this is to start ssh with the "-d" option which will start sshd in debugging mode and it will listen for one session only. If required multiple -dd (up to three) can be specified to increase the debugging level. Obviously the running ssh session is already listening on port 22 so we do not interfere with that so we need to specify a different port to listen on that is not in use with the -p option. For security reasons build into sshd you must run sshd with the full path of where it is installed.

# /usr/sbin/sshd -ddd -p 2222
debug2: load_server_config: filename /etc/ssh/sshd_config
debug2: load_server_config: done config len = 595
debug2: parse_server_config: config /etc/ssh/sshd_config len 595
debug3: /etc/ssh/sshd_config:21 setting Protocol 2
debug3: /etc/ssh/sshd_config:36 setting SyslogFacility AUTHPRIV
debug3: /etc/ssh/sshd_config:66 setting PasswordAuthentication yes
debug3: /etc/ssh/sshd_config:70 setting ChallengeResponseAuthentication no
debug3: /etc/ssh/sshd_config:80 setting GSSAPIAuthentication yes
debug3: /etc/ssh/sshd_config:82 setting GSSAPICleanupCredentials yes
debug3: /etc/ssh/sshd_config:96 setting UsePAM yes
debug3: /etc/ssh/sshd_config:99 setting AcceptEnv LANG LC_CTYPE LC_NUMERIC LC_TIME LC_COLLATE LC_MONETARY LC_MESSAGES
debug3: /etc/ssh/sshd_config:100 setting AcceptEnv LC_PAPER LC_NAME LC_ADDRESS LC_TELEPHONE LC_MEASUREMENT
debug3: /etc/ssh/sshd_config:101 setting AcceptEnv LC_IDENTIFICATION LC_ALL LANGUAGE
debug3: /etc/ssh/sshd_config:102 setting AcceptEnv XMODIFIERS
debug3: /etc/ssh/sshd_config:108 setting X11Forwarding yes
debug3: /etc/ssh/sshd_config:131 setting Subsystem sftp /usr/libexec/openssh/sftp-server
debug3: /etc/ssh/sshd_config:138 setting PermitRootLogin without-password
debug1: sshd version OpenSSH_5.3p1
debug3: Not a RSA1 key file /etc/ssh/ssh_host_rsa_key.
debug1: read PEM private key done: type RSA
debug1: private host key: #0 type 1 RSA
debug3: Not a RSA1 key file /etc/ssh/ssh_host_dsa_key.
debug1: read PEM private key done: type DSA
debug1: private host key: #1 type 2 DSA
debug1: rexec_argv[0]='/usr/sbin/sshd'
debug1: rexec_argv[1]='-ddd'
debug1: rexec_argv[2]='-p'
debug1: rexec_argv[3]='2222'
debug2: fd 3 setting O_NONBLOCK
debug1: Bind to port 2222 on 0.0.0.0.
Server listening on 0.0.0.0 port 2222.
debug2: fd 4 setting O_NONBLOCK
debug1: Bind to port 2222 on ::.
Server listening on :: port 2222.

Now the user can connect to that port with something like "ssh -p 2222 user@host" which will then give us detailed information of what is happening with that connection.

To make changes to the config file and to debug/test these changes it is best to make a copy of the existing config file and edit this file instead so we copy the config with something like:

# cp -a /etc/ssh/sshd_config /etc/ssh/sshd_config.TEST

Then we can start a session using this config file with:

# /usr/sbin/sshd -ddd -p 2222 -f /etc/ssh/sshd_config.TEST
debug2: load_server_config: filename /etc/ssh/sshd_config.TEST

Once confident that our changes are safe and they will not break anything else we can copy the changes to the real config file and restart the main ssh daemon on the system.

Edited on: Fri, Feb 16, 2024 12:36

Posted in HowTo (RSS), System - AIX (RSS), System - Linux (RSS), System - Solaris (RSS)

How to recover the hscroot password on an HMC

Posted on Sat, Dec 02, 2023 at 15:01 by Hubertus A. Haniel

We recently where locked out of an old HMC that is no longer supported but had problems with an LPAR attached to it which we are trying to get rid off so IBM where not very helpful to get us into it so I found the below procedure somewhere on the internet and now tested it on V7R7.9.0 before I will do on the real system by installing the HMC code on a VMware system - for that some hacks that are described at http://omnitech.net/reference/2013/05/01/installing-hmc-in-virtualbox/ where necessary I have lost the link to the procedure below so I am putting it here in case somebody else needs it. I did not actually have to follow the full procedure as init=/bin/rcpwsh prompted me to change the hscroot password rather than dropping me into a shell and then continued to fully boot at which stage I could just could log straight back in.

Anybody familiar with Linux will probably be familar with this procedure as it works on most linux distributions with init=/bin/bash. In case of the HMC I just was not sure how locked down these devices where and would I be challanged with encrypted filesystems and stuff like that....

Here we go:

1) Power off the HMC.

2) Power on the HMC, and as soon as the Loading grub message is displayed

quickly press the F1 key to get into grub.

The Grub menu will show one line with the text hmc.

3) On the Grub menu, select e for edit. The next GRUB screen is displayed with two lines:

root (hd0,0)

kernel (hd0,1)/boot/bzImage ro root=/dev/hda2 vga=0x317 apm=power-off

Note: The root device can vary by model: hda2 C03, C04, CR2, and hdc2 for CR3.

4) Move the cursor down to the line starting with kernel. Select e for edit.

Move the cursor to the right and append the following to the end of the string:

V5.1.0 to V6.1.1: init=/bin/bash

V6.1.2 and later: init=/bin/rcpwsh

The final string will vary slightly by version and model:

kernel (hd0,1)/boot/bzImage ro root=/dev/hda2 vga=0x317 apm=power-off init=/bin/rcpwsh

Press the Enter key to save the changes.

5) Press b to boot the changed selection.

This will boot to a bash shell on older HMC's - On newer HMC's this willl prompt you for a new hscroot password after the kernel is loaded and after changing the password it will continue to boot so you can skip the next steps until step 9. You may want to choose a simple password as the keyboard mapping may not match your locale if you are outside the US.

6) Verify root is mounted read/write. If not you may need to rmount it with

mount -o remount,rw /dev/hda2 /

Note: The root device can vary by model: hda2 C03, C04; hdc2 for CR2,CR3; sda2 for CR4.

7) Reset root and hscroot passwords. Run the following commands to reset the passwords. The command will prompt the user to enter the new password and a confirmation password. Any warning concerning the password being too simplistic can be ignored.

Reset root password:

/usr/bin/passwd

Reset hscroot password:

/usr/bin/passwd hscroot

8.) Reboot the HMC (left ctl+left alt+del).

9) Log on as hscroot.

10) Immediately after logon, use the Web-based System Manager (HMC GUI) or the chhmcusr

Edited on: Wed, Mar 13, 2024 15:56

Posted in HowTo (RSS), System - AIX (RSS)

Colour output in your scripts

Posted on Wed, Aug 02, 2023 at 13:03 by Hubertus A. Haniel

On Linux I have been using tput to produce colours in my output but then I noticed the other day that this does not actually seem to work on Solaris but I am not sure why so I had to resort to the old fashioned way of using escape sequences. This works perfectly fine in Linux:

#!/bin/bash
GREEN=$(tput setaf 2)
RED=$(tput setaf 1)
YELLOW=$(tput setaf 3)
NOCOL=$(tput sgr0)

echo "This works in Linux...."
echo "This is ${GREEN} Green${NOCOL} in Green"
echo "This is ${RED} Red${NOCOL} in Red"
echo "This is ${YELLOW} Yellow${NOCOL} in Yellow"
echo ""

So on Solaris this would be done like this (And this also works on Linux):

#!/bin/bash
GREEN="\033[0;32m"
RED="\033[0;31m"
YELLOW="\033[0;33m"
NOCOL="\033[0m"

echo "This works in Linux and Solaris...."
echo -e "This is ${GREEN} Green${NOCOL} in Green"
echo -e "This is ${RED} Red${NOCOL} in Red"
echo -e "This is ${YELLOW} Yellow${NOCOL} in Yellow"

    

So I guess I am going to have to stick to the second method to make my stuff work across platforms - From the script bits above you can see that a font effect is turned on with a code and you will have you will have to use a reset code "\033[0m" to turn it back off. The \033 ANSI escape sequence has a lot of codes to go in hand with it to do all sort of clever effects.

    echo -e "\033[31;1;4mHello\033[0m"

This example above has a comma separated list of codes so you got 31 for red, 1 for bold and 4 for underline and all this is cleared again with 0

This is a table that lists all the effect codes:

Code Effect Note
0 Reset / Normal all attributes off
1 Bold or increased intensity
2 Faint (decreased intensity) Not widely supported.
3 Italic Not widely supported. Sometimes treated as inverse.
4 Underline
5 Slow Blink less than 150 per minute
6 Rapid Blink MS-DOS ANSI.SYS; 150+ per minute; not widely supported
7 [[reverse video]] swap foreground and background colors
8 Conceal Not widely supported.
9 Crossed-out Characters legible, but marked for deletion. Not widely supported.
10 Primary(default) font
11–19 Alternate font Select alternate font n-10
20 Fraktur hardly ever supported
21 Bold off or Double Underline Bold off not widely supported; double underline hardly ever supported.
22 Normal color or intensity Neither bold nor faint
23 Not italic, not Fraktur
24 Underline off Not singly or doubly underlined
25 Blink off
27 Inverse off
28 Reveal conceal off
29 Not crossed out
30–37 Set foreground color See color table below
38 Set foreground color Next arguments are 5;<n> or 2;<r>;<g>;<b>, see below
39 Default foreground color implementation defined (according to standard)
40–47 Set background color See color table below
48 Set background color Next arguments are 5;<n> or 2;<r>;<g>;<b>, see below
49 Default background color implementation defined (according to standard)
51 Framed
52 Encircled
53 Overlined
54 Not framed or encircled
55 Not overlined
60 ideogram underline hardly ever supported
61 ideogram double underline hardly ever supported
62 ideogram overline hardly ever supported
63 ideogram double overline hardly ever supported
64 ideogram stress marking hardly ever supported
65 ideogram attributes off reset the effects of all of 60-64
90–97 Set bright foreground color aixterm (not in standard)
100–107 Set bright background color aixterm (not in standard)

    

The table below lists the basic 8bit color table which should be sufficient for most cases - there are plenty of other sources to give you 256 colours but in most cases that would not be required

Edited on: Thu, Dec 05, 2024 16:29

Posted in HowTo (RSS), Shell Scripting (RSS), System - AIX (RSS), System - Linux (RSS), System - Solaris (RSS)

AIX for System Administrators

Posted on Thu, Jun 08, 2023 at 10:50 by Hubertus A. Haniel

While searching for some specific information on AIX today I came across a really good site which I thought I bookmark here as it contains lots of useful information: - https://aix4admins.blogspot.com/

I hope it will help others as well.

Edited on: Wed, Aug 02, 2023 14:33

Posted in HowTo (RSS), System - AIX (RSS)

USB drive going to sleep on Linux? - How to prevent that?

Posted on Mon, Mar 20, 2023 at 14:40 by Hubertus A. Haniel

I have recently been experimenting with some stuff that needed quite a bit of storage but was not throughput intensive so I thought I use some old drives I had lying around with some USB to SATA adapters. It was working like a charm until I discovered that the application was complaining about latency which I eventually pinned down to the drives spinning down if not used for a few minutes. So any time I wanted to read data in short moments but stretched apart for a few minutes I had a delay while the drive was spinning up of about 5 to 10 seconds.

Normally I disable this in the BIOS for all drives because I don't think it actually is very healthy for the drive constantly spinning up and down but that is another debate...

It turned out that the disks actually have a kind of a BIOS setting on their own that is used when they are on USB or not controlled via some other way. - The tools that will be of help here are smartctl, hdparm and sdparm. Unfortunately there is no telling of which tools will work best in your scenario since some of the tools work differently on different models of drives and even more so if they are hanging of USB.

In some cases sdparm is your friend:

sdparm --all /dev/sda | grep STANDBY
  STANDBY_Y     0  [cha: n, def:  0, sav:  0]
  STANDBY       1  [cha: y, def:  1, sav:  0]

The above has power saving enabled - to disable this we can do:

sdparm --clear=STANDBY /dev/sda

Which should set the previous to:

sdparm --all /dev/sda | grep STANDBY
      STANDBY_Y     0 [cha: n, def: 0, sav: 0]
      STANDBY       0 [cha: y, def: 1, sav: 0]

I have also seen in some cases a STANDBY_Z variable that was set to 1 and clearing that did the trick.

The more elegant way is using hdparm but it would not work on all of my drives

 hdparm -B /dev/sdb
   /dev/sdb:
   APM_level = 128

The lower the number of APM_level is the more aggressive is the power management vs I/O numbers below 128 permit spin down and any higher numbers up to 254 prevent spin down - 255 will disable Power Management:

hdparm -B255 /dev/sdb
/dev/sdb:
 setting Advanced Power Management level to disabled
SG_IO: bad/missing sense data, sb[]:  70 00 01 00 00 00 00 0a 00 00 00 00 00 1d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
APM_level = off

There is also an hdparm -C and -S option - The -C is used to query the APM_level and -S 0 should diable power saving which did not work for me though - although take a look at the man page with the -Z option for Seagate drives.

I could not get smartcl to work reliably on my USB converter so I can not comment on the usage much as smartctl -g apm on any of my devices would just return unavailable.

Edited on: Mon, Mar 20, 2023 15:39

Posted in HowTo (RSS), System - Linux (RSS)

AIX - Get the serial number of the system.

Posted on Fri, Oct 28, 2022 at 11:55 by Hubertus A. Haniel

On AIX the serial number of the system can be retrieved multible ways. - "lsconf | head" will get you the serial number and you can grep it out of that output but actually this is quite an expensive call to make if you are doing this across a large estate to populate your inventory database. lsconf will actually go down and probe the hardware for things and will be quite in efficiant.

The serial number is actually stored on the local filesystem in the ODM which is the system registry for an AIX system and it is much more efficient to retrieve it from there using "odmget CuAt | grep -p systemid"

Posted in HowTo (RSS), System - AIX (RSS)

function cleanexit {} - Clean your shit!

Posted on Mon, Jun 20, 2022 at 11:42 by Hubertus A. Haniel

When writing shell scripts in bash that create temporary files I prefer to stick a clean exit function at the top of the script that runs the clean up no matter how the script exits - this should even remove files when the script got interrupted:

	function cleanexit
{
rm -f /var/tmp/tmpfile
}
trap cleanexit INT QUIT TERM EXIT

Let me know if you have a better idea!

Edited on: Wed, Aug 02, 2023 14:36

Posted in HowTo (RSS), Shell Scripting (RSS), System - AIX (RSS), System - Linux (RSS), System - Solaris (RSS)

How to recover broken grub2 boot on SLES15

Posted on Wed, Nov 25, 2020 at 12:28 by Hubertus A. Haniel

I am currently having issues where the boot of one of my VM's gets corrupted and I frequently have to recover it - I sometimes miss out mounting one of the required mount points so I noted them here to jolt my memory mainly but they may be helpful for others......

These are the steps I currently follow to recover the boot sector:

  1. - Boot the system from the DVD or PXE into rescue mode
  2. Mount the existing root file system under /mnt which is on /dev/rootvg/rootlv using "mount /dev/rootvg/rootlv /mnt"
  3. mount -t proc none /mnt/proc
  4. mount -t sysfs sys /mnt/sys
  5. mount -o bind /dev /mnt/dev
  6. chroot /mnt
  7. /sbin/mkinitrd (to recreate the initrd for the existing kernel
  8. /usr/sbin/grub2-mkconfig -o /boot/grub2/grub.cfg (To regenerate the grub config)
  9. /usr/sbin/grub2-install --force /dev/sda (/dev/sda is my boot disk)

Now exit the chroot jail and reboot the system and hopefully everything should be back to normal

Edited on: Wed, Nov 25, 2020 12:36

Posted in HowTo (RSS), System - Linux (RSS)

What is my WWN of my Storage Adapter

Posted on Tue, Mar 13, 2012 at 11:37 by Hubertus A. Haniel

I am frequently being asked on how one finds the WWN for the storage adapter on a modern Linux system with a 2.6 kernel.

This information used to be in /proc but now has been moved into /sys and can be found simply looking into /sys/class/fc_host/host*/portname (host* because you may have multible adapters eg host0 and host1)

On my system I get the following:

cat /sys/class/fc_host/host*/port_name
0x5000101000000414
0x5000101000000416

These numbers are required by your storage administrator to present storage to you in an enterprise class storage array.

Posted in HowTo (RSS), System - Linux (RSS)

Linux SCSI rescan? - Reboot!?

Posted on Tue, Dec 06, 2011 at 11:56 by Hubertus A. Haniel

Recently people keep asking me how do I add and remove storage from a Linux system running a 2.6 kernel and get linux to rescan the SCSI bus and add (or remove) storage dynamically with out rebooting.

So here is how to do it:

1 - Find the host number for the HBA:

ls /sys/class/fc_host/

You will have something like host1 or host2

2 - Ask the HBA to issue a LIP signal to rescan the FC bus:

echo 1 > /sys/class/fc_host/host1/issue_lip

3 - Wait for a few seconds for the LIP command to complete

4 - Ask the linux kernel to rescan the SCSI devices on that HBA

echo "- - -" > /sys/class/scsi_host/host1/scan

( - - - means every channel, every target and every lun )

Edited on: Tue, Dec 06, 2011 12:07

Posted in HowTo (RSS), System - Linux (RSS)

LXC - Linux Containers on OpenSuSE/SLES

Posted on Wed, Oct 05, 2011 at 9:27 by Hubertus A. Haniel

Recently somebody pointed me at LXC so I thought I give it a try.

As I mainly work on SuSE/SLES - I attempted this following the documentation at http://en.opensuse.org/LXC with a little extra help from http://lxc.teegra.net unsing OpenSuSE 11.4

At the time of me writing this the OpenSuSE guide did work pretty well but I had to make a few adjustments to get the container running properly:

  • To get the network to start up properly I had to comment out the paragraph that sets the mode to "onboot" in /etc/init.d/network so during boot it is just called with start. - This is a bit of a hack and may break things if the networking setup is a little more complex then a single interface. I also adjusted my config slightly to use DHCP rather then static addresses as that is a little easier to handle in my test environment.
  • This is not mentioned in the documentation I found but autofs has problems within a container. - My homedirectory gets mounted from a NFS server and autofs just seemed to hang while a hard nfs mount in fstab would work just fine.
  • Booting the environment came up with lots of errors about udev so I re-enabled that even though the documentation mentions that it should be taken out.

I have the advantage that I have puppet in my environment to adjust the configuration of systems to suit my test environment but a few things to make LXC on SuSE viable in a proper environment would be:

  • LXC/YaST intigration so AutoYaST templates could be fed into a container creation.
  • Currently there are no LXC templates for OpenSuSE or only the frameworks so one would have to create proper templates to use the lxc-create/lxc-destroy commands to create and destroy containers on the fly.
  • LXC is part of the SLES11 distribution but there does not seem to be any documentation what Novell would support inside a container in a production environment especially since I had to hack startup scripts in /etc/init.d so I think the startup scripts would need to be properly adjusted to be aware to do the right things if they are running inside a container. Hacking the start up scripts is not really an option as those changes may get reversed out during patching.

Other then the above concerns and gotchas LXC is a very interesting project and has the potential of Solaris Zones and offers for Linux a full compliment of Virtualisation technologies alongside UML (Not realy used any more), Xen and KVM.

Posted in HowTo (RSS), System - Linux (RSS), Virtualization (RSS)

Creating upgradable rpm packages (-Uvh)

Posted on Sat, Sep 24, 2011 at 11:23 by Hubertus A. Haniel

When creating rpm packages one should bear in mind that during a package upgrade the pre and post install and uninstall scripts are also run and it can happen that modifications are executed that you may not want to execute during the upgrade eg removal of files in a postuninstall which will actually still be needed after the upgrade. I belive the postuninstall is actually being run after the package has been upgraded. To get around this one should use the following framework:

%pre
if [ "$1" = "1" ]; then
Do stuff that should happen during initial installation.
elif [ "$1" = "2" ]; then
Do stuff that should happen during upgrade.
fi

%post
if [ "$1" = "1" ]; then
Do stuff that should happen during initial installation.
elif [ "$1" = "2" ]; then
Do stuff that should happen during upgrade.
fi

%preun
if [ "$1" = "0" ]; then
Do stuff that should happen during removal of the package.
elif [ "$1" = "1" ]; then
Do stuff that should happen during upgrade.
fi

%postun
if [ "$1" = "0" ]; then
Do stuff that should happen during removal of the package.
elif [ "$1" = "1" ]; then
Do stuff that should happen during upgrade.
fi



Edited on: Tue, Mar 13, 2012 11:34

Posted in HowTo (RSS), Packaging (RPM) (RSS), System - Linux (RSS)

Verification-time script in rpm

Posted on Fri, Sep 23, 2011 at 18:14 by Hubertus A. Haniel

This could be useful - found this in the rpm documentation while I was searching for something else so I thought I put it here for later :)

The %verifyscript executes whenever the installed package is verified by RPM's verification command. The contents of this script is entirely up to the package builder, but in general the script should do whatever is necessary to verify the package's proper installation. Since RPM automatically verifies the existence of a package's files, along with other file attributes, the %verifyscript should concentrate on different aspects of the package's installation. For example, the script may ensure that certain configuration files contain the proper information for the package being verified:

for n in ash bsh; do
   echo -n "Looking for $n in /etc/shells... "
   if ! grep "^/bin/${n}\$" /etc/shells > /dev/null; then 
      echo "missing"
      echo "${n} missing from /etc/shells" >&2
   else
      echo "found"
   fi
done

In this script, the config file /etc/shells, is checked to ensure that it has entries for the shells provided by this package.

It is worth noting that the script sends informational and error messages to stdout, and error messages only to stderr. Normally RPM will only display error output from a verification script; the output sent to stdout is only displayed when the verification is run in verbose mode.

Edited on: Sat, Sep 24, 2011 18:39

Posted in HowTo (RSS), Packaging (RPM) (RSS), System - Linux (RSS)

nosrc.rpm files

Posted on Thu, Sep 22, 2011 at 17:43 by Hubertus A. Haniel

It is rather often that we have to package up binaries in rpm format for which there are no sources available and the vendor does not supply them in rpm format. - For this the best way to do this is using the NoSource tag in the SPEC file,

When building the RPM the result is apart from the package you want is also that rather then having a src.rpm as the source rpm you end up with a nosrc.rpm file. This nosrc.rpm file will only contain the spec file and not the sources that have been specified. This can be quite handy to grab files out of a running system.

So in the header of the SPEC file you would have something like

Source0: some.binary.%{_target_cpu}
NoSource: 0

As this binary will be architecture specific in most cases you should also ensure that you set the ExclusiveArch tag to ensure that the package does not get rebuild with the wrong architecture as a target. In this case I have two architecture specific binaries and with the above %{_target_cpu} the correct one gets selected when I specify the --target parameter on rpmbuild. My ExclusiveArch tag is set as below:

ExclusiveArch: i586 x86_64

An example of a package where I have used this concept can be found here.

Edited on: Sat, Sep 24, 2011 18:38

Posted in HowTo (RSS), Packaging (RPM) (RSS), System - Linux (RSS)

OpenSuSE filesystem snapshots on btrfs

Posted on Mon, Sep 19, 2011 at 17:16 by Hubertus A. Haniel

Only recently did I find out about a new cool feature on the OpenSuSE Factory builds called snapper. It has been in the works for quite a while and was actually announced back in April here but it is only recently that I have tried these tools out while familiarising myself with btrfs

Snapper is reasonably easy to configure and there are quite a few guides on the OpenSuSE website. Once it is all configured it creates snapshots of your system on regular intervals via cron and also everytime you make modifications to your system using zypper and YaST but it requires that your root filesystem is btrfs. - The only drawback currently is that you can not boot of btrfs so you can not create snapshots of the /boot filesystem and therfore you would not be able to roll back kernel updates unless you replicate the contents of /boot into your btrfs filesystem before kernel upgrades.

A detailed guide on how to get snapper going which I followed is here but it should be noted that the subvolume for snapshots should be created as /.snapshots rather then /snapshots otherwise things will just not work.

So on your btrfs root filesystem or any filesystem that you want to snapshot you create a subvolume called .snapshots like so:

    
  btrfs subvolume create /.snapshots

Then you create your configuration in /etc/snapper/configs/root for your root filesystem. - I got the example out of /etc/snapper/config-templates/default

Then you add "root" to SNAPPER_CONFIGS in /etc/sysconfig/snapper

Now everything should be ready to work and you can try to create an initial snapshot with the following command:

    snapper create --description "initial"
Edited on: Sat, Sep 24, 2011 17:41

Posted in HowTo (RSS), System - Linux (RSS)

Mirroring a LVM based root disk to a second disk on Linux

Posted on Fri, Dec 17, 2010 at 16:02 by Hubertus A. Haniel

I did this on OpenSuSE 11.2 so on other systems you may have to adjust this procedure slightly.

Note: Make sure that kernel dumping is disabled as OpenSuSE 11.2 does not seem to be able to dump to a raided lvm setup. Also before you do this you should ensure that you have backups of everything as you could end up with a corupted system if you get this procedure wrong.

On my setup the system is installed on /dev/sda and my spare disk which I want to mirror to is /dev/sdb. I have two partitions with /dev/sda1 being the /boot filesystem and /dev/sda2 being under LVM control containing the root filesystem and swap space. Obviously on a raided system you may have to trade in a bit of performance against the disk redundancy as every write to the disks will happen twice.

You need to ensure that both disks are the same size or /dev/sdb is bigger then /dev/sda. In my system both disks are exactly the same size which is preferable as the disk geometry will be the same then.

You also want to ensure that any applications are shut down to minimize any other disk access to the system while you are doing this.

As a first step you need to re-label all your partitions on the primary disk to raid (type fd):

    
carling:~ # fdisk /dev/sda
 
The number of cylinders for this disk is set to 2610.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
(e.g., DOS FDISK, OS/2 FDISK)

Command (m for help): t
Partition number (1-4): 1
Hex code (type L to list codes): fd
Changed system type of partition 1 to fd (Linux raid autodetect)

The above example needs to be done for /dev/sda2 (partition 2) as well and then you should exit fdisk with the command "w" to write the partition table. You may get a warning that the new partition type is not used until the next re-boot but you can ignore that. Listing the partitions with sfdisk should still give you the new setup though which should look like this:


carling:~ # sfdisk -d /dev/sda
# partition table of /dev/sda
unit: sectors

/dev/sda1 : start= 63, size= 208782, Id=fd, bootable /dev/sda2 : start= 208845, size= 41720805, Id=fd /dev/sda3 : start= 0, size= 0, Id= 0 /dev/sda4 : start= 0, size= 0, Id= 0

Now we will copy the above partition table to the second disk using sfdisk again:

sfdisk -d /dev/sda > partitions.txt
sfdisk /dev/sdb < partitions.txt

At this stage you may get an error about /dev/sdb having an invalid DOS signature which can safely be ignored. You should check though that the bootable flag for /dev/sdb is set as below. This was not the case for me so I had to do this manually using fdisk:

carling:~ # fdisk -l /dev/sdb

Disk /dev/sdb: 21.5 GB, 21474836480 bytes
255 heads, 63 sectors/track, 2610 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x3dc7fd79
 
Device Boot Start End Blocks Id System
/dev/sdb1   *           1          13      104391   fd  Linux raid autodetect
/dev/sdb2              14        2610    20860402+  fd  Linux raid autodetect    

Next we create a degraded array on /dev/sdb NOT touching /dev/sda:

carling:~ # mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdb1 missing
mdadm: array /dev/md0 started.
 
Carling:~ # mdadm --create /dev/md1 --level=1 --raid-devices=2 /dev/sdb2 missing
mdadm: array /dev/md1 started.
 
carling:~ # cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb2[0]
      1582336 blocks [2/1] [U_]

md0 : active raid1 sdb1[0]
      513984 blocks [2/1] [U_]

unused devices: <none>

Now create and copy /boot on /dev/md0 - in my case /boot is a reiserfs filesystem:

mkreiserfs /dev/md0
mount /dev/md0 /mnt
cp -a /boot/* /mnt/    

Now you need to edit /etc/fstab and change /dev/sda1 to /dev/md0 for the boot filesystem. After this is done you need to umount /mnt and /boot and do a mount /boot. The output of df should now show that boot is on /dev/md0 as below:

carling:~ # df -h /boot
Filesystem Size Used Avail Use% Mounted on
/dev/md0              102M   50M   53M  49% /boot    

Now we can migrate rootvg which in my case is the volume group for /dev/sda2 containing my root filesystem and swap space with the following commands to /dev/md1:

pvcreate /dev/md1
vgextend rootvg /dev/md1
pvmove /dev/sda2 /dev/md1
vgreduce rootvg /dev/sda2
pvremove /dev/sda2    

Once this is done we can attach the first disk to the new raid setup using the command "mdadm -a /dev/md0 /dev/sda1" and "mdadm -a /dev/md1 /dev/sda2". It will now take a while for your raid to syncronise with can be monitored by checking the contents of /proc/mdstat with cat. While the syncronisation of the raid completes you can carry on with modifying some files. You should create /etc/mdadm.conf with the following contents:

carling:~ # cat /etc/mdadm.conf
DEVICE partitions
ARRAY /dev/md0 level=raid1 devices=/dev/sdb1,/dev/sda1
ARRAY /dev/md1 level=raid1 devices=/dev/sdb2,/dev/sda2    

You also need to edit the filter line in /etc/lvm/lvm.conf to ensure that /dev/md1 is an accepted device for LVM otherwise you will not be able to boot. My line now reads after modification:

filter = [ "r|/dev/.*/by-path/.*|", "r|/dev/.*/by-id/.*|", "a/.*/" "a|/dev/md1|" ]    

I have also read that some people on some Linux distributions have had the problem that LVM would always find the SCSI devices before the MD devices and use them instead which may give some unpredictable results but I have not run into this issue on OpenSuSE 11.2. After you have modified /etc/lvm/lvm.conf you eed to run "vgscvan" to update the .cache file in the same directory.

Now you need to create a new initrd using "mkinitrd -f md" which will tell mkinitrd to include the needed md kernel modules in the initrd.

    
carling:~ # mkinitrd -f md
 
Kernel image: /boot/vmlinuz-2.6.31.8-0.1-default
Initrd image:   /boot/initrd-2.6.31.8-0.1-default
Root device: /dev/rootvg/rootlv (mounted on / as reiserfs)
Resume device: /dev/rootvg/swaplv
setup-md.sh: md127 found multiple times
Kernel Modules: scsi_transport_spi mptbase mptscsih mptspi libata ata_piix ata_generic ide-core piix ide-pci-generic hwmon thermal_sys processor thermal
fan dm-mod dm-snapshot rtc-lib rtc-core rtc-cmos reiserfs ohci-hcd ehci-hcd uhci-hcd hid usbhid raid0 raid1 xor async_tx async_memcpy async_xor raid6_pq
raid456 linear
Features: dm block usb md lvm2
19415 blocks
  

As seen above the feature list includes md and lvm2

Once the raid is fully syncronised according to /proc/mdstat we can update the grub records and ensure that they are present on both disks boot sectors. This is done from the grub shell by running "grub":

    GNU GRUB  version 0.97  (640K lower / 3072K upper memory)

[ Minimal BASH-like line editing is supported. For the first word, TAB
lists possible command completions. Anywhere else TAB lists the possible
completions of a device/filename. ]


grub> find /boot/grub/stage1
 (hd0,0)
 (hd1,0)

grub> root (hd1,0)
Filesystem type is reiserfs, partition type 0xfd

grub> setup (hd1)
Checking if "/boot/grub/stage1" exists... yes
Checking if "/boot/grub/stage2" exists... yes
Checking if "/boot/grub/reiserfs_stage1_5" exists... yes
Running "embed /boot/grub/reiserfs_stage1_5 (hd1)"... 23 sectors are embedded.
succeeded
Running "install /boot/grub/stage1 (hd1) (hd1)1+23 p (hd1,0)/boot/grub/stage2 /boot/grub/menu.lst"... succeeded
Done.
 
grub> root (hd0,0)
 Filesystem type is reiserfs, partition type 0xfd

grub> setup (hd0)
Checking if "/boot/grub/stage1" exists... yes
Checking if "/boot/grub/stage2" exists... yes
Checking if "/boot/grub/reiserfs_stage1_5" exists... yes
Running "embed /boot/grub/reiserfs_stage1_5 (hd0)"... 23 sectors are embedded.
succeeded
Running "install /boot/grub/stage1 (hd0) (hd0)1+23 p (hd0,0)/boot/grub/stage2 /boot/grub/menu.lst"... succeeded
Done.

grub> quit
  

You should now be in a position to reboot the system and it should be able to boot stright of the raid set. - for me things went a little wrong while I was writing this procedure as I messed up the lvm filter line so I had to boot of rescue media (CD/PXE) and for some reaso my md devices have shuffled around a bit and rootlv ended up on /dev/md127 as you may have noticed from some of the above output.

Edited on: Sat, Sep 24, 2011 17:16

Posted in HowTo (RSS), System - Linux (RSS)