Collecting Server Software Versions

You manage several servers running Linux distributions. Each server’s installation may be different, due to operation purpose or due to error. How do you know which ones need attention? Need upgrades?

Make your servers tell you their installed software lists. Each list shows what is installed, but comparing lists from several servers is not easy because they have lots of software. Reformat them into a table: one column for each server, one row for each software package.

In this article you’ll find sections addressing two different ways to get your software lists. One way works for Debian systems including Ubuntu. Another way works for Fedora systems including Red Hat Enterprise Linux.

Section 1: Acquiring Debian Lists

List installed Debian software using apt(8). Put lists into files named for servers:

$ apt list --installed >$(hostname)-Installed.txt

Collecting the lists is easy with only a couple of servers. Just login to each server, run the command to save the list, then transfer the list file to your collection server. If you have many servers, run a loop on your collection server’s command line to get them.

for s in LIST-YOUR-SYSTEMS-HERE
do
  ssh -Y ${s} 'apt list --installed' >${s}-Installed.txt
done

On your collection server, if your login username is the same on all the systems listed after the in, this works. If not, include your username, such as yourloginname@${s}, for the relevant systems. If you use a different username on each system, you might need one loop for each system set using the same name. But, if they all use different usernames, you’ll have to do this manually. When finished, you’ll have a collection of files named for the system.

In each collected filename, a hyphen separates the hostname from the rest of the filename. That will be important later. If your hostnames use a hyphen, pick a different separating character, one that no system uses.

Understanding Debian’s List

Here’s a sample of apt list output:

Listing... acl/stable,now 2.2.52-2 amd64 [installed]
adduser/stable,now 3.113+nmu3 all [installed]
adwaita-icon-theme/stable,now 3.14.0-2 all [installed,automatic] amd64-microcode/stable,now 2.20160316.1~deb8u1 amd64 [installed,automatic]
apt/stable,stable,now 1.0.9.8.4 amd64 [installed]
apt-listchanges/stable,now 2.85.13+nmu1 all [installed]
apt-transport-https/stable,stable,now 1.0.9.8.4 amd64 [installed]
apt-utils/stable,stable,now 1.0.9.8.4 amd64 [installed]
aptitude/stable,now 0.6.11-1+b1 amd64 [installed]
aptitude-common/stable,now 0.6.11-1 all [installed]
aptitude-doc-en/stable,now 0.6.11-1 all [installed,automatic]
aspell/stable,now 0.60.7~20110707-1.3 amd64 [installed,automatic]
aspell-en/stable,now 7.1-0-1.1 all [installed,automatic]
at/stable,now 3.1.16-1 amd64 [installed]
at-spi2-core/stable,now 2.14.0-1 amd64 [installed,automatic]
avahi-daemon/stable,now 0.6.31-5 amd64 [installed,automatic]

Each line follows the same format:

  • Application name
  • Slash (/)
  • Distribution conditions, e.g., “stable”, “unknown”, “now”
  • Space
  • Version number
  • Space
  • CPU class, e.g., “amd64”, “all”
  • Space
  • Condition in brackets, e.g., “[installed]”, “[upgradable…]”

Isolating Debian’s List

Because the command asked for --installed, results show only what’s on the system, not everything available to install. You need to take the name up to the slash and the version in the second space-separated field. Test that as follows:

$ apt list --installed | cut -d/ -f1 | less  Listing...
acl
adduser
adwaita-icon-theme
amd64-microcode
apt
apt-listchanges
apt-transport-https
apt-utils
aptitude
aptitude-common
aptitude-doc-en
aspell  aspell-en
at
at-spi2-core
avahi-daemon

$ apt list --installed | cut -d' ' -f1,2 | less
Listing...
acl/stable,now 2.2.52-2
adduser/stable,now 3.113+nmu3
adwaita-icon-theme/stable,now 3.14.0-2
amd64-microcode/stable,now 2.20160316.1~deb8u1
apt/stable,stable,now 1.0.9.8.4
apt-listchanges/stable,now 2.85.13+nmu1
apt-transport-https/stable,stable,now 1.0.9.8.4
apt-utils/stable,stable,now 1.0.9.8.4 aptitude/stable,now 0.6.11-1+b1
aptitude-common/stable,now 0.6.11-1
aptitude-doc-en/stable,now 0.6.11-1
aspell/stable,now 0.60.7~20110707-1.3
aspell-en/stable,now 7.1-0-1.1
at/stable,now 3.1.16-1
at-spi2-core/stable,now 2.14.0-1
avahi-daemon/stable,now 0.6.31-5

With two different delimiters, turn the slash (/) into a space " ". It makes extraction easier, leaving one delimiter to get both name and version:

$ apt list --installed | tr '/' ' ' | cut -d' ' -f1,3
Listing...
acl 2.2.52-2
adduser 3.113+nmu3
adwaita-icon-theme 3.14.0-2
amd64-microcode 2.20160316.1~deb8u1
apt 1.0.9.8.4
apt-listchanges 2.85.13+nmu1
apt-transport-https 1.0.9.8.4
apt-utils 1.0.9.8.4
aptitude 0.6.11-1+b1
aptitude-common 0.6.11-1
aptitude-doc-en 0.6.11-1
aspell 0.60.7~20110707-1.3
aspell-en 7.1-0-1.1
at 3.1.16-1
at-spi2-core 2.14.0-1
avahi-daemon 0.6.31-5

Organizing Debian’s List

Now you have a list of all software names and version numbers. Specialized servers may have software that others don’t need. You can’t compare these lists without considering those differences.

First collect a list of package names across all servers before identifying the versions installed. Go back to cut -d/ -f1, but this time take every file collected from all the servers, sort the names, and eliminate any duplicates.

$ cut -d/ -f1 *-Installed.txt | sort -u

This gets you a list of all packages installed without regard to which server has it. Store the list in an array:

pkgs=( $(cut -d/ -f1 *-Installed.txt | sort -u) )

This holds all the package names in sorted order. In bash, like other programming languages, indexed arrays use the form array[index]=value. You select one element of an array with ${array[index]}. The pkgs array is this kind. The index, such as 0, selects one value, so echo ${pkgs[0]} gives acl from the example list.

An array’s index ranges from 0 to one less than the total elements in the array. If you had 1000 elements in an array, its total would be 1000, but its index would range from 0 to 999. Bash gives you an array’s total element count when you put a # in front of the name and use an asterisk (*) as the index, such as ${#pkgs[*]}. Think of # as saying, “Give me the number”, and * as saying, “for every element”. Use an index number like 0 instead of the *, bash reports the number of characters stored in that element. In the example list, echo ${#pkgs[0]} would give 3 for acl.

Associate Debian Names, Versions, & Systems

To see an example, use one package name. Put a trailing slash to end the name. Some files add details to a name, such as aptitude vs aptitude-common vs aptitude-doc-en. To see only aptitude use the same slash that apt terminates the name with:

$ egrep "^aptitude/" *-Installed.txt | tr ' ' '/' | cut -d/ -f1,3
mysrv01-Installed.txt:aptitude/0.6.11-1+b1
mysrv06-Installed.txt:aptitude/0.6.11-1+b1
mysrv07-Installed.txt:aptitude/0.6.11-1+b1
mysrv08-Installed.txt:aptitude/0.6.11-1+b1
mysrv10-Installed.txt:aptitude/0.6.8.2-1
mysrv11-Installed.txt:aptitude/0.6.8.2-1
mysrv12-Installed.txt:aptitude/0.6.8.2-1
mysrv13-Installed.txt:aptitude/0.6.11-1+b1
mysrv14-Installed.txt:aptitude/0.6.11-1+b1

Because multiple files were named on egrep‘s command line, each matching instance starts with the filename and a colon. The trick is to turn this into array notation.

Bash has two types of arrays. The indexed array uses index numbers to select elements. The other type is called the associative array. It associates names with values. Its array storing format is array[name]=value. They store a name as the index and a value related to that name.

For the package collecting software, make the name a combination of filename:package, as in the egrep example up to the slash. Set the value from the version number.

Without doing anything special, creating an array in bash makes an indexed array. That’s bash‘s default array. You must tell bash ahead of time that a certain variable will be an associative array:

declare -A sys

Typically, such declarations go at the beginning of a bash program. Now bash knows that an array named sys will be associative. You don’t have to say anything special to create an indexed array, but if you must differentiate, declare indexed arrays with -a (lowercase).

To create the format, sys[name]=value, turn the filename:package results into the name by preceding each with sys[. Closing with the ]= requires replacing the /. Do both these replacements at once using sed(1), the stream editor:

sed -e 's/^/sys[/' -e 's,/,]=,'

The sed -e option gives editing expressions. These use regular expressions, overviewed in regex(7), which can get complicated. This use is simple. A sed expression starting with s is a substitution: replace one string of characters with another. The first character after s signals the beginning of the find-string: the string of text characters to search for. Whatever that first character is — commonly the slash, but any character is possible — it becomes the separator between the parts. The next time the separator appears it ends the find-string and starts the replace-string: the string of text characters to replace the find-string. Terminate the replace-string with one last repetition of the separator character.

In the sed command above, the first -e expression uses a carat (^) symbol. This is a regular expression symbol for the beginning of a line. Signify line ends with a dollar sign $. With a carat as the only find-string, this expression matches beginning of every line. The replace-string is sys[ so every line will start with that.

The second -e expression in the above sed command uses a comma (,) instead of a slash as the separating character. When the find-string or replace-string uses a slash as a character to find or replace, you can’t use that slash as the separator. Pick another character that won’t appear in either the find-string or the replace-string. This one uses the comma. So, replace every / with ]=.

Here’s the result:

$ egrep "^aptitude/" *-Installed.txt |
tr ' ' '/' |
cut -d/ -f1,3 |
sed -e 's/^/sys[/' -e 's,/,]=,'
sys[mysrv04-Installed.txt:aptitude]=0.6.8.2-1 sys[mysrv01-Installed.txt:aptitude]=0.6.11-1+b1
sys[mysrv06-Installed.txt:aptitude]=0.6.11-1+b1
sys[mysrv07-Installed.txt:aptitude]=0.6.11-1+b1
sys[mysrv08-Installed.txt:aptitude]=0.6.11-1+b1
sys[mysrv10-Installed.txt:aptitude]=0.6.8.2-1
sys[mysrv11-Installed.txt:aptitude]=0.6.8.2-1
sys[mysrv12-Installed.txt:aptitude]=0.6.8.2-1
sys[mysrv13-Installed.txt:aptitude]=0.6.11-1+b1
sys[mysrv14-Installed.txt:aptitude]=0.6.11-1+b1

Note: Long command lines can break after the pipe symbol. Doing so helps readability. Bash shows line continuation by changing the prompt from $ to >. They are not shown here to aid copy/paste into your command line testing.

This output, after the sed line, shows the associative array syntax to assign the sys[] array based on the filename:package as the index name and the version number as the value. All these vary among the servers. But, this generates messages using array notation. This output shows the syntax. It doesn’t store in an array.

To store in the array requires bash evaluating this output. Bash‘s eval command does this. Try it out:

$ declare -A sys

$ eval $(
egrep "^aptitude/" *-Installed.txt |
tr ' ' '/' |
cut -d/ -f1,3 |
sed -e 's/^/sys[/' -e 's,/,]=,'
)

Note: Long lines can break after an open parenthesis and don’t complete until the close parenthesis comes along.

First, declare the associative array named sys[]. Next, use eval to execute the assignment strings created in the subshell. How many elements are in the array?

$ echo ${#sys[*]}
9

What are all the values?

$ echo ${sys[*]}
0.6.11-1+b1 0.6.11-1+b1 0.6.11-1+b1 0.6.11-1+b1 0.6.8.2-1 0.6.8.2-1 0.6.8.2-1 0.6.11-1+b1 0.6.11-1+b1

They appear in a space-separated list. What are their index names?

$ echo ${!sys[*]}
mysrv06-Installed.txt:aptitude mysrv14-Installed.txt:aptitude mysrv08-Installed.txt:aptitude mysrv01-Installed.txt:aptitude mysrv10-Installed.txt:aptitude mysrv12-Installed.txt:aptitude mysrv11-Installed.txt:aptitude mysrv13-Installed.txt:aptitude mysrv07-Installed.txt:aptitude

They also show in a space-separated list. Using the index names, loop through the list.

$ for n in ${!sys[*]}
do
  echo "${n} is ${sys[${n}]}"
done
mysrv06-Installed.txt:aptitude is 0.6.11-1+b1
mysrv14-Installed.txt:aptitude is 0.6.11-1+b1
mysrv08-Installed.txt:aptitude is 0.6.11-1+b1
mysrv01-Installed.txt:aptitude is 0.6.11-1+b1
mysrv10-Installed.txt:aptitude is 0.6.8.2-1
mysrv12-Installed.txt:aptitude is 0.6.8.2-1
mysrv11-Installed.txt:aptitude is 0.6.8.2-1
mysrv13-Installed.txt:aptitude is 0.6.11-1+b1
mysrv07-Installed.txt:aptitude is 0.6.11-1+b1

In these last three output examples, notice the associative array does not list in the order each was assigned. Never depend on that order in an associative array. Make the order you want.

Producing Debian Table Headings

The pkgs[] array was stored in sorted order by package name. With lots of packages, more than there are servers, make each server a table column. Rows get package versions. First row headings have server names. In the first column, put the package names. Separate each column with a tab. Each row is one line. This makes parsing easy for a spreadsheet.

To generate the column headings, produce a tab-separated list of server names. Each filename includes the server name followed by a hyphen. Use the hyphen to isolate the server name from the filename.

Told you that hyphen would be important.

$ for f in *-Installed.txt
do
  echo -en "$(echo $f | cut -d- -f1)\t"
done; echo
mysrv01    mysrv02 mysrv03mysrv04 mysrv05 mysrv06 mysrv07 mysrv08 mysrv09 mysrv10 mysrv11 mysrv12 mysrv13 mysrv14

That did it. Echo‘s -en option allows escape characters (-e) and suppresses newlines (-n). Escape characters use a backslash before a character to mean a special symbol, such as \t means tab and \n means newline (a linefeed and a carriage return). Echo produces what happens inside the quotation marks:

  1. Start a subshell.
  2. Send the current filename in $f through the pipe.
  3. From the pipe, cut using a hyphen delimiter produces field 1.
  4. Close the subshell.
  5. Append a tab to that subshell’s output.

The result delivers the tab-separated list to a single line. When the loop finishes, the last echo drops down to the next line to start the next row.

Let’s see that output again but make the tabs visible. Just insert |cat -t between the done and the semicolon to make the tabs visible:

$ for f in *-Installed.txt
do
  echo -en "$(echo $f | cut -d- -f1)\t"
done | cat -t; echo
mysrv05^Imysrv01^Imysrv02^Imysrv03^Imysrv04^Imysrv05^Imysrv06^Imysrv07^Imysrv08^Imysrv09^Imysrv10^Imysrv11^Imysrv12^Imysrv13^Imysrv14^I

A tab (^I) appears after each name, including the last. Proof of the final echo is that the prompt returns on the line below the output. If the final echo was not there, the prompt ($) would appear after the last tab.

Produce the Debian Table

Looping through all the package names in the pkgs[] array, every package name must appear in the first column. After it comes each version number for that package for each server. They come from the sys[] associative array. Output a tab to get to the next column, then index for the correct system and the package name. Do this in order of the filenames because their order becomes the column sequence from left to right.

Here is the whole program:

declare -A sys

# Collect package names.
pkgs=( $(cut -d/ -f1 $* | sort -u) )

# Format sys array definitions and create them, each with a version number.
for pkg in ${pkgs[@]}
do
  eval $(
  egrep "^${pkg}/" $* |
  tr ' ' '/' |
  cut -d'/' -f1,3 |
  sed -e 's/^/sys[/' -e 's,/,]=,'
 )
done

# Produce headings.
echo -en "Name\t"
for f in $*
do
  echo -en "$(echo $f | cut -d- -f1)\t"
done
echo

# Generate package names & versions per sys.
for pkg in ${pkgs[@]}
do
  echo -en "${pkg}"
  for f in $*
  do
    echo -en "\t${sys[${f}:${pkg}]}"
  done
  echo
done

Use chmod(1) to make this executable. If you call this program softdiffs.sh, the command line for running this is:

$ softdiffs.sh *-Installed.txt >softversions.txt

In the program code, $* is the space-separated list of all arguments on the command line, not including the program name and not including any redirections. Because wildcards expand in sorted order, each system’s package list will be in sorted order by system name. That becomes the column order.

Import the tab-separated table into a spreadsheet. After importing, equalize the column widths and start adjusting your software installations.

 

Section 2: Acquiring Fedora Lists

Fedora uses rpm(8) to keep track of installed software. Its list format makes parsing the package names from the version numbers more difficult:

$ rpm -qa | head
festvox-ked-diphone-0.19990610-32.fc24.noarch
mesa-libglapi-12.0.3-2.fc24.x86_64
cyrus-sasl-gssapi-2.1.26-26.2.fc24.x86_64
python-matplotlib-data-1.5.2-0.1.rc2.fc24.noarch
satyr-0.21-1.fc24.x86_64
gnome-session-3.20.2-1.fc24.x86_64
pangomm-2.40.0-1.fc24.x86_64
libbluray-0.9.3-3.fc24.x86_64
google-noto-emoji-fonts-20170223-1.fc24.noarch
marisa-0.2.4-15.fc24.x86_64

It’s difficult to parse because some software names use numbers as part of the name. Here are a few examples:

udisks2-2.1.7-1.fc24.x86_64
libnl3-3.2.28-4.fc24.x86_64
p11-kit-0.23.2-2.fc24.x86_64
byzanz-0.3-0.16.fc24.x86_64
bind99-libs-9.9.9-4.P6.fc24.x86_64
openssh-askpass-7.2p2-14.fc24.x86_64

It isn’t easy to distinguish those numbers from official version numbers. Hyphens often separate parts of the name as well as the name from the numbers, and even sometimes appear in version numbers. The fourth example looks easy with a name, a hyphen, and the version number. With all package names starting with a letter, scanning from left to right you could say the numbers after a letter are part of a name up to a hyphen followed by a number, such as the first three and the fifth. Then the sixth one fouls you up because the “7.2p2” makes you question whether there are other packages using digits and letters with periods as part of the name or only as part of the version. One could approach it from right to left, looking for the beginning. These suggest a lot of possible variations requiring many logical tests. Unfortunately, the rpm list is more difficult to parse than it’s worth.

Fortunately, dnf(8) gives the same information, sorted, and has spaces separating its three fields. Here’s a brief example:

$ dnf -C list installed | head
Last metadata expiration check: 0:01:02 ago on Wed Apr 19 13:16:53 2017.
Installed Packages
GConf2.x86_64                          3.2.6-16.fc24                    @fedora
GeoIP.x86_64                           1.6.10-1.fc24                    @updates
GeoIP-GeoLite-data.noarch              2017.04-1.fc24                   @updates
GraphicsMagick.x86_64                  1.3.25-6.fc24                    @updates
ImageMagick.x86_64                     6.9.3.0-2.fc24                   @fedora
ImageMagick-libs.x86_64                6.9.3.0-2.fc24                   @fedora
LibRaw.x86_64                          0.17.2-1.fc24                    @fedora
ModemManager.x86_64                    1.6.4-1.fc24                     @updates

Remove the first two lines to get at the data. Causing trouble, dnf has a freakish dependence on uniform column widths. It always shows three columns, so it gauges the terminal width and fits its three columns to that width, expanding the columns as needed. But, redirect its output and it assumes the width should be 80 and fits its column widths accordingly. It doesn’t mind if the third column’s data extends wider than 80. Here are some examples:

NetworkManager-config-connectivity-fedora.x86_64
                                       1:1.2.6-1.fc24                   @updates
audacity-freeworld.x86_64              2.1.3-0.9.20161109git53a5c93.fc24
                                                                        @rpmfusion-free-updates

When a package name or version number is very long, invading the next column, dnf drops the next column’s data to the next line at the same column start position. This means that not all lines have the same number of columns. Two examples show the variations. First, the name in column 1 is too long, so dnf dropped column 2’s data to the next line and followed up with column 3 on that line. Second, the version in column 2 was too long, so dnf dropped column 3’s data to the next line.

There could be a way to recombine those columns. Consider an approach:

1. Remove the first two unneeded lines using tail(1):

dnf -C list installed | tail -n +3

2. Turn all spaces and newlines into one space each using echo.

echo $(dnf -C list installed | tail -n +3)

3. Ignore every third word.

The second item removes the line separations and makes everything a space-separated entry, perfect for storing in an array.

The third item, deleting every third word before storing in an array is an interesting problem. Maybe use cut to remove the third column. Could put the result of the echo into an array, then loop through the array to keep only the first two of every three elements. More interesting, use awk(1) to detect when an input line has fewer than three fields, then attach the next input line.

Because deleting every third array element could be deferred until the first two are used, here’s the timing of just putting the list into an array:

$ time a=( $(echo $(dnf -C list installed | tail -n +3)) )

real    0m1.541s
user    0m1.322s
sys     0m0.169s

$ echo ${#a[*]}
6918

An algorithm presents itself:

  1. Expect 3 fields.
  2. If a line has one field, the next line has the other two.
  3. If a line has two fields, the next line will have the third.

Run the list through awk to test and combine as each line comes in:

$ unset a

$ time a=( $(dnf -C list installed | tail -n +3 |
awk '
  NF < 3 {
    printf "%s ", $1;
    if (NF == 2) {
      printf "%s ", $2;
      getline;
      print $1;
    }
    else {
      getline;
      print $1,$2;
    }
    next;
  }

  {
    print $1,$2,$3;
  }
') )

real    0m1.553s
user    0m1.324s
sys     0m0.181s

$ echo ${#a[*]}
6918

In awk scripts, each input line is a record. Records automatically divide into fields separated by any number of spaces or tabs. Spaces and tabs automatically conflate into one unless you change the field separator. So, awk will do what echo did previously, but line separation still exists.

Test every input line to see if the number of fields (NF) is under 3. If so, output the guaranteed first field. The line either has 2 fields or only 1. If 2, output the second field, get the next line, and print that new line’s one-and-only field. However, if the record only had 1 field, get the next line and print both its fields. Then skip all other operations by going to the next record.

All other lines have 3 fields. Just print all three of them.

In both cases the third field was left in place. As before, ignoring every third element could happen later while processing the array. But, the awk script includes the third field in output. It could just as easily not.

Notice the real times have minuscule differences. Multiple runs of these show the differences are more related to what else the system is doing. The two methods take about the same time. Why not delete the third column in the awk script and be done with it?

$ unset a

$ time a=( $(dnf -C list installed | tail -n +3 |
awk '
  NF < 3 {
    printf "%s ", $1;
    if (NF == 2) {
      printf "%s ", $2;
      getline;
    }
    else {
      getline;
      print $1;
    }
    next;
  }

  {
    print $1,$2;
  }
') )

real    0m1.413s
user    0m1.173s
sys     0m0.192s

$ echo ${#a[*]}

4612

The unneeded column is gone. Getting this list from each server does not need the array.

Don’t run the awk script on the remote server. Instead, run dnf remotely and feed its output to awk locally:

for s in LIST-YOUR-SYSTEMS-HERE
do
  ssh -Y ${s} 'dnf -C list installed | tail -n +3' |
  awk '
    NF < 3 {
      printf "%s ", $1;
      if (NF == 2) { printf "%s ", $2; getline; }
      else { getline; print $1; }
      next;
    }

    { print $1,$2; }
  ' >${s}-Installed.txt  done

Sending the command inside apostrophes, the dnf|tail happens on the remote system. That feeds its output through the local pipe to awk. Awk produces the space-separated, two-column lines to the local file named after the system.

On your local collection server, if your login username is different on the other systems listed after in, adjust it to use yourloginname@${s} for the relevant systems. When done, the list of filenames collected for each system will be on your local server. Each filename will have the server’s name and “-Installed.txt“. That hyphen will be important later.

Organizing Fedora’s List

Each server’s list with software names and version numbers may be different, slightly or greatly, from other servers. Those systems specializing in certain operations may have software that other servers don’t need. Comparing these lists without considering the differences will bite you later.

Collect the list of all package names throughout all servers without regard to which server has them. Version numbers don’t matter at this stage. Because all the Fedora lists show package names in the first field, eliminate the second field from all lists and sort the package names to remove the duplicates:

$ cut -f1 *-Installed.txt | sort -u

This produces the list of all installed packages no matter which servers have it. Put that in an array:

pkgs=( $(cut -f1 *-Installed.txt | sort -u) )

This array holds all package names in sorted order. To select one element from an array, use numbers ranging from 0 to one less than the total array elements. Bash is like other programming languages. Indexed arrays use the array[index]=value format. The first index is 0: ${pkgs[0]}. This ranges up to the total elements in the array minus 1. So, if you had 1000 elements, the last index would be 999.

The total array elements comes from a special bash syntax. Put a “#” in front of the array name and use an asterisk (*) as the index: ${#pkgs[*]}. The # says, “Give me the number”, and the * says, “for every element.”

Associate Fedora Names, Versions, & Systems

For an example, use one package name. Put a slash between the name and version number to make the separation visible. Fedora typically suffixes package names with the host architecture, such as x86, x86_64, or noarch:

$ egrep "^NetworkManager.x86_64" *-Installed.txt | tr ' ' '/'
mysrv01-Installed.txt:NetworkManager.x86_64/1:1.0.12-2.fc23
mysrv02-Installed.txt:NetworkManager.x86_64/1:1.2.6-1.fc24
mysrv03-Installed.txt:NetworkManager.x86_64/1:1.4.4-3.fc25

With multiple files named on egrep‘s command line, each instance is preceded by the discovered filename separated by a colon. How to turn this into an array?

Bash has two array types: indexed and associative. Indexed arrays have already been shown. They use a subscript number indexing the array element to get or set the array value. Associative arrays store a name as the subscript. The value to get or set associates with that name. Thus, the array storing format is array[name]=value.

For package software, use the combined filename:package as the name and the version number as the value. Bash uses indexed arrays by default. You must tell it you want a particular variable to be associative:

declare -A sys

This tells bash that the variable named sys will be an associative array. It does not require a declare for indexed arrays, but you can use -a (lowercase) to do that.

To create the array notation for sys[...], each version record must precede the filename with sys[. To close the bracket and make the assignment, replace the / with ]=. Use sed(1), the stream editor, to make this change easily:

sed -e 's/^/sys[/' -e 's,/,]=,'

The -e option gives sed editing expressions. These use regular expressions, as overviewed in regex(7), which can get complicated. This particular use is simple. In sed, an expression starting with s is a substitution: replace one string with another. Whatever character follows the s becomes the separator between the substitution parts.

In the first expression, the separator is a slash (/). After the first separator comes the find-string — what you want sed to find. This example uses the caret (^), a special symbol meaning find the beginning of the line. The separator appears again, signaling the beginning of the replace-string — what you want sed to replace the find-string with. This replace string is sys[. After the replace-string comes another separator, terminating the substitution. The whole thing means every line will end up starting with sys[.

The second expression, another substitution, uses a comma (,) instead of a slash as the separator. While slash is commonly used as the separator, when a slash is either part of the find-string or part of the replace-string, use a different character as the separator. This time it’s a comma. Reading it: substitute a / with a ]=.

Here’s what it looks like:

$ egrep "^NetworkManager.x86_64" *-Installed.txt |
tr ' ' '/' |
sed -e 's/^/sys[/' -e 's,/,]=,'
sys[mysrv01-Installed.txt:NetworkManager.x86_64]=1:1.0.12-2.fc23
sys[mysrv02-Installed.txt:NetworkManager.x86_64]=1:1.2.6-1.fc24
sys[mysrv03-Installed.txt:NetworkManager.x86_64]=1:1.4.4-3.fc25

Note: Long command line can break after the pipe symbols. Doing that helps readability. Bash shows line continuation with the prompt, “>” instead of “$“. They are not shown here to assist using copy/paste into your own command line testing.

This output produces associative array assignment syntax. The filename:package name becomes the sys[] array index and the version number becomes the value. Each server gets represented with its unique versions. But, this output doesn’t store the data in an array.

Storing these values in the array requires reevaluating these strings as commands. Bash turns data output into commands using its built-in eval command. Try it out:

$ declare -A sys

$ eval $(
egrep "^NetworkManager.x86_64" *-Installed.txt |
tr ' ' '/' |
sed -e 's/^/sys[/' -e 's,/,]=,'
)

Note: Long lines can break after open parentheses. They complete when the close parenthesis appears.

The first command declares an associative array named sys. The next command uses eval to execute the assignment strings produced in the subshell.

$ echo ${#sys[*]}
3

All three software references generated by egrep were assigned to the sys[] array. Here are their values:

$ echo ${sys[*]}
1:1.0.12-2.fc23 1:1.4.4-3.fc25 1:1.2.6-1.fc24

Echo turned them into a space-separated list. What do those version numbers associate with?

$ echo ${!sys[*]}
mysrv01-Installed.txt:NetworkManager.x86_64 mysrv03-Installed.txt:NetworkManager.x86_64 mysrv02-Installed.txt:NetworkManager.x86_64

Using an exclamation mark (!) before the associative variable’s name tells bash to produce the subscript names in the same order that it produces the values. See the association in the following loop:

$ for n in ${!sys[*]}
do
  echo "${n} is ${sys[${n}]}"
done
mysrv01-Installed.txt:NetworkManager.x86_64 is 1:1.0.12-2.fc23
mysrv03-Installed.txt:NetworkManager.x86_64 is 1:1.4.4-3.fc25
mysrv02-Installed.txt:NetworkManager.x86_64 is 1:1.2.6-1.fc24

The loop selected the indexes one at a time and echo showed the filename:package subscript in ${n} and the version number.

Producing Fedora Table Headings

Did you notice that the previous three output examples did not list the associative array elements in the order they were assigned? Never depend on their order in an associative array. Make the order you want.

The pkgs[] indexed array stored in sorted order by package name. With many more packages than servers, a table best shows the servers in each column. Rows show the software package versions. Show the server names in the first row. The first column holds the package names. Separate each column with a tab. Each row is one line.

To generate column headings, separate the server name list with tabs. Each filename uses the server name followed by a hyphen.

Told you that hyphen was important.

$ for f in *-Installed.txt
do
  echo -en "$(echo $f | cut -d- -f1)\t"
done; echo
mysrv01 mysrv02 mysrv03

That gives a filename list, each separated by a tab. Use echo -en to allow escape characters (-e) and suppress the newline (-n). Escape characters use a backslash (\) with a printable character meaning a special symbol, such as the \t for tab and the \n or newline (linefeed and carriage return). Echo produces what happens inside the quotation marks:

  1. A subshell pipes the current filename in $f through cut.
  2. Cut uses a hyphen as the delimiter and selects the first field, the filename.

That finishes the subshell, but a tab appears right after the filename just produced. After the loop, one last echo outputs nothing followed by a newline. That makes up for the lack of newlines after the filenames. To see where the tabs are, insert a pipe to cat -te between done and the semicolon (;):

$ for f in *-Installed.txt
do
  echo -en "$(echo $f | cut -d- -f1)\t"
done | cat -te; echo
mysrv01^Imysrv02^Imysrv03^I

A tab (^I) appears after each name, including the last.

Produce the Fedora Table

Looping through every name in the pkgs[] array, every package name must appear in the first column. Then comes each version number for that package in each server’s column from the sys[] associative array. Tab to the next column, then index for the correct system and package name. Do this in order by filenames because their order becomes the column sequence going from left to right.

Here is the full program:

declare -A sys

# Collect package names.
pkgs=( $(cut -f1 *-Installed.txt | sort -u) )

# Format sys array definitions and create them, each with a version number.
for pkg in ${pkgs[@]}
do
  eval $(
    egrep "^${pkg}/" $* |
    tr ' ' '/' |
    sed -e 's/^/sys[/' -e 's,/,]=,'
  )
done
# Produce headings.
echo -en "Name\t"
for f in $*
do
  echo -en "$(echo $f | cut -d- -f1)\t"
done
echo

# Generate package names & versions per sys.
for pkg in ${pkgs[@]}
do
  echo -en "${pkg}"
  for f in $*
  do
    echo -en "\t${sys[${f}:${pkg}]}"
  done
  echo
done

Make this executable with chmod(1). Calling this program softdiffs.sh, the command line is:

$ softdiffs.sh *-Installed.txt >softversions.txt

In the program, $* is the space-separate list of all arguments given on the command line except for the program name and any redirections. Because wildcards expand in sorted order, each system’s package list will be in sorted order by system name. That order becomes the column order.

Import the tab-separated table this builds into a spreadsheet. After importing, equalize the column widths to the data and start adjusting your installed software.

Leave a Comment