Recently in Debian Category

Compendium of Spotify on Linux Tips

tux.jpg
Getting Spotify to work nicely on Linux

Note: the Linux Spotify client will only work with a premium Spotify account.

I spoke at the NYC PostgreSQL Users' Group meeting in December, and while there someone mentioned that Spotify is a great music service (and that they are using PostgreSQL!). So I decided to give it a try. The issue was that, while it can be made to work on Linux, the process of making it work well on Linux is less than simple. I decided to document what I did (and my sources) as I had to pull information from several sources and added a few modifications of my own.

There are two main problems to deal with:


  1. Getting the program itself installed and running

  2. Getting Linux and your browser to handle the spotify protocol so that, for example, clicking on playlist URLs will work correctly

The answer to problem number one depends in part on your Linux distribution. I am only going to cover Ubuntu and Fedora here -- extrapolation is left as an exercise for the reader.

On Ubuntu (I'm using 11.10), the directions from Spotify seems to work fine. I'll paste them here for the sake of completeness:

# On Ubuntu
# This gets you the older released client
# From http://www.spotify.com/us/download/previews/
# -----------
# 1. Add this line to your list of repositories by
#    editing your /etc/apt/sources.list
deb http://repository.spotify.com stable non-free

# 2. If you want to verify the downloaded packages,
#    you will need to add our public key
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4E9CFF4E

# 3. Run apt-get update
sudo apt-get update

# 4. Install spotify!
sudo apt-get install spotify-client-qt

I just noticed that the Ubuntu directions result in the older client working, not the shiny new preview version. See below for instructions to get the preview client working

# On Ubuntu, new preview client
# From 
# http://getsatisfaction.com/spotify/topics/try_out_the_linux_apps_client_beta_preview
# -----------
wget \
http://download.spotify.com/preview/spotify-client_0.8.0.1031.ga1569aa.552-1_amd64.deb
ar vx spotify-client_0.8.0.1031.ga1569aa.552-1_amd64.deb
tar -xzvf data.tar.gz
cp -rf ./usr /

# From 
# http://meltingrobot.wordpress.com/2011/11/08/spotify-installation-on-fedora-16/
# modified to handle passing of arguments

vi /usr/local/bin/spotify

# add the following lines to /usr/local/bin/spotify
8<--------------------------
#!/bin/bash

/bin/rm -rf ~/.cache/spotify
/usr/share/spotify/spotify $*
8<--------------------------

# make the script executable
chmod +x /usr/local/bin/spotify

# arrange to use the script in place of the binary to work
# around a known issue causing segfaults
rm /usr/bin/spotify
ln -s /usr/local/bin/spotify /usr/bin/spotify

# create symlinks to work around library mismatches
ln -s /usr/lib/x86_64-linux-gnu/libplc4.so /usr/lib/x86_64-linux-gnu/libplc4.so.0d
ln -s /usr/lib/x86_64-linux-gnu/libnspr4.so /usr/lib/x86_64-linux-gnu/libnspr4.so.0d

On Fedora things are complicated by the fact that Spotify no longer distributes an RPM - at least not that I could find. There are several recipes for solving this dilemma that can be found scattered around the Internet. Here is what I used:

# On Fedora (I am on Fedora 15)
# From http://www.passwdshadow.com/
yum -y install perl-ExtUtils-MakeMaker gcc qt-webkit rpm-build git
cd /tmp
git clone git://git.kitenet.net/alien
cd alien
perl Makefile.PL; make; make install
wget \
http://download.spotify.com/preview/spotify-client_0.8.0.1031.ga1569aa.552-1_amd64.deb
/usr/local/bin/alien --to-rpm \
spotify-client_0.8.0.1031.ga1569aa.552-1_amd64.deb
rpm -Uvh --nodeps spotify-client-0.8.0.1031.ga1569aa.552-2.x86_64.rpm
ln -s /usr/lib64/libssl.so.1.0.0e /usr/lib64/libssl.so.0.9.8
ln -s /lib64/libcrypto.so.1.0.0e /lib64/libcrypto.so.0.9.8
ln -s /usr/lib64/libnss3.so /usr/lib64/libnss3.so.1d
ln -s /usr/lib64/libnssutil3.so /usr/lib64/libnssutil3.so.1d
ln -s /usr/lib64/libsmime3.so /usr/lib64/libsmime3.so.1d
ln -s /lib64/libplc4.so /lib64/libplc4.so.0d
ln -s /lib64/libnspr4.so /lib64/libnspr4.so.0d

# From 
# http://meltingrobot.wordpress.com/2011/11/08/spotify-installation-on-fedora-16/
# modified to handle passing of arguments

vi /usr/local/bin/spotify

# add the following lines to /usr/local/bin/spotify
8<--------------------------
#!/bin/bash

/bin/rm -rf ~/.cache/spotify
/usr/bin/spotify.bin $*
8<--------------------------

# make the script executable
chmod +x /usr/local/bin/spotify

# arrange to use the script in place of the binary to work
# around a known issue causing segfaults
mv /usr/bin/spotify /usr/bin/spotify.bin
ln -s /usr/local/bin/spotify /usr/bin/spotify

At this point you should be able to click on the Spotify desktop shortcut and the program will launch.

So on to problem number two. One of the key features of Spotify is the ability to share playlists. This is done via a "spotify" protocol URL. Unfortunately at this point neither Linux nor your browser know how to handle this protocol. I have only worked out the specifics for gnome and Firefox, but here they are below:

# Handling the spotify protocol -- e.g. to allow use of http://sharemyplaylists.com
# From http://kb.mozillazine.org/Register_protocol
# -------------------------------------------------
# At shell command prompt:
gconftool-2 -s \
/desktop/gnome/url-handlers/spotify/command '/usr/bin/spotify %s' --type String
gconftool-2 -s \
/desktop/gnome/url-handlers/spotify/enabled --type Boolean true

# In Firefox:
#    Type about:config into the Location Bar (address bar) and press Enter.
#    Right-click -> New -> Boolean
#          -> Name: network.protocol-handler.expose.spotify
#          -> Value -> false
# Next time you click a link of protocol-type spotify you will be asked
# which application to open it with. Select /usr/bin/spotify

I think that's everything. I used the preceding successfully on my Fedora 15 desktop and my Ubuntu 11:10 laptop. But use at your own risk -- no guarantees that the foregoing will work or will not eat your data ;-)

Hope this helps someone else!

tux.jpg
Recursively finding Windows Internet Shortcut (*.url) files and changing them into GNOME desktop files

Over the past few days I have finally converted my wife's computer from WinXP to Linux (Ubuntu 10.10). One of the many fine points of the negotiation leading to this was that I needed to preserve her many Internet Shortcut files. Since she has literally thousands of them, and they are sprinkled about in many a nested folder, I needed a script that could find them, and create the equivalent GNOME desktop files. The following is my solution. Perhaps not the most elegant way to achieve these ends, but it worked great for me. However I cannot promise this script will not eat your files, so please test and use at your own risk ;-)

Create the following script (e.g. using vi)

vi /usr/local/bin/fix_url.sh

Put the following in fix_url.sh (press "i", and then type or paste):

#!/bin/bash

(
    IFS=$'\n'
    files=$(find . -name *.url)
    for fl in $files; do
        NEWFILE=${fl}.desktop
        cp "${fl}" "${NEWFILE}"
        sed -i 's/InternetShortcut/Desktop\ Entry/g' "${NEWFILE}"
        sed -i '/^\[DEFAULT\]/d' "${NEWFILE}"
        sed -i '/^BASEURL/d' "${NEWFILE}"
        sed -i '/^IconFile/d' "${NEWFILE}"
        sed -i '/^IconIndex/d' "${NEWFILE}"
        sed -i 's/\r$//' "${NEWFILE}"
        echo "Type=Link" >> "${NEWFILE}"
    done
)


Save the file by typing ":x" if you used vi.

Make it executable:

chmod +x /usr/local/bin/fix_url.sh

Test/run the new script. Do this first on an isolated test location, e.g. copy some Windows Internet Shortcut files to /tmp/windows_urls:

cd /tmp
/usr/local/bin/fix_url.sh

Check out the resulting *.desktop files. Verify they look correct, and that they actually work when clicked from Nautilus, etc.

If completely satisfied, change to the root of the real directory tree and re-run the script.

When you are all finished, the original *.url files are still hanging around. If you want to get rid of them (again test first):

cd /tmp
find . -name *.url -delete

Hope this helps someone else!

[Howto] Debian preseed with Netboot

debianlogo.pngThe vast majority of Debian installations are simplified with the use of Preseeding and Netboot. Friedrich Weber, a school student on a work experience placement with us at our German office has observed the process and captured it in a Howto here:

Imagine the following situation: you find yourself with ten to twenty brand new Notebooks and the opportunity to install them with Debian and customise to your own taste. In any case it would be great fun to manually perform the Debian installation and configuration on each Notebook. This is where Debian Preseed comes into play.

The concept is simple and self-explanatory; usually, whoever is doing the installation will be faced with a number of issues during the process (e.g. language, partitioning, packages, Bootloader, etc.) In terms of Preseed, all of these issues can now be resolved. Only those which are not already accounted for in Preseed remain for the Debian installer. In the ideal situation these would become apparent at the outset of the installation, where the solution would differ depending on the target system and which the administrator must deal with manually - only when these have been dealt with can the installation be left to run unattended.

Preseed functions on some simple inbuilt configuration data: preseed.cfg. It includes, as detailed above, the questions which must be answered during installation, and in debconf-format. Data such as this consists of several rows, each row of which defines a debconf configuration option - a response to a question - for example:

    d-i debian-installer/local	string de_DE.UTF-8

The first element of these lines is the name of the package, which is configured (d-i is here an abbreviation of debian installer), the second element is the name of the option, which is set, as the third element of the type of option (a string) and the rest is the value of the option. In this example, we set the language to German using UTF-8-coding.

You can put lines like this together yourself, even simpler with the tool debconf-get-selections: these commands provide straight forward and simple options, which can be set locally. From the selection you can choose your desired settings, adjusted if necessary and copied into preseed.cfg.

Here is an example of preseed.cfg:

    d-i debian-installer/locale string de_DE.UTF-8
    d-i debian-installer/keymap select de-latin1
    d-i console-keymaps-at/keymap select de
    d-i languagechooser/language-name-fb select German
    d-i countrychooser/country-name select Germany
    d-i console-setup/layoutcode string de_DE

    d-i clock-setup/utc boolean true
    d-i time/zone string Europe/Berlin
    d-i clock-setup/ntp boolean true
    d-i clock-setup/ntp-server string ntp1

    tasksel tasksel/first multiselect standard, desktop, gnome-desktop, laptop
    d-i pkgsel/include string openssh-client vim less rsync

In addition to language and timezone settings, selected tasks and packages are also set with these options. If left competely unattended, the installation will not complete, but will make a good start.

Now onto the question of where Preseed pulls its data from. It is in fact possible to use Preseed with CD and DVD images or USB sticks, but generally more comfortable to use a Debian Netboot Image, essentially an installer, which is started across the network and which can cover its Preseed configuration. This boot across the network is implemented with PXE and requires a system that can boot from a network card.

Next, the system depends on booting from the network card. It travels from a DHCO server to an IP address per broadcast. This DHCP server transmits not only a suitable IP, but also to the IP of a so-called Bootserver. A Bootserver is a TFTP-Server, which provides a Bootloader to assist the Administrator with the desired Debian Installer. At the same time the Debian Installer can be shared with the Boot options that Preseed should use and where he can find the Preseed configuration. Here is a snippet of the PXELINUX configuration data pxelinux.cfg/default:

    label i386
        kernel debian-installer/i386/linux
        append vga=normal initrd=debian-installer/i386/initrd.gz netcfg/choose_interface=eth0 domain=example.com locale=de_DE debian-installer/country=DE debian-installer/language=de debian-installer/keymap=de-latin1-nodeadkeys console-keymaps-at/keymap=de-latin1-nodeadkeys auto-install/enable=false preseed/url=http://$server/preseed.cfg DEBCONF_DEBUG=5 -- quiet 

When the user types i386, the tt>debian-installer/i386/linux kernel (found on the TFTP server) is downloaded and run. This is in addition to a whole load of bootoptions given along the way. The debian installer allows the provision of debconf options as boot parameters. It is good practice for the installer to somehow communicate where to find the Preseed communication on the network (preseed/url). In order to download this Preseed configuration, it must also be somehow built into the network.

The options for that will be handed over (the options for the hostnames would be deliberately omitted here, as every target system has its own Hostname). auto-install/enable would delay the language set up so that it is only enabled after the network configuration, in order that these installations are read through preseed.cfg. It is not necessary as the language set up will also be handed over to the kernel options to ensure that the network configuration is German.

The examples and configuration excerpts mentioned here are obviously summarised and shortened. Even so, this blog post should have given you a glimpse into the concept of Preseed in connection with netboot. Finally, here is a complete version of preseed.cfg:

    d-i debian-installer/locale string de_DE.UTF-8
    d-i debian-installer/keymap select de-latin1
    d-i console-keymaps-at/keymap select de
    d-i languagechooser/language-name-fb select German
    d-i countrychooser/country-name select Germany
    d-i console-setup/layoutcode string de_DE

    # Network
    d-i netcfg/choose_interface select auto
    d-i netcfg/get_hostname string debian
    d-i netcfg/get_domain string example.com

    # Package mirror
    d-i mirror/protocol string http
    d-i mirror/country string manual
    d-i mirror/http/hostname string debian.example.com
    d-i mirror/http/directory string /debian
    d-i mirror/http/proxy string
    d-i mirror/suite string lenny

    # Timezone
    d-i clock-setup/utc boolean true
    d-i time/zone string Europe/Berlin
    d-i clock-setup/ntp boolean true
    d-i clock-setup/ntp-server string ntp.example.com

    # Root-Account
    d-i passwd/make-user boolean false
    d-i passwd/root-password password secretpassword
    d-i passwd/root-password-again password secretpassword

    # Further APT-Options
    d-i apt-setup/non-free boolean false
    d-i apt-setup/contrib boolean false
    d-i apt-setup/security-updates boolean true

    d-i apt-setup/local0/source boolean false
    d-i apt-setup/local1/source boolean false
    d-i apt-setup/local2/source boolean false

    # Tasks
    tasksel tasksel/first multiselect standard, desktop
    d-i pkgsel/include string openssh-client vim less rsync
    d-i pkgsel/upgrade select safe-upgrade

    # Popularity-Contest
    popularity-contest popularity-contest/participate boolean true

    # Command to be followed after the installation. `in-target` means that 
         the following
    # Command is followed in the installed environment, rather than in 
        the installation environment.
    # Here http://$server/skript.sh nach /tmp is downloaded, enabled and 
        implemented.
    d-i preseed/late_command string in-target wget -P /tmp/ http://$server/skript.sh; 
  in-target chmod +x /tmp/skript.sh; in-target /tmp/skript.sh

All Howtos of this blog are grouped together in the Howto category - and if you happen to be looking for Support and Services for Debian you've come to the right place at credativ.

[Howto] RHCS: install on Debian

tux.jpgFollowing our earlier introduction to RHCS we now present a real world example: the installation of RHCS with Debian to provide certain virtual machines as services.

Our RHCS overview already explained the basics of RHCS. This time we will take two hosts with shared storage and provide KVM guests as services.

Installation of the nodes

In this setup the nodes are the machines which are running KVM. Each running KVM guest is a service managed by RHCS. While installing the KVM hosts you should make sure you comply with the following suggestions:
  • /tmp/ and /var/ should be running on different partitions, this improves performance.
  • Activate Debian backports, especially for the Kernel.
  • Make sure all IP addresses can be resolved in both directions - /etc/hosts helps here in worst case.
  • The host name must not resolve to 127.0.0.1! You would only get problems with the Cluster Management System CMAN.
  • /etc/hosts/ and /etc/resolv.conf should be the same on all nodes.
  • Create password free ssh keys for all nodes and distribute them.
  • For ultimate performance it is best to install the latest Debian Linux kernel. In our example we used linux-image-2.6.32-bpo.2-amd64, which crashes the guest kernels >= 2.6.30. However, a patch is available, see bug #573071.
  • The network devices should be named in a way that makes sense, for example: rhcs-backbone and external instead of eth0 and eth1.

Configuring the shared storage

As with almost any HA solution, a key element of RHCS is the shared storage which is accessed by all the nodes. In this example we take a "private" machine and install an iSCSI target on it:
apt-get install iscsitarget iscsitarget-source 
echo 'ISCSITARGET_ENABLE=true' > /etc/default/iscsitarget
m-a a-i iscsitarget


Keep in mind that the iSCSI target must build properly, see bug #566740. The configuration of the shared storage is done via /etc/ietd.conf:

IncomingUser discovery_in YourSecurePwd1
OutgoingUser discovery_out YourSecurePwd2
Target YOURMACHINE:clvm1
       IncomingUser node_in YourSecurePwd1
       OutgoingUser node_out YourSecurePwd2
       Lun 0 Path=/dev/sdx1,Type=blockio


On the nodes the same target must be accessed, so make sure /etc/iscsi/iscsid.conf is correct:

discovery.sendtargets.auth.authmethod = CHAP
discovery.sendtargets.auth.username = discovery_in
discovery.sendtargets.auth.password = YourSecurePwd1
discovery.sendtargets.auth.username_in = discovery_out
discovery.sendtargets.auth.password_in = YourSecurePwd2
node.startup = automatic
node.session.auth.authmethod = CHAP
node.session.auth.username = node_in
node.session.auth.password = YourSecurePwd1
node.session.auth.username_in = node_out
node.session.auth.password_in = YourSecurePwd2


The service is started with /etc/init.d/open-iscsi start. Existing targets can be searched, deleted or added by the following commands:

# discovering the targets
iscsiadm -m discovery -t st -p YOURMACHINE -P 1
# deleting target on wrong interface
iscsiadm -m node -p 192.168.0.100:3260,1 -o delete
# opening the portal
iscsiadm -m node --targetname "iqn.2010-03.YOURMACHINE:clvm1" --portal "YOURMACHINE:3260" --

VM setup

The virtual machines are provided by KVM. Thus the apropriate KVM software must be installed first:
apt-get install linux-image-2.6.32-bpo.2-amd64 kvm libvirt-bin virtinst -t lenny-backports


When configuring the bridge, make sure that the bridge name is the same on all nodes. Also the libvirt configuration must be the same on all hosts, so it makes sense to use puppet or similar techniques.
Afterwards, bring up the guests with:

virt-install -n <NAME> -r 256 --vcpus=1 --disk path=/dev/vg_cluster#/<LV> \
  -c /root/debian-<VERSION>-amd64-netinst.iso --vnc --noautoconsole --os-type linux \
  --os-variant debianLenny --accelerate --network=bridge:bridge0 --hvm -k de


To monitor the process use virt-viewer -c qemu+ssh://:/system .

RHCS setup

The next step is the setup of RHCS itself. Again, first things first, the software: apt-get install redhat-cluster-suite. This pulls quite a number of services which are not needed in our example:
invoke-rc.d nfs-kernel-server stop
invoke-rc.d nfs-common stop
invoke-rc.d portmap stop
update-rc.d -f nfs-kernel-server remove
update-rc.d -f nfs-common remove
update-rc.d -f portmap remove


Btw., system-config-cluster is not available for Lenny, but our Philipp Hübner has created a backport:

wget --no-check-certificate https://www.credativ.com/~phu/lenny-backports/system-config-cluster/system-config-cluster_1.0.53-1_all.deb
dpkg -i system-config-cluster_1.0.53-1_all.deb
apt-get -f install
apt-get install xauth


In order to have locking on the LVM cluster, you now need to modify /etc/lvm/lvm.conf: check for the global part.

 locking_type = 3


With the newer kernels the module lock_dlm also vanished, so CMAN init script must be modified: comment out the line modprobe lock_dlm 2>&1 || return 1. Additionally, RHCS 2 only supports XEN, so for libvirt you need to load the resource handler vm.sh.

wget --no-check-certificate https:///www.credativ.com/~phu/vm.sh -O /usr/share/cluster/vm.sh
chmod +x /usr/share/cluster/vm.sh

RHCS itself is called via

/etc/init.d/cman start
/etc/init.d/clvm start
/etc/init.d/rgmanager start

Fencing

Fencing describes the automagical neutralization of nodes which cease to function properly. In our example we use a power plug which can be controlled via network, NETIO-230A. Currently there is no real fence agent available for the device, but the python library Python-Bibliothek offers the necessary background to quickly write one.

Closing words

This howto has shown the setup of RHCS on Debian in easy steps - but of course, the correct steps depend very much on the targeted services, so this is just an example. If you need help just ask - Open Source HA solutions are our speciality, and we offer services and support for KVM virtualization as part of our day to day business.

This week, credativ launches its Open Source Support Card. With this card Open Source Support can be bought at a fixed price - without a binding contract.

After a long preparation phase we are now offering our trusted services in a new, simple format; with the Open Source Support Card you get a fixed contingent of project-specific, pre-paid services.

Sup_Card_front.png

Customers using the Open Source Support Card have the unique advantage of full cost control; the card can be purchased as a product, without any obligation to sign an agreement for a specific length of time. This may be of particular benefit to larger companies, where new contracts have to be reviewed and cross-reviewed before they can be authorised. A summary of the advantages of the new pre-paid support format include:

  • Open Source Support for a specific project
  • Support not restricted to a specific number of desktops and servers within a company
  • A tempting price model, starting at just £480
  • Full control of costs
  • Support available via telephone, e-mail and remote access
  • Bilingual support - help given in English or even German, if required! ;-)
  • Cost of support NOT determined by the number of CPUs or users
  • NO binding contract - easy way to purchase
  • NO call centre - direct access to the experts
  • Support units can be used for the following services:
    • administration
    • installation (remote)
    • consultancy

All support is provided to the usual credativ standard. Just as you would expect from our usual contracts, the cost of the service is not determined by the number of CPUs, users, or DB entries. Support units purchased through the Support Card can be used for all related problems within a company - no matter which workstation or server they come up on. The support itself is provided by our Open Source Support Centre: you won't have to deal with non-technical staff or battle through FAQ scripts - our Linux experts and Open Source specialists are on hand to take calls directly. Many of us are actively involved in contributing to a number of Open Source projects - as regular readers will already be aware. ;-)

The new Open Source Support Card is also an exciting development for the wider Open Source community. By offering yet another attractive support option for free distributions, we hope to prove that there is now no reason not to consider Debian and CentOS as viable alternatives to commercial distributions.

The Open Source Support Card is designed and marketed in such a way that resellers can also get on board, making access to support that bit easier for consumers: imagine purchasing your server online and while you're at it being able to drop a Support Card into the shopping basket as well - Open Source Support with just one click!

Currently the Support Card is just available for Debian and CentOS in the UK and in Germany, although we will soon be offering it in the US and Canada too. If you have any questions or comments we'd be pleased to hear from you - we've put a lot of effort into this new product, and are looking forward to the response from our customers and the wider community.

[Tip] Auto rotate images

bash.pngMany digital cameras today do not just save an image, but also save various meta data in the Exif standard. This data includes information about the position of the camera when the image was taken (such as vertical or horizontal). However, some image programs use this data to rotate the image when displaying it while others don't, leaving the user to face inconsistent behaviour.

This can be fixed with the tool exiftran; it automatically rotates all images according to the Exif data, which it discards afterwards. It is also very easy to use for mass conversion:

# apt-get install exiftran
$ find -print0 | xargs -0 exifautotran

This tool might be shipped by other distributions under a different name. Fedora, for example, calls it fbida.

RHCS: an Introduction

tux.jpgThe Red Hat Cluster Suite is a framework to bind two or more machines together to jointly handle one task. The following article gives an introduction to RHCS in terms of service failover.

Linux is used daily in mission-critical environments all over the world. It follows that Linux can be required to fulfil a range of needs with relation to availability and stability.  The Red Hat Cluster Suite (RHCS) is designed with these needs in mind; it enables the admin to set up a cluster of machines which all handle the same task or provide the same service. If the machine providing the service goes down, another machine then steps in and takes over.

Core elements of RHCS

RHCS consists of four core components:
  • cluster infrastructure
  • high availability service management
  • tools for the cluster administration
  • Linux virtual server routing
The cluster infrastructure includes all the core components necessary for the set up and running of a cluster of several nodes. These components manage the integration of nodes, shutting them down where problems occur (fencing), replicating the configuration and so on.

After the cluster has been set up the next step is to define the high availability service management. This is a service running on one node with other nodes configured for failover. The HA service management includes defining the service, start/stop scripts, ports, storage places and other resources as well as the priority of the different failover nodes.

The next core component is not so much a necessary key element but more a set of helpful tools: the cluster administration tools. In theory they are not critical to the running of the RHCS, although in practise it would be stupid to run the RHCS without them. They incorporate GUI tools, web pages for accessing cluster data and tools for status queries, among other things.

The situation is similar for the Linux virtual server routing; although RHCS documentation lists Linux virtual server routing as a core component, this functionality is not always needed as it "only" provides load balancing functions on IP level and re-routes the traffic when a node brakes down.Besides these official core components of RHCS the system can incorporate other services when they are available: GFS (Global File System) and Cluster Logical Volume Manager. They help with mounting network block devices, making storage management much easier.

Structure of a RHCS Cluster

To create an initial RHCS cluster a substantial set of machines is needed:

  1. Shared storage like iSCSI or Fibre Channel.
  2. For each node a method to detach it from the cluster (fencing), either by network or by a controllable power switch.
  3. At least two nodes with a network connection.
  4. A switch.

It is important that the shared storage is not running on one of the nodes itself - that would render the idea of fencing useless. Also keep in mind that the machines listed here only describe the minimum hardware configuration - a larger cluster would of course require many more nodes.

Closing words

RHCS offers a well thought out framework for managing a cluster, especially when it comes to service failover. Using RHCS makes securing your mission-critical systems easy, and makes them highly available with standard hardware.

The R in RHCS implies that this method only runs on RHEL machines - but this is not the case, as we will demonstrate in one of our upcoming articles.

tux.jpgThe administration of a large number of servers can be quite tiresome without a central configuration management. This article gives a first introduction into the configuration management tool, Puppet.

Introduction

In our daily work at the Open Source Support Center we maintain a large number of servers. Managing larger clusters or setups means maintaining dozens of machines with an almost identical configuration and only slight variations, if any. Without central configuration management, making small changes to the configuration would mean repeating the same step on all machines. This is where Puppet comes into play.

As with all configuration management tools, Puppet uses a central server which manages the configuration. The clients query the server on a regular basis for new configuration via an encrypted connection. If a new configuration is found, it is imported as the server instructs: the client imports new files, modifies rights, starts services and executes commands, whatever the server says. The advantages are obvious:
  • Each configuration change is done only once, regardless of the actual number of maintained servers. Unnecessary - and pretty boring - repetition is avoided, lucky us!
  • The configuration is streamlined for all machines, which makes it much easier to maintain.
  • A central infrastructure makes it easier to quickly get an overview about the setup - "running around" is not necessary anymore.
  • Last but not least, a central configuration tree enables you to incorporate a simple version control of your configuration: for example, playing back the configuration "PRE-UPDATE" on all machines of an entire setup only takes a couple of commands!

Technical workflow

Puppet consists of a central server, called "Puppet Master", and the clients, called "Nodes". The nodes query the master for the current configuration. The master responds with a list of configuration and management items: files, services which have to be running, commands which need to be executed, and so on - the possibilities are practically endless:
  • The master can hand over files which the node copies to a defined place - if it does not already exist.
  • The node is asked to check certain file and directory permissions and to correct them if necessary.
  • Depending upon the operating system, the node checks the state of services and starts or stops them. It can also check for installed packages and if they are up to date.
  • The master can force the node to execute arbitrary commands
Of course, in general all tasks can be fulfilled by handing over files from the master to the client. However, in more complex setups this kind of behaviour is not easily arranged, nor does it simplify the setup. Puppet's strength is that it facilitates abstract system tasks (restart services, ensure installed packages, add users, etc.), regardless of the actual changed files in the background. You can even use the same statement in Puppet to configure different versions of Linux or Unix.

Installation

First, you need the master, the center of all the configuration you want to manage: apt-get install puppetmaster Puppet expects that all machines in the network have FQDNs - but that should be the case anyway in a well maintained network.

Other machines become a node by installing the Puppet client: apt-get install puppet

Puppet, main configuration

The Puppet nodes do not need to be configured - they will check for a machine called "Puppet" in the local network. As long as that name points to the master you do not have to do anything else.

Since the master provides files to the nodes, the internal file server must be configured accordingly. There are different solutions for the internal file server, depending on the needs of your setup. For example, it might be better for your setup to store all files you provide to the nodes on one place, and the actual configuration you provide to the nodes somewhere else. However, in our example we keep the files and the configuration for the nodes close, as it is outlined in Puppet's Best Practice Guide and in the Module Configuration part of the Puppet documentation.Thus, it is enough to change the file /etc/puppet/fileserver.conf to:
[modules]
allow 192.168.0.1/24
allow *.credativ.de

Configuration of the configuration - Modules

Puppet's way of managing configuration is to use sets of tasks grouped by topic. For example, all tasks related to SSH should go into the module "ssh", while all tasks related to apache should be placed in the module "apache" and so on. These sets of tasks are called "Modules" and are the core of Puppet - in a perfect Puppet setup everything is defined in modules! We will explain the structure of a SSH module to highlight the basics and ideas behind Puppet's modules. We will also try to stay close to the Best Practise Guide to make it easier to check back against the Puppet documentation.

Please note, however, that this example is an example: in a real world setup the SSH configuration would be a bit more dynamic, but we focused on simple and easy-to-understand methods.

The SSH module

We have the following requirements:
  1. The package open-ssh must be installed and be the newest version.
  2. Each node's sshd_config file has to be the same as the one saved on the master.
  3. In the event that the sshd_config is changed on any node, the sshd service should be restarted.
  4. The user credativ needs to have certain files in his/her directory $HOME/.ssh.
To comply with these requirements we start by creating some necessary paths:
mkdir -p /etc/puppet/modules/ssh/manifests
mkdir -p /etc/puppet/modules/ssh/files
The directory "manifests" contains the actual configuration instructions of the module and the directory "files" provides the files we hand over to the clients.

The instructions themselves are written down in init.pp in the "manifests" directory. The set of instructions to fulfil aims 1 - 4 are grouped in a so called "class". For each task a "class" has one subsection, a type. So in our case we have four types, one for each aim:
class ssh{
        package { "openssh-server":
                 ensure => latest,
        }
        file { "/etc/ssh/sshd_config":
                owner   => root,
                group   => root,
                mode    => 644,
                source  => "puppet:///ssh/sshd_config",
        }
        service { ssh:
                ensure          => running,
                hasrestart      => true,
                subscribe       => File["/etc/ssh/sshd_config"],
        }
        file { "/home/credativ/.ssh":
                path    => "/home/credativ/.ssh",
                owner   => "credativ",
                group   => "credativ",
                mode    => 600,
                recurse => true,
                source  => "puppet:///ssh/ssh",
                ensure  => [directory, present],
        }
}
Each type is another task and calls another action on the node:
package
Here we make sure that the package openssh-server is installed in the newest version.
file
A file on the node is compared with the version on the server and overwritten if necessary. Also, the rights are adjusted.
service
Well, as the name says, this deals with services: in our case the service sshd must be running on the node. Also, in case the file /etc/ssh/sshd_config is modified, the service is restarted automatically.
file
Here we have again the file type, but this time we do not compare a file, but an entire directory.
As mentioned above, the files and directories you configured so that the server provides them to the nodes must be available in the directory /etc/puppet/modules/ssh/files/.

Nodes and modules

We now have three parts: the master, the nodes and the modules. The next step is to tell the master which nodes are related to which modules. First, you must tell the master that this module exists in /etc/puppet/manifests/modules.pp:
import "ssh"
Next, you need to modify /etc/puppet/manifests/nodes.pp. This specifies which module is loaded for which node, and which modules should be loaded as default in the event that a node does not have a special entry. The entries for the nodes support inheritance.

So, for example, to have the module "rsyslog" ready for all nodes but the module "ssh" only ready for the node "external" you need the following entry:
node default {
    include rsyslog
}
node 'external' inherits default {
    include ssh
}
Puppet is now configured!

Certificates - secured communication between nodes and master

As mentioned above, the communication between master and node is encrypted. But that implies you have to verify the partners at least once. This can be done after a node queries the master for the first time. Whenever the master is queried by an unknown node it does not provide the default configuration but instead puts the node on a waiting list. You can check the waiting list with the command: # puppetca --list

To verify a node and incorporate it into the Puppet system you need to verify it: # puppetca --sign external.example.com The entire process is explained in more detail in the puppet documentation.

Closing words

The example introduced in this article is very simple - as I noted, a real world example would be more complex and dynamic. However, it is a good way to start with Puppet, and the documentation linked throughout this article will help the willing reader to dive deeper into the components of Puppet.

We, here at credativ's Open Source Support Center have gained considerable experience with Puppet in recent years and really like the framework. Also, in our day to day support and consulting work we see the market growing as more and more customers are interested in the framework. Right now, Puppet is in the fast lane and it will be interesting to see how more established solutions like cfengine will react to this competition.

keyhole-heimdal.pngcredativ employee, Alexander Wirt, is due to give a presentation at the German Chemnitz Linux Days about Single Sign On with Kerberos. Besides an introduction into configuration of Kerberos, the talk will also focus on the configuration of its various services.

Kerberos is an authentication protocol which enables an admin to incorporate services and an operating system transparently into an existing setup. This makes Single Sign On possible: the user only has to enter his/her credentials once and thereafter can access any secured services and websites which support Kerberos without having to enter them again.

The Kerberos Single Sign On approach will be described by Alexander Wirt, credativ's expert on this topic, during a talk at the Chemnitz Linux Days due to take place in March. Besides the basic introduction to Kerberos based on Heimdal, he will also explain how to configure services such as SSH, Apache and IMAP. The topic of this talk will be very close to real-world usage, thus it should enable members of the audience to try it out easily themselves on their own networks.The talk will take place in German on March 13th 2010 at 15:00 in Room V4.

PostgreSQL Optimizer Bits: Semi and Anti Joins

| 2 Comments | 1 TrackBack
postgreslogo.pngThe series "PostgreSQL Optimiser Bits" will introduce the strategies and highlights of the PostgreSQL optimiser. We start today with a new feature of PostgreSQL 8.4: Semi and Anti Joins.

Since version 8.4, PostgreSQL has been offering a new optimisation strategy for the optimisation of certain queries: Semi and Anti Joins.

A Semi Join is a specific form of a join, which only takes the keys of relation a into account if these are also present in the associated table b. An Anti Join is the negative form of a Semi Join: that is, a key picked in table a will be taken into account if it is not present in table b.

To summarize, Semi and Anti Joins are specific forms of a join which only take certain keys on the left side into account - where queries want to make sure certain keys exist, but are not concerned with the content of the key itself. This behaviour is already widely known in Object Relation Mappers (ORM) which formulate such queries using EXIST() or NOT EXIST().

Compared to PostgreSQL 8.3 the same query is possible with a much simpler and more efficient query plan. The following simple example shows this improvement: take two tables, a, b and an EXIST() query. A certain set of data from a is to be found which has its equivalent a.id2 = b.id in b. Of course, this aim can also be accomplished by one single join, however, this example shows the improvements of the optimizer solving this query.
EXPLAIN SELECT id FROM a WHERE a.id = 200 AND EXISTS(SELECT id FROM b WHERE a.id2 = b.id);
The optimiser in PostgreSQL in 8.3 determines the following plan for this example. Keep in mind that both tables a and b each have an index on the column id and id2.
                                QUERY PLAN
--------------------------------------------------------------------------
 Index Scan using a_id_idx on a  (cost=0.00..8355.27 rows=503 width=4)
   Index Cond: (id = 200)
   Filter: (subplan)
   SubPlan
     ->  Index Scan using b_id_idx on b  (cost=0.00..8.27 rows=1 width=4)
           Index Cond: ($0 = id)
In contrast, in PostgreSQL 8.4 the optimizer can use a hash Semi Join:
                                QUERY PLAN
---------------------------------------------------------------------------
 Hash Semi Join  (cost=27.52..78.16 rows=969 width=4)
   Hash Cond: (a.id2 = b.id)
   ->  Index Scan using a_id_idx on a  (cost=0.00..37.32 rows=969 width=8)
         Index Cond: (id = 200)
   ->  Hash  (cost=15.01..15.01 rows=1001 width=4)
         ->  Seq Scan on b  (cost=0.00..15.01 rows=1001 width=4)
The reduced costs of this query plan are more than obvious - and lower costs mean fewer I/O accesses. So, in future a more detailed analysis of such queries is worth a look.