[Howto] Zarafa mail extraction

| 1 Comment

zarafa_logo.jpgMail administrators which use spamassassin and its Bayesian filter need to train the classifier on a regular basis. Spamassassin's sa-learn utility needs to process full RFC-822 emails including mail headers and the the mail body. The following howto describes a method of how full emails can be extracted from the mailsystem Zarafa.

A Zarafa environment utilizes a MySQL database to maintain its users, mail stores, folders and other "mail objects". Two operating modes exist for Zarafa's attachment-storage:

  • In database operation mode, all objects and object types are stored within the database, especially emails, headers and attachments.
  • In files operation mode, however, potentially large objects, namely attachments and full emails (which include their attachments as multipart blocks), are stored as files on the server's file system.

The following points are particularly interesting:

  • The email header alone is found in the properties database table.
  • Data regarding user accounts are distributed amongst the users and stores table.
  • When in database operation mode, Zarafa stores full emails in the lob table or, if using files operation mode, it stores them on the server's filesystem, typically under /var/lib/zarafa/....
  • Relationships between related data objects of multiple tables are established by instance IDs (SQL key attributes).

In files mode, the complete path of a gzip compressed full email has the form {attachment_path}/{INT}/{INT}/{Instance_ID}.gz, where the attachment_path equals /var/lib/zarafa in Zarafa's standard configuration. Its subfolders derive from the instance ID, so a concrete sample path might be /var/lib/zarafa/4/6/43664.gz.

The following script reads a string from standard input and searches for matching email headers in the properties database table. For every hit, the corresponding full email is extracted from the lob database table, so it is important that Zarafa runs with attachment_storage=database. One must consider that the given SQL queries might produce a high database load - depending on its size. The script has been tested with a standard installation of Zarafa 7.0.9. A sample run would be:

[root@zarafa ~]# cat needle.txt 
Subject: test Mon, 04 Feb 2013 15:05:12 +0100
[root@zarafa ~]# mkdir hackstack; ./zarafa-extract.pl zarafauser haystack < needle.txt
 Writing haystack/1359986776.0000.txt ...
[root@zarafa ~]# cat haystack/1359986776.0000.txt
Date: Mon, 04 Feb 2013 15:05:12 +0100
To: zarafauser
From: zarafa.testsystem.intern
Subject: test Mon, 04 Feb 2013 15:05:12 +0100
X-Mailer: swaks v20100211.0 jetmore.org/john/code/swaks/
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="----=_MIME_BOUNDARY_000_14806"

------=_MIME_BOUNDARY_000_14806
Content-Type: text/plain

This is a test mailing
------=_MIME_BOUNDARY_000_14806
Content-Type: application/octet-stream; name="test.xml"
Content-Description: test.xml
Content-Disposition: attachment; filename="test.xml"
Content-Transfer-Encoding: BASE64

SGVsbG8gV29ybGQK

The following Perl script uses both DBI and DBD::mysql modules, which are included in the libdbd-mysql-perl package under Debian, for example.

#!/usr/bin/perl -w

# Copyright (c) 2013, Damian Lukowski <damian.lukowski@credativ.de>
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#    * Redistributions of source code must retain the above copyright
#      notice, this list of conditions and the following disclaimer.
#    * Redistributions in binary form must reproduce the above copyright
#      notice, this list of conditions and the following disclaimer in the
#      documentation and/or other materials provided with the distribution.
#    * Neither the name of the <organization> nor the
#      names of its contributors may be used to endorse or promote products
#      derived from this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
# ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
# WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL <COPYRIGHT HOLDER> BE LIABLE FOR ANY
# DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
# (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
# ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
# SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

use strict;
use DBI;
use DBD::mysql;

my $DEBUG = 0;

# Fill in the MySQL server properties
my ($dbhost,$dbport,$dbname,$dbuser,$dbpass) = ('', 3306, 'zarafa', '', '');

if (@ARGV < 2)
{
	print "Usage:	$0 <zarafa-user> <outdir>\n
	$0 reads a mail header from STDIN and searches
	for zarafa-users's emails matching the given mail header.
	Each result is stored as a separate file within <outdir>.
	It is assumed that zarafa is configured with
	attachment_storage = database!\n";
	exit;
}

my ($username, $outdir) = @ARGV;

chomp(my $needle = join '', <STDIN>);
$needle = "%$needle%";
# RFC822 demands CR LF as a line separator.
$needle =~ s/(?!\r)\n/\r\n/g;

my $dbh = DBI->connect("dbi:mysql:$dbname:$dbhost:$dbport", $dbuser, $dbpass)
	  or die $DBI::errstr;

my ($qry, $ref, $sth, $uid);

if ($username ne '.')
{

	$qry = "SELECT user_id FROM stores WHERE user_name = '$username'";
	$ref = $dbh->selectall_arrayref($qry);

	if (@$ref > 1)
	{
		print "Ambiguous username. Matching userids: ",
		join ', ', map {$_->[0]} @$ref;
		exit;
	};

	die "Username not found\n" unless @$ref;
	$uid = $$ref[0][0];
	print "uid of $username: $uid\n" if $DEBUG;
};

if ($DEBUG && $username ne '.')
{
	$qry = "SELECT COUNT(*) FROM properties";
	printf "%d properties\n", ${$dbh->selectall_arrayref($qry)}[0][0];

	$qry = "SELECT COUNT(*) FROM lob";
	printf "%d large objects\n", ${$dbh->selectall_arrayref($qry)}[0][0];

	$qry = "SELECT COUNT(*) FROM properties WHERE tag = 125";
	printf "%d properties with tag 125 (headers)\n",
		${$dbh->selectall_arrayref($qry)}[0][0];

	$qry = "SELECT COUNT(*) FROM properties WHERE val_string LIKE ?";
	$sth = $dbh->prepare($qry);
	$sth->execute($needle);
	printf "%d tag-125 properties matching needle\n",
		${$sth->fetchrow_arrayref}[0];

        $qry = "SELECT h.owner, s.user_name, p.hierarchyid FROM "
             . "hierarchy h RIGHT JOIN properties p on h.id = p.hierarchyid "
             . "LEFT JOIN stores s ON s.user_id=h.owner WHERE p.tag = 125 "
             . "AND p.val_string LIKE ?";

	$sth = $dbh->prepare($qry);
	$sth->execute($needle);
	while (my @row = $sth->fetchrow_array())
	{
		printf " uid=%s, username=%s, hierarchyid=%s\n", @row;
	}

	$qry = "SELECT count(*) FROM hierarchy h JOIN properties p on "
	     . "h.id = p.hierarchyid LEFT JOIN singleinstances s on "
	     . "s.hierarchyid = h.id WHERE p.tag = 125 AND h.owner = $uid";
	printf "%d tag-125 properties of %s\n",
		${$dbh->selectall_arrayref($qry)}[0][0], $username;

	$qry = "SELECT h.owner, p.hierarchyid, s.instanceid FROM hierarchy h "
	     . "JOIN properties p on h.id = p.hierarchyid LEFT JOIN "
	     . "singleinstances s on s.hierarchyid = h.id WHERE "
	     . "h.owner = $uid AND p.val_string LIKE ?";
	$sth = $dbh->prepare($qry);
	$sth->execute($needle);
	$ref = $sth->fetchall_arrayref();
	printf "%d tag-125 properties of %s matching needle\n",
		(scalar @$ref), $username;

	for my $row (@$ref)
	{
		printf " uid=%s, hierarchyid=%s, instanceid=%s\n", @$row;
	}
}

# A tag value of 125 corresponds to the mapitag definition as found in zarafa's
# sources under mapi4linux/include/mapitags.h:
# define PR_TRANSPORT_MESSAGE_HEADERS          PROP_TAG(PT_TSTRING,    0x007D)
# define PR_TRANSPORT_MESSAGE_HEADERS_W        PROP_TAG(PT_UNICODE,    0x007D)
# define PR_TRANSPORT_MESSAGE_HEADERS_A        PROP_TAG(PT_STRING8,    0x007D)

if ($username ne '.')
{
	$qry	= '(SELECT s.instanceid FROM properties p JOIN hierarchy h '
		. 'ON h.id = p.hierarchyid JOIN singleinstances s ON '
		. "s.hierarchyid = h.id WHERE h.owner = $uid AND p.tag=125 AND "
		. 'p.val_string LIKE ?)';	
} else
{
	$qry	= '(SELECT s.instanceid FROM properties p '
		. 'JOIN singleinstances s ON '
		. "s.hierarchyid = p.hierarchyid WHERE p.tag=125 AND "
		. 'p.val_string LIKE ?)';
}

$sth = $dbh->prepare($qry);
$sth->execute($needle);

my $iid = undef;

# Hint:
# For zarafa deployments with attachment_storage = files, the requested mails
# are not stored in the lob table, but within /var/lib/zarafa/. Let iid be
# 54321, then the mail is stored in /var/lib/zarafa/1/12/54321.gz
# The extraction logic is not covered here ...

my $written = 0;

while (my @row = $sth->fetchrow_array())
{
	$iid = $row[0];
	$qry = "SELECT val_binary FROM lob WHERE instanceid = $iid";
	$ref = $dbh->selectall_arrayref($qry);

	print "Looking for full mail of iid=$iid\n" if $DEBUG;

	for my $row (@$ref)
	{
		print " Found full mail with iid=$iid\n" if $DEBUG;
		my $tmp_file = sprintf '%s/%s.%04d.txt', $outdir,
				time, $written++;
		open my $fh, "> $tmp_file";
		print " Writing $tmp_file ...\n";
		print $fh $row->[0];
		close $fh;
	};
};

unless (defined $iid)
{
	print "No iid found, probably because the header didn't match.\n";
	print "If you are sure that the header should be found, try to use ",
	      "'.' (dot) as the username.\nThe DB query will be slower but ",
	      "might find the email.\n" unless $username eq '.';
} elsif ($written == 0)
{
	print	"Instanceid(s) found, but no files written? ", 
		"Probably attachment_storage != database\n";
}

Case Study: Munich City Council

| No Comments

LiMux.jpg In the "Case Study" category we present recent customer projects. This case study focuses on one of our most prolific public sector projects to date, the administration of the Bavarian Capital, the City of Munich.

credativ GmbH have been providing support for the LiMux-Project of Munich's city council since 2008, during which time some 15,000 workstations have been migrated to Linux. After years of planning and a long start phase, the migration recently made real headway.

At the end of November 2012, the goal of the project was finally realized, with more than 12,000 migrated LiMux desktops in operation. The LiMux project has has been hailed as a "Success Story" of mass migration in the public sector, with financial savings of switching from Microsoft Windows and Microsoft Office amounting to more than 10 million euros.

Each Linux desktop is installed, configured and administrated by the council's so-called distribution server. The two main components are the LDAP-based configuration and administration management solution, GOsa², and the Installation Management tool FAI (Fully Automatic Installation).

These backend components have been the main focus of credativ's work for the City of Munich to date. Part of this work has involved planning and developing new features to facilitate daily administrative tasks. Another focus was to improve the scalability in mass installations, whereby the mass rollout of migration was made possible. In the course of this, credativ were able to eliminate many bugs, both small and complex, and standardise the user interface.

LiMux project lead, Peter Hoffman said,

"We have been completely satisfied throughout our collaboration with credativ GmbH in recent years. As a local, open-source oriented, medium-sized service company, credativ GmbH embodies our commitment to advancing the LiMux Project independently of the manufacturer and with open standards".

The main reason for the required work is the nature of the use of GOsa² by the LiMux project: GOsa² is basically a web application that enables the management of users and associated services such as email or file-sharing. The city of Munich did not use GOsa² to create users or change user data (a different program is used to do this), but for configuring users and administrating workstations. Additionaly GOsa² is used extensively by the city of Munich to manage a large number of clients and users.

GOsa_Screenshot_de.png
Main menu of GOsa²

Most of the functionality has been developed specifically for the migration requirements of the City of Munich, it is used rarely, if ever, elsewhere. The main areas where GOsa² is deployed at the City of Munich are:

1. Configuring users' (or groups of users') desktop settings, such as: desktop shortcuts, start menu entries, login / logoff scripts, shares, printers, etc. These settings are queried in LDAP when a user logs in to a workstation and are configured by a corresponding script.

2. Configuring workstations (or groups of workstations) and distribution servers, such as LDAP / NTP server, system-wide shared services and printers. Various services can also be configured for (distribution) servers, such as some LDAP, NTP, or logging servers, as well as software repositories.

3. Configuration of FAI classes for software distribution. FAI is usually managed through text files; changing the different installation profiles and their partitioning schemes, package lists and configuration scripts is simple when the data is stored in LDAP with the GOsa² Web application.

4. Remote maintenance of workstations and distribution servers is possible with the GOsa-SI client / server system, through which systems can be restarted, reinstalled, or switched on and off. This client / server communication enables messages to be sent to logged-in users and allows monitoring of the progress of a system installation.

credativ implemented the extensive changes detailed here over the last few years. By mid-2011, the established code-base of the LiMux project was based on version 2.6 of GOsa² and was publicly available in the 2.6-lhm-branch in the Subversion repository of GOsa². credativ have added over 250 Change Sets to this branch since 2008.

In Autumn 2011 it was decided to move the LiMux Project to the latest version of Gosa², 2.7. To achieve this, changes which were still necessary but not yet integrated into the main development branch were ported to the 2.7 code-base by the end of 2011. This resulted in about 90 changesets.

In GOsa² 2.7, software distribution with FAI and system administration with GOsa-SI had been gradually replaced through new projects, which meant some of the functions used by the City of Munich no longer worked properly. Since the beginning of 2012, these broken functions have been fixed or reimplemented, along with a number of new problems found during extensive testing of GOsa² 2.7. A series of new features have also been implemented. This has led to a further 200 changesets and 4000 new lines of code (while eliminating 1000 lines of code) in the space of around 4 months. This work was made available in the summer of 2012 in a public git repository and sent back upstream to the Open Source Community.

Concerning the paid work, Michael Dusel, Director of the LiMux Project's workstation development, said:

"The years of successful collaboration with credativ GmbH employees have been an absolute pleasure. Particularly during the migration from GOsa² 2.6 to 2.7, we were able to access their prompt and professional support. This meant that newfound problems were quickly resolved by the developers at credativ during the test phase in GOsa² 2.7 and significant new functionality was implemented to our satisfaction."

In addition to the work on GOsa², credativ GmbH has also provided extra support for the City of Munich. Various problems with the workstations have been successfully diagnosed; Debian specialists at credativ GmbH have, for example, supported the packaging and backporting of various packages (such as Firefox and its addons).

In summary, it can be said that the LiMux project is well on its way and we at credativ are proud of our part in it. We hope that the success of this project at the City of Munich will encourage other councils or public sector organisations to consider similar migrations, and we would be glad to offer our expertise and support. If you would like to know more please email limux@credativ.com.

Software Freedom Day 2012!

| No Comments

credativ UK once again celebrated Software Freedom Day this month. Although it was one of just 4 registered events taking place in the UK, Software Freedom Day is a global event. A lot of time and preparation went in to making sure it was as successful as last year, and it paid off. Around a hundred visitors passed through the doors, drawn by various press releases, letters to local councillors and schools, and online marketing efforts from the sponsors, who included the local Rugby LUG (Linux User Group), social enterprise Cultivating Communities, and OpusVL, among other open source service providers in the area..

This year, organizers wanted to particularly attract teachers and representatives from local schools, as well as volunteers who might be interested in signing up to host or help with a CodeClub. The event itself proved popular with people of all ages and walks of life.

SFD26 entrance#.JPG

In the entrance hall visitors were met by experts volunteering advice and information; they could sample free software and enjoy refreshments cooked by the Indian cuisine group held weekly in the Benn Partnership Centre.

SFD61 agedinentrance.JPG

Craig Barnes, a regular at the Rugby LUG, had designed and built a drum kit with home-made pads and electronics, which was operated by an Arduino using open source software. This was great fun and popular with all ages.

SFD17 arduinodrums inuse.JPG

SFD03 drumarduino.JPG

Visitors enthused about the "hands-on" tech experience in the Arduino, Raspberry Pi and Scratch workshops, which were full for most of the day.
SFD43 arduinofull.JPG

Working hard in the Arduino room: SFD07 arduino work.JPG

Using Scratch to operate Raspberry Pi computers:
SFD52 dylan.JPG

SFD62 andymike.JPG

Scratch was quickly mastered by the kids passing through this room, who then received a certificate: SFD11 kidsteaching scratch.JPG

Rugby's local MP, Mark Pawsey, welcomed the opportunity to raise awareness about free software and open source technologies in the town. He appeared in press coverage beforehand and has been supportive of efforts by credativ and the Open Source Consortium in campaigning for government adoption of open source software during the past year.

Mark Pawsey with Stuart Mackintosh of OpusVL at Software Freedom Day:

SFD44 pawseystuart.JPG

Avon Valley School donated 20 PCs for use on the day, which were used to demo Linux based operating systems and open source applications.

SFD05 entrance comps2.JPG

Visitors thought the event was "well-presented", "inspiring" and an "excellent learning day", run by "very enthusiastic and informative experts". We would like to say thank you to all our wonderful volunteers who did such a fantastic job. Here are a few of them:

SFD74 volunteersgroupshot.jpg

Based on the success of the event we are planning further events throughout the year.

The UK government is currently consulting on the use of Open Standards and Open Source as an alternative for proprietary software. The proposed policy is being attacked by the large corporations who dominate the market of UK public sector IT and want to ensure government policy does not undermine their market share.

Respond to the Consultation

As part of the open source community we have a responsibility to respond to this Consultation - not just passively dis/agree with it.

The closing date for submitting responses is fast approaching and coincides with election date next week, 3 May. It is the number of responses, rather than the amount written which will have the most impact - i.e. even if you can't answer every question, it's important just to respond explaining why you want Royalty Free Open Standards. So please, if you haven't already, do write in and encourage colleagues, family and friends to do so too.

Use 'writetothem' to write to your MP

It's important to draw the attention of MPs to this; they are trying to gather support at the moment so it could be convenient timing. Remember, if your company is in a different constituency from your home, you could write to both.

  1. Go to http://www.writetothem.com/
  2. Type your home postcode
  3. Click on the name under 'Your Member of Parliament'
  4. Fill in the box with your text and submit.

Suggested content for the letter to your MP

  • Introduce the Policy Exchange event on 30 April: Open standards for open government? and encourage them to attend. Even if they can't attend themselves, ask if they can send one of their London-based staff to attend in their place.
  • Encourage them to submit a response to the Consultation - perhaps offer to be available to discuss it or help them fill it in.
  • It doesn't have to be long and complicated - make it personal, don't send duplicate letters; explain what you regard to be the most pressing implications of failing to adopt royalty free open standards.

Let us know when you have written! Also, don't forget to add a link to the writetothem site and/or the Consultation to your company website/blog/other social media channels to get more support.

credativ is one of the partners carrying out the 2012 Future of Open Source Survey The goal is to reach 1,000 respondents, so if you're passionate about Open Source, please fill it in. Your help is extremely important to the success of this year's survey.

The following blog post was originally published by SandHill.com and details the findings of last year's survey.

It's exciting to see the evolution of Open Source evident in findings in our Future of Open Source survey over the past five years. Open Source has now moved beyond the tipping point it reached last year in the private sector and now is mainstream in businesses of all sizes and even in mission-critical applications. In fact Open Source is clearly playing a central role in the cloud and mobile segments and has become a driver of innovative solutions in new spaces like Big Data.

The Pace of Open Source Adoption and What's Driving It

Fifty-six percent of the survey participants predicted that that Open Source will comprise 50 percent or more of software purchases in five years. This has more than tripled over the last few years. Respondents continue to identify a turbulent economy as a primary reason why Open Source software is attractive. But perhaps more interestingly, they also indicated that avoiding vendor lock-in is a top driver for adopting Open Source solutions.
skok-072011-1-TippingPoint-300x223.jpg

What this Means for Open Source Vendors

According to the survey (455 respondents, 40 percent of whom were vendors), people think the number-one impact on vendors will be Software as a Service (SaaS), which has a profound impact on revenue sources and business models.

How OSS Vendors will Make Money in the Future

skok-072011-2-RevenueSources-300x226.jpgA lot of Open Source vendors used to make their money through custom development; in fact, it was the single biggest source of revenue (26 percent). But when asked how that will change over the next five years, they predicted that it will shrink dramatically to being fourth on the list, dropping to only 17 percent. They believe that SaaS will be the way in which Open Source vendors will make their money. As the figure below shows, they believe that digital services - that is, value-add subscriptions and SaaS - will grow by about 25 percent over the next two years.

This makes perfect sense if you think about the way customers want to consume Software as a Service, and the stronger potential business model of SaaS for Open Source vendors in responding to that need. (For example, Gartner predicts that SaaS will grow from a $10 billion industry today representing just 10 percent of enterprise application spending to double that value by 2015.) Further, SaaS offers a much more flexible and agile approach to productize offerings rather than trying to rebuild a specific solution each time for every customer through professional services.

Accordingly, I think in the Open Source industry as a whole we will see companies getting better at putting digital or web services and SaaS capabilities in place over the next two years and moving away from so much dependence on services.

Also in looking at SaaS from another angle, I see Open Source continuing to be a driver in the build-up of the software stacks that enable SaaS and Cloud-based solutions. SaaS and Cloud together and Open Source will thus become synonymous with each other.

Open Source is Driving New Sectors like Mobile

skok-072011-3-Sectors-Disrupted-300x227.jpgWe have been tracking it as a fund and survey participants identified Mobile as the sector that will be most disrupted by Open Source over the next five years. For example, it is notable that the 3,800 new mobile projects in 2010 were for newer platforms.

Clearly developers aren't targeting legacy platforms like Windows mobile. Ninety-four percent of those new projects were for Apple iOS or Android. Android, you can understand as it is fundamentally open; but it's good to know it's penetrating the iOS stack. We're investing at the intersection of OSS, Cloud and Mobile in companies like Apperian.

Macro Shifts in Open Source - Innovation in Areas like Big Data

To me, what's most exciting at a macro level is to see that Open Source is shifting from commoditization to innovation. For example, take the Big Data space. Open Source is the only way to address such a big problem space because you need a body as large as a community, rather than a single vendor, to tackle the problems of Big Data. Hence you see the innovation around Hadoop and the many companies it has spawned like Cloudera, MapR and the plethora of others.

Also, there is a new class of developers who are building on Open Source who are not willing to take the de facto legacy standard of relational databases and SQL as the solution. Through their innovation on frameworks like Spring, Hibernate and Ruby, they've learned that, in many instances, the best object relational mappings would benefit more from new innovative non-SQL, or so-called "NoSQL" approaches. In many instances these will be far superior to the traditional legacy approaches that came from closed-source giants like Oracle and others.

At the same time, I think it's also very interesting that Open Source is taking the lead in so-called NewSQL. For many transactional and mission-critical applications, where consistency is so important to data for the reliability of the transactions, SQL is still an incredibly powerful and natural way to program. But as applications scale to enormous levels with cloud and mobile applications accessing services in the cloud, there needs to be some new means to reinvigorate SQL. Hence NewSQL. Here we're investing in companies like Akiban to complement our NoSQL investment in Couchbase

Either way, I believe Open Source is going to be the driver of new approaches to solving the Big Data problem because the breadth of solutions needed will require the depth of the Open Source community to address the challenge of data expanding at a rate of 50x over the next decade.

The Potential Traps for Open Source

Despite all the positives we're seeing in the Open Source market, there is a potential negative side. Open Source is not a panacea in and of itself; just because software is written in Open Source, it doesn't mean it will be better.

Open Source is not immune to the bloatware trap. It killed many legacy software companies and Open Source does not necessarily mean best practice. So we have to be careful as an industry not to fall into the trap of just adding more and more features in the name of progress. This is a trap that stifles innovation as more dollars go to maintenance of bloated codebases.

Fortunately communities tend to self-correct and recognize this and trend toward what I call "leanware". For example, I always celebrate projects with tight core kernels that are modular and extensible like Drupal.

However more broadly, feature functionality in many instances was not well thought through by vendors in the legacy approach of the closed-source world. We need to look at how Open Source can stay closer to the customer, and in the best of scenarios, the community will be the customer.

What Barriers Remain to Open Source Adoption?

At a macro level, the barriers to Open Source adoption have certainly changed over the years.

Initially, they were things like policy constraints or legal issues; now we're seeing much more traditional kinds of barriers that we see with any selection of software. According to the survey respondents, the top three barriers to Open Source selection are:

  • Lack of internal technical skills
  • Unfamiliarity with the Open Source solutions
  • Lack of commercial vendor support
Clearly, these could be barriers for any enterprise software selection. We used to see issues about security and licensing as the top barriers.

Next Set of Value Propositions around Open Source

In responding to a survey question on how Open Source impacts the manageability of applications, 53 percent said manageability is better or at least the same with Open source. But as we drilled down further and talked with companies about the survey, they explained that in the past they considered Open Source a complexity - something they worried might infect their own software or something that would be difficult to manage as a component of their overall stack. They explained that's no longer the case. Open Source software is maturing and is properly packaged and properly delivered.

However, as Open Source gets more widely used, new considerations come into focus such as:

  • Once we've built the software, how do we deploy it?
  • When we've deployed it, how do we patch it and update it?
  • How can we make it more effective in the rest of the life cycle, moving it from development to operations? (The so-called dev-ops challenge)

Therein lies the path for the next set of value propositions around Open Source. I think cloud will play a role here because of the emergence of Platforms as a Service (PaaS) in the cloud.

The cycle of managing configuration, deployment and updating of traditional software is still an issue. That challenge may never go away unless there is a fundamental shift. The cloud and PaaS represent that shift. Instead of dealing with all the underlying components of the delivery application, companies will deal with everything being packaged as a service, either as infrastructure, as a platform or at the top of the stack as an application.

Cloud computing will embody, or contain or be built on Open Source. We're very much on that path today. The more cloud accelerates, the more it pulls Open Source. Examples of it are everywhere. Google is built almost entirely on Open Source, as is Amazon Web Services. And there are many examples of pure-play Open Source companies that are addressing the need for interoperability in the cloud by using Open Source. An example of this is how Eucalyptus is building an Amazon-compatible platform on Open Source.

So overall there's a significant shift occurring, and while I think we'll continue to see Open Source delivering more value disrupting and commoditizing existing software categories, I expect to see it both enabling new categories like Cloud and, perhaps most excitingly, driving new areas of innovation in areas like Big Data. It's an exciting time and I look forward to what the year ahead holds for us all.

Results of North Bridge Venture Partners' fifth annual open source survey, conducted in partnership with The 451 Group, were unveiled at the 2011 Open Source Business Conference in San Francisco. Click here to see a slide presentation of the survey findings including up-and-coming open source companies and some cool open source projects mentioned by survey participants.

tux.jpg
Getting Spotify to work nicely on Linux

Note: the Linux Spotify client will only work with a premium Spotify account.

I spoke at the NYC PostgreSQL Users' Group meeting in December, and while there someone mentioned that Spotify is a great music service (and that they are using PostgreSQL!). So I decided to give it a try. The issue was that, while it can be made to work on Linux, the process of making it work well on Linux is less than simple. I decided to document what I did (and my sources) as I had to pull information from several sources and added a few modifications of my own.

There are two main problems to deal with:


  1. Getting the program itself installed and running

  2. Getting Linux and your browser to handle the spotify protocol so that, for example, clicking on playlist URLs will work correctly

The answer to problem number one depends in part on your Linux distribution. I am only going to cover Ubuntu and Fedora here -- extrapolation is left as an exercise for the reader.

On Ubuntu (I'm using 11.10), the directions from Spotify seems to work fine. I'll paste them here for the sake of completeness:

# On Ubuntu
# This gets you the older released client
# From http://www.spotify.com/us/download/previews/
# -----------
# 1. Add this line to your list of repositories by
#    editing your /etc/apt/sources.list
deb http://repository.spotify.com stable non-free

# 2. If you want to verify the downloaded packages,
#    you will need to add our public key
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 4E9CFF4E

# 3. Run apt-get update
sudo apt-get update

# 4. Install spotify!
sudo apt-get install spotify-client-qt

I just noticed that the Ubuntu directions result in the older client working, not the shiny new preview version. See below for instructions to get the preview client working

# On Ubuntu, new preview client
# From 
# http://getsatisfaction.com/spotify/topics/try_out_the_linux_apps_client_beta_preview
# -----------
wget \
http://download.spotify.com/preview/spotify-client_0.8.0.1031.ga1569aa.552-1_amd64.deb
ar vx spotify-client_0.8.0.1031.ga1569aa.552-1_amd64.deb
tar -xzvf data.tar.gz
cp -rf ./usr /

# From 
# http://meltingrobot.wordpress.com/2011/11/08/spotify-installation-on-fedora-16/
# modified to handle passing of arguments

vi /usr/local/bin/spotify

# add the following lines to /usr/local/bin/spotify
8<--------------------------
#!/bin/bash

/bin/rm -rf ~/.cache/spotify
/usr/share/spotify/spotify $*
8<--------------------------

# make the script executable
chmod +x /usr/local/bin/spotify

# arrange to use the script in place of the binary to work
# around a known issue causing segfaults
rm /usr/bin/spotify
ln -s /usr/local/bin/spotify /usr/bin/spotify

# create symlinks to work around library mismatches
ln -s /usr/lib/x86_64-linux-gnu/libplc4.so /usr/lib/x86_64-linux-gnu/libplc4.so.0d
ln -s /usr/lib/x86_64-linux-gnu/libnspr4.so /usr/lib/x86_64-linux-gnu/libnspr4.so.0d

On Fedora things are complicated by the fact that Spotify no longer distributes an RPM - at least not that I could find. There are several recipes for solving this dilemma that can be found scattered around the Internet. Here is what I used:

# On Fedora (I am on Fedora 15)
# From http://www.passwdshadow.com/
yum -y install perl-ExtUtils-MakeMaker gcc qt-webkit rpm-build git
cd /tmp
git clone git://git.kitenet.net/alien
cd alien
perl Makefile.PL; make; make install
wget \
http://download.spotify.com/preview/spotify-client_0.8.0.1031.ga1569aa.552-1_amd64.deb
/usr/local/bin/alien --to-rpm \
spotify-client_0.8.0.1031.ga1569aa.552-1_amd64.deb
rpm -Uvh --nodeps spotify-client-0.8.0.1031.ga1569aa.552-2.x86_64.rpm
ln -s /usr/lib64/libssl.so.1.0.0e /usr/lib64/libssl.so.0.9.8
ln -s /lib64/libcrypto.so.1.0.0e /lib64/libcrypto.so.0.9.8
ln -s /usr/lib64/libnss3.so /usr/lib64/libnss3.so.1d
ln -s /usr/lib64/libnssutil3.so /usr/lib64/libnssutil3.so.1d
ln -s /usr/lib64/libsmime3.so /usr/lib64/libsmime3.so.1d
ln -s /lib64/libplc4.so /lib64/libplc4.so.0d
ln -s /lib64/libnspr4.so /lib64/libnspr4.so.0d

# From 
# http://meltingrobot.wordpress.com/2011/11/08/spotify-installation-on-fedora-16/
# modified to handle passing of arguments

vi /usr/local/bin/spotify

# add the following lines to /usr/local/bin/spotify
8<--------------------------
#!/bin/bash

/bin/rm -rf ~/.cache/spotify
/usr/bin/spotify.bin $*
8<--------------------------

# make the script executable
chmod +x /usr/local/bin/spotify

# arrange to use the script in place of the binary to work
# around a known issue causing segfaults
mv /usr/bin/spotify /usr/bin/spotify.bin
ln -s /usr/local/bin/spotify /usr/bin/spotify

At this point you should be able to click on the Spotify desktop shortcut and the program will launch.

So on to problem number two. One of the key features of Spotify is the ability to share playlists. This is done via a "spotify" protocol URL. Unfortunately at this point neither Linux nor your browser know how to handle this protocol. I have only worked out the specifics for gnome and Firefox, but here they are below:

# Handling the spotify protocol -- e.g. to allow use of http://sharemyplaylists.com
# From http://kb.mozillazine.org/Register_protocol
# -------------------------------------------------
# At shell command prompt:
gconftool-2 -s \
/desktop/gnome/url-handlers/spotify/command '/usr/bin/spotify %s' --type String
gconftool-2 -s \
/desktop/gnome/url-handlers/spotify/enabled --type Boolean true

# In Firefox:
#    Type about:config into the Location Bar (address bar) and press Enter.
#    Right-click -> New -> Boolean
#          -> Name: network.protocol-handler.expose.spotify
#          -> Value -> false
# Next time you click a link of protocol-type spotify you will be asked
# which application to open it with. Select /usr/bin/spotify

I think that's everything. I used the preceding successfully on my Fedora 15 desktop and my Ubuntu 11:10 laptop. But use at your own risk -- no guarantees that the foregoing will work or will not eat your data ;-)

Hope this helps someone else!

OpenERP is increasingly becoming a serious contender in the ERP market, as its features and usability improve. This tip explains how its flexible views provide ways to save time retrieving the data you want. You can search for the exact data you are interested in by just filling out the necessary search filters, but if there is a search which you perform on a regular basis, you have the option of saving filters to make life easier.

This feature was added in the OpenERP webclient from version 6.0.0 and is available for all users. Simply go to any tree view and you will notice to the bottom right of the search filters, there is a drop down selection box '-- Filters --'... this is where the magic happens.

filters_open_scaled.png

To get started creating your own saved filter, enter the search as normal and after pressing the 'Search' button, simply select 'Save filter' from the drop down box, give it a name and press 'Save'.

filter_save_scaled.png

If you already have a filter defined for the view you are currently in, select your filter from the options available and it will be immediately applied on top of your current search results, replacing the currently selected filter if one is already selected.

For the power users there are ways to customize your filters to exactly how you want them. You can select 'Manage Filters' from those available, or under Administration -> Customization -> Low Level Objects -> Actions -> Filters, you will see a list of all filters for all views and you will even be able to fine tune the search domain that the filter uses.

At the time of writing there are a few pit falls to watch out for. When it comes to editing your filters it may just be easier to create a new one from scratch. The reason for this is that when you use a filter as well as other search parameters, when it comes to saving the filter it will save the two search conditions to the same filter, rather than exchanging the existing search condition. You may end up with much fewer results than you would expect, and might not even get any at all! Also, the hint text for the filter suggests that if a filter is not assigned to a user (by making it 'False'), it will be viewable by all users; currently this is not the case does not currently work but should be resolved in the next release.

All tips in this blog can be found in the Tip Category. Should you need further Support for Linux, you've come to the right place at credativ.

Careers update - credativ UK

| No Comments

This month, credativ is pleased to welcome two new members of staff to the team in Rugby; as a leading specialist in Linux and Free Software, we are expanding in order to accommodate the growing demand for our services.

At credativ we invest in our employees - their growth and development is important to us and, by working for a dynamic company which is constantly evolving, our employees gain exposure to a diverse range of opportunities which may not be available so readily in larger, more traditional organizations.

Current Opportunities
credativ UK is looking for competent open source developers and assistant free software engineers to join our development team permanently. We have a small development team in the UK, so you need to learn quickly and be the kind of person who gets things done and cares about your coding and craftmanship. Current customer projects using Python and Ruby include:

  • back and front end web development for customers to use for training and recording of information security processes

  • creating and advancing modules for business enterprise systems

  • Work will involve supporting existing systems, improving free software packages and deploying new technologies for customers.

    What do we do?
    We develop and support business software solutions using free software; our key business areas are consulting, development, support and training. credativ supports a diverse range of clients and has a long history of contributing to free software projects.

    Our technical team is actively involved with software projects such as Debian, PostgreSQL and OpenERP, among others. Over the last decade, credativ has expanded from Germany to the UK, US, Canada and India, and worked to maintain excellent relationships with other free software organisations, companies and upstream projects. This means we have extensive links with the wider free software community and a vast pool of resources we can tap in to for the benefit of our customers.

    What do we use?
    Our platforms run on Linux and are all built using open source technologies. We use Python, Ruby, Rails, PostgreSQL, Django, Apache, C++, Git, and whatever is the best tool for the job. We use lightweight agile development processes, with a strong emphasis on test driven development; we like to get involved in user groups and open source community initiatives.

    Skills and Requirements
    Solid development skills, a hunger for learning new things and enthusiasm are the most important things. We are using some interesting technologies to solve some interesting problems, so a good approach to problem-solving is a must.
    Our platforms use a few core technologies, the more you are familiar with the better. Here is a sublist; for detailed job descriptions please see the careers pages on our website.

  • Linux

  • PostgreSQL, MySQL

  • Python, other object oriented programming languages

  • Ruby

  • Ruby on Rails

  • Open Object, OpenERP

  • GNU/Linux system administration
  • How to get in touch
    Please send your CV, a covering letter, and links to your blog, github or any open source project contributions to careers@credativ.co.uk


    sm sfd logo.png

    Saturday 17th September - credativ employees were among the volunteers of Rugby Linux Users Group (LUG) who held an open day at The Benn Partnership Centre in Rugby to promote Free Software to the local community.

    Software Freedom Day is celebrated in over 100 countries worldwide; Linux User Groups in cities all over the UK organise their own initiatives, but this is the first year that the Rugby LUG has held the event.

    "The idea of Software Freedom Day is to educate the public about the many benefits of using high-quality, free open source software that is available. Many people may be using Free Software already, without even realising it." says Nick Morrott, from the group. "By increasing awareness of Free Software, our vision is to empower everybody to be able to freely connect, create and share in a digital world that is participatory, transparent, and sustainable."

    At the free event, there were representatives from the Free and Open Source software community, as well as from local ICT companies who specialise in FOSS for the home, education and business.

    A specialised session for members of the education sector took place prior to the main event, where they could discuss their needs on a one-to-one basis with the specialists.

    Volunteers had set up a range of PCs, laptops and projectors so that visitors could experience first-hand the huge range of free software available. A wide range of software suitable for the home, education and business were on-hand, suitable for common tasks such as photo and video editing, multimedia, office and productivity, and games. Visitors could try out these free and open applications on both Microsoft and Linux platforms and benefit from informal and free advice, with the opportunity to arrange further follow-up sessions if required. Many took away information on the solutions as well as "Software Freedom Day" T-shirts, free CD's and memory sticks loaded with software.

    The event was a great success, bringing in over 100 visitors, from individuals wanting to know how FOSS can be applied to their home computers, to representatives from larger organisations, charities and local government.

    Rugby LUG hopes to organise a similar event next year. For more details, please see the group's website: http://www.rugby.lug.org.uk. or contact us at info@credativ.co.uk

    madecom2.jpgRugby, UK - 19 July 2011 credativ have been helping clients to benefit from OpenERP for 3 years now. During this time we have made 10 deployments from our UK office, including customising and writing additional accounting, invoicing, VAT, and reporting modules for specific client requests, many of which have been sent back upstream to further the development of the project.

    Our engineers have also worked on integrating with Magento web stores and have merged many branches into OpenERP Server -web and -addons, including improving the performance of the web client by running it behind mod_wsgi. During the OpenERP Community Days in March this year, one of credativ's Consultants became the first community member (non-OpenERP employee) to have contributed to the upcoming web client 6.1.

    As part of each deployment, credativ have delivered user and admin training to customer candidates. Here we take a look at the journey with Made.com, who our UK experts have been working closely with over the past year to design, develop and deploy further functionality in OpenERP.

    About Made.com
    Made.com is an ecommerce business in the home furnishings sector, selling indoor and outdoor furniture, artwork and leisure products directly from overseas factories to customers in the UK. They currently have 30 employees, based in London and Shanghai, with a warehouse near Ipswich. Their systems have 25 users, including the fulfilment, sourcing, customer service and quality control teams, as well as financial staff who use the data to manage the monthly ins and outs.

    The Challenge
    The Made.com business model is unique in that they take orders for stock prior to ordering it from suppliers, enabling them to fill containers and order in quantities high enough to keep the overall costs down, and thereby pass significant savings on to customers.
    Due to the long order lead time, it was important to Made.com to be able to automatically email customers at various stages along the process, such as when manufacture had completed and their item had been loaded into a container, and when it was about to be despatched from the warehouse. All this would help them deliver as transparent and seamless a user experience as possible to their customers.

    Requirements
    In order to be able to provide accurate delivery information to their customers, they needed to be able to allocate stock to customers before it arrived at the warehouse in Ipswich. This allocation information is used throughout the day in the customer service department in response to customer queries, and is also passed to the Magento website to provide the user with up-to-the-minute information about the status and expected arrival date of their order. This requirement meant that they needed a software solution which could be modified to grant them the ability to allocate customer orders to purchase orders prior to it arriving, which is not a default behaviour.

    Solutions considered
    As the challenge was not based on an existing system, it was designed from scratch between Andy Skipper, Chief Technology Officer at Made.com, credativ and the Magento development agency used to build the initial version of the website. Made.com had considered several other stock management and ERP systems, including SAP, Netsuite and ERPLY, but decided that OpenERP offered the most flexibility and agile development capabilities. By selecting an open source solution, the whole process was vastly more cost effective than it would have been, had they tried to modify a proprietary system.

    Skipper recalls, “credativ provided a very focused development resource, and were capable of providing solutions to the complexities that were uncovered through the process. The Made.com ERP development project is a very good example of how using a modular and flexible core system can provide a comparatively fast turnaround for a large and complex system. credativ offered very impressive technical ability and project management to enable the project to be completed to our requirements and budget. We plan on using them for further open source projects in the future.”
     
    Future Plans
    credativ are now working on the reporting capability of OpenERP to improve usability for non-technical users. Made.com may potentially move all of their product database to OpenERP, in order to remove the use of static files in that part of the company. credativ are now also creating modules to integrate OpenERP with Metapak and Asterisk.

    Contact us for more information on how open source could benefit your business.