3

SSH configuration parameters

published on 2008|10|14

Having a number of hosts following the same naming scheme is good, defining them is repetitive and boring. Maybe we could patch SSH to achieve something like this. Take a look at an example for ssh-config:

Host myhost%02d
    User lars
    Hostname myhost$1.example.com

ssh myhost02 would be expanded and as a result would resolve to myhost02.example.com.

3

Antipattern: chaining stateless protocol requests

published on 2008|09|24

As we all know, HTTP is a stateless protocol. We do all sort of hacks to add state, like ext/session in PHP. While such hacks work great for a lot of use cases, we should remind ourselves that they are hacks. There is a phenomenon of state creep: coupling unrelated HTTP requests. Think of a page that references a thumbnail in an <img/>-tag and the picture is generated as needed: it would be possible to generate that image in the context of the request that embeds that image. So the template calls a helper to generate the thumbnail and the thumbnail is generated in the file system.

While this works well for a single host, your personal weblog about cooking and cats, it won’t work for something serious. When you start load balancing between two webserver nodes you are set on fire as you can’t guarantee that the image is present on the correct node (beside you are generating the image n times where n is the number of nodes). The solution is not that hard: pregenerate all the images with a queuing system and display “This image is currently not available”-placeholders as long as they are not ready or – in case of little image uploads – generate them when uploading the image. The other option is to generate them on the fly when they are requested. If you do the latter, do it in the context of the request that tries to receive the image, not in the embedding context (the page that embeds the image). Generating on the fly means that you deliver your files through PHP or something similar: this is fine as long as you have an HTTP accelerator in place.

One of the systems that does it in the way described above is Drupal. I’ve implement MogileFS for image storage and retrieval for Drupal and let me say, it was not a pleasure.

On a side note: HTTP 1.1 allows resources to be fetched in parallel, which makes generating images in the wrong context even worse from a user experience point of view, as the page will not show up until each thumbnail is generated.

6

8 Hints out of Testing-Turmoil

published on 2008|09|19
  1. Have a continuous integration solution in place. Really. If you don’t, you just burn money by writing tests. I would go so far and say, if you don’t have continuous integration, you should stop writing unit tests and do click testing. Let your CI system generate API docs, high level docs, code coverage report, testdox and every statical analysis info you generate.
  2. The definition of “tests pass” is “tests pass on the continous integration system”. “Works for me” has neither a place in the bugtracker nor everywhere else.
  3. If you can’t test it, the architecture is most likely wrong (exceptions are sessions and caching related code which is generally hard to test). Testability should be your main concern when writing code. What’s the use of fast or wonderful looking code, if you can’t repeatable prove it is working?
  4. Prefer method calls over annotations. A typo in setExpectedException will trigger a transparent error, while a typo in expectedException will lead to Obscure Test, and most likely a Mistery Guest.
  5. Run the whole test harness twice. This will hellp to identify setup/teardown bugs. Create a random test suite to identify the hard to track mistakes.
  6. Run your testsuite really often. We run it with 15 seconds delay every minute and I’m pretty happy with it.
  7. Use good test names that describe the behavior of the unit. The behavior is not the unit you test itself, that’s what I see in the code, it is something like “calling register changes the status of the user to foobar” so the good test name would be “testRegisterChangesTheStatus …”.
  8. Aim for 100% code coverage. 95% is nothing to be proud about, I can guarantee, the missing 5% will be the hardest part.
0

Recovering a software RAID

published on 2008|09|13

The scenario: my RAID crashed because I’ve messed around with the partition table of one of the disks in there. This results in a RAID array not being able to assemble itself because the superblock of the messed up device is invalid. The trick is pretty easy: just recreate the whole RAID with mdadm. The existing metadata will not be overwritten, the current information is just replicated. I used to have a simple RAID1, but I’ve now recreated it as an incomplete RAID5 (--level=5, --raid-devices=2) as the missing disk is soon to be bought.

$ mdadm --create /dev/md0 --level=5 --raid-devices=2 /dev/<original> /dev/<crashed>

If you like to stick with a RAID1, and not doing the migration to RAID5 along the way, just use --level=1 instead. I’m not really sure if the order of the disks matter and I’m not brave enough to find it out.

Tomorrow I’m going to buy the next disk for the RAID to make sure the redundancy level is alright. Generally I’m pretty amazed that this kind of setup is so robust. Even me messing around with it can’t bring it down.

Tags: , , ,

2

Testing PHP 5.3 alpha1

published on 2008|08|03

Finally Johannes Schlüter baked a first alpha-tarball for PHP 5.3. The new version contains a huge amount of new features, like closures, namespaces and late static binding. Such a huge amount requires thorough testing: if you are using a PHP application you would like to see fully working with our brand new version or you are developing a PHP application, this is your chance to make sure everything will go smoothly. If you are a web hosting provider, do your performance benchmarks now!
If you are, accidentally, using the Gentoo Linux distribution, I have something for you: in my personal PHP overlay you can find an ebuild for PHP 5.3.0_alpha1. A few warnings: currently, ext/fileinfo does not compile because of #45636 and of course I did not test all possible USE-flag combinations. If you experience problems with it, just leave a comment here.

Tags: , ,

0

Specific env vars for Gentoo packages

published on 2008|08|02

Since Gentoo Portage introduced the package specific configuration in /etc/portage there was one thing I always missed: specifying environment variables per package. Some environment variables you might want to specify per package are CFLAGS, CXXFLAGS and FEATURES. Especially when you do debugging, some packages should not be stripped, which is the perfect use case for the FEATURES environment variable. While specifying USE-flags and keywords per package, the rest is not that easy. Christian Hoffmann dropped me a link to this mail: the tip there works fine. I’ve played around with it and implemented it slightly differently: first, I would like to be informed which environment files are read and second I changed the resolution order so that the specific configuration inherits from the more generic. So this is what my /etc/portage/bashrc looks like now:
[geshi lang=bash] for conf in ${PN} ${PN}-${PV} ${PN}-${PV}-${PR}; do env=/etc/portage/env/${CATEGORY}/${conf}.env if [[ -f ${env} ]]; then einfo "Reading specific environment from ${env}" . ${env} fi done [/geshi]
For dev-lang/php-5.2.6-r3 I can use three different files to customize the build environment: /etc/portage/env/dev-lang/php.env would apply for all PHP ebuilds, /etc/portage/env/dev-lang/php-5.2.6.env for all revision of the ebuild for 5.2.6 and /etc/portage/env/dev-lang/php-5.2.6-r2.env for the exact ebuild. My /etc/portage/env/dev-lang/php.env file now looks like this to disable stripping the binaries after emerging them and keeping the working directory for better backtraces:
[geshi lang=bash] FEATURES="${FEATURES} nostrip keepwork" [/geshi]

48

Antipattern: the verbose constructor

published on 2008|07|31

Constructors are often used to shortcut dependency injection and parameter passing on instantiation. This is a valid practice and often leads to shorter code. Consider the following example (a simple value object, often used to not mess around with floats and to keep currency and amount together):

class Money
{
    protected $_amount;
    protected $_currency;
    protected $_divisor;
    public function __construct(
        $amount = null, $currency = null, $divisor = null)
    {
        if ($amount !== null)
            $this->setAmount($amount);
        if ($current !== null)
            $this->setCurrency($currency);
        if ($divisor !== null)
            $this->setDivisor($divisor);
    }
    ... setter and getter ...
}

Now consider instantiating this object. Instead of creating a new instance of “Money” and calling three setter, everything can be done compactly in the constructor.

bc . $money = new Money(13200, ‘EUR’, 100);

So for the money object this works pretty well. The code is easy to read, but wait, the first argument can be grasped easily, the second too, but the third? It is not too obvious that it is a divisor is passed. An alternative would be changing the constructor to accept an array. This is a replacement for true named arguments, as e.g. Python supports. Solar uses that a lot, as well as the Zend Framework.

$money = new Money(
    array(
        'amount' => 13200,
        'currency' => 'EUR',
        'divisor' => 100
    )
);

Much better readable but does your IDE code completion works? And what happens if you pass “amoµnt”, because your fingers are as clumsy as mine? Exactly, the parameter will be silently ignored.
But look at this:

$money = new Money();
$money->setAmount(13200);
$money->setCurrency('EUR');
$money->setDivisor(100);

It is at least equally short, readable, your IDE works and if you have problems with the dimensions of your keys on your keyboard (they are too small, it has nothing to do with your fingers) you will be warned. But we could even have an even shorter example while maintaining the readability. With fluent interfaces we would get the following:

$money = new Money();
$money->setAmount(13200)->setCurrency('EUR')->setDivisor(100);

Wonderful! If you want, you can add a newline between each object operator and you would have the same amount of lines but less dense code (sad that we don’t have fluent constructors, isn’t it?). Sometimes setters are so elegant.

So until know one thing should be clear: it is not just about easily writing the code, but about the next guy understanding it too. Because you never write code for yourself. Never. But let’s investigate some real live example. I work with a framework that allows me to define really nifty business logic by just sticking together a bunch of fields and every field having a bunch of validators and filters attached.

class User extends Model
{
    protected function _define(Definition $definition)
    {
        $definition->addField(new StringField('username', true, null, true));
    }
    protected function _getStorageClass()
    {
        return 'UserStorage';
    }
}

All the time I write such a definition, I need to look into the code to check the order of the parameters. I can remember the first parameter, but the rest is too similar. To explain it: the second parameter specifies whether the field is required, the third expects a default parameter and the fourth indicates whether the value can be changed after it has been set once. I’ve talked about filters and validators, right?

class User extends Model
{
    protected function _define(Definition $definition)
    {
        $definition->addField(new StringField('username', true, null, true))
            ->addValidator(new UniqueUserValidator())
            ->addFilter(new LowercaseFilter())
            ->addValidator(new RegexValidator('/^[a-z]+$/'));
    }
}

Definition::addField() returns the passed field object to allow adding validators and filters. What works for validators and filters, should work for the rest too, shouldn’t it?

class User extends Model
{
    protected function _define(Definition $definition)
    {
        $definition->addField(new StringField('username'))
            ->setRequired(true)
            ->setReadonly(true);
    }
}

I admit, a bit more code to write, but a huge improvement in readability and therefore in maintainability. Other variants, where setter are not a good solution is to create an expressive factory. We e.g. have a Criteria object that creates and orders Criterion objects internally. Because we don’t have a fluent constructor, we have a static create-method for the Criteria object.

$criteria = Criteria::create('User')->field('id')->equal(1);

The alternative with just utilizing the constructor would be horribly to read and would have limitations regarding the parameter parsing capabilities (except if func_get_args() is used, which is totally the opposite of the paradigm of strict APIs). But back to the constructor only example:

$criteria = new Criteria('User', array('id' => 1));

And how would you express “id not equal 1” with it? So that’s where expressive factories are an alternative.

Constructors, as like any other method, should have as less parameters as possible but as much as needed. Obvious. The constructor should only allow setting vital information for the object (if the object has a name, there is a good chance, that the name is the parameter of the class’ constructor because it is considered vital). And the ease of use depends heavily whether the parameters passed can be intuitively distinguished by looking at there values. As well when the code is written first time as for maintaining it for the rest of your life.

(There are a bunch of other tricks to make parameters more readable, like using class constants as parameters, but this is out of scope of this article).

Tags: , ,

0

Still a dog

published on 2008|07|16

But the fact remains: even if no one knows you’re a dog, you’re still a dog. A fantastical online identity never impresses readers. Instead, it makes them think you’re more into image than substance, or that you’re simply insecure. Use your real name for all interactions, or if for some reason you require anonymity, then make up a name that sounds like a perfectly normal real name, and use it consistently.
Producing OSS – Communications

1

Jabber for portage 0.0.5(.1)

published on 2008|07|05

portage-mod_jabber is a small elog module for Gentoo’s portage which provides Jabber notification for portage. It basically allows to receive notifications via jabber when elog events occur.

The new version of portage-mod_jabber fixes a deprecation warning with portage 2.2 (but works with older versions too) and introduces a new syntax for the JABBERFROM-variable: both node:pw@host.com/resource and node@host.com/resource:pw are allowed now.

Grab it while it’s hot!

Note: I’ve released 0.0.5.1 to fix an invalid exception catch. Catched LoadError instead of ImportError. That’s what happens when you confuse Python with Ruby.

10

Over abbreviated

published on 2008|06|30


© Giant Ginkgo

Matthew Weier O’Phinney announced Zend’s naming scheme for the Zend Framework from the point where PHP 5.3 namespaces are used. The issue is, that the PHP parser does not allow class Abstract, neither interface Interface as both “abstract” and “interface” are reserved keywords. So Zend suggests prefixing interfaces with “I” and abstract classes with “A”. Hungarian notation for classes and interfaces.

One of the bullet points in the list of “what makes a name a good name?” is and will be forever “as short as possible, as verbose as needed”, other points are “you must understand the name without studying specific rules before”. The last is why hungarian notation sucks so tremendously. The IFoo/ABar violates two of those criteria: first it is not as verbose as it could be with just a few keystrokes more: AbstractBar would work fine and is much clearer. At second it introduces a special notation you have to grasp before. While AbstractBar would be as descripive as possible, ABar is cryptic for those who are not lucky enough to practice Python programming.

If we are at it, the scheme makes it impossible to have grammatically correct names: IFoo would be read as InterfaceFoo which really should be FooInterface. And no, the fix is not FooI.

2

Sound on a 3rd generation MacBook

published on 2008|06|10

Thanks to the heros on the Mactel list I have sound again. If you use Linux on a MacBook of the third generation and your soundcard has been detected but just no sound will come out, putting the following in /etc/modprobe.conf (on Gentoo: /etc/modprobe.d/alsa and modules-update afterwards) should fix the problem. Please reboot or reload snd-hda-intel after changing the configuration. NOTE: it has been reported to work fine on MacBook Pro’s too.

options snd-hda-intel index=0 model=mbp3
0

Join us

published on 2008|06|04


You were a bit bored lately: you wanted to have time and infrastructure for unit tests and continuous integration but it was “too expensive”, you wanted a more grown up, professional structure for development, a coding style, a build system, lots of books for training, augmentative thinking about architecture and object orientation or – more general – work you can both take pride in and sleep well with. This is how we want to develop software (and we are close to it and continuously improving). We are offering two positions: senior developer and another one more suited for career starters.
You will work on various projects, including a not yet released open source framework based on the Zend Framework (and yes, we are using PHP 5.3 for development). You should be fluent in PHP 5, at least know what unit tests are and you have a good understanding of object oriented programming. And no, you don’t need to know who’s invented the pepper mill or the handcar or PHP.
Additionally benefits include table football, a Wii, free water and coffee and silent workplaces.
So the ball’s in your court: if you are interested, drop me a message.

5

Security "to go"?

published on 2008|05|20

I’m a huge fan of PHP-IDS. Mario Heiderich and Christian Matthies did an incredible job polishing this tool, adding new features and trying to catch every esoteric attack signature. However I have the feeling there is some confusion (german) about what intrusion detection is for. On a server, intrusion detection is used to diagnose a break in. First of all you do everything not to let your server go down. You have a firewall, you try not to expose services to the outside, you do SSH with port knocking, you put a risky service into jail or chroots, you use the Suhosin patchset and so on. There are various strategies how to harden a server. The hardening is the barrier against break-in attempts.
If the hell freezes, the intrusion detection mechanism comes into play to make sure the attempt is not overseen and the machine does not become yet another zombie in a bot net. PHP-IDS is an intrusion detection tool on the application level. Application firewalls know about a certain protocol and its structure (e.g. HTTP) and inspect the protocol to detect attack patterns. Some of them are even capable of learning from usual request signatures and enforcing rules based on the learned data. There are various commercial products to achieve application firewalling. PHP-IDS does the same for free and sits directly on the webserver in the scope of the application. For personal usage or projects with a lower budgets who can’t effort expensive products, it might be a good supplement. Beside being a supplement, application firewalls are a valid use when security becomes an urgent problem: a lot of heavily flawed software is designed (often it is not even designed) and developed without a developer even heard about security: “Yes you can inject HTML, that’s a feature!”, “‘ OR true/* lists you every item, isn’t that cool?”. If such projects become popular, application firewalls might be an option to hotfix the disaster. But nevertheless the application needs to be fixed.
The very immanent issue with application firewalls is that there is no other place to know exactly what’s proper incoming data for the application – except in the application itself. That’s why application firewalls can never be perfect. IDS is needed for the 2% the developer forgot. So it is not like coffee to go. It is like having the coffee and adding milk or sugar. Having milk without coffee seems pointless to me anyway.

1

PHP Unconference Hamburg - Day 1

published on 2008|04|27

The first day at the PHP Unconference in Hamburg was quite nice. The day started with a slightly confused registration, followed by the notorious voting for sessions. Our planned talk was magically lost but I was too tired to object.
I attended two sessions, “Security Development Lifecycle”, a process model developed by Microsoft to strengthen the focus on security during development. While the entire process is pretty complex, there are a few ideas and basic rules that are worth adapting. Treating security problems as show-stoppers should be obvious, classifying attack surfaces, scenarios and privacy impacts is a thankless job, regular security training for the development team is a good idea, but do you really do it? The second session was “Ask the core developer” by Johannes Schlüter. It ended up pitying one another and wining a bit about missing innovation in core, an impression I don’t share.
The interesting parts were not the sessions but the corridor conversations. It’s always interesting to hear how others do PHP.

7

NOWDOC + double quotes = HEREDOC

published on 2008|04|12

PHP 5.3 introduces a new syntax element, NOWDOC. If you know HEREDOC, NOWDOC is easy to understand: it is in fact HEREDOC taken literally. Whily variables are expanded in HEREDOC, in NOWDOC they are not. Just to remind us, a small HEREDOC example:

$value = "Hello World!";
$var = <<<LABEL
$value
LABEL;

$var will contain “Hello World!” now.

<?php
$value = "Hello World!";
$var = <<<'LABEL'
$value
LABEL;

$value is not expanded, so $var contains literally “$value”.

For consistency and the sake of completeness, an alternative syntax has been introduced:

<?php
$value = "Hello World!";
$var = <<<"LABEL"
$value
LABEL;

Guess how it behaves …

Tags:

(Page 1 of 56, totaling 830 entries) » next page