Specifications

I recently read one of Joel’s blogs on still how difficult it is to reverse engineer a Microsoft Office document even though Microsoft have now released their specification’s on the formats. Now the problems I’ve been facing are in no way on the order of magnitude of any developer attempting to reverse engineer one of Microsoft’s Office documents, but as some of you may know I’ve been attempting (mostly with success – more tomorrow on that) to create a clone of the Amazon S3 service from their freely published documentation.

The problem is that it’s quite easy to replicate the ‘happy path’ of the specification as that’s been quite clearly documented, but when you try and recreate how and when different errors are thrown from just the documentation things become a little bit more murky. Say the document states that it throws different errors depending on if the Content-MD5 or the Content-Length don’t match was calculated by what was received by the service, then how do you know which will get sent first as it’s quite likely if one condition fails then the other will also fail? The specification doesn’t answer this, but my answer is that it’s probably best that it shouldn’t and these sort of questions are best left to developer forums as sometimes a specification can so detailed that no-one ever reads it!

Then today I was thinking on my way to my parents house that maybe I was wrong to create the back end database layer first and I should have stuck with a contract first approach, but later on my way home I remembered the reason I didn’t: The Amazon S3 REST service doesn’t have a contract, it has documentation – which simply isn’t the same. The S3 SOAP service does have a contract of sorts – it’s WSDL, but even that doesn’t help you recreate/describe the ‘unhappy path’ of the underlying service. The only real way you can do this is to write tests against the real service and hope they (the people who own the service) don’t change it much and your tests map out most of the potential paths which exist. Even better if the specification came with a downloadable set of software tests (JUnit et al) then that would make building a client even easier … a baseline reference implementation of sorts.

Simply contract first development works well when you own the software behind the contract and the contract itself. I’m not fully convinced it works as well when you have neither and your trying to clone a service. I could write tests against S3, but they would mean signing up and possibly breaking the T&C’s, but this project wasn’t to threaten S3 or get sued, but to understand it and the fundamental principles of well behaved web based services it bases itself on. I guess I’m someone who likes to take things apart to see how it works and that’s what I’ve done.

Also from my current experience it’s harder to develop a REST service than it is ‘in theory’ a SOAP service; BUT I think a REST service is easier to consume by clients of the service than SOAP. Simply because SOAP has massive interoperability problems between tool kits as the SOAP specification it itself ambiguous and are in small parts incompatible with several languages and REST simply has none of this because it based on the great HTTP RFC 2616 which the entire web is based on (including the majority of SOAP based services).

I have no solutions, just more questions and that generally isn’t a bad thing!

Windows

Dear Windows,

I’m not sure how i’m going to say this, but I think our time together is at an end. I’m not sure that we were ever that suited to each other even when we first met; around that time I’d had recently left Commodore and I think I was just looking for something different.

We’ve been together for many years now (15 years I believe) and I’ve seen you change from 3.1 to 95, 2000, XP and now finally Vista. I guess we’ve just grown apart and we simply don’t have the same interests anymore, you’re more into your business work and i’m still just a hacker at heart, which I don’t think you ever satisfied or appreciated. I should tell you that while at University I met Solaris (who knows Unix and Linux) and that’s were my doubts about you started, but even then I stuck by you as they had serious issues which meant I couldn’t imagine living with them.

Many years ago I met Apple at work and although we didn’t like each other at first (OS 7 – 9) a couple of years ago I cheated on you with Apple by buying a MacMini. Now after having fun with Apple for a couple of years I’ve decided that Apple and I are better suited than you and I ever were. You may think this is just a phase i’m going through and you may be right, but I think I need to try, so Apple is moving in next week (MacBook Pro is on it’s way!) therefore i’m sorry to say you’ll need to move out (and BT says I can’t keep you). I’m sure we’ll see each other around the office.

Take care,
Milan

Flock

I first tried Flock when it came out in it’s initial beta many many moons ago, but with the recent death of Netscape and some fortunate stumbling I downloaded and gave the 1.1 beta release (now final) a go…
Now for those who have never tried Flock as a browser I can best describe it as Firefox inside with a social networking wrapper on the outside. My default layout for Flock (shown below – I lost the picture) is to have a left hand ‘People’ frame showing all updates from my Twitter, Facebook and Flickr friends sorted by the most recent updates first and have a media stream of pictures my friends have uploaded to Facebook (although that’s usually hidden to gain browser space).
The one thing which does really annoy me is that some people update their Facebook status using Twitter and obviously this causes duplicates in my people feed (not flocks fault). I can see why they do that, but for me a tweet is different from a status update; it’s just plain lazy and pointless duplication.You also have a ‘My World’ page which aggregates all of this as well have any Atom/RSS feeds I have into a single page view. And if that wasn’t enough you can save your bookmarks to del.icio.us, post directly to Twitter and Facebook and even write a blog posting. All of which I think is pretty damn cool.
OK all that’s all great and I do use it as my default browser, but what wrong with this picture … (I lost the picture)
Personally I think at the moment this is a cool, but ultimately a fringe browser for people who are interested in social networking or earn a living by it; there just aren’t that many people who will find it useful (i.e. most of my friends and family would never need it – yet). Also unless you already have accounts of Facebook, Twitter and Flickr etc.. the appeal is very limited and you can’t easily add more services to the browser as the ones which are there are baked in.
Then again I do have accounts on all those web services and I am interested in social networking, so it might as well be called Milan’s browser. I’m looked forward to future updates. This post was written using Flock

Tags: , , , , ,

Paranoia

I looked out of the window of my flat last night and saw this:

What is this?

My mate Piotr and I were baffled on what this could be; I was tempted to turn off all my wifi kit just in case it was some sort of semi pro wifi hacker? Other options were some sort of mind control device, a not very discret surveillance car, a mobile pirate radio station or even someone who really really wants to get a good radio reception! It had gone by the time we’d finished playing squash … maybe I should have turned my wifi off!

Office

I wiped my computer a few weeks ago and it was only today I noticed that I don’t have Microsoft Office installed. I guess I really don’t need it anymore. Yah!

I think I’m going to stick with the old but useful XML Resume to update terribly out of date my CV!

Aspects and PHP

phpAspects is a project to bring Aspect-oriented programming (AOP) to PHP. If you don’t know much about aspects then to state it very simply aspects is a way to separate concerns such as logging and dependencies like database handling to produce more manageable and maintainable code, but the wikipedia article on aspects can describe this better than I can!

Now this project isn’t really mature at the moment i.e. it’s alpha version 0.10 which means it’s very likely to change alot before it becomes final which is the main reason I’ve decided not to use it from my SimpleStorageService project – although I’m really tempted to play around with it anyway; The other reason why I’m not going to use it is because i’d have to add another PECL dependency to the project (on top of PDO) which is Parse_Tree (another alpha dependency… Hmmm!). Then again over the last few days I’ve come to believe if you want to write a REST interface on a LAMP style stack you’re going to need to be able to have server level access to configure Apache to allow PUT & DELETE HTTP verbs (editing httpd.conf), although I’m hoping to find a higher level solution to that problem for easier deployment.

But I do recommend that it’s a project you should keep your eye on as sooner or later you’re going to want to use Aspects to keep your code clean and simple. Once it is a bit more mature I will be using it in some or my more work related projects where I’ll have more control of which extensions are compiled into PHP.

Milan

While stumbling today I found a Microsoft Project named Milan and started to wonder what else was using my name… Obviouslly I know of the City of Milan in Italy, but I also found these other regions, cities & towns (from wikipedia):

In Canada: Milan, Quebec, a village

In Colombia: Milan, Caquet a town and municipality

In the United States of America:

Surprizing in the 2000 US census 27,795 amercians said they lived in Milan!!!

Milan is also:

  • MILAN, a European anti-tank guided missile, which is one way to get through London traffic
  • A.C. Milan and Inter Milan are two football teams of the city of Milan (Italy)
  • Milan, a variant of the Mirage V fighter – I wish I had one of these!
  • Mercury Milan, a ugly american ford car – I don’t want one of these!
  • Milan Entertainment, an internationally operating record company
  • Milan Records, a record label which boasts an extensive electronic catalog which features down-tempo, chillout, and eclectic electronic releases. I approve.
  • Milan (aka The Leather Boy), a New York musician and producer active in the 1960s; With a name like the leather boy it’s no wonder he changed his name
  • Milan (film) -There are 4 movies with my name (all romantic, just like me)
  • Milan (website) – They make network components – yawn! If i’d had a little more money when I was a student that site would have been mine.

Let me know if you find any other interesting uses of my name. At least I’m not alone!

One reason not to use a email address as an username

I’m helping out a friend at the moment to include a forum into a charity site. The site is used by teachers and students as a educational resource and some of the resources are protected by your typical authentication system which uses the users email address as a username and a password of their choosing. Now none of this would normally set of any alarm bells, but adding a forum to this site brought a question.

 A typical forum uses a unique friendly name to identify users without exposing any contact information. If the only unique user identifier you have is a email address and you for very obvious reasons you don’t want teacher and students email addresses exposed then how do you resolve this?

In this case a solution was achievable although with caveats which I’m not going to go into right now, but in future when I (or you) design an authentication system it might do you good to at least consider how your user identifier is going to be used.

Before anyone says use OpenID I just don’t think it really would work in forums (yet); Although I know that you can have a nickname in OpenID it’s just too clunky a system at the moment to create a new persona for a new site for the average user (if your default nickname is already in use) i.e. it doesn’t pass my ‘Can my mum understand this?’ test!

Do I have a better solution – hell no, but i’m sure smarty people than me are thinking about it!

UPDATE: I found a excellent blog which articulates some of my concerns about OpenID

Simple Storage Service – Very Alpha Release

So after reading about the unscheduled downtime of Amazon S3 yesterday I thought that I should probably release what I’ve done so far. Although most of the work I’ve done has been focused on the storage layer and writing many many tests for it. So last night I spent a few hours hacking in functionality into what will be the REST layer of the service mostly from a PHP S3 Client to provide a very basic service to show what I’ve been doing – mostly handcrafted responses; although I’m probably going to the the pecl http extension to handle most of this in the future

This isn’t really up-to what I’d call alpha ‘quality’ in any respect, but it’s just a sneak peak with many many cavets i.e.

Anonymous authentication doesn’t work at all (you need an authenticated user for all method calls)

Only putBucket, deleteBucket, putObject, getObject, deleteObject have been partially implemented, although most methods are implemented at the storage layer.

Many many things need to be re-factored

Exception handling isn’t fully implemented yet

The REST layer has no tests and the SOAP layer hasn’t been started yet

You need the (PECL) PDO MySQL extension added to PHP (and probably some other PEAR libraries like Crypt/HMAC)

No documentation yet, but I’m willing to help with any questions

You need to be able to edit the httpd.conf for apache to enable PUT and DELETE http verbs*

If your running PHP as CGI then you may need to modify my .htaccess (well maybe?)

You need to create your own user using createUser in the storage class (but I’ll add a script into the subversion to help with this)

Security hasn’t been tested and the code is not optimized in any way

Plus some other stuff that I may have forgotten because I’m tired

You may have got the impression that I’m not entirely satisfied with this code yet and you’d be right. I’m only releasing this as *some* people *may* find it interesting. And one final thing, I don’t have a Amazon S3 account, I’ve basically cobbled this together from the documentation (which can be inconsistent), because I read the T&C’s and I wasn’t sure if Amazon would sue me if I agreed to them, so I didn’t!

Also you’ll need to create a mysql database, but the database details are hardcoded into the src/s3/lib/storage.php file and test/AbstractTest.php for unit tests.

So … blah, blah … it might not work … blah, blah … give me a break and i’ll help you ….. blah, blah …. I won’t be able to do any more work on this for one week before I start again … so here is the SVN URL ….

http://svn.magudia.com/s3server/

On the positive side of things, when I do get time next week to continue working on this project the hardest parts of the project have been thought about or have already completed, so implemented the REST and SOAP layers shouldn’t take along as I did implementing the storage layer.

* You need to modify your httpd.conf to allow PUT and DELETE http verbs by including these commands in your htdocs <DIRECTORY> tag (Apache doesn’t allow PUT or DELETE http verbs by default for sensible security reasons)

Script PUT /workspace/s3server/src/index.php

Script DELETE /workspace/s3server/src/index.php

Where the index.php matches where you (relative to your htdocs path) checked out the code.

Agile and PHP

So since my last post I’ve actually started to write my SimpleStorageService project and as I’m an agile developer I decided to write the project with the agile skills I’ve picked up over the last few years with Java, .net, scrum-master training et al and check out how easy it actually is to ‘do agile’ with PHP.

So…. where should I begin….

Unit Testing (Test Driven Development):

Firstly PHP has had unit testing for quite some time with PHPUnit; this is something which after using unit testing in Java and C# was actually quite straight forward and although there are other testing frameworks like SimpleTest I decided to go with PHPUnit as it seems more comprehensive; Although I found that SimpleTest has a better mocking implementation than PHPUnit, but for now I’m sticking with PHPUnit.

Also PHPUnit can integrate with Selenium and has a partial implementation of DbUnit, but that’s not complete yet – hopefully this will be complete by PHPUnit 4

Continuous Integration:

Now I didn’t think PHP had anything like this, so when I was looking into testing I found the phpUnderControl project which literally knocked my coding socks off as it’s a PHP wrapper for cruiseControl, but with a cool interface and extra PHP goodies on project code metrics, a Java like checkStyle which defaults to the PEAR coding standard and generating phpDoc as well as the normal cruiseControl stuff.

I was so impressed by this project that at the time (early January) I set it up on my macmini although I did have to use macports to replace the crippled default build of PHP that is bundled with OS X (please fix this Apple!). I initially installed version 0.20 of phpUnderControl, but I’m currently upgrading my install to the recently released version 0.30 which has a neat javascript metrics view – which is nice

Finally phpUnderControl neatly integrates with PHPUnit, which another reason why I’m using this and the project is now hosted alongside PHPUnit, so I hope to see more integration between the projects in the future.

Integrated Development Environment:

Allow this is by all means not needed to practise Agile, but a good IDE helps you write better code faster. I used to use DreamWeaver for all my PHP web development work, but as my SimpleStorageService is by definition a service project I didn’t need any HTML editing functionality. Anyway here was my IDE shortlist:

Ignoring DreamWeaver and TextPad as being out of date and inappropriate for the project I began with Eclipse (with PDT), but I quickly found several problems with this mainly SVN integration amongst other things. Then I gave Aptana a go which was beta at the time and did fix my SVN issues, but in the final version this was removed from the free edition (grrr!). So just when I thought that PHP didn’t have a good IDE literally stumbledupon Zend Studio Neon which ticked nearly every box I wanted from a IDE for PHP … PHPUnit, phpDoc, SVN, code coverage, code formatting, real time error checking, intellisense and much much more. The downsides are a bug where it doesn’t understand the PDO class when unit-testing (well it is beta!) and the final version isn’t free, so I’m using a time trial version which runs out in just over two weeks. Anyone want to buy me a copy ;-)

Source Control:

It still surprises me how many people don’t use or even understand the point of source-control, but I’ve been a big user for many years. Firstly with CVS and then once Subversion (SVN) was more stable I moved to that and didn’t look back. I know there are many other choices here, but as SVN is integrated into phpUnderControl and Zend Studio it was simply a no brainer. My DreamHost account includes SVN so all my code can be committed ‘off site’ and I can create an abstraction between my IDE and continuous integration environment.

Conclusion:

The state of Agile in PHP is good and much much better than it was even six months ago. I think once PHPUnit 4 is released, phpUnderControl reaches stablity and Eclipse with PDT catches up with Zend Studio (add unit testing, svn projects) then Agile in PHP should be excellent and easy to accomplish. One thing I haven’t looked at is if PHP has any good scrum management products (but I guess this doesn’t necessarily have to be in PHP).