My Way to PHP: Day 9 of 75

Day 9. I just published yesterday’s article and looked back at my first post. It’s just about a week ago but if eels like months. I learned so much in the past days it’s absolutely great.

In Mastering the SPL Library I’m on page 115 which is the start of the chapter about interfaces. My goal is 50 pages, i.e. 165 which is the end of the chapter about exceptions.

I really like the book so far. The biggest criticism is that the source code is grayish. If it would be just in black I would be fine and if it would have been highlighted it would have been great. Strange that phparch didn’t do that.


  • IS-A = interfaces
  • HAS-A = abstract class

Traversable Allows the use of foreach(). However, you can’t implement that yourself. You can use Iterator or IteratorAggregate.

IteratorAggregate Returns an iterator for an aggregation the object has. E.g. A school has students, so you iterate over them.

ArrayAccess Just like the name sounds, it provides array access to your object, i.e. you can access elements by their index (set and get).

Serializable You no longer use __sleep() and __wakeup()!

SeekableIterator Get random access.


In the intro of the chapter about exceptions there’s a beautiful graph showing all exceptions and how they inherit from each other. Nice!

There’s two main classes LogicException (‘compile’ time) and RuntimeException (runtime). It’s quite clear to which you have to look.

  • BadMethodCallException if you do some __call() magic use that
  • DomainException when something is against the domain’s rule (business rule)
  • InvalidArgumentException
  • RangeException when something is out of range

I think these 5 will be the most used ones. However, I’m going to check Github and look for results. I searched for “new language: PHP”.

[visualizer id=”3155″]


Ok, last chapter which is called miscellanous funcitonality.

  • iterator_apply() similar to array_walk() but for iterators (or anything that is a traversable)
  • iterator_to_array()
  • splt_object_hash() returns a 32-char hash. Problematic the hash changes from process to process. That’s why serialzing an object and unserialzing it doesn’t regenerate the same object!

There is also an implementation of Arrays as Objects. The idea is that you can work easier with references because Copy on Write doesn’t kick in.


This was a really good book! I’m quite relieved that after the fiasco from the other book, this book shined. It’s short, not fluffy, goes into depth and you can see that the author took time to write.


Reading-wise I want to read one last book on the internals of PHP. It’s called PHP Internals Book and it’s about the internals of PHP! I don’t know how many pages it has thus I will count the number of chapters which is 5!

After reading that book I’m okay with the knowledge about the intricacies of PHP.

The first chapter is about building PHP I will skimmed that mostly.

  • Extensions removed from the main PHP distribution will land in PECL
  • You can use pecl to install extensions (dynamically linked) without recompiling the PHP binary: pecl install $extension

Ok, the build chapter is over and now the chapter about zvals starts.

The strings are implemented with their length and a pointer. The problem is that the length is in bytes which leads to problems using unicode. Here’s an example and the correct way to do it:

echo strlen("♞"); // 3
echo mb_strlen("♞"); // 1

Imho, PHP should slowly default to mb_* functions. If you’re developing new code, it’s probably a good idea to work with unicode from the ground up.


The internal references used by zvals means how often something is used by something other (variable, array, function, etc.).

In case of PHP references &$foo the attribute is_ref__gc is set to 1 which explicitly tells the VM not to copy on write.


Bottleneck Analysis

User perception = reality!

He tells the story of Apple who uses screenshots to “restore” the screen after sleep. The users quickly sees something. I like the idea. For example, it would make sense for a website to first deliver some cached elements of the page, so that the user sees something quickly and then inject the more dynamic content.

Start off with checking the user time

You can use the typical network traffic analysis tools in Firefox, Chrome, etc. This gives you a good idea were the problems are.


There’s a tool called Boomerang which measures the speed of your site as experienced by the end-user. It returns the data in a URL access, so you can extract the data from your logs.


The next part is about web server optimization: Compression, buffers, network, i/o, etc. You can test all that with a static file.

For PHP use OpCache and Fast CGI.

XHProf can be used in product with sampling and aggregates the data which is pretty nice.

Errors can take a lot of time (thanks to writing the log). You should write error-free code nonetheless but for bigger applications it’s even a performance problem.

A good idea to set all errors to fatal. This will fix errors pretty fast.

Log slow queries in your database


Again, a talk in German. The title is Code Reviews – Leave your ego at the door. The slides are in English though!

  • Editing is standard in publishing although the writers are good but they know that they make errors
  • Code quality is about attitude. Be open to criticism and your quality will improve / you will learn
  • Reviews help sharing knowledge
  • No blame culture
  • Try to find problems not solutions
  • Review the critical code

Different types:

  • Adhoc: 5min problem solving
  • Peer desk check: Asynchronous and works for lots of code, some do 1 hour per day
  • Pair programming: Review on the fly
  • Walkthrough: Explains code, others ask questions, educational
  • Code Reading: some other person explains your code, you answer questions (this is a source material for comments)

The Architecture of Stack Overflow:

  • 560mio page views per months handled by 25 servers, developed by 5 devs

How is that possible?

  • Very simple system (around 1/10 of the average project)
  • The newest version is beta-tested on meta.stackoverflow.com by the users
  • Heavily cached (CDN/Browser, Server level, Redis, 384GB Ram whole DB cached in memory, SSDs)
  • Solve most problems at compile time
  • Have a great team

Something I have so say. I had around 240 talks queued up and look at the first 10% normally and then decide to watch it completely or not. There seems to be a strong correlation between presenters, topics and community. I enjoyed the ones from the IT security community for a long time and I never would have thought that there are such good talks around the PHP community. Kudos!


This talk is called *How Instagram.com works”

It was build as a Single Page App (SPA).

The biggest problem is the poor load performance. Even gzipped the complete JS for instagram.com is around 2.5mb. To get around that they just load the minimal JS required for each page (entrypoint).

They use a module system and then bundle modules which is optimized, so that each file can be cached which is reused! There’s also a front controller which asynchronously loads these bundles.


The next talk is called: Surviving a Prime Time TV Commercial

  • Expected visitors: up to 10k visitors within a minute or two
  • They used an ecommerce solution but rewrote the directly user-facing stuff.
  • Front-page, category pages, search etc. were rewritten
  • They hosted everything in the cloud (EC3)
  • Used Symfony as their framework – everything is bundled and decoupled
  • They stored all their data into Elasticsearch
  • Clever solution. As soon as you put something into your shopping cart (or login) you go into https mode and therefore on the old system. Otherwise you just browse in http on the new system
  • For loading they set a marker cookie if someone put something into the shopping cart, then they included the dynamically cart info otherwise just the cached version (ESI)
  • Outsource the tedious stuff: CDN, mail servers, hosting,

Jim Coplien and Bob Martin Debate TDD

Wow, I’m surprised but Bob Martin actually agreed that good architecture doesn’t just emerge if you did enough TDD.

I also like that Jim says that you don’t start without knowledge if you implement something – especially if you implemented that before.

I really liked the format.


I saw this talk a few days ago but never written my notes on it:

It’s called The Scams That Derail Programming, Motherfucker by Zed Shaw.

Generally, I like the talk because the criticism in the community against the consulting talk seems pretty small. On the other side, it’s clear that he has his own agenda to push.

There’s the saying: If you want the truth look at what the opposition has to say. I think it’s similar here.

Nonetheless, I can recommend watching it. There’s also a newer recording which however is a bit more self-censored.


You know, thinking about it. I don’t do much PM today but if I would I would look into the papers researching methods in software. I’m pretty sure that there are probably hundreds of papers about agile, TDD, software quality, etc.

On the other hand, I still hold the opinion, from what I’ve seen, that the individuals are more important than anything else (language, methodology, amount of computer screens, etc.).


I’ve written about coding bootcamps / dojos / etc. before and my opinion is that they basically recreate something that failed 15, 25, and 35 years ago.

Now I stumbled upon this post called About Coding House. It’s a long read but it’s absolutely insane.

It’s about a code FOO called coding house. People actually paid around $10-$15k to learn coding and get a job in a few months. This alone is insane (like I said) but the practices here are absolutely disastrous.

I won’t go into detail – read the post instead – but it was a machine to deceive.

This makes me fucking angry. I looked at some of the pictures he posted and projects of other people who paid a fuckload of money.

There’s so much wrong with all of that. Jesus Christ.

The guys of WhalePath interviewed us and were we humiliated since none of the students could answer the JavaScript questions. I remember the CTO telling me after the technical interview that we should ask for our money back.

Jesus.


To calm down a bit I’m going back to the book!

This chapter is about the implementation of hashtables in PHP.

I’ve written about that before but PHP handles hash collusion with linked lists. Actually, there are double linked lists. Also there’s an additional linked list which keeps track of the insertion order of the elements in the hash table.

Here’s how the bucket looks like:

typedef struct bucket {
    ulong h;
    uint nKeyLength;
    void *pData;
    void *pDataPtr;
    struct bucket *pListNext;
    struct bucket *pListLast;
    struct bucket *pNext;
    struct bucket *pLast;
    char *arKey;
} Bucket;

What’s interesting is that h holds the index if it’s an int otherwise the index / key is in arKey.

There’s no implicit conversion, that means that a hash table can have both 23 and "23" as keys.

Hash tables also don’t shrink! If the expand, they double however and rehash every time they do. The minimum size hash table is 8 which should work pretty good given the average PHP application.

Also every value is copied into the hashtable!


Neat, so here’s the hashing algorithm which PHP uses called DJBX33A:

static inline ulong zend_inline_hash_func(const char *arKey, uint nKeyLength)
{
    register ulong hash = 5381;

    for (uint i = 0; i < nKeyLength; ++i) {
        hash = hash * 33 + arKey[i];
    }

    return hash;
}

It’s very easy to understand so I won’t comment on it. However, there’s a neat way to generate keys that collide.

The first one is with integer keys. Normally, you create your hash and then apply a bit mask so that it fits into the hashtable. The starting size is 8, integers don’t get hashed, that means that you can generate keys which are a multiple of 8.

For string keys it’s a bit more complicated. Look at the algorithm again.

If we have a nKeyLength of 1 (which would mean that the loop just runs once), we get:

final_hash = 5381 * 33 + arKey[0];

that means that we only would get a collusion if we would use the same key. So we need at least one of two!

then we get:

final_hash = (5381 * 33 + arKey[0]) * 33 + arKey[1]

Let’s make that more readable and I will introduce two possible keys A and B with length 2. We want them to be equal.

(5381 * 33 + A[0]) * 33 + A[1] = (5381 * 33 + B[0]) * 33 + A[1]

This can be simplified of course:

33 * A[0] + A[1] = 33 * B[0] + B[1]

And you can quickly see, if we increment the characters with the multiple 33 by X, we have to decrement the other character by X * 33. If we take X = 1, we just increment the first by one and decrement the second by 33. And that’s it!


Doing Behavior-Driven Development (BDD) with Behat

The waterfall model was initially created to realize smaller projects. They should be bigger than one man projects but no longer than one year. Once again, people didn’t read the source material carefully.

Tests in TDD weren’t originally not tests but rather examples for an existing feature. Then TDD actually makes a ton of sense!

Dan North called this approach BDD.

Instead of writing tests, you write examples of how the system will work. And instead of refactoring you do design which makes sense.

A lot more sense. The focus is on examples. You talk with the client about examples on how the system should work and you work internal with examples on how to implement that system.

This makes so much sense. There are different kind of tests. The behat story type which is in the language of the customer. And the developer type which is unit tests – also in their language!

BDD is when you use examples in conversations to illustrate behavior.


  • 45% of the features are never used by anyone
  • 13% are often used
  • 7% are used always

He talks about talking with the customer. I can only agree I do the same for a few years now.

I repeat myself but I still can say that finding the root cause is one of the most important things you can do as a professional.


For each example ask:

  • Why would anyone want this feature?
  • Who benefits from that the most?
  • What does he need in order to benefit?

Take all examples and let the stakeholder prioritize them.

Now you can get into individual features and start the example process again. Communicate and write.

Scenario:

Given some state
When something happens
Then some result

Useful questions:

  • Is that the only result?
  • Is there are state when that doesn’t happen?

The language of the customer is the language of your code. Your scenarios written in your customer’s language give hints for classes and methods. The best thing is that your customer can help you to solve business problems.

Really excellent talk! Recommended!


And enough for today.


Updates Goals:

  • Get an overview of the standard library (SPL)
  • Learn the intricacies: how does the interpreter work, the OOP system, etc.
  • Learn about PHPUnit
  • Learn a bit more about legacy systems and how to handle them
  • Learn a bit more about MySQL
  • Learn Symfony2
  • Write at least one web app using Symfony2 and its core components (templating, testing, forms, validation, security, caching, i18n)
  • Get a bit more exposure to OOP and OOD
  • Watch one video per day on average

Progress status

Done

  • Mastering the SPL [done]

In Progress

  • Reading PHP Internals Book [4 of 5 chapters]
  • Watch one video per day on average [21 of 75]

My Way to PHP: Day 8 of 75

Day 8. Today, it will be one week since I started! I’m rather pleased with my progress and satisfied. My goal today is to finish PHP: The Right Way and watch at least one conference video.


I’m reading up on traits again. I still don’t have a good use case in mind. However, one interesting thing I learned is that use in namespaces resolves in the global namespace whereas use in traits is local.

After reading the original RFC and the wiki article on traits I stumbled on this question on SO: Traits in php – any real world examples?

Ok, so the main opinions are either a) you don’t need traits and b) you can use it for isolated cross cutting concerns, e.g. logging.


Something new, you can import functions via namespaces since PHP 5.6:

use function My\Namespace\functionName;

Composer! Here’s what I need to know

  • composer require package:~version to update the composer.json
  • composer install to install packages
  • require 'vendor/autoload.php' injects composer’s autoloader
  • You can add your own files to the autoloader in the compoer.json (supports PSR-4)

You basically map your vendor on a directory:

"autoload": {
    "psr-4": { "Foobar\\": "mysrc/"}
}

Now you can use a namespace like Foobar\Baz\Foo and it will include mysrc/Baz/Foo.php. And if you leave the key in the json (in this case Foobar\\) empty it will search your directory for any namespace!

Sweet!


Here’s a nice cheat sheet on using PHP and UTF8

  • Set your DB to utf8
  • Use mb_string and mb_* functions
  • Set headers

I’ve looked into Behat – a storyBDD tool – a few days ago. It’s basically like Ruby’s Cucumber. Neat tool, especially nice for the final testing. I think I have a video about that in my queue. So it will come up some time.


Regarding caching, there’s APCu which seems to have the same API as APC had. Also important to note, it will cache inside a PHP process. That is, if you’re using (F)CGI, they cache won’t be shared between these processes. Here memchached could be a solution. I also want to look into Redis in the future.


And done! Good read, I knew most of the stuff so far – which is what I aimed to. This was probably one of my first exposures since I started and I modeled my goals a bit after that!


Before starting the next book I want to read a few articles from PlanetPHP.


New terminology from the article Fault tolerant programming in PHP: Circuit breaker. That is a wrapper around a class which can fail (mostly I/O). It will handle the errors through either

  • Fallback
  • Retry
  • Monitors
  • Or just fast failing

And a new pattern: Repository Design Pattern pack the data to a central place which can be accessed by a public API.


Good. Let’s start a new book: Mastering the SPL Library. It’s 182 pages about obviously about SPL. My first critique is that the title is Mastering the Standard PHP Library Library.

Let’s start! The SPL reminds my of C++’s STD on the first sight. The intro is a short essay about complexity.

It also covers the PHP internal hash table. This is the stuff I wanted to know about! So the implementation is pretty standard. They use double linked lists in case of hash collision. That also explains why for large N the look up time of the associate array converges from O(1) to O(n).

The hash table grows automatically – it doubles in size. And every time it does that it has to recalculate hashes!


An other talk – in German – I will summarize the most interesting facts:

The title is “Vor BDD und TDD war PDD” which means “Before BDD and TDD there was PDD”.

  • PDD is Pain-Driven Development
  • They show an actual – and still developed – source file which is just one class, with about 400 methods, and around 25k LoC. Insane!
  • Sebastian (the creator of PHPUnit) actually build it because his professor mocked him that he didn’t want to do real software development in Java but in PHP. Sebastian then created the first rudimentary version of PHPUnit in one week which was a clone of JUnit at the time.
  • About 6 months after the initial bet he expanded PHPUnit and published it. But for at least 2-3 years nobody really cared.
  • A lot of QA tools are made by German developers
  • They talked about Code Rank and Manuel’s project PHP Depend has a great overview over different software metrics.
  • Code Rank (CR) is basically PR applied on Packages / Classes. The higher the CR the more this entity should be tested
  • Use the complexity metric (Cyclomatic Complexity Number) for starters – the goal is between 12 and 14 for a method!

I normally subscribe to the idea that a method / function shouldn’t be longer than your head (or fit on your screen).

  • The problem with CCN is that it doesn’t count the number of paths. It just counts the number of branches

If you want to have 100% branch coverage, you need at least one test for each path you can take. You can calculate that number with NPath Complexity.

  • A huge amount of duplicated code is often a sign for communication problems in the dev team
  • People starting with PHPUnit (or other QA tools) often feel pain and project it onto the tool itself

Incredible good talk! Sadly, only in German.


The next talk is titled: Recognising smelly code

  • Methods should do a specific task

Code smells:

  • High level of nesting (see CCN and NPath!) => decouple
  • Bad naming
  • Too many parameters => use a Object instead if possible
  • Class should be a noun and should have a responsibility

Code smells:

  • God object => decouple
  • Tight coupling / typing new => DI
  • Using arrays instead of objects => check if it makes sense

General code smells:

  • Copy & paste
  • Speculative Generality => YAGNI

This talk is titled: FAIL: The Best Ways to Bring Down Your Website.

So, far interesting war stories! Also failure normally means that your are out of something (CPU, Bandwith, Money, whatever).

The best way to fail is to be inordinately successful!

  • Statelessness works great for scaling

Most of the talk is the admin / infrastructure side so far which I didn’t cover.

Even if you don’t care about the infrastructure watch the first 8 and the last 8 minutes for war stories.


This talk is titled: The 7 Deadly Sins of Dependency Injection

The first new terminology was Service Locator. I looked it up and it looked similar to a DI container. Here’s an article about the difference: Service Locator vs. DI container

The main difference is in the usage not in the implementation. The idea is that instead of injection all your dependencies you are going to write a class which returns you each dependency.

It also gets covered in the talk (around 33:00). For SL you explicitly call get and anyone could set them. For DIC however, you explicitly set the dependencies!

The problem is that you’re hiding the ‘mess’ with ServiceLocators most of the time.


The next terminology is ContainerAware. Here’s my source: ContainerAware considered harmful.

It’s like a DIC (DI Container) but not in the constructor but with one special setContainer() method. And this will be called implicitly at construction time!

They are even worse than ServiceLocators.


If there are two many dependencies injected feel free to split the class up into multiple ones.

Use constructor DI. If there’s an optional injection you can use setter injection.


The whole talked reminded me of a part of The Zen of Python

Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.


Another talked PHP Under the Hood, however a bit newer from 2014:

  • Profiling: looks for bottle necks in the call graph, memory usage, etc.
  • Benchmarking: actual performance

Jesus, I feel sorry for that guy. He has such a hard time breathing :/


Ok, now it goes a bit in depth. All data is stored in zvals (Zend values). The zval saves 4 things (the value, refcount, type and reference). Here’s the struct:

typedef struct _zval_struct {
    zvalue_value value;
    zend_uint refcount__gc;
    zend_uchar type;
    zend_uchar is_ref__gc;
} zval;

An interesting bit is the type zvalue_value which is a union:

typedef union _zvalue_value {
    long lval; // stores ints, bools and resources
    double dval; // float
    struct {
        char *val;
        int len;
    } str;
    HashTable *ht;
    zend_object_value obj;
} zvalue_value;

What’s interesting is that PHP never changes the zvals actual type but rather creates a new zval.

The rest of the talk looks at Opcode from a few simple examples. It’s very understandable!

Generally, pretty approachable talk. However, I would say that you probably need a bit of background. I make this assumption given a note in the description:

Note: This is an advanced talk, you should be extremely familiar with PHP and have some experience with profiling if you are to get the most from this talk.

What helped me is to understand how a VM works, some C, some Assembly and a bit of knowledge about operating systems.


Talk: Caching Best Practices

Caching everything that is expensive – either getting it once or once ever so often

I like the generality of that statement. Memorization can also be seen as caching.


Cache things that don’t change often

He presents several methodologies with variable flexibility. I quite like the idea of caching widgets and it made me aware about cache invalidation. There are cases when cache invalidation is clear and easy. One example is that, you have a push system. You can cache forever and then invalidate if you push out a change. That’s called update invalidation.

Alternatively, you can directly update the cache. However, it gets problematic if the database changes and the cache and DB get out of sync.


Pre-generation strategy: You proactively generate the cache. Facebook used that for new clusters.


The ideal form is the biggest-smallest re-usable object. Basically cache everything which doesn’t change often enough but has a big influence on the performance.


Solutions:

  • APC for one machine, is incredible fast (runs in memory), native PHP interface
  • Memcached / Redis for multiple servers

Multi-Layer Cache: Store data given access time and staleness.

  • PHP instance = immediately
  • APC = Quick expiration
  • Memcached = longer expiration
  • DB = worst case

Never store something in cache that you can’t recreate

Also use a DB.

Have external configuration for caching

E.g. in case of a traffic spike you can increase the caching.

Prefix your keys with a version no.

Saves flushing, no problem when rolling out a new version, and no old stale data which might be accessed.


The next talk is called PHP Opcache Explained.

That’s interesting: include, require and eval() in source files, are included, then executed and as soon one of these statements appears, these files get compiled again and executed!


So a OPcode consists out of an uint (internal numeric representation), an handler (actual function), two operators, a result, an extended value (e.g. a carry), the line no and the type information.


OpCache optimizes calculates at compile time which is great. For example, if you’re dealing with time you often write something like:

60 * 60 * 3 // 3 hours

This gets optimized!


OPCache just keeps old (wasted) files in the memory. If a threshold is reached it will recompile all the files.


Okay, back to the book!

Chapter 3 which presents all data structures.

SplDoublyLinkedList They actually implemented count() as O(1) by saving meta data. Otherwise, the complexity is like expected. What’s pretty cool is that you can set the iterator mode to LIFO and FIFO and so traverse the list backwards.

SplStack This shares the same codebase with the Double Linked List (inherits it) and could be implemented with LIFO mode.

SplQueue Nothing special about that. Also inherits from `SplDoublyLinkedList but uses FIFO mode.


SplHeap This is an abstract class and you need to implement a function called compare(). That’s pretty cool. Otherwise it implements a binary tree internally.

However, there’s a special case in the case something in compare() goes wrong. Then the heap is corrupted and throws an RuntimeException.

SplMaxHeap Like SplHeap but the maximum element is at the top (i.e. compare() is already implemented).

SplMinHeap The opposite of SplMaxHeap.

SplPriorityQueue Similar to the heaps but an other internal implementation. It still compares however, you add a priority when using insert(). It will then push out the most important (highest priority) item at first.


SplFixedArray Sweet. Traditional arrays implemented in PHP. You set the size at construction. You can however, also set a new size with setSize(). That’s pretty great.

SplObjectStorage Interesting. A hash map in which objects can be keys.

At the end of the chapter there’s a neat table listing all data structures and their complexities for different operations. Neat!


The next chapter is about iterators. There are a lot of iterators which I will talk about later. But how you can work with them is neat.

An example from the book:

$it = new DirectoryIterator("."); // lists all files in the current directory
$it = new RegexIterator($it, "/.mp3$/i"); // extracts only files with matched regex
$it = new LimitIterator($it, 0, 3); // returns only the first 3 

foreach($it as $file) {
    // 
}

That’s super cool!

To create your own iterator you just have to implement Iterator. And for filter iterators (like RegexIterator) implement FilterIterator – which has just one function called accept().

Man, that’s beautiful. I really love those iterators. Let’s see which ones are available in the SPL.


  • AppendIterator = appends several iterators (key isn’t changed!)
  • CachingIterator = caches all ‘old’ elements, can look ahead one and you can specify the __toString() behaviour
  • CallbackFilterIterator = apply a callback to filter the elements
  • FilterIterator = abstract class which allows to implement a filter
  • InfiniteIterator = repeats the given iterator over and over again
  • LimitIterator = returns the Xth to Yth elements
  • MultipleIterator = returns the elements of several iterators simultaneously
  • NoRewindIterator = the iterator can only be used once
  • RecursiveIteratorIterator = iterates through all the children
  • There are also several recursive iterators and one

Okay, enough for today :)


Updates Goals:

  • Learn about composer
  • Learn about caching
  • Get an overview of the standard library (SPL)
  • Learn the intricacies: how does the interpreter work, the OOP system, etc.
  • Learn about PHPUnit
  • Learn a bit more about legacy systems and how to handle them
  • Learn a bit more about MySQL
  • Learn Symfony2
  • Write at least one web app using Symfony2 and its core components (templating, testing, forms, validation, security, caching, i18n)
  • Get a bit more exposure to OOP and OOD
  • Watch one video per day on average

Progress status

Done

  • Reading PHP: The Right Way [done]
  • Learn about composer
  • Learn about caching

In Progress

  • Mastering the sPL [155 of 186 pages]
  • Watch one video per day on average [13 of 75]

My Way to PHP: Day 7 of 75

Ok, like everyday I published yesterday’s article. In The Pragmatic Programmer I’m on page 53 which is the beginning of section 11. My goal is 50 pages. Which is around the beginning of Chapter 4 or section 21.


Something that bugged me in the past was that the highlighting plugin I use (Crayon) didn’t support the new PHP keywords. I just added them and created a PR, let’s see if it works.


I’m looking for more interesting videos on youtube. And I realized how amazing the modern world is in regards to knowledge. I can look at talks to which I a) wouldn’t have access (e.g. Google talks or Facebooks) and/or b) they cost a lot of money (conferences, traveling, etc.). It’s pretty insane.


Idea: DeveloperTV – 24/7 streaming of YT videos about developing, IT sec, etc.


Here’s a talk by Gregor Kiczales about Aspect Oriented Programming (AOP):

He’s one of the main developer of AOP. The idea of AOP is that you have some method to access the direct environment of functions. That could be to do something before/after a specific call of a setter, getter, construction, destructor, etc. And that’s basically it!

The beauty of it is that you don’t have to change your method call to use an aspect. An aspect is something that’s orthogonal to your actual method: Logging, caching, etc.

Then there are join points. These are the points at which you can insert your aspects. And there’s advice which is the actual action taken.

One obvious example is that of an observer. It’s elegant with AOP but not so much with pure OOP.

For me AOP is a great addition to OOP. It makes sense in a Lisp-way.

Let’s say you have code like that:

function foo() {
    // ...
    doX();
}

function bar() {
    // ...
    doX();
}

function baz() {
    // ...
    doX();
}

Let’s also say that there are many more methods and classes and all that stuff. In Lisp you would write a macro. And a more elaborate version of this is AOP which doesn’t just handle calling an advice after a call but on other join points.

After the first 20 minutes most of the talk is about AspectJ which may be interesting if you’re developing Java but otherwise there wasn’t that much additional information.


I just learned that twig – the template engine for symfony2 can be extended. You can basically write a DSL in that. I just skim through the doc about extending it but it seems pretty easy so far. Even adding new language constructs seems pretty manageable. Definitely something I have to keep it mind :)


Here’s a talk about testing set-top boxes:

I really like the idea and the simplicity of it. They recorded the output of the set-top boxes using GStreamer and then did simple pattern matching with OpenCV. It’s a quite elaborate framework (stb-test) but the basics are pretty simple and it’s usable. Kudos!


PHP under the hood:

A talk about the inner workings of PHP. I liked his section about references although I knew the problems I agree with this recommendation writing functions. Don’t mutate the arguments.

Thanks to my functional programming background I’m damaged but really it’s so much clearer. The biggest disadvantage can be performance because PHP didn’t implement a performant immutable type system.

If a PHP script is running longer than a second, in most cases there’s something wrong.


That was fast. My pull request was just accepted. Yay!


Back to the book. In the chapter about DSL they make a good point about Python. You can implement your DSL using Python just as functions and program the interface. It’s pretty easy to read and write syntax wise and you just have to define functions. Alternatively, Lua would be a good idea thanks to the easy integration.


I just looked up lua and PHP and there’s actually an extension.

The interface is really simple. Here’s an example of loading a external file, giving lua access to a variable and function. I looked through the code and found an undocumented function called include which I will use.

$foo = ...;
function bar() {
    // ...
}

$lua = new Lua();
$lua->assign("lua_foo", $foo); // provide lua access to $foo
$lua->registerCallback("lua_bar", "bar"); // provide lua access to bar()
$lua->include("myfile.lua"); // include and run the file

I can’t say anything about access escalation. This is something I definitely would look up when I would use Lua. However, it’s probable – given the functions – that it’s a closed system. That is you could actually provide an interface for scripting your application. Pretty nice!


Here’s a cool exercise:

5. We want to implement a mini-language to control a simple drawing package (perhaps a turtle-graphics system). The language consists of single letter commands. Some commands are followed by a single number. For example, the following input would draw a rectangle.

P 2 # select pen 2
D   # pen down
W 2 # draw west 2cm
N 1 # then north 1
E 2 # then east 2
S 1 # then back south
U   # pen up

Implement the code that parses this language. It should be designed so that it is simple to add new commands.


I will have two classes. I want one who does the drawing and one who does the interpretation. Let’s start with the drawing one.

I want to draw onto a canvas which will for now just be a 8×8 array. I hardcoded numbers just because this is an exercise and I don’t want to make it too complex. I could also extend some things into other classes but let’s keep it simple.

Ok, while thinking about the problem and trying something out I changed my design a bit. I have one class canvas on which you can draw, display it and set your pen.

class Canvas {
    protected $canvas;
    
    public function __construct() {
        $this->initCanvas();
    }
    
    protected function initCanvas() {
        $this->canvas = array();
        for($i=0; $i < 8; $i++) {
            $this->canvas[] = array_fill(0, 8, 0);
        }
    }

    public function display() {
        echo "\n\n";
        foreach($this->canvas as $line) {
            echo "\t", implode("", $line);
            echo "\n";
        }
    }
    
    
    public function draw($x, $y, $pen) {
        $this->canvas[$y][$x] = $pen;
    }
}

I left out any error or boundary checking. Like I said for the sake of simplicity. I also written a Painter class which takes care of all the painting!

class Painter {
    protected $pen;
    protected $canvas;
    protected $penValue;
    // pen coordinates
    protected $x;
    protected $y;
    
    public function __construct(Canvas $canvas) {
        // start in the middle
        $this->x = 3;
        $this->y = 3;
        $this->pen = 0;
        $this->penValue = 0;
        $this->canvas = $canvas;
    }
    
    public function select($pen) {
        $this->pen = $pen;
    }
    
    public function down() {
        $this->penValue = $this->pen;
    }
    
    public function up() {
        $this->penValue = 0;
    }
    
    protected function moveHelper(array $orientation, $distance) {
        for($i = 0; $i < $distance; $i++) {
            $this->canvas->draw($this->x, $this->y, $this->penValue);
            $this->{$orientation['xy']} += $orientation['-+'];
        }   
    }
    
    public function goWest($distance = 1) {
        $this->moveHelper(['xy' => 'x', '-+' => -1], $distance);
    }
        
    public function goEast($distance = 1) {
        $this->moveHelper(['xy' => 'x', '-+' => 1], $distance);
    }
    
    public function goNorth($distance = 1) {
        $this->moveHelper(['xy' => 'y', '-+' => 1], $distance);
    }
    
    public function goSouth($distance = 1) {
        $this->moveHelper(['xy' => 'y', '-+' => -1], $distance);
    }
}

And now we can test it and it works :)

$canvas = new Canvas();

$painter = new Painter($canvas);
$painter->select(2);
$painter->down();
$painter->goWest(2);
$painter->goNorth(1);
$painter->goEast(2);
$painter->goSouth(1);
$painter->up();

$canvas->display();

canvas


Now, let’s look at the parser and interpreter. We have a few things. Each command is in a different line. There are no empty lines. The first character in a line defines the function, which can be followed by one parameter separated by one whitespace. And there’s also comments. I will make an incredible easy parser which doesn’t care about much. There are better ways to do it – of course – but it’s just a small exercise.

class Interpreter {
    protected $painter;
    protected $syntax = ['P' => ['parameter' => true,  'function' => 'select'],
                         'D' => ['parameter' => false, 'function' => 'down'],
                         'U' => ['parameter' => false, 'function' => 'up'],
                         'S' => ['parameter' => true,  'function' => 'goSouth'],
                         'W' => ['parameter' => true,  'function' => 'goWest'],
                         'N' => ['parameter' => true,  'function' => 'goNorth'],
                         'E' => ['parameter' => true,  'function' => 'goEast']];
    
    public function __construct(Painter $painter) {
        $this->painter = $painter;
    }
    
    
    public function interpret($input) {
        $splitLines = explode("\n", $input);
        foreach($splitLines as $line) {
            $command = $line[0];
            $parameter = $line[2];
            $this->execute($command, $parameter);
        }
    }
    
    public function execute($command, $parameter) {
        $internalCommand = $this->syntax[$command];
        if($internalCommand['parameter'] === true && $parameter == ' ') {
            throw new Exception("You forgot the parameter for {$command}");
        }
        
        if($internalCommand['parameter'] === false) {
            $this->painter->{$internalCommand['function']}();
        } else {
            $this->painter->{$internalCommand['function']}($parameter);
        }
    }
}

Like I said it’s super simple. All it does it take the first character and the third of each line (this also means that no parameter can be bigger than 9 which doesn’t matter because I restricted the canvas to 8×8). Then looks up the command and its function call. Thanks to higher-order function, we can just call the function and we’re done. I’ve written a bit more robust interpreter which takes care of correct syntax, allows variable definition and handles whitespace well but for this example a super simple one is enough imho.

And the final code and test:

$input =<<<END
P 2 # select pen 2
D   # pen down
W 2 # draw west 2cm
N 1 # then north 1
E 2 # then east 2
S 4 # then back south
U   # pen up
END;

$canvas = new Canvas();
$painter = new Painter($canvas);
$interpreter = new Interpreter($painter);
$interpreter->interpret($input);
$canvas->display();

canvas-interpreter

It works! Pretty cool exercise. Writing this took me a bit more than an hour – a pretty good investment for that additional ease of use! (Funnily enough, the Painter & Canvas class took way longer than the interpreter)


In the section about estimating there’s a good advice. Iterate and refine your estimates. You can give a fuzzy estimate at the start, e.g. 4-6 months. After a month, you can refine that. And then the cycle repeats.


Embrace the fact that debugging is just problem solving, and attack it as such.

This can be hard. I would also add: If you don’t find the bug after 15 minutes. Do something else for a while. Clear your mind and come back.


What are the disadvantages of using persistent connection in PDO?

Interesting question! And a simple answer. You are fucked if your script dies half-way. That could mean that a transaction is still open or a database is locked, etc.


I liked the idea of Design by Contract when I first heard of it. However, successfully I only applied it to interfaces with team mates. That’s something that works fantastic, btw. Agree what your classes are receiving and sending.

For methods / functions however, I found that Design by Contract either is really hard to do, trivial or basically the same code.

Example for hard to do is that you have an array with some arbitrary content. How do you check it? You probably use your program to do it anyways.

Trivial. Great example is the function sqrt(). You can beautifully check the precondition (input >= 0) and the postcondition (output * 2 = input).

Basically the same code. E.g. a function which returns max and you check that with max.

So, the usefulness wasn’t that great. However, I like one thing and that is that it checks the input variables and you don’t have to clutter your function with it. E.g. in the sqrt() case. You can write something like that:

/*
 * @precondition $x>=0
 */
function sqrt($x) {
    //...
}

Instead of putting a if and throw statement in your function.


The Law of Demeter is shortly mentioned. I found a good explanation in the c2 wiki:

You can play with yourself.
You can play with your own toys (but you can’t take them apart),
You can play with toys that were given to you.
And you can play with toys you’ve made yourself.


I’m currently on page 171 and it’s hard. Not that it’s hard to read but it’s a drudge. Like I said I’ve read this book about 8 years ago. And I enjoyed it a lot. However, at the time I was new to software development. I’ve written some code – sure – but I never learned about version control, shell scripting, processes, etc. In the meanwhile I learned all that.

I try to get as far as possible today so that I can get this off my list.


Starting a new project (or even a new module in an existing project) can be an unnerving experience.

I listened to a talk by Zed Shaw yesterday in which he talked about TDD and its advocates. He said that he experienced that people advocating TDD seem to be pretty good at tasks like writing tests. But often have problems coming up with innovative or new things.

After reading this line I have a theory. There are different types of personality. If you look at Big Five factors there is some which stands out. Openness to experience. I quote:

It is also described as the extent to which a person is imaginative or independent, and depicts a personal preference for a variety of activities over a strict routine.

Here’s my theory or better hypothesis: The TDD/Enterprise/JavaBeans guys score lower on the openness score than people like Zed Shaw. They feel comfortable because the style of working and environment fits their personality. People like Zed however want new experiences and embrace them. People like him and me have no problem starting a new project or module. It’s exciting. You have an empty sheet before you and you can create something spectacular. People scoring lower on the openness score however like if they know what to do, if they can work on existing products / code.


Ok, done with that book. I wonder if I would recommend it to a beginner. I think if so then to somebody who wasn’t exposed to the software development community – or not much. Otherwise, you will either get most of the tips from other people or learn it by imitation.


After all that fluff I need some substance. I had this book – which is the only book on the topic – on my reading list. I read one review about it which was bad, so my expectations are pretty low. The book is Functional Programming in PHP. It’s 122 pages long. Let’s start!


The book starts off with a history section and a bit of general foo about FP. One thing that irritates me a lot is the author’s focus on short functions or writing less characters. Really strange.

A lot of whitespace so far. I’m now on Chapter 8 (page 33). The book covered coincidentally the same content which I presented in my oral exam as a presentation at the finals in school (before college). This took about 15-20 minutes.

What I like so far is that he doesn’t write a lot of fluff. It’s up to the point. The question is if somebody without any knowledge about FP can effectively learn.

I quite like the illustrations. Ok, I’m through. Took me about 45 minutes. I’m not pleased but I’ll write a blog post about that so others can get an additional review.

In case you want to read it: Review on Functional Programming in PHP.


That took a while writing the review – I think about as long as reading the book. I have an other book from phparch on my resource list. That’s the same company that published FP in PHP. I’m a bit nervous, however there are several reviewers on Amazon which liked the book.


Great. I want to read through PHP The Right Way. It’s quite short, however I plan to read the linked resources.

While testing the build in server, e.g. php -S localhost:8080 brings up a local server on port 8080. I just realized how easy it is to bring a program to the web. I had only one .php file in the directory – that was my Interpreter from some hours ago and it worked on the web.


I haven’t decided yet if I’m going to develop on Linux or Windows. For the package manager alone I will probably go with Linux. Thanks to Vmbox I can develop on Linux on a Windows host which will be most likely be the one I will have at work.


Vagrant and PuPHPet are incredible. I really like that DevOp movement. Especially if I don’t have to set up all that stuff :P

I downloaded and install Vagrant and generated the config files but it wants me to restart which I will do tomorrow or so.


Ok, next stop is PHP-FIG. I’m going through all standards.

PSR-0: Autoloading Standard Normally, you’re not going to write an autoloader (probably), so the only think to remember is:

  • Namespaces are ()*
  • Each namespace separator is a new subdirectory
  • _ in class names are like directory separators
  • Only alphabetic characters

PSR-1: Basic Coding Standard

Here are the interesting ones (or the ones I want to remember):

Files SHOULD either declare symbols (classes, functions, constants, etc.) or cause side-effects (e.g. generate output, change .ini settings, etc.) but SHOULD NOT do both.

I like that. Makes interop easier. Also reminds me of header files in C.

  • Classes in StudlyCaps: SomethingMaker
  • Methods in camelCase: makeSomething
  • Class constants with underscore and in uppercase HARD_GRANITE
  • Use namespace declarations

I have to say quite sensible so far.

PSR-2: Coding Style Guide:

  • 4 spaces for indenting (oh yeah baby)
  • Opening braces for methods and classes in the next line
  • For control structures in the same
  • abstract and final before visibility, static after

I’m very pleased with that guide. It’s sensible in my opinion. And there are a lot of crazy styles.

PSR-3: Logger Interface: Ok. The most important things are the interface:

There are the following methods: debug, info, notice, warning, error, critical, alert, emergency. There’s also log which logs at a given level.

Pretty good.

PSR-4: Improved Autoloading:

Here underscores don’t matter in classnames. Also it takes a base directory.


Now, there are 3 coding standards: PEAR, Zend and Symfony. I’m going with the latter just because that’s my desired platform.

Symfony coding standard

  • javadoc style comments
  • @param type $parameter description
  • @return type|type description
  • @throws \ExampleException
  • Type hinting if possible
  • Comma after each element in array (even last one)
  • One class per file (exception: private helper class)
  • public methods > protected > private
  • Use sprintf for Exceptions
  • Underscores only for option and parameters names
  • Suffix for Interface, Trait and Exception
  • Abstract prefix for abstract classes

Incredible standards again. Great!


Ok, enough for today. I’m currently at the start of Language Highlights. I also added an additional goal. Watching one video per day – on average.


Updates Goals:

  • Reread the Pragmatic Programmer
  • Learn about functional programming in PHP
  • Learn about coding standards
  • Get an overview of the standard library (SPL)
  • Learn about composer
  • Learn about caching
  • Learn about PHPUnit
  • Learn a bit more about legacy systems and how to handle them
  • Learn a bit more about MySQL
  • Learn Symfony2
  • Write at least one web app using Symfony2 and its core components (templating, testing, forms, validation, security, caching, i18n)
  • Learn the intricacies: how does the interpreter work, the OOP system, etc.
  • Get a bit more exposure to OOP and OOD
  • Watch one video per day on average (75 – 6)

Progress status

Done

  • Reread the Pragmatic Programmer [done]
  • Learn about functional programming in PHP
  • Learn about coding standards

In Progress

  • Reading PHP: The Right Way [13 of 51 pages]

Functional Programming in PHP Book Review

FPinPHPcover

Functional Programming in PHP

I will just write a short review. There’s only one available at the time by Stefan Kanev on Goodreads. Here’s the complete review. I reference his review a few times. So if you want the complete picture read both. Or just skip to the bottom to the TL;DR.

The book currently costs $12 for digital version and $19 for digital + print. It’s officially 112 pages long. I’ll start with the stuff I liked. The illustrations were nice. The author didn’t write too fluffy. And it was a quick read (took me around 45 minutes).

Who’s the audience?

However, there are a lot of problems. I start with the first one: Who’s the audience? From the intro:

It goes without saying that this book is not aimed at beginner PHP
developers and some object-oriented experience is assumed. Having said
that, of course, this is as gentle of an introduction as possible and represents
a beginner guide to functional programming in PHP.

Quite frankly, if you never had exposure to functional programming (FP) you won’t learn enough from this book.

My book recommendations

Rather grab one of these books (in increasing difficulty; btw all are available for free):

Let’s say that you had some exposure to FP. It doesn’t matter if it’s a Lisp or Haskell or OCaml or whatever. You won’t need that book. Instead take a look at all these pages from the PHP manual:

I frankly don’t know to whom this book should appeal.

No real depth

There are several other problems – most were addressed by Stefan. It’s incredible shallow – I think it would have been a lot better if the author had spent more time on the language fundamentals. Dig into some C code and look how array_filter() is implemented. Or something I loved in a lot of books I read on FP: Reimplement the language in the language. Go into detail.

Haskell envy

An other thing, Stefan also addressed that, is that Haskell envy. It’s like physics envy:

In science, the term physics envy is used to criticize a tendency (perceived or real) of softer sciences and liberal arts to try to obtain mathematical expressions of their fundamental concepts, as an attempt to move them closer to harder sciences, particularly physics.

PHP isn’t Haskell. And as much as the author seems to love Haskell, it also isn’t the only way. Especially given PHP type system in comparison to Haskell’s. Maybe it would have been more interesting to work with a language that is closer to PHP than Haskell.

Missed chance and PHP isn’t ready, yet

There are two more things. On the one side I see a missed chance. The author could have taken a piece of code and refactored it using FP. Or could have shown how the development could differ (he even mentioned a REPL at the start). Maybe showed some potential bugs and how the vanish using FP.

The next thing, and I think that’s one important point, is that PHP isn’t ready, yet. All the more, I had shown how you can apply the ideas about FP to PHP. E.g. write code without side effects. Learn the power of higher-order functions, etc. He talked a bit about that but there was just that lack of detail. What a pity!

Conclusion / TL;DR

I don’t recommend the book. Read the books I linked if you are new to functional programming or just read the PHP Manual if you aren’t.