Zope Criticisms

Zope Criticisms

Chris McDonough just posted a capsule criticism of the Zope project and culture to zope-dev in a discussion I started. I believe Chris and I have been "violently" agreeing on most many issues in this discussion... I thought this characterization is quite interesting and I'd like to share it with the wider world. I agree with it so much and disagree so much at the same time.

Even though I disagreed with the decision to include underwear as a logo on a (now rejected) design for a new zope.org homepage, I do think it's good to sometimes focus on our dirty laundry as it can help with a cultural renewal I think the Zope community needs and is ready for.

I think this information can also be interesting for developers of other web frameworks. Look at the stuff we deal with after having been around so long! Don't let this post mislead you: I see a lot of value in Zope technology and its community otherwise I wouldn't have stuck with it for more than 10 years. Chris obviously sees value in it too, otherwise he wouldn't be arguing with me on zope-dev and extracting so many of its concepts into Repoze components.

I hope he'll forgive me for quoting him here:

I have no faith whatsoever that staying on the course we've been on for the last 9 years (large interconnected codebase, backwards compatibility at all costs, lack of any consumable documentation at a package level, not much curiosity about how other Python web frameworks work, not much real cooperation with folks that maintain other Python web frameworks, a constitutional inability to attract new users) will bring any sort of positive change.

As a background, Chris is characterizing the Zope culture in such an extreme way as it contrasts with the Repoze project he leads that tries to these things differently. The Grok project shares quite a few of these values as well, and wants to get closer.

I agree with much of this in the sense that it's a useful caricature of what's wrong with the Zope community and what needs to be improved. It's amusing that I look for ways to apply the lessons of Repoze and Grok (among others) to Zope we end up arguing so much. Chris interpreted my proposals for cultural renewal as a way to codify all things bad he describes, while my intent was more or less the opposite. I have to improve the way I express myself!

Let's go into the details:

I have no faith whatsoever that staying on the course we've been on for the last 9 years

9 years is a long time, and while I agree that some cultural deficiencies such as bad presentation of modern Zope technologies to developers and a bad installation story, have lasted a very long time without enough of an effort to fix them, other deficiencies we're aware of and we're making progress on.

large interconnected codebase,

This characterizes the current Zope 3 codebase quite well, but at the same time doesn't characterize our goals or efforts.

We had an effort in 2007 to split up our large interconnected codebase into small components. These are now the many zope.* and related areas on PyPI. Some of these are reusable independently, but far too many still pull in way too many dependencies (basically all the others), due to some seriously circular dependencies between them.

Recently I and others have spent quite a bit of time in trying to make these dependencies better, and this is part of an on-going process.

I think this is a mischaracterization therefore in the sense that is something the community knows about and is actively working on.

backwards compatibility at all costs,

I agree that we have erred on the side of too much backwards compatibility. That increased the overhead of changes tremendously and blocked innovation.

That said, I also see a lot of value of having a lot of components that can work together, and we do have a large collection of those in the Zope ecosystem. This is why Grok is so careful to stay compatible with Zope 3, so we can share that pool of components.

I'm in favor of an evolutionary approach where backwards compatibility on occasion is broken and it's clearly documented what developers should do to fix their code. We've found this worked quite well for changes in Grok. I'm also in favor of an approach where due to proper dependency factoring we can dump whole chunks of code that we aren't using (such as the Zope 3 ZMI) in a large step.

lack of any consumable documentation at a package level,

I agree that most package-level documentation could be improved tremendously by focusing on writing real documentation instead of half-test/half-documentation stuff.

That said, we also have a tremendous level of package-level documentation and interface documentation, and it's a mischaracterization of the values of the Zope project to say we haven't cared about documentation at all. We innovated with interface-level documentation and doctests, as well as making those doctests available on PyPI.

Chris has said in the past that this is a sort of "false optimum" that stops people from really fixing documentation issues, and I agree with him there that this is wrong (even though I do value doctests). We should make an effort to change our culture and redirect our documentation efforts to go beyond doctests. We've seen the adoption of Sphinx in our community in the last year, and I have good hopes we will make a lot of progress on this in the coming year.

I'll also note that documentation for the whole system has traditionally been lacking (how to get started, install it?). For this my answer is Grok. If you want to use the Zope 3 technology stack, it's by far the easiest way to get started.

not much curiosity about how other Python web frameworks work,

I'm not sure whether this characterization is accurate or not. Because Zope was there sooner than many other Python web frameworks, it's probably true we've not studied the competition as much as if we'd been there later and had more chance to compare and contrast.

I've personally been quite interested in seeing how the cultures surrounding other web frameworks work and trying to adopt lessons from this (DRY and convention over configuration for Grok, for instance, and proper documentation).

I've been able to apply the lessons I've learned from other web frameworks far better in the context of Grok than I have been in the context of the wider Zope community, and I wish that would change. In fact I'm trying to apply some lessons I've observed from Chris' efforts, Repoze, to Zope.

So, we should do more of this, indeed.

not much real cooperation with folks that maintain other Python web frameworks,

It's hard to judge this one, because what is "real cooperation"? We've tried reaching out quite a few times over the history of Zope, but I do think we can do better. Of course reaching out like this is one of the main things that Repoze is trying to do, so it's unlikely we'll be able to get up to the level of the Repoze folks any time soon.

The culture of cooperation between other Python web frameworks has started really taking off surrounding WSGI. Zope has tried to integrate with WSGI (and Twisted before then), but with moderate success in gaining community benefits from this. This means we need to try harder. I believe Grok's upcoming 1.0 release will help us a lot with the adoption of this technology, as we've been working quite hard to make sure it works out of the box.

I think we should do our best to integrate other technology in our own stuff, and we've had some progress with things like WSGI, Twisted and SQLAlchemy. Maybe Repoze is next, but I hear they think very badly of us indeed, probably because they know us so well. :)

a constitutional inability to attract new users

I share that concern very much. Part of the reason we don't attract more users is our lack of attention to documentation, proper web presentation, and our "here's a giant toolbox, it's flexible, you figure it out" approach. Grok has been one answer to this, and we've had a lot of good progress in bringing the old Zope 2 documentation up to speed as well.

It's good that the Zope technology is so central to other projects which do attract new users (Grok, Zope 2, especially through Plone) so we still have an influx of new users that way. We also get an influx of users of our individual libraries such as zope.interface and zope.component, and we want to encourage that happening more.

Besides this, I'm always surprised we are able to attract new users of the Zope 3 app server directly itself as well - there are more than you'd think given our rather bad public presentation. There must be some value in it after all then! :)

I think we should recognize the position of the Zope technology as central to Zope web frameworks that do attract users. I want to call that technology the "Zope Framework". It's not something users install directly, but it's something that is used to build Grok, Zope 3 and Zope 2, that can be installed.

We need to manage the Zope Framework as such. In it, there is a tension between the concerns of the individual libraries (the parts) and the whole (the integrated set of them that's traditionally been called Zope 3, but is also used by Zope 2 and Grok). We need to think of the Zope Framework as a whole, and as parts. In order to make whole better we need to improve the parts, but in order to improve the parts we often need to think about how they fit into the whole as well.

We need to manage it so that we can resolve this tension so that we can have both good individual libraries and a better integrated experience. I'm optimistic we can resolve this tension to the betterment of the whole and the parts.

We need to look at ways to make the Zope Framework smaller, composed of more easily digestible parts, and being a whole that's easier to comprehend.

In reality we're not managing one big thing, but a tree of libraries that depend on each other, and people can approach parts of the trees as well as the whole. Breaking the tree metaphor, branches or nodes of the tree can be adopted into other trees such as repoze.bfg and Twisted. The Zope Framework, like Chris' description, is in a way a caricature of something more complex. It's a handy concept to organize a community around.

That community is the Zope community. Here's our dirty laundry. We're washing it so you can use it too. And we'll need to wash it again in the future. We're used to doing laundry. We've been at it for over a decade, and we won't be going away any time soon. Care to join us? :)

Cleaning up Zope 3's dependencies

Cleaning up Zope 3's dependencies

This week a bunch of us (myself, Christian Theune, Wolfgang Schnerring, Brandon Rhodes, Jan-Wijbrand Kolman and Sylvain Viollon) have been sprinting in my house at the "Grok Cave sprint". We've been working on cleaning up Zope 3's dependency structure, which in places is very hairy. This meant that you could often pull in one fairly innocent looking Zope 3 package and as a result pull in almost all of them. This makes it difficult to reuse packages and upgrade code. Loosely coupled code and all that.

See also Mark Ramm's talk at the Django Conference about why having a sane physical dependency structure between packages is a good thing.

One major part of this dependency reduction project has involved extracting a new package zope.container out of zope.app.container . Another part involved reversing the dependency relationship between zope.traversing and zope.location [UPDATE: I mistyped zope.security instead of zope.traversing here previously]. We've also started extracting zope.site from zope.app.component and moved some code from zope.app.security and zope.app.component into zope.security. In addition we created two new tools (z3c.recipe.compattest and an addition to the Zope test runner) to help us keep track of things. We've also made a lot of use of an existing tool to track dependencies between packages called tl.eggdeps.

That's all gobbledygook to most people. So here are the before and the after pictures (of the zope.container and zope.location/zope.traversing work in particular).

Here is the before, the dependency graph of zope.app.container (with core packages zope.interface and setuptools excluded as they don't depend on anything and clutter up the graph):

http://startifact.com/cavesprint2009/zope.app.container-before.png

And here is the after, zope.container, which can be used instead of zope.app.container almost everywhere (and zope.app.container in fact uses it too), again with zope.interface and setuptools not shown:

http://startifact.com/cavesprint2009/zope.container-after.png

We believe this is significant progress! It's still a lot of packages of course, but we can at least motivate the existence of a dependency relationship on them in most cases.

All this work is far from done. It's been a lot of work to get this far. There are still many dependencies more to clean up. It will take more work after the sprint to get to a good dependency structure for the complete set of Zope 3 libraries. We're starting to see some light at the end of the tunnel now though.

Grok's songlist application

Grok's songlist application

I've been following with interest a number of posts that talk about creating a simple REST-based web application that persists the number of plays various songs have had. Here's the history:

First a few comments on the protocol: the RESTfulness of this protocol could definitely be improved. As was remarked by a comment on the original post, REST-based apps return lists of URLs in overviews and the overview in this app doesn't. I'd also modify the way new songs get registered with the app and make that a POST request on the song container, instead of implicity creating such resources by the mere act of traversing. I haven't made any such modifications to the protocol even though the last improvement would simplify my code as it'd let me get rid of the traverse method.

I thought it'd be interesting to implement the same using Grok (1.0a1). Without more ado, here it is:

import grok
from zope.app.publication.interfaces import IBeforeTraverseEvent

class App(grok.Application, grok.Container):
    def traverse(self, id):
        if id not in self:
            song = self[id] = Song()
            return song
        return self[id]

@grok.subscribe(App, IBeforeTraverseEvent)
def applySkin(obj, event):
    # make rest layer the default if necessary
    if not IRESTLayer.providedBy(event.request):
        grok.util.applySkin(event.request, IRESTLayer, grok.IRESTSkinType)

class IRESTLayer(grok.IRESTLayer):
    grok.restskin('main')

class AppREST(grok.REST):
    grok.context(App)
    grok.layer(IRESTLayer)

    def GET(self):
        return ','.join(['%s=%s' % (k, v.count) for k, v in self.context.items()])

    def DELETE(self):
        for key in list(self.context.keys()):
            del self.context[key]

class Song(grok.Model):
    def __init__(self):
        self.count = 0

class SongREST(grok.REST):
    grok.context(Song)
    grok.layer(IRESTLayer)

    def GET(self):
        return str(self.context.count)

    def POST(self):
        self.context.count += 1
        return str(self.context.count)

Before I go into the good news (Grok gives you two important features here that the other frameworks examples don't have), first the bad news.

It's about 10 lines longer than the CherryPy and Restish examples (the Werkzeug example is shorter still but rather low-level).

Performance-wise it's the slowest of the bunch, on my machine, which is comparable to the machines of the others (in my case an Intel Core 2 Duo 2400 MHz Linux box) I get about 580 requests per second (not too shabby):

Concurrency Level:      1
Time taken for tests:   17.39500 seconds
Complete requests:      10000
Failed requests:        0
Write errors:           0
Total transferred:      2500000 bytes
HTML transferred:       10000 bytes
Requests per second:    586.87 [#/sec] (mean)
Time per request:       1.704 [ms] (mean)
Time per request:       1.704 [ms] (mean, across all concurrent requests)
Transfer rate:          143.26 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:     1    1   1.1      1      61
Waiting:        0    1   1.0      1      60
Total:          1    1   1.1      1      61

Now to the good news. The other examples all store their information in a global variable in the form of a dictionary. Gulp. This application actually features true persistence. When you restart your server, the counted information is still there - it's in the database. This means that the benchmark actually includes database access (to the ZODB).

You may have noted that there isn't much database access code there. That's because the ZODB allows transparent persistence of Python objects. This actually made it trivially easy to write this application with true persistence.

The other has more to do with framework power. This is not low-level code, and that shouldn't be underestimated. We have available to us a framework that offers a ton of features, both out of the box and as extensions. I'll talk about some out-of-the-box features here.

Grok's REST system allows you to extend existing (persistent) objects in your application with RESTful behavior. These objects can retain their original UI entirely. If I actually left out the applySkin code above, the RESTful URLs would look like this:

http://localhost:8080/++rest++main/song/1

and the normal URLs to the objects would look like this:

http://localhost:8080/main/song/1

This means that you could give your app both a normal UI as well as REST-based access. In the example I've used the applySkin line to consolidate them into a single URL space however.

In addition, Grok's REST support also features a powerful and built-in security system. You can give each access a permission by adding the line @grok.require:

@grok.require('some.permission')
def GET(self):
    ...

Grok also takes care of URL management for you out of the box. The objects in the app all have a URL automatically. Should I want to display the URL of each song object in addition to its count I'd change the GET line of SongREST to this:

def GET(self):
    return grok.url(self.request, self.context) + ' ' + str(self.context.count)

If you want to see a much bigger REST-ful app I've written with Grok for a customer (ID-StudioLab at the Technical University of Delft), please check out imageSTORE (it comes with a lot of doctests). It's a RESTful persistent storage of image information.

The Ghost of Packaging Past and Future

The Ghost of Packaging Past and Future

There has been a recent discussion on how packaging of Python libraries should be done. As frequently happens, the Zope community sometimes encounters problems before many other communities do, and I think we have an example of that here.

James Bennett wrote:

"Please, for the love of Guido, stop using setuptools and easy_install, and use distutils and pip instead."

Ian Bicking then pointed out that this is impossible, as pip is actually built upon setuptools.

With Grok we don't use easy_install much. setuptools, overall, is great to have. It's hardly perfect, but it offers many essential features that distutils alone doesn't offer, such as the ability to actually list the dependencies of your project in the first place. We also greatly enjoy the benefits that Buildout brings.

Even though we've been dealing with advanced packaging issues in the Zope project for quite a while now, I am not going to be presumptuous and tell you what to do or not to do, for the love of Guido or not. I'm not going to tell you to use Buildout, for instance. Instead I'm going to sketch out what we do right now, and what our use cases are. Hopefully this will help inspire further improvements to the Python packaging infrastructure.

What Grok does

Let me first sketch out the way Grok's installation system works.

To install grok, you first need to install a small tool called grokproject. This is the only time when easy_install comes in It is used to download and install the small grokproject distribution. Many people do this in a virtualenv. After that we don't use easy_install anymore, and I actually imagine you could use pip as well in this step.

Once you have grokproject installed you have new command line utility to create Grok-based projects. It's appropriately named grokproject. To create a new Grok project, you type something like this:

$ grokproject myproject

grokproject will ask you a few questions (username/password of the admin user), and will then create a project directory called myproject and will install Grok's basic environment. This includes startup and other useful scripts in a bin directory, a setup.py file and a src file which will contain the actual source code of the project in a Python package.

In the Grok project we go for code reuse as this enables us to reuse the brains of smart people, so all this is done with a combination of Ian Bicking's Paster and Jim Fulton's Buildout.

Paster is used to create the directory and file layout from a template. You can in fact also use the Grok paster template using the paster create command to create a project directory and then call buildout manually - it will do the same thing.

Buildout is then automatically invoked. It will do whatever we tell it to do through the project's buildout.cfg file. Buildout is pluggable with recipes that are themselves distributed on PyPI. In Grok's case the buildout step does the following:

  • download Grok and all its dependencies from PyPI (we actually have an optimization here in that it will download a large bundle of libraries if it can)

  • install all eggs in a centrally shared eggs directory (in your home directory typically). This way the next time you create a project with grokproject you will not have to download anything new. Note that Grok's eggs are installed unzipped; we don't like zipped eggs much either.

  • install an empty ZODB database.

  • install the zope control scripts for starting and stopping the server and such.

With Grok 1.0 (coming in early 2009) we will start using paster serve for managing the Grok process, instead of our home-grown zopectl script, so we can enjoy a lot of benefits of WSGI support out of the box.

Version management

Very important in this whole procedure is version management. When someone installs Grok it's essential they will only get versions of dependencies that we actually know will work. If it would get an arbitrary recently released version of a library, any new release of a library could potentially break Grok. In addition, all Grok projects would use different versions of dependencies, which would make communication about bugs rather difficult. We struggled with this for a bit in the second half of 2007 and then solved it, using a feature of Buildout.

Buildout allows you to specify the versions of the dependencies you need in a special section. The specification looks like this:

[versions]
grok = 0.14.1
grokcore.component = 1.5.1

The versions specification can be maintained in a separate file (similar to pip's requirements file), and indeed grokproject will automatically install a file versions.cfg in your project that contains the versions specification for anything you use in your package. These files can be loaded from the filesystem or from URLs.

We publish these version lists on canonical URLs on http://grok.zope.org/releaseinfo that grokproject automatically inspects when you create a new project.

An important feature is that in your project you can override which versions you really want, and extend it with versions of other packages that your project is using. Grok gives you its own list to reuse, but doesn't force you to do so. This is possible because buildout has a built-in feature for overriding sections. Note that such overrides wouldn't work if you locked the versions down in the individual setup.py files themselves.

When you want to upgrade an existing grok-based project to a new release of Grok, you simply download the appropriate versions.cfg for that Grok release into your project and re-run buildout. It will proceed to download any new packages you may need and you're ready to go. Alternatively you can just make your project point to the releaseinfo URL directly (we used to do this by default), although that has the drawback that you cannot re-run buildout while you are working offline.

Who manages version requirements?

Ian Bicking wrote:

Pip requirement files are an assertion of versions that work together. setup.py requirements (the Setuptools requirements) should contain two things: 1: all the libraries used by the distribution (without which there's no way it'll work) and 2: exclusions of the versions of those libraries that are known not to work. setup.py requirements should not be viewed as an assertion that by satisfying those requirements everything will work, just that it might work. Only the end developer, testing the system together, can figure out if it really works. Then pip gives you a way to record that working set (using pip freeze), separate from any single distribution or library.

I'll note that we've been managing the setup.py requirements in the way that Ian recommends from the start. It should express which libraries are needed and the minimum version requirements that the developer writing the setup.py knows about.

Who however is the "end developer"? Would that be a developer that uses a framework like Grok, or the developers of the framework itself? I don't know whether Ian meant this, but let's define "end developer" as the developer actually building an application for the sake of this discussion.

Of course the end developer always makes the final determination whether libraries work together. But reuse is possible here, and a project like Grok wants to give the end developer as much help as we can.

One of the advantages of a framweork that's distributed as a single large distribution containing many sub-packages like Django is that the developers of the package can determine whether this subset really works together, not the end developer that uses Django to build an app. There is no chance someone will install a version of Django where subpackage 'a' doesn't work properly with subpackage 'b', unless the Django developers themselves made a mistake.

Indeed the Zope project used to distribute Zope 3 this way (and we still distribute it that way as an option), but we wanted more flexibility for end developers to select their individual dependencies, and evolve different parts of the project at different rates. We also wanted to make it easier for others to use individual components developed by the Zope project, so we split Zope up into a whole bunch of packages and started distributing them that way. Now we've been plugging away at unwinding all sorts of dependencies that grew up between many of our supposedly independently reusable packages while we weren't looking, but that's another story.

If you have de-coupled packages, like Grok does, the burden of maintaining which versions work well together should not fall solely on the end developers that use a set of libraries. This would mean that each developer that used Grok would need to figure out a large list of versions, as Grok is composed from a large list of dependencies.

While I fully agree that it's important that end developers should have the ability to determine the exact version of each package if they should want to, I think often they want to delegate this job to the developers of a larger project (like Grok or Django) to determine what combination is known to work.

We want people to be able to work together to determine the known-good set. We don't want to force end developers to figure out which list of versions to freeze each time they create a new project. We need to have the ability to share these lists of frozen versions, work on them as a team, and release them so others can reuse them, and potentially extend and override them.

That's what we've doing with Grok for more than a year now. We maintain lists of versions for the developers that use Grok, but we give our users the opportunity to override and extend our choices should they wish to do so.

From what I see pip's requirement file mechanism and bundle mechanism get there part of the way, but I do not yet see patterns for delegating the management of such requirements files to a project like Grok, while still allowing individual developers to override these choices where needed. Perhaps I'm missing something.

The future?

Grok's current approach could be improved. When I build a framework on top of Grok, I need to host a new requirements file for that new framework as well on some URL. While it's possible to combine requirement lists together into a larger list with Buildout's mechanisms, it is a bit frustrating we are forced to have to maintain such a separate infrastructure entirely independently from PyPI. It's an extra release step too, making the entire release process that's automated so nicely by distutils and setuptools more cumbersome again.

I think it would be useful to allow people to ship "known-good" requirements along with the distributions, and host these directly on PyPI. Someone that uses a library will not be forced to use these requirements, but can opt to do so if they wish. Requirement lists should be able to build on and override other requirements lists. I wrote down some ideas on how this might work in 2007. Perhaps the people working on this problem today will find some of my old thoughts useful, so if you're really interested in these topics, please read the linked article.

I like doctests

I like doctests

It seems to be a recent trend to point out things you don't like about doctests. There are two articles by Andrew (? - see update below) and one by Ned Batchelder. There's also one by Marius Gedminas.

I take the doctest negativity as a sign of increased popularity of doctesting in the Python world. Doctests are now being seen and read by a larger amount of Python programmers, so there are now more people to talk about the undoubted drawbacks of doctests. (Of course it is also a sign of people disliking aspects of doctests - Marius for one has been exposed to narrative doctests for years)

I like narrative doctests, and have been using them for years now. They often constitute the bulk of the tests of my code. To Andrew at least, that probably means I "abuse" them. Why do I like them?

Doctests are not an ideal testing tool. There are pros and cons. Read the linked articles for some cons (and pros: Andrew points out they are easy to write). Narrative doctests aren't an ideal form of developer documentation either: a well-written, well-maintained dedicated text is better.

The great thing about doctests is that you can write fair tests and fair developer documentation, at the same time. You can use doctests to provide reasonable test coverage suitable for solid, real-world code. Importantly, those same doctests then also provide developer-level documentation that may not always be great, but is still much better than the frequent alternative (nothing).

One advantage of using doctests to describe your API is that you use the API of your code in the doctest before you actually use it for real. As a result, the API of the code you write becomes better as you are forced to think about it early on during the design process. Unit tests of course have the same benefit: improved API design is actually one of the great but rather underacknowledged benefits of unit testing. But doctests encourage you to think about your API design more than plain unit tests, as you're actually writing prose that tries to explain the way your API works to the reader. If it's hard to explain, it may be time to change the design.

Doctests often contain usage examples. Unit tests do too, but doc tests have a narrative around them, including the often all-important setup code. Instead of digging around to see which objects you're supposed to create and what methods you're supposed to call in what order, you have a narrative in the doctest that tells you what to do.

Another advantage of doctests is that they spell out the intent of the tests better than a typical unit test suite does. An individual unit test can of course describe its intent by being well-named and by having comments, but nothing encourages you to do so. Doctests have an actual narrative, so this style of testing actively encourages writing down the intent.

Here are some examples of narrative doctests I've written over the years. I don't find it particularly hard to work with doctest. In all cases below they form the bulk of the tests in the codebase. Is this abuse of the doctest format? Judge for yourself whether you like narrative doctests or not:

  • hurry.resource (a general framework for including resources (CSS, JS) in web pages. I intend to write more about it in the future)

  • martian (declarative configuration embedded in Python code without metaclasses)

  • classix (an experimental configuration system for hooking up classes to lxml elements)

  • z3c.vcsync (synchronizing application state with a version control system, for Zope 3)

  • hurry.workflow (a simple workflow system for Zope 3)

  • z3c.saconfig (integrating SQLAlchemy with Zope 3's component architecture)

  • imageSTORE (an example of using doctests with an application instead of a library. This doctest works through the REST protocol that this application offers)

Narrative doctests are not an ideal tool; no tool is. You have to actually write a narrative and if you don't, you are left with a testing tool that is in many respects worse than unit tests. I maintain that doctesting is an approach that's good enough to write good, solid software with reasonable developer-level documentation. You can enhance the reuse potential and API design of your library by writing a narrative doctest for it.

Feel free to use unittest and doctest where appropriate, to your taste. But don't be scared off by the recent negativity that seems to surround doctests. Doctests have many benefits. Doctests are a good balance for me personally, and perhaps they will be for you as well.

[update: I don't know why I gave Andrew a last name; I'm not sure where I got that from so I'll take it away again.]

Grok tech in Plone continued

Grok tech in Plone continued

In my previous blog entry I tried to make the case for Plone adopting Grok technology. I gave some background on Grok and talked about some of the pain points that I think Plone has and that Grok has been trying to tackle. My goal was to explain where Grok is coming from, and why it's a good fit for Plone.

Martin Aspeli, in a comment, wrote the following:

However, there are three charges that the advocates of Grok-in-Plone would need to counter:

  • Not enough people are using Grok, i.e. it's yet another esoteric technology choice

  • It's too much magic (or too hard to debug when things don't do what you'd expect)

  • It's yet another way of doing things that can't completely displace the existing ways

I'd be interested to hear your responses to these. :-)

These are each good points that need to be discussed. Let me respond to them here.

Not enough people are using Grok; Grok is too esoteric

The too-clever answer would be that enough people would be using Grok if the Plone community hopped on board. :)

Grok technology expands on a technology choice that Plone already made: Zope 3. Grok is the project that is trying to make Zope 3 technology feel less esoteric. It seems to me therefore that Grok technology is a way to make Plone's technology choices feel less esoteric, not more.

I would also be interested in hearing alternative suggestions. Are people suggesting Plone dump Zope 3 technology? What would replace it? Or are people saying there are better choices to make Zope 3 easier to work with? Or are people saying Zope 3 technology as it stands is just fine already?

Grok provides a smooth upgrade path from technology that you're already using, it has been around for two years now, and it's focused on reducing pain points for developers. The Grok community isn't an enormous community, but it's active and growing.

I agree with the general idea that we shouldn't be making too many esoteric technology choices. On the other hand, the Zope community on sometimes is tackling problems that others haven't encountered yet. The solutions will inevitably seem esoteric to some people, because we simply were the first to get there.

It's too much magic

Grok's way of doing component configuration builds on the Martian library. You can extend Grok in a declarative fashion. We've continued to work on expanding Martian so that individual bits of Grok become more declarative and less special. We're also documenting this behavior extensively.

One can make the charge that the defaulting rules that Grok's directives (to associate a view with a context, give a name for a view, and so on) use are too much magic. They're also the core of Grok's convention over configuration approach. Convention over configuration has two positive effects:

  • it encourages different people to follow the same conventions in code, so that code that follows the conventions is more regular.

  • it allows people to learn about Grok gradually. They don't need to know all directives in order to accomplish something, but once they come to a road block and they want to override the default behavior, they are ready to learn a new directive.

If you are unsure about Grok's behavior, you can always be explicit by writing all the directives down explicitly yourself.

As to the suggestion that Grok code is hard to debug when things don't do what you expect, I'd be interested to see some examples of this problem. I haven't noticed things being particularly hard to debug myself, but of course my perspective is special here. Grok ships with a built-in introspector which should help, although we can certainly improve it quite a bit lot more.

Grok is a form of automation. One person's automation is another person's magic. We're on guard against inconsistent automation with lots of special cases.

It will be a new way that cannot entirely displace the existing ways

I think this is not a comment particular to Grok. Any new way you introduce is vulnerable to this charge. It's inherent to the evolution of a large framework and the tension between moving forward and backwards compatibility, which I already described in my previous article.

Grok provides a smooth transition path at least for the Zope 3 code that already exists in Plone. It allows a gradual transition. It's good as this is the only way forward for a larger system with so many users. This is bad because two ways will co-exist at the same time. I'm not sure this can be avoided, however.

As for non-Zope 3 APIs in Plone, where you need to registers with Zope 2 or the CMF, Martian technology can help there too. The Silva project has been doing interesting work in this direction; it uses Grok technology to register its distinctly Zope 2 content types.

I think some people have the impression that Grok takes away from the power of ZCML. They seem to think Grok is for some basic stuff, but when things really gets tough you have to fall back on hard-core ZCML. That's not the case. I find myself writing very little ZCML these days. Grok isn't a simplifying layer over ZCML; it's an alternative configuration system for the Zope 3 component architecture that we believe is easier to use. That said, Grok allows ZCML and Grok code to coexist just fine if you want or need to have both.

Whether Grok technology, or any technology, will displace old ways entirely in Plone is not something that is limited by the technology itself. It is a choice the Plone developers make, and if they make it, a challenge the Plone developers must rise to.

Why I think Grok can make Plone easier

Why I think Grok can make Plone easier

Let me start off by saying that I'm not a Plone developer. I've got a lot of experience with Zope 2 development though, and I keep an eye on the Plone community. I hope the Plone developers will not mind a relative outsider's observations and suggestions.

I understand that many of the people developing with Plone, or learning Plone, are frustrated. Now this is nothing unusual - I think fundamentally it's pretty difficult to extend and customize such a feature-rich system.

We should ever strive to move forward however, and the situation is clearly worse than it could be: Plone presents many different ways of doing things to the developer. This is largely a result of its evolution over time. Things a Plone developer has to deal with are Zope 2, the CMF, various Zope 3 technologies, the Five integration layer, and many technologies the Plone developers created.

I'm responsible for creating the Five project some years ago (many others did a lot too), bringing Zope 3 technology to the Zope 2 platform. This ultimately led to the introduction of Zope 3 technologies into Plone. This has been, I take it, a mixed blessing.

I think Zope 3 technologies have undoubtedly given Plone developers new abilities to create powerful, pluggable features for Plone. That's great. It's also allowed Plone to adopt well-engineered Zope 3 libraries. It has also aligned the interests of the Plone community better with those of the Zope 3 community, and the Python community as a whole. This led to us all working together better; a not to be underestimated benefit in open source development. As a result, Plone developers are now working on code that is useful outside of Plone (for instance with Grok), and similarly Zope 3 developers are working on code useful in Plone.

On the other hand, the addition of Zope 3 technology adds yet another way to do it for those people who want to extend the customize Plone. It's yet another set of concepts to learn about. What's worse, these concepts interact in "interesting" ways with older bits that are already there. This results in frustration and head-scratching.

In addition, when given a powerful tool for creating pluggable frameworks, it's easy for a developer to go overboard, and create something powerful, very pluggable, but not very agile. That's not directly the fault of the tool itself. Building a good framework and using tools well is a learning process. But in the mean time this is results in yet more complexity for people to deal with.

Why can't there be a simple, uniform way of extending and customizing Plone?

In fact, the Zope 3 developers created at least the basis of such a uniform system. It's been informed by history: it built on lessons learned with Zope 2 and CMF. This uniform system for customization and extending software is called the Zope 3 component architecture. In addition, there's an explicit, and valuable, notion of configuration that supports extending and customization of software by changing the configuration.

Plone is a victim of its success. It's been around for a while. Plone is therefore a piece of software with a history. There are many useful features in Plone that use the older ways of doing things. The community has finite resources to rewrite these. It's generally not a good idea to start rewriting everything at once, either. And if you manage to rewrite something, you'll have people complain about breaking backwards compatibility...

Still, Plone is being rewritten, step by step, to make use of Zope 3's component and configuration technology. Misteps are made. People are frustrated. But I do believe Plone is fundamentally going in the right direction with this. The Plone developers have been working for years on creating a uniform system for customizing and extending Plone, building on Zope 3 foundations. Ironically, but entirely understandably, this effort is also a source of frustration.

Why can't this component and configuration system "Just Be Python"?

Zope 3 is far from perfect. Traditionally, configuration is done with Zope 3's ZCML; the bit that's making life harder for many people working with Plone today. I can understand this frustration very well myself, as I've felt it when developing with Zope 3.

Now I'm not someone who is fundamentally against the notion of domain-specific configuration languages. As I indicated above, I think there is a lot of value in having a notion of explicit configuration in your system. ZCML and Zope 3's configuration system are valuable. So I understood why ZCML was there. I even defended it - you can dislike it because it's an XML dialect, but that's really superficial.

Then I worked with it. I learned over time that ZCML is one of the aspects of Zope 3 that cuts down on my agility as a developer. Not because it's XML, but because it's separate. I have to do too many mental context switches - Python code to ZCML and back. If I forget to write a bit of ZCML to hook things up, which is easy to do, things won't work. I don't see the complete picture when reading the code, and I don't see the complete picture when reading ZCML. I concluded it didn't fit my brain. I think it fits the brains of some people, but I also think there are quite a lot of people like me.

ZCML wasn't the only thing that made Zope 3 harder for me to work with than I thought it should be. Other bits are its very explicit and invasive security system (but that doesn't bother a Zope 2 developer already, as it's not in Five), and the tendency (partially culturally and partially encouraged by Zope 3's security system) to write a lot of explicit interfaces before they are really needed. I see a lot of value in the explicit interfaces that zope.interface supports, but I typically want to add them to my code gradually as I evolve it. I don't want to feel the pressure to write them up front.

My response to these issues back in the summer of 2006, was to start thinking about Grok. In the fall of 2006, now two years ago, some of us got together and fleshed out the design of Grok into more detail, then started building it. Along with improving the technical aspects of the Zope 3 experience, we also wanted to take a fresh look at the way the Zope 3 community works and how it presents itself. We got quite far in two years.

The Grok technologies we developed are now finding their way back into Zope 2. The process started with the extraction of the grokcore.component library from Grok. grokcore.component works out of the box in Zope 2. Seeing this, Zope 2 developers then started working on the five.grok project, bringing back more Grok technologies to Zope 2.

There has been a nice synergy between the efforts to bring Grok technology to Zope 2, and the efforts to integrate Grok technology in non-Grok ("legacy" :) Zope 3 applications. The libraries we've been extracting for reuse in Zope 2 are reusable in Zope 3 a well, and vice versa, with only a little bit of extra effort. Both motivations (use in Zope 2, use in Zope 3) nicely drive development in the same direction. It is, I think, a nice example of how our larger community can work together, something the Five project has helped make possible.

I hope Grok technologies makes their way into Plone. I think it can help the developer's experience a lot.

Why should Plone developers believe a guy who is partially responsible for bringing the whole ZCML mess to you in the first place? Above I make the case that the Five project brought a lot of value to us, not just frustration. And the Five project allows us to share our frustrations in a larger community: although I'm not a Plone developer, I've shared some of your frustrations, and helped start work on a solution: Grok.

Grok is here today. The technology is there for the taking. It's essentially "Just Python": you do not write ZCML, and you don't even have to write much more than the Python code you'd need to write anyway - it just gets configured for you. Frustrated with the Zope 3 component and configuration system? Grok technology is designed to make those things easier!

People have a legitimate fear it will result in yet another layer they will have to learn about, and have to deal with during development. This will be true in part, as ZCML isn't going to go away right away everywhere. This means, I fear, people will need to know about it at least a bit. But at least by adopting Grok technology you'd be moving forward, in backwards compatible fashion. You'd keep the good things that Zope 3 already offers while getting rid of some of the bad aspects of it.

Grok, as a sub-community, cares intensely about the developer experience. We care about DRY: don't repeat yourself. We care about bringing a smooth learning curve towards the advanced technologies of Zope 3, while not breaking compatibility with it. We try to make our "new layer" as thin as possible, choosing to replace an existing layer instead of piling yet another new one on top of it. We're about making the easy things easy; thanks to Zope 3, the hard things are already possible and we do not take away from this power.

I hope the Plone project will align its interests with those of the Grok project. I admit am self-interested; I think adoption of Grok technology in Plone would be of benefit to the Grok project at least as much as it would be to the Plone project. But that's what good open source development is about: aligning interests.

Thanks for listening!

Happy Second Birthday Grok!

Happy Second Birthday Grok!

Grok the codebase is 2 years old this week. Two years ago we had the first Grok sprint in Halle, Germany, at the Gocept offices. A lot has happened since then. For me personally Grok is my development workhorse now and has been since early 2007. It's something I use during development every day.

Let's review some of the highlights of whave happened since last year's birthday.

  • Grok has a Plone-based website. This site hopefully looks welcoming to new users and thanks to the CMS backing it, it allows people to contribute documentation for Grok easily.

  • In addition to the Plone-driven website, Grok has a sphinx driven documentation website. This allows the developers to maintain important pieces of documentation in subversion, along with the codebase.

  • Grok now officially works with Python 2.5.

  • Grok's technology became usable in non-Grok Zope 3 applications and in Zope 2 as well, as we split off important functionality into reusable libraries.

  • Some feature highlights since the last year are viewlet support and ZCML auto-inclusion, as well as eggbasket support for grokproject.

What's currently in the works?

  • From what I've seen, Grok's going to really improve the development experience in the next feature release of the Silva CMS. A lot of the drive to port Grok technology to Zope 2 has also come from members of the Plone community, so I have good hopes that Plone will benefit from Grok technology as well. I know there is currently a lot of pain felt by Plone developers - too many ways to do things, and too many files to edit. I hope Grok technology can be useful to make their lives easier.

  • WSGI out-of-the-box support is still in the works, but should be released very soon now. People have been using Grok with WSGI for a long time now, but we still need to ship with a story that works straight away.

  • People have been working on improving integration between Grok and the powerful z3c.form form generation and handling library.

  • We've worked a lot on Grok's integration with SQLAlchemy in the form of megrok.rdb. An initial release of this package should be around the corner.

  • We've been reviewing how Grok's view story works in connection with inheritance, and we have hopes to improve this.

  • We've been working on improving Grok's support for the inclusion of static resources.

I think we should be heading towards a Grok 1.0 release within the next couple of months. Meanwhile, we're already thinking about larger changes that can go into Grok afterwards. I think there are quite a few exciting technologies that should be included in Grok out of the box, and also a lot of opportunities for engineering Grok's underlying technology to be even better.

Grok's release mill

Grok's release mill

It already happened last week, but I thought I'd mention our Grok 0.14 release. Grok 0.14 is the first release of Grok that officially works with Python 2.5, though unofficially Grok has worked with Python 2.5 for a while on many platforms. There is already a report of Grok working with Python 2.6!

The other major change in this release is the spin-off of three new libraries that are also reusable in plain Zope 3 applications as well as in Zope 2 (through five.grok):

We've also spun off the grokui.admin package. This contains Grok's user interface. This should means it becomes possible to deploy Grok applications without this user interface included, which adds security to deployments.

Earlier already Grok has spun off grokcore.component, which is a layer on top of zope.component and zope.configuration enabling you to use Zope's component architecture while writing plain Python code.

Earlier still, the Grok project spun off the martian library for deducing configuration from Python code itself. Martian has much improved since its first release, making it easier than ever to auto-register classes with any configuration system of your choosing, define directives and declare sensible defaults in case those directives are not used.

Why do we spin all these packages off? Why not develop Grok as one big lump of code? The reality is that we can't anyway - Grok is based on a whole range of Zope 3 components and other packages that are developed separately. Why though do we make our lives harder and split off things from Grok itself?

First, thanks to buildout our lives are not that much more difficult. It's easy enough for us to continue to develop these packages in synch with each other when we want to. We can use svn externals to pull a development version of a package into another and then tell buildout we want develop using that version. This is basically an enhanced version of setuptools develop installation mode.

Still, why is it worth it?

In the abstract, splitting off separate packages helps us safeguard conceptual integrity of a package. Giving a package a separate identity makes us think about what this package is really for, and helps keep the scope of a package clear.

Since we use separate packages with separate reponsibilities, it allows us a smoother framework evolution as well - we can more easily decide to drop a feature from the core and adopt another feature into the core if these features are independently packaged.

Most importantly and most concretely, the separate packaging helps with code reusability. The Grok developers want the Grok technologies to be used by many people. We've already seen the uptake of our grokcore.* packages in plain Zope 3 projects by developers who like some of Grok's features but don't want to pull in all of Grok. Even more importantly Zope 2 developers are starting to use our technology in CMS projects such as Silva and Plone. The martian package has also seen use in projects like repoze.bfg. This way a wider community becomes a stakeholder in the Grok project, and we feel this is very important.

How easy is it to use our code in non-Zope Python projects? The answer varies. Martian has only a dependency on zope.interface. grokcore.component has a slightly larger but still very well defined set of dependencies and is still entirely reusable in any Python project. The other packages unfortunately are dependent on a larger set of Zope 3 dependencies making them not reusable outside of a Zope setting.

The Repoze project has made many Zope technologies available to the outside world. The Grok project is looking at starting to use some of that code. The Zope developers are also looking at the Zope packages themselves to see whether we can't cut some more dependencies here and there increasing their reusability. That's another advantage of separate packages: it makes us aware of reusability issues, and lets us work on it.

What Zope can learn from Zope

What Zope can learn from Zope

So we've had what Django can learn from Zope, what Zope can learn from Mozilla, and a few years ago already we had what Zope can learn from Ruby on Rails. (and www.zope.org still sucks, but grok.zope.org and repoze.org don't)

Since the Zope of almost a decade ago was mentioned by a TurboGears guy in a negative light to a Django audience a few of course took this opportunity to bash Zope, as the word Zope is like a red flag. Zope has accumulated long, rich history of good and bad ideas, so we must present a tempting target.

Zope can learn from Django, and is of course doing so. We've been around for a decade and you aren't around for that long if you don't learn and adapt.

Let's now talk about what Zope can learn from Zope itself. How can that be?

The Past from the Future

The obvious way for Zope to learn from Zope is for the past to learn from the future. We've presumably learned some things over time, after all. What could the old Zope 1 and Zope 2 hackers have learned from modern web development with Zope 3, Repoze, and Grok?

Filesystem development

Zope 2 has a user interface in which software development can take place. Limited, untrusted Python code is stored in the object database. This has two major drawbacks:

  • it's Python, but Python with a few twists: untrusted so normal imports and operations are not always possible, and implicit acquisition pulling things in everywhere. This sucks, we want to use Python the way it is designed to be.

  • you can't use normal file system tools to manage your code. Instead you have to use the web browser UI. That still isn't ideal even with modern AJAX goodness, and certainly wasn't ideal in 1999. It means reimplementing all the tools that already exists (editors, version control systems, searching, etc) on top of Zope. This was an attractive effort that many in the Zope community undertook, and most of that work is now lost.

Zope 2 development through the filesystem was also possible, but it has a completely different development model than the through the web development, and this accounts for part of the infamous Z-shaped learning curve of Zope 2.

Zope 3, and Grok, have filesystem-based development only. You just write normal Python code in normal Python modules which reside in normal Python packages. You use your favorite editor, and your favorite version control system.

Fine-grained components

Zope 2 introduced a powerful component programming model. Zope 2 is extensible through so-called Products, which often were used to create new ways to help developers program in the web UI. As an example, one product I wrote back in 2001 is called Formulator, which helps people construct web forms by putting together fields in the Zope UI.

Unfortunately Zope 2 components are rather coarse-grained. Single Zope 2 components would have to inherit from a lot of mixin classes in order to play within the Zope 2 framework, fattening up their APIs to unmanageable sizes. Components would be so coarse-grained it became hard to reuse them in other contexts, as they would make assumptions that were hard to override. It was hard to just use Formulator's widget classes and not its Form class for instance.

Zope 3 instead has a powerful component model that allows for fine-grained components. These components define interfaces which typically only provide a few methods, and this allows for loosely-coupled development and true reusable components. We distribute these components in many separate packages, which allows for a lot of flexibility - application developers may choose not to use some.

The Future from the Past

Now let's turn it around and consider what Zope 3 can learn from Zope 2. What can we learn from the successes of our past? What have we lost? What should we bring back?

A user-interface

So the Zope 2 user interface has drawbacks for developers. It locks code into a hard to manage Zope-only format. It was also Zope 2's killer feature.

Zope 2's through the web user interface contributed enormously to Zope 2's success. Zope 2 was being adopted by non-Python programmers who discovered they could be very productive using the Zope 2 UI. These developers often turned into full-fledged Python programmers later on.

A user interface offers discoverability: in Zope 2 you get a drop-down list of components you could add to your object database. See an unfamiliar one? Just try creating one of those, see what UI options it presents, and click the "help" button. The UI encouraged experimentation and learning.

The Zope 2 UI also allows less hardcore developers to easily tweak layouts here and there and do a bit of simple scripting.

The Zope 2 UI was also used as a crude CMS and often extended to build simple CMSes that were still quite powerful for their time. In this sense the Zope 2 UI was quite similar to the admin UI that is one of the killer features of Django.

Zope 3 has lost almost all of this. Zope 3 does have a UI modeled after Zope 2's, but it's rarely used. Zope 3's UI was intentionally cripped compared to Zope 2's to prevent unmaintainable code to be created, and some ideas existed for checking out code from the object database for development on the filesystem, but it never really went anywhere. As a result Zope 3 is only approachable for Python developers, and is harder for beginners to pick up.

Grok has made Zope 3 a lot more approachable already, and I think is competitive with the other modern Python web frameworks, but a user interface could bring many more people in beyond this.

The Zope 2 UI had the drawback that it was all UIs in one, and did none of them very well as a result. I think we should work on bringing back most of the features of the Zope 2 UI back to Zope 3:

  • an introspector UI. See what content objects are stored in your database, what APIs they offer, what views exist for them. There is a Zope 3 apidoc tool that does this. In addition, the Grok introspector was developed in last year's summer of code, and more work on it was done in this year's summer of code.

  • a management UI: install, uninstall, configure and monitor applications and the server. Grok has a simple UI that allows you to do this. There is also the ZAM effort.

  • a through the web development UI: we shouldn't be thinking about a full-fledged IDE here. This one should be tackled carefully, step by step. Tweaking page templates is something that could be presented in a UI quite well, for instance. Some work to this effect was done in the five.customerize package.

  • an "editing backend" - it would offer navigation through the object tree, would display the contents of containers and would allow form-based editing of schema-driven contents. This would help provide a "good enough" solution for many applications that need some form of editing backend.

A component with a user interface starts to look like the coarse grained components of Zope 2. We should therefore be careful: the user interface should be easily replaceable and de-installable. We shouldn't be locked into it, but it should be there when we need it.

The good news is that many of the pieces are already in place for this work, and we should continue these efforts.