Skip to main content

An open letter to anonymous Network World writers

Dear anonymous Network World writers claiming to speak for corporate IT executives,

I read your recent open letter to the open source community with interest. You say that before free software can take on a higher profile in big IT departments you need to see the following things from us in the open source community:

  • more enterprise-class support
  • better documentation
  • a sense of stability
  • access to more platforms
  • a commitment to stay open
  • a focus on the end user.

I'm happy to say that the open source community can deliver on all of these. I presume of course that for the implementation of this wishlist, you are also willing to pay first. Supporting a healthy ecosystem of open source companies after all generally works better than slinging open letters at us. Open letters with wishlists in them feels a bit too much like looking gift horses in the mouth.

Thank you,

One member of the open source community

my EuroPython 2005 slides online

I had a terribly good time at EuroPython 2005. The quality of talks was better than ever, with a nice diverse and big selection. I gave two of them. I also was on two panels, and just shouted out comments through other talks :). If you missed it, you missed quite the event.

Next year's conference is at CERN, the place with the giant particle accelerator. We're already planning on using it to create transphasic mesonic resonance rays to make the world Safe for Python. We're also going to look for the quantum singularities and antigravity generators we're just sure they are storing there. Hope to see you there!

Last friday, I put the slides for my two talks online. Note that they're in magicpoint, so you may have to install that (mgp), or just read the plaintext source.

My first talk was on lxml, a not unfamiliar project to the readers of this blog.

The second talk was called Five in Action, and is about the Zope 3 in Zope 2 system that I helped build.

Note that the download links at the bottom of each page insist on calling the downloaded file download.cpy, so you'll have to rename things to lxml.mgp and five_in_action.tgz respectively.

lxml should now compile with gcc 4.0

Recently I started getting reports that lxml does not compile with gcc 4.0. Investigating this an issue with Pyrex was quickly identified -- it generates C code that is in fact illegal, was accepted by older gcc versions, but gcc 4.0 refuses to.

Now we're going to see open source cooperation in action. The issue with gcc 4.0 was first reported by Mikhail Sobolev, who is bravely running Ubuntu Breezy, which apparently switched over to using gcc 4.0 for its build environment. Just after the release I got a second report by Olivier Grisel (from the company Infrae does a lot of work with, Nuxeo), saying it also didn't work on Fedora Core 4, which apparently also uses gcc 4.0.

As a result of this, I installed gcc 4.0, and went to the Pyrex list to see whether anyone had already reported the same thing. Yes, there was, John Palmieri of Red Hat reported the issue (mentioning in actual earlier report by Jeremy Katz) and actually included a patch to solve this, though to an older version of Pyrex, 0.9.2.1.

It did however apply cleanly to 0.9.3 as well. Applying this patch, I discovered I still would get compiler errors, though less so, and they looked more like the ones reported by Mikhail and Olivier. In some further communication, it turned out that FC4 as well as Ubuntu breezy already have this patch to Pyrex applied. Meanwhile I also learned that Mac OS X Tiger is shipping with a gcc 4.0 build environment as well, further spreading the amount of platforms on which lxml won't work.

Not really expecting I'd get anywhere, I started looking through the Pyrex compiler code to see whether I could fix the remaining issue. Since I knew what kind of bogus C code was being created, I could track down the code that did this fairly easily. I then made a small change, not guided by any actual knowledge of the Pyrex compiler. It does however, seem to work.

I reported this back to the Pyrex and lxml mailing lists. Mikhail had some trouble applying my patch, and said it was already there. Some confusion arose; had the Ubuntu people produced the exact same patch as I did, line for line? Surprising, but since it was only a few lines, not impossible.

Olivier did get everything working though and produced a cumulative patch for Pyrex 0.9.3. Mikhail's problem was finally resolved when Andrew Bennetts from Canonical (the Ubuntu developers) popped in on the lxml mailing list (wow!) and pointed out my mistake -- I accidentally had produced a patch file that was in reverse, removing my fixes.

After this was clarified, it all worked for Mikhail too. I updated the lxml web pages and installation notes to include this info for those compiling with gcc 4.0. You can find them here:

http://codespeak.net/lxml/installation.html

Thank you everybody!

lxml 0.7 released!

I'm happy to announce I've released lxml 0.7 earlier today! It contains quite a bit of important work, from XInclude and XML Schema support to a better implementation of tostring(). See the changelog for more information.

You can get more information and a download here:

Unfortunately there seem to be some problems compiling with gcc 4.0 due to some issues with Pyrex. There is a patch out there which aims to fix it, but it's either incomplete or lxml is doing something weird, as it doesn't appear to work with lxml yet. The lxml community is helping me tracking this one down and testing it. Thanks community; your feedback is very useful to have.

In other lxml related news, I'm going to give a presentation at EuroPython about lxml. So if you're curious and you're showing up at EuroPython, please consider coming to my talk.

lxml upcoming new features

lxml has undergone quite a bit of development since lxml 0.6. While 0.7 is not yet released, this release should be coming soon, and to whet your appetites here's a partial list of new features:

  • XMLSchema validator support
  • XInclude support
  • more control over namespace prefixes when generating XML

I'll talk about the least spectacular sounding feature that in fact cost me the most time to implement: control over namespace prefixes.

I found myself generating XML quite a lot in a recent project, and experimented with a bunch of different APIs in lxml to support which prefixes are created for namespaces (and what the default namespace should be).

As you may or may not know, the ElementTree API doesn't have any official support for controlling what prefixes are outputted. This can result in entirely correct but ugly XML with namespace prefixes like ns0, ns17, etc.

Even though prefixes are not part of the XML infoset, some control over what they look like in XML is frequently desirable, as the intent of XML is to be at least somewhat human readable. It's easy to start leaning too much into the other direction, though: one should be careful not to offer too much control to the user either.

The W3C DOM, as usual, offers way too much API for namespace handling, which results in all kinds of scary interactions I don't want to worry about. I did add an attribute to read prefix information, but unlike the DOM, will not make this writeable, as this quickly gets pretty insane, so that route towards namespace control is out.

After quite a bit of thinking, I ended up supporting a second special argument to the Element and SubElement constructors. The first special argument, part of ElementTree, is 'attrib', which is a dictionary to control attributes. I added a new argument called 'nsmap', which is a dictionary to control namespaces. The keys are the namespace prefixes, the values the namespace URIs. A key of 'None' means set the default namespace. If a namespace is already known higher up in the tree, that will be reused instead.

Here's an example:

>>> from lxml import etree
>>> e = etree.Element('{http://ns.infrae.com/foo}bar',
>>> ... nsmap={'foo': 'http://ns.infrae.com/foo'})
>>> e.prefix
'foo'

lxml 0.6 released

I've released lxml 0.6. lxml is an alternative, more Pythonic binding for the libxml2 and libxslt XML processing libraries.

lxml 0.6 contains important bugfixes, in particular better namespace support while handling attributes, as well as a fix for what turned out to be totally broken behavior for etree.tostring(). An upgrade is recommended.

A great new development is that the tests that helped expose these bugs were contributed by people working with lxml on various projects. This really helped me track down important problems. Thank you for the contributions!

You can download lxml 0.6 as well as find out more information at the site:

http://codespeak.net/lxml

Silva Flexible XML released

I'm happy to have released today a new Silva extension, Silva Flexible XML. Silva Flexible XML combines a lot of the interests and themes of my technical life in a single package. It's an extension to the Silva CMS that allows the user to create and manage XML content. The important part is XML standards support: the XML content can be configured to be checked by a Relax NG schema or transformed to XHTML using a particular XSLT stylesheet.

Silva Flexible XML builds on a lot of other things. It integrates with Silva, the CMS I helped build. It extensible using Five, the Zope 3 integration layer for Zope 2 that I helped create. Through lxml, a Python-binding I created (I hope soon I'll be able to say "helped create" too for it) it makes use of the powerful XSLT and Relax NG facilities of libxml2 and libxslt.

Silva Flexible XML is not finished yet; it needs better XML error reporting for instance, which will need some work in the lxml layer. Infrae is putting Silva Flexible XML out there now in the hope that someone likes the concepts and wants to help us (by work or funding) to develop this further.

The amount of XML technology exposed by it, and the way you can easily integrate your own, is in my opinion quite impressive. We haven't seen anything like it in other Zope-based CMSes. What's interesting about this is that it does not by a thick layer of Silva-specific code, but mostly by leveraging Five and lxml, general libraries which have nothing to do with Silva, and in the case of lxml, nothing to do even with Zope. That is, we're not building much that is very Silva or even Zope 2 specific.

The release of Silva Flexible XML is therefore also a message about something else I helped build: Infrae. If you need one or more of the following:

  • content management/document management expertise
  • XML expertise
  • Python expertise
  • Zope expertise (Zope 2, Zope 3, and Zope 2 + Zope 3)

..then Infrae wouldn't be a bad place to start looking. We'll throw in client-side development ("Ajax") skills too; we had more than a little to do with Kupu. We're innovators, head in the sky but our feet firmly on the ground: we know to leverage existing standards and software where possible, but also know how to build new stuff where needed. I'm quite proud of Infrae, both the team and the software we created. We'd love to build more so don't hestitate to contact us!

iCalendar 0.10 released!

The last couple of months I've been heavily involved in a cool Zope + Five based project involving calendaring, about which more will be announced shortly. As part of this project I've had the opportunity to improve Five. The project also needed support for the iCalendar RFC, and this lead to me to become the janitor of the awesome Python iCalendar library.

The iCalendar specification defines a format for importing and exporting information about calendars. Basically, you can specify in it that a particular event with a particular title will be happening on a particular date. There is however much more to it; certain events recur every week, for instance, and there is a way to describe that. Events may be private or public, and there's a way to describe that. And so on.

When I was faced with the task to enable iCalendar support for the project I was working on, I did a websearch first, to see whether there was any package that did this already out there. I already knew the Schoolbell project has an implementation, but wanted to make sure I wasn't missing something.

Soon enough I ran into the iCalendar package created by Max M. It seemed general and capable, more so than the schoolbell code, and I figured I'd give it a shot using it. I did have a few smaller niggles though: I did not like the package and module structure very much -- the files were CamelCase and all directly in the distribution directory, and I've become used to a more structured approach with lowercase package names and a setup.py and the like. The package contained many doctests, but didn't contain any testrunner to run them all at once, so I wanted to fix that too.

I contacted the author, Max M, and I got permission from him to move the stuff over onto codespeak.net subversion and restructure the package structure. It turned out he had some requests from other parties as well, including Linux distributors.

So I became what I can best describe as this project's janitor. I reorganized the package structure, hooked in a test runner, and cleaned up various texts some. I do not think I actually touched the code at all so far, though there are a few places where I have some ideas for improvements now.

The more I used the iCalendar package in the project afterwards, the happier I became about it. It is very powerful and the iCalendar import/export facility came together very quickly. My goal for the iCalendar package became broader: I must let others know about this! It needed a project website, mailing lists, the works! We must form a community of users so we can work on this together. Being a project janitor is a new and interesting experience to me, and we'll see whether my efforts will help make the library more well known. If we can establish a community to develop it beyond Max M, myself and Lennart Regebro (who created a simple setup.py), I also have a chance of not being its janitor forever. :) One of my hopes is that the schoolbell people will start using this instead of their home-grown import/export system.

Max M graciously gave his permission to start working on a web presence on codespeak.net. The result of this is that for the release today, I've also (thanks to the great guys at codespeak.net) set up little a website that I linked to earlier. We also got a mailing list and checkins mailing list now. If you're interested in this project, I hope you'll subscribe.

Five 1.0 released!

Yesterday I've released the one dot oh version of Five, the Zope 2 product that allows you to use Zope 3 technology in Zope 2, today.

This release has been the effort of a bunch of very smart developers from all over. I think it's great that in about a year's time, it turned from a few experiments of mine into a truly open source, multi-person project with huge contributions from many others. Check out the CREDITS.txt in the Five distribution to see a list of the people who contributed, and sorry if I forgot anyone!

What's more, Five 1.0 is also the version that is going to be folded into Zope 2.8. Zope 2.8 will be shipping with Five technology out of the box! It is my hope that Five will become an important drive for further Zope 2 evolution, throughout Zope 2.9 and onwards.

Meanwhile, Five development post-1.0 is going full-speed ahead. Philipp von Weitershausen has been refactoring the Five package structure on a branch, to make it more maintainable, and he, Lennart Regebro and Florent Guillaume have been working on a version of Five that enables one to use the Zope 3 i18n infrastructure inside Zope 2. This is something I'd like to use for Silva eventually.

More about the Five 1.0 release here:

http://www.infrae.com/newsitems/five_1_0_released

and check out the Five website of course:

http://codespeak.net/z3/five

the Clarity Template Language

The ClearSilver templating language does not have a very pleasant syntax for people familiar with the TAL notation of Zope Page Templates. That's not to say ClearSilver's syntax is awful; it's deliberately simple, and I'm sure one could get used to it pretty quickly. Still, I started wondering what ClearSilver syntax would look like if it were more like TAL. Let's call such a theoretical TAL-like ClearSilver "Clarity". Perhaps this is a bit confusing, as it's the same name as the ClearSilver integration package I talked about before, but it's a nice name. :)

The Clarity templating language could be seen as a simple frontend to ClearSilver, meaning there is a one-to-one mapping of any Clarity code to ClearSilver code.

I realized that it should be possible to implement such a frontend fairly easily by involving yet another templating language: XSLT. I could use an XSLT template to translate Clarity templates to the equivalent ClearSilver templates.

So, I started an experiment. So far it's working out fairly well, though only a very limited set of ClearSilver commands can be produced right now (just var and set). Clarity right now defines the following statements: cs:define, cs:content and cs:replace, which more or less work like the equivalents in TAL, though the define syntax and semantics are different (that of set in ClearSilver). I even added support for "structure", in that I use the ClearSilver html_quote() by default everywhere, unless "structure" is used.

Pretty neat stuff, though somewhat perverse. I could even integrate this into the Clarity Zope 3 package using lxml, so that any .cla templates will be automatically preprocessed to ClearSilver templates whenever needed...

And for the people who know him, no, this is not a secret project to drive Paul Everitt completely nuts. :)

The code can be found here:

http://codespeak.net/svn/z3/clarity/trunk/xslt/

In particular, here's an example of Clarity code:

http://codespeak.net/svn/z3/clarity/trunk/xslt/example.cla