Obviel 1.0!

I'm proud to announce the release of Obviel 1.0:

http://www.obviel.org/en/1.0/

Obviel is a client-side web framework that supports powerful UI composition based on an easy to learn core. On top of that Obviel adds a lot of features, such as templating, i18n support, form generation and validation, and routing. Obviel stays close to HTML but lets you build sophisticated components when you need to.

Obviel has come a long way since its beginnings. On top of the core we've added a template language, an internationalization system that integrates with JavaScript and templates alike and a routing library. So it's high time for the One Dot Oh release!

Standout Features of Obviel

  • a powerful core that enables model-driven UI composition. If you know jQuery it will be easy to learn this core.

  • Integrated internationalization (i18n) based on gettext. In a world with multiple languages we need UIs that can be easily translated to other languages. I haven't met a client-side web framework yet that can beat Obviel in this area.

    Obviel offers an i18n approach that lets you mark strings in your JavaScript code in the gettext style:

    _("My translatable string")

    and also lets you mark translatable strings in templates:

    <p data-trans="">My translatable string</p>

    Obviel then offers tools to automatically extract these strings into a .pot file that you can offer to translators in various ways; gettext has vary extensive tool support.

    See Obviel Template i18n and Internationalizing your Obviel Project for more information.

  • Extensive documentation. Documentation has been and is a priority and Obviel has been documented to bits.

  • Testing is as much a priority as the docs are: Obviel is extensively unit tested using the Buster.JS JavaScript testing framework.

  • Automatic form construction using client-side validation: Obviel Forms

    Obviel Forms: http://www.obviel.org/en/1.0/form.html

  • Routing: path (/foo/bar) to object and object to path with Traject.

    Traject lets you build a nested navigation space on the client-side. Not only can you route a path to an object, but you can also generate a path for an object, something that results in cleaner and more decoupled code.

What's next for Obviel?

We have been transitioning the code from bitbucket to github and from mercurial to git to make it more accessible to JavaScript hackers who are more familiar with git. We're still busy updating the docs, but the transfer has been complete and the code now lives on github.

We've been doing lots of research on using a JavaScript module system for Obviel so we can better maintain the codebase. I've been overwhelmed by the options, but soon we'll pick one, I promise! We'll also introduce various JavaScript codebase maintenance tools such as the Grunt task runner.

We are still busy working on a configurable transceiver framework for integrating Obviel with a diversity of backends (HTTP, websockets, localstorage): Obviel Sync. More on this soon!

JS Dependency Tools Redux

Introduction

[UPDATE: This post has new 2015 followup]

Recently I looked into JavaScript dependency management and wrote a long post about it where I was overwhelmed and found some solutions. Since then I've learned a few new things and have thought a bit more, and I thought I'd share.

My goal is still to be able to develop Obviel with smaller modules than I do now. Obviel is a client-side web framework, but besides that you should not need to know anything about it in order to understand this post.

So, I'm looking for a JavaScript dependency framework for the client-side.

AMD versus CommonJS

Last time I mentioned Asynchronous Module Definition (AMD) and how it contrasts with the CommonJS module definition. AMD wraps everything in your module in a function:

define([jquery], function($) {
   var myModule = {
      foo: function() { ... };
   };
   return myModule;
});

Whereas CommonJS uses this pattern:

var $ = require('jquery');

module.exports = { foo: function() { ... }};

Though AMD has sugar to make it look much like CommonJS plus the define() function wrapper.

AMD was designed to work in web-browsers, client-side, whereas CommonJS modules are mostly used on the server, typically in a NodeJS environment. AMD modules can be directly loaded into the browser without build step, which is an important advantage during development. AMD modules like CommonJS modules can also refer to individual modules in other packages. You could install such an AMD-based package using Bower, like you'd install a CommonJS-based package using npm.

The most well known AMD implementation is called RequireJS. There is a small implementation of it called Almond that RequireJS can use in JavaScript modules that are packaged together for release.

Recently I also learned about RaptorJS, which is a larger framework that features an extended AMD implementation as well as a separate server-side module loader. It contains some interesting ideas such as "adaptive packaging" which helps adjust codebases to different target environments (client, servers, different browsers, etc).

CommonJS on the client

Of course people have worked on bringing CommonJS modules to the client too. And have they! I ran into these so far:

May a thousand flowers bloom! Overwhelmed, me?

I've done a cursory review of these, so I apologize if I get something wrong, but here goes:

  • browserify is a tool that can take a file with a CommonJS module (and its dependencies) and bundle them all up in a .js file you can use in the browser.

  • OneJS seems to do something very similar. The docs don't make it immediately obvious what its distinctive features are.

  • commonjs-everywhere does the same again (I think... the docs are a bit technical...), but has more features.

    One cool thing is source maps support. Source maps I just found out about: they are basically a way to add debugging symbols to a minified, bundled .js file so the debugger can find out about the original source. This is handy if you bundle plain JS, and also makes it possible to offer better debugger support for languages such as CoffeeScript which compile to JavaScript. Source maps are only supported in Chrome right now, with Firefox support coming up.

    [update: a commenter pointed out that browserify supports source maps too,

    pass a --debug flag during building to enable this]

  • browser-build is a very recent implementation that does the same as the others, but is focused on performance: producing a browser version of CommonJS modules of your project really fast, so you never have to wait for your tool during development. It has support for source maps.

    But it also does something more: if you write your code in plain JavaScript (as opposed to CoffeeScript, say), it makes your original modules available to the browser in only very slightly edited form (the line numbers are the same). This should help debugging a great deal in browsers that don't support source maps.

    But I'm unsure about the details, as browser-build needs to have CoffeeScript available to compile the sources and run it and I too lazy to try this right now.

  • component is also a package manager (I'll mention it again later in that context), but also contains a build system to generate a single .js file from CommonJS modules.

All of these approaches need a build step during development, which makes debugging harder, though source maps will help. browser-build minimizes the build step during development to the bare minimum, however, and this will help debugging.

Both? uRequire

Then there's uRequire.

uRequire allows you to use CommonJS or AMD or both and converts them to whatever is needed. It talks about a Universal Module Definition (UMD) I haven't studied yet, but apparently its not necessary to use its boilerplate when using uRequire. From what I understand in order to use code in the browser a build step is required.

JS frameworks with a module system

There are lots of JavaScript client-side frameworks that have grown their own module systems. I'll only talk about a few here that seem relevant.

  • Dojo had its own module system, but has started to use AMD in recent versions and has been pushing this forward. You can use Dojo modules directly in your own codebase - this kind of inter-package reuse is something I will go into more detail later.

  • Closure Tools by Google contains a lot of things, such as a powerful JavaScript compiler and various JavaScript libraries. It also features its own module system, which I'll also talk about more later.

JavaScript Packages and Modules Redux

In the previous blog entry I explored some concepts surrounding dependencies and modules. I've had some new insights on how these concepts apply to the JavaScript world. I'll review some of this again, this time with a focus on what these concepts look like in JavaScript.

  • a project is the codebase you're hacking on. In open-source JS land, typically it's something published on github. It's in hackable form, so contains a lot of separate .js files to separate concerns: modules.

  • a module is a JavaScript file that provides some functionality by exposing an API.

  • a module may depend on another module. A module dependency is expressed in the module itself, in source code. In JavaScript there are multiple systems for expressing modules and their dependencies, such as CommonJS, AMD and Google Closure.

  • a package is a collection of one or more modules that is published somewhere so others may use it (this may be published on the internet, or internal to a project). It has metadata that describes the package, its version number, who wrote it, and what other published packages it depends on.

    CommonJS packages on the server-side are distributed as essentially an archive of a CommonJS project: a lib directory full of modules, with package.json for metadata.

    Traditionally client-side JavaScript packages are just distributed as URLs pointing to a .js file that people can download. So, to get jQuery, you download a version at the jQuery website. This is a very large difference between browser and server.

    Bower packages are a formalization of this traditional pattern: there is a single .js file with bower.json metadata to describe it. Bower adds metadata (bower.json) and a package index to the original story.

    In fact the Bower story is more complicated: it does allow you to package up a directory of multiple modules too, which you could then use using, say, RequireJS. This is an entirely different way to use modules, but Bower is agnostic and just installs the stuff. Bower also supports listing more than one .js file in its bower.json configuration file; it's unclear to me what the semantics of this is exactly.

  • Package generation. This is something I skipped in the previous discussion of concepts, but is very important in the JavaScript world especially.

    CommonJS packages are just archived versions of a particular project layout: a directory with a package.json, with a lib subdirectory which contains the individual .js modules.

    Browser-targeted packages are most commonly shipped as a single .js file as mentioned before. In the most simple case you maintain this .js file by hand directly and give it to others to use.

    But you can also generate a single .js package file by compiling a bunch of .js module files together. This is what the CommonJS generators described above do, except for browser-build, which actually maintains a tree of unbundled .js modules.

    The realization I had, perhaps obvious, is that a client-side JavaScript package is often shipped as a single compiled .js file. It's like how a C linker works - it bundles individually units into a larger library file (.so, .dll).

  • A package manager is a tool that installs a package into your system so you can start using it. npm is popular for NodeJS, Bower is focused on client-side packages and tries to be very package format agnostic. component contains a package manager too, centered around CommonJS (and also the build tool I mentioned earlier).

  • A package registry is a system where packages can be registered so that others may find and download them. npm has an index, and so do Bower and component.

MantriJS

Another dependency system I ran into since my last post is MantriJS. MantriJS is built around the Google Closure Tools but hides them from the developer, except for the dependency system.

You define a module that depends on others like this in Mantri/Closure Tools:

goog.provide('obviel.template');
goog.require('obviel.util');

obviel.template.foo = function() { ... };

Here you say you are defining a module called obviel.template and that in this module obviel.util needs to be available. Once you require something you have that available to use in your module, so you can now do this:

obviel.util.blah();

Mantri has a build step for development, but only to build a deps.js file and only when you've changed a dependency. The modules themselves are directly exposed to the browser during development, meaning you can debug them. In this it looks quite similar to browser-build, though browser-build does touch the individual modules in a minor way, something MantriJS does not.

MantriJS does offer a build step to generate a single file .js package from your modules, using the Closure Tools.

I tried to see whether MantriJS was easy to integrate with the Buster.JS test runner; I had to wrestle quite a bit to get RequireJS running properly before. It turned out to be very easy (it Just Worked ™!). See the jspakmantri example project and compare it with the original RequireJS-based jspak project if you like.

Thinking about MantriJS I realized something: MantriJS actually allows you to have modules the way many existing client libraries do it: create a namespace object and fill it with properties you want to expose. This is important to me because that's how Obviel does it now, and I'd prefer not to break that client API.

Global Namespace Modules

So what is this client library module definition pattern MantriJS supports? Everybody is familiar with it. It's what jQuery does for instance: it fills the jQuery object ($) with the jQuery API and exposes this.

For example, to make a module, you create an empty object, perhaps listed in another object to namespace it, and make it globally available:

var obviel = {};
obviel.template = {};

You then fill it with the API you want somehow, for instance like this:

obviel.template.foo = function() { ... };

or like this:

(function(module) {
   module.foo = function() { ... };
}(obviel.template));

To use a module from another one, you simply refer to it:

obviel.template.foo();

That's all that's needed, but there are also frameworks that help you declare and use modules like this, such as MantriJS mentioned earlier; YUI has another one. The primary benefit these add is the ability to express module dependencies better, avoiding the need to mess around with <script> tags.

So this pattern is neither CommonJS or AMD. But it is very widely used on the client-side. Obviel uses it for instance, and Backbone too, and Ember, and Knockout, and Mustache, and YUI, and Google Closure Tools. To just list a few. Let's call it the Global Namespace Modules pattern (GNM).

GNM is not a module definition pattern like CommonJS or AMD. Instead it is defined by how modules are used: you refer to the API of a module using a global namespace that the module exposes (jQuery, obviel, Backbone, Mustache, etc).

GNM assumes that modules are loaded in a particular order, synchronously. You ensure this order by listing <script> tags in a particular order, or by using a smart module loader like MantriJS, or by bundling modules in order into a single .js package file.

Getting this more clear for myself is quite important to me. It had been bugging me for a while after reviewing RequireJS: if I start using it for Obviel, do I need to to break the Obviel API, which assumes GNM. Or do I tell all developers to start using AMD for their code that uses Obviel too?

[update: here is a post with more on this pattern; here's another]

Requirements

After thinking about all this, here are some varying requirements for a JavaScript module dependency system. Ideally Obviel can adopt one that has all of these properties, or as close as possible:

  • automated loader: no <script> tag management. (loader)

  • encourage fine-grained modules. (fine)

  • being able to use browser debuggers like Firebug or the Chrome Dev tools. (debug)

  • source maps not required: being able to use these debuggers without relying on new source map technology. (nosm)

  • no build step needed during development. (nobuild)

  • support for exposing modules using the GNM pattern. Is this really important? Yes, as it's a very popular pattern on the web. Dojo went the way of telling people to use AMD for their own code, and that does help with fine-grained reuse between packages... (gnm)

  • compilation tools: bundling, minification to deliver easy to use .js files. This way the browser can load a package efficiently and it becomes easy for people to start using the API the package exposes: just drop in a file. (comp)

  • inter-package reuse: being able to require just one module from another package without having to load all of them. (reuse)

    There is some tension here with the bundling into a single .js package approach - if there's a module in a package that I don't use, why does it still get shipped to a web browser? On the server installing a bit more JS code in a package is not a problem, but on browsers people tend to start counting bytes.

    This tension can be reduced in various ways: jQuery now offers various smaller builds with less code. Build tools can cleverly only include modules that are really required, though for inter-package reuse this can defeat the benefit of caching.

  • integration with BusterJS test runner. As this is the test runner I use for Obviel. Preferably with the least hassle. (bjs)

  • CommonJS everywhere: client definition of modules same as on server, so CommonJS packages can be used on the client too. There is after all potentially a lot of useful code available as a CommonJS package that can be used on the client too, and potentially some of my Obviel code can be run on the server too. (cjs)

Review

Let's review some of the systems mentioned in the light of these requirements. If I get it wrong, please let me know!

system

loader

fine

debug

nosm

nobuild

gnm

comp

reuse

bjs

cjs

manual

N

N

Y

Y

Y

Y

Y

Y

Y

N

RequireJS

Y

Y

Y

Y

Y

N

Y

Y

Y

N

browserify

Y

Y

Y

N

N

N

Y

Y?

Y?

Y

cjs-everywhere

Y

Y

Y

N

N

N

Y

Y?

Y?

Y

browser-build

Y

Y

Y

Y

N

N

Y

Y?

Y?

Y

uRequire

Y

Y

?

?

N

N

Y

Y?

?

Y

MantriJS

Y

Y

Y

Y

N

Y

Y

N

Y

N

[update: source maps are also a browserify feature]

A few notes from the perspective of Obviel:

Nothing ticks all the boxes for Obviel from what I can see. RequireJS, MantriJS and browser-build come closest.

The manual system involves maintaining <script> tags yourself. That is what I'm doing with Obviel now. It involves no build step, so debugging is easy during development. It supports the popular global namespace modules pattern. If a framework exposes multiple modules that users are to include using <script> tags, like Obviel currently does, then inter-package reuse is possible. Compilation into a single .js file is not needed but there are tools that can do it for you. But it's not fine-grained at all, breaking a fundamental requirement for Obviel.

RequireJS is quite nice; script tag management goes away, no build step is needed but compilation to a .js file is still possible. It allows fine-grained reuse of modules in other RequireJS based packages, which is very nice. After some effort it integrates with BusterJS. But it doesn't offer Global Namespace Modules support out of the box. It shouldn't be too hard to make it do that, though, by simply exposing some modules myself, possibly during a build step.

The various CommonJS approaches are interesting. It is attractive is to be able to use same approach on the browser as on the server. But most tools require a bundling build step and I'd like to avoid having to rely on still uncommon source maps to do debugging. That's why browser-build is one of the more interesting ones, as it minimizes the build step required and makes debugging easier.

I still a bit unclear to me whether fine-grained module reuse of other npm-installed packages is possible - do these modules get exposed to the browser too (in a bundle or directly for browser-build?). From what I've read here and there I think so. I also haven't explored how easy it is to integrate these with client-side Buster (server-side Buster integration is supported by Buster), but I get the impression it's posible.

The CommonJS approaches don't offer Global Namespace Modules support so I'd have to hack that up as for RequireJS.

MantriJS was quite a revelation to me as it helped me come clarify my thinking about the Global Namespace Modules pattern. I've contacted the author and he's very responsive to my questions, also nice. It turned out to be dead-easy to integrate with Buster.JS. MantriJS assumes that external JS packages are bundled up in a single .js file for reuse however, so fine-grained module reuse of other packages is not possible.

Still overwhelmed

I'm still overwhelmed by the choices available as well as all the details. But I know a bit more about what's possible and what I want now. Are there any players in this field that I missed? Undoubtedly more will come out of the woodwork soon. What do you think about my requirements? Should I just give up on GNM, or forget about not having a build step during development? Am I missing an important requirement? Please let me know!

Obviel 1.0rc1 released!

I've just released Obviel 1.0rc1. This, or something very close to it, will be turned into 1.0 final.

What's new in Obviel 1.0rc1?

  • upgraded and tested with newer version of jQuery (1.9.x)

  • new Obviel Forms widget, passwordField

  • a few fixes and improvements in Obviel Template

  • a few fixes surrounding event handling

See the changelog for details (and ignore the (unreleased) bit -- I have too many things to update before a release and should automate it!). We've been getting more people contribute to Obviel lately, thank you and please do keep it up!

We're planning a few big changes for Obviel after 1.0:

  • transition code from bitbucket to github, as that seems to be the hub for JS related development and any exposure is good exposure.

  • finally finish Obviel Sync - a very configurable framework for syncing models with a backend (such as a HTTP server). We've made a huge leap forward with this recently, and there is also a lot of potential in this, so this will happen.

  • use more JavaScript tools to manage the codebase besides Buster.js which we already use for test running. The main aim right now is to split Obviel's code into smaller modules, and still generate a simple usable .js package so it's easy to include in a codebase. See this blog entry for some of my packaging explorations.

Overwhelmed by JavaScript Dependencies

Introduction

[UPDATE: This post has new 2015 followup]

This is about managing dependencies in a well-tested client-side JavaScript codebase, how I got overwhelmed, and how I automated the pieces to make it Just work ™.

The JavaScript world has grown a lot of tools for dependency management. I've dipped my toes into it into the past, but didn't really know much about it, especially on the client-side. Now I've done so, and I'm somewhat overwhelmed. I also found some solutions.

If you don't feel overwhelmed by JavaScript dependency management yet, this document can help you to become overwhelmed. And then perhaps it can help a little to become less so.

[update: I've created a followup to this post further analyzing the various options available]

A Client-side Codebase

The JavaScript client-side codebase I work on is fairly large; it's a JavaScript web framework called Obviel. Obviel is large immaterial to this document however, so don't worry about it. I'm going to talk about tools like Grunt and RequirejS and Bower and Buster instead.

So far I've managed dependencies for Obviel manually by adding the right <script> tags on web pages. If something needs foo.js and bar.js I need to manually make sure it's being included. Even if bar.js is only needed indirectly by foo.js and not by the code I'm writing myself. Managing <script> tags by hand is doable but annoying.

Obviel is extensively unit tested. For the tests the dependency situation is more involved. Obviel uses this nice JavaScript test runner called Buster.JS. It runs on the server and is based on Node.js. It features a command-line test runner that can be hooked up to actual web browsers so you can run a whole set of different tests automatically without having to go clicking about in web browsers.

Buster needs to be configured properly so that it knows what dependencies there are to run tests in the browser. This is done in a file called buster.js that you put in your project. You still need to explicitly manage dependencies by hand, though there are a few shortcuts such as including all .js files in a whole directory as dependencies at once.

The Goal

I've always been dissatisfied with all this explicit dependency management in Obviel's development. I want to use more fine-grained modules in Obviel. Even it is decently modular already, I want to break up Obviel into smaller modules still so I can more easily manage the code during development. A hash table implementation that can have object keys, for instance, should be in its own module, and not in, say, obviel-session.js. I want smaller modules that do one thing well. Deployment of client-side JavaScript code favors large .js files, but during development I want smaller ones.

Moreover, I want everything to Just Work ™. If use a new internal module in my project, or if I start depending on a new external library in my project, I should be able to use them right away without the hassle of config file editing. I should be able to hack on my code and write unit tests and depend on stuff in them without worrying.

My Python Background

Over the years, I've worked a lot with Python dependency management; PyPI, distutils, setuptools, pip etc. And build tools like buildout. And of course, the Python import statement. I learned a lot about JavaScript tools, and the Python tools I'm already familiar with helped me put them in the right context.

For JavaScript dependency management so far I've piggy-backed on Python's systems through Fanstatic, which I helped create. Fanstatic is a Python HTTP middleware that I helped create. It can automatically insert needed dependencies (.js, .css, etc) into a web page on the server-side. Using Fanstatic-wrapped Obviel (the js.obviel Python library) works pretty well in a Python project, but it doesn't work for developing Obviel and it won't work for people who don't use Fanstatic or even Python.

Dependency management

What do I mean by dependency management anyway? Let's write down a few concepts:

  • a project is the codebase I'm hacking on. It could be an application or a framework or a library. It's in hackable form, checked out from a version control repository such as git. You can check out jQuery or the Linux Kernel or Obviel as a project. A project can typically be used to generate one or more packages.

  • a module is a source code unit (normally a file) that provides some functionality. It contains things like functions and classes. Usually it exposes an API. A project typically contains multiple modules. So foo.js is a module, and so is foo.py, foo.c, etc.

  • a module may depend on another module. A module dependency is expressed in the module itself, in source code.

    In Python and many languages this is done by importing the module by name somehow:

    import foo

    The C module system is bolted on through textual inclusion during the compilation phase:

    #include <stdio.h>

    In JavaScript there is no native way to express module dependencies, but people have created frameworks for it such as Node's module loading system and RequireJS, which I'll go into later.

  • a package is a collection of one or more modules that is published somewhere so others may use it (this may be published on the internet, or internal to a project). It has metadata that describes the package, its version number, who wrote it, and what other published packages it depends on. In Python this information is in a file called setup.py, in JavaScript it's... well, it depends.

    I'll note that Linux distributions also feature packages that can be installed. I'll call these deployment packages. Deployment packages for various reasons are not very convenient to develop against. This is why many languages such as Python or JavaScript or Ruby or Perl have language-specific package systems. I'm focusing on such development-oriented packaging systems here.

  • a module in package A may depend on another module in package B. In this case package A describes in its metadata that it depends on package B. This is an external dependency on a module in another package.

  • A package manager is a tool that installs a package into your system so you can start using it. It's distinct from a version control system which installs a project into your system so you can start hacking on it, though package managers can be built on top of version control system software.

  • A package registry is a system where packages can be registered so that others may find and download them. CPAN is the package registry for Perl code, for instance. Some of these systems allow manual download of packages through a web interface as well as automated downloads; Python's PyPI is the example I'm most familiar with. JavaScript has several package registries.

When I develop a project I want to be able to express in my metadata somewhere that it depends on some packages, and in my project's modules I want to express what other modules they depend on.

I don't want to have to worry about other config files for this, as that's only more code to maintain and more mistakes to make.

And if I check out a project, I want to start hacking on it as soon as I can. To get the project's external dependencies I want to be able to run a command that does that for me.

Again, it should Just Work ™.

server-side js packaging

I mentioned Buster.JS before. Using Buster.JS and its various plugins for linting (jshint) and code coverage and so on introduced me to the world of npm, the package manager for Node.js.

npm is built around the CommonJS specs for JavaScript package management. There's a file called package.json that describes what dependencies packages have on others. Like what setup.py does for Python packages.

There's also a registry of published packages for npm; this is like PyPI for Python, CPAN for Perl, etc. npm lets you to download and install packages from this registry, either by hand or by reading a config file. This is much like easy_install or pip for Python.

So npm provides the equivalent of what pypi + distutils + distribute + pip is for Python. I haven't really studied npm in detail yet, but it seems nice. npm appears to be more coherent than the Python equivalents, which grew over the years layering on top of each other, sometimes glued together in hacky ways. npm looks cleaner.

server-side imports

Unlike Python and many other languages, JavaScript doesn't have a standard way to import modules. So people had to invent some!

In Node.JS, import works like this:

var otherModule = require('otherModule');

You can then use otherModule like a module, so call functions on it for instance:

otherModule.someFunction();

require() doesn't just take names, but paths, such as relative paths, so you can see stuff like this:

``require('../otherModule');``

to get a module from one directory higher up.

npm installs modules in a directory called node_modules which is often in the project's home directory. There are some lookup rules I don't quite fully grasp yet for getting modules with require() from other places.

client-side imports

But Obviel isn't a server-side JavaScript codebase; it's a client-side one that runs in the browser. So I need dependency management for browser JS.

The big fish here is RequireJS. It is a client-side library that implements a spec called Asynchronous Module Definition, or AMD. AMD modules look like this:

define(['jquery', 'obviel'], function($, obviel) {
   ...
   obviel.view({...});
   ...
   $('.foo').render(blah);
   ...
});

So, a module is wrapped in a function that's passed to define(). AMD also allows this:

define(function(require) {
   var otherModule = require('otherModule');
});

which starts to look a lot like CommonJS, expect this wrapper function around it all.

AMD was actually born originally as part of the CommonJS project, but there was some difference of opinion and the AMD folks went off on their own.

My understanding is that the main point of contention is that the AMD folks figured an explicit function wrapper was the way to go for client-side code to ensure maximum portability of JavaScript code (no preprocessing of JS code necessary), and because it's good practice anyway on the browser to avoid global variables from leaking out of your code. The CommonJS folks wanted the client-side code to look more like server-side modules. See the Why AMD? document for more on this from the AMD perspective.

How does RequireJS know where to find modules? You need to specify where it looks for modules in a config file. In that config file you can map one path to another, so you can tell it where to look for your project's modules, and where jQuery is and so on.

client-side packaging

That brings us to client-side packaging. One popular package manager for doing this is called Bower. It introduces this config file called bower.json which is like package.json (or setup.py) but then for declaring front-end package metadata and dependencies. Bower also introduces its own package registry and you have a command-line tool to install packages, which end up in a components directory, much like the way npm installs server-side packages into node_modules.

Overwhelmed yet?

So we have two different ways to define modules, and two different way to do packaging. I am simplifying matters here - CommonJS does offer definitions to transport modules to the client too, and there are ways to manage client-side packages using npm too, and I'm sure other package managers exist too.

You may start to agree with me that this is all somewhat overwhelming, especially if you're new to this! But we're not there yet...

bower and RequireJS

bower is agnostic as to what's in packages and how they're used; RequireJS is just one possibility. So when I install a package into a project using bower, RequireJS has no clue that it is even there, let alone where to import it from. As a result, I cannot import any modules from that package in my project's modules. I need to tell RequireJS first.

So if I install jQuery using bower, I still need to manually tell RequireJS in its config file how to find it: look in components/jquery/jquery.js for it, please. Only after that I can depend on it in my module.

This doesn't Just Work ™. I want to install something with bower and start using it right away. We need something to help glue this together.

Grunt

To construct complex Python projects I use this system called buildout. It can be used to pull in dependencies, install scripts, and automate all sorts of other tasks too. Buildout is driven from a configuration file - it's kind of like Make and the Makefile configuration file. s So JavaScript has some build automation systems too. One popular one is called Grunt. It takes a config file called Gruntfile.js. It can be extended with plugins which you install with npm.

Grunt is pretty useful to automate jobs such as gluing bower and RequireJS together.

grunt-bower-requirejs

Using npm, you can install a grunt plugin called grunt-bower-requirejs. You configure it up in Gruntfile.js. Now if you run grunt it will automatically make any dependencies installed using bower available for RequireJS. It does this by manipulating the RequireJS config to tell it where bower-installed packages are.

So now (at least after I run grunt), I can require() whatever bower-installed packages I like from my own JS code. Awesome!

Gluing up Buster

We're not all there yet. Remember the test runner I use, Buster? There is already a buster-amd plugin, which is needed to let Buster behave properly around RequireJS. Making this work did take somewhat tricky configuration featuring a pathMapper and a regex I don't quite understand, but okay.

There is also already a grunt-buster plugin. This can automatically start a PhantomJS based web browser to run the tests, and then run them, if I type grunt test. Pretty neat!

Is this enough to make things Just Work ™? After all I should be able to rely on RequireJS declare the dependencies for my test modules. But no...

As mentioned before, Buster actually has a special requirement if you run tests against a web browser: it needs to know what JS resources to publish to the web browser it runs the tests in, so that they are even available for RequireJS at all. It is kind of what Fanstatic does, actually!

grunt-bower-busterjs

So now Buster needs to know what client-side packages are installed through Bower too, just like RequireJS.

Unfortunately there wasn't a grunt plugin for this yet that I could find. Balazs Ree, my friend from the Zope community who also is doing lots of stuff with JavaScript, suggested creating something like grunt-bower-requirejs to create bower integration for buster. Good idea!

It turned out grunt-bower-requirejs was extremely close to what was needed already, so I forked it and hacked it up into grunt-bower-busterjs. When plugged into grunt, this generates a bowerbuster.json file. Following Balazs' advice I then tweaked Buster's buster.js configuration file to load up bowerbuster.json into the test sources list.

And then, at last, everything started to Just Work ™!

jspak - a sample project

There is a good chance you're now overwhelmed as I was. Hopefully I can help: I've pulled all this together into a sample project called jspak. It integrates bower and buster and grunt and RequireJS and seems to Just Work ™.

I will consult it myself when I start the job of converting Obviel to use it but perhaps it's useful for others too.

Thoughts

Here are a few thoughts concerning all this.

It would be nice if the JavaScript world could work out a system where I don't need 5 or 6 configuration files just to get a project going where I can install client-side packages and run the tests (Gruntfile.js, bower.json, bowerbuster.json, buster.js, package.json, rjs.js). I'm sure glad I got it working though!

Maybe such a system already exists; there just might be a parallel JavaScript ecosystem out there with yet another way to do packaging and imports that has Just Worked ™ all the time already. One never knows with JavaScript!

The Python packaging world feels a lot more comfortable to me than the JavaScript one. One obvious reason that doesn't really count is just because I'm used to Python packaging and am familiar with its quirks.

Another reason is that JavaScript actually runs in web browsers as well as on the server, while Python is used on the server only. This means JavaScript needs to solve problems that Python just doesn't have. (Though various projects exist that make something like Python run in the browser too. One wonders how packaging works for them.)

Finally an important reason is that Python actually has a freaking built-in import statement! People then kind of naturally gravitate towards using something that is already, instead of creating several different ways. JavaScript clearly doesn't follow the Zen of Python: "There should be one-- and preferably only one --obvious way to do it." ("Although that way may not be obvious at first unless you're Dutch." -- I'm Dutch :) )

Finally, a funny thing about JavaScript project names: Buster.JS, Node.js, CommonJS, RequireJS - not being very consistent with the spelling of the JS bit, are we? I'm a programmer and I'm trained to pay attention to irrelevant things like that.

Powerful composition with Obviel data-render

Obviel is the client side web framework I've been hacking on for the past few years. It has among many other things a client-side template language called Obviel Template. Obviel Template has a bunch of nice features such as i18n support, about which I'll write more later. But the feature I want to talk about today is the data-render directive.

Obviel lets you define views for types of objects, and render them on elements. See my previous blog entry on the core of Obviel for a very short introduction to this concept (note: data-handler has been renamed to data-on). In web applications your Obviel views are mostly going to have small templates - each view doing its own thing: rendering a template for a model, handling user events, etc. Complex UIs are created by composing views together.

data-render is the tool that lets you do this easily from within a template. If we have the following model of a todo list:

{
  iface: 'todo-list',
  todos: [{ iface: 'todo-item', title: 'Item 1' },
          { iface: 'todo-item', title: 'Item 2' }]
}

Then we can create a view to render that todo-list like this:

obviel.view({
   iface: 'todo-list',
   obvt: '<ul><li data-repeat="todos" data-render="@."></li></ul>'
});

What, you may ask, is @.? It is Obviel Template's way to express "the current object", in this case the current item in the todos array that we are looping through using data-repeat.

And we also need to be able to render an individual todo-item:

obviel.view({
   iface: 'todo-item',
   obvt: '<div>{title}</div>'
});

Some benefits:

  • each view is doing one and only one thing

  • if todo-item has an event handler (for instance to handle click events), this will just work:

    obviel.view({
       iface: 'todo-item',
       obvt: '<div data-on="click|clicked">{title}</div>',
       clicked: function() {
         alert("You clicked!");
       }
    });

    [update: fixed bug in code example by renaming 'click' to 'clicked']

  • The todo-list view is agnostic about what individual entries of the todos array actually are. So if you add a special-todo-item to the todos array, and provide a view for it, it will just work too. This is pretty powerful if you are going to hook up entirely different events in it!

Have fun with Obviel and see you next time!

Obviel 1.0b6

I've just released Obviel 1.0b6!

There are backwards incompatible changes to Obviel Template in this release. I figured now was the time to do so, as it's not widely adopted yet. The general pattern is to use imperative verbs where possible for directives (though data-on is an exception):

  • data-each was renamed to data-repeat and @each to @repeat.

  • data-view becomes data-render, as it really calls Obviel's render() function to render an object.

  • data-func becomes data-call, as it really calls a function on the view.

  • data-handler becomes data-on as its shorter and on typically implies event handling.

A simple search and replace in your templates for these names should be enough to update your templates.

There is also a bugfix in Obviel Template that was contributed by Daniel Havlik, thank you!

You can find out more about Obviel here:

http://www.obviel.org/en/1.0b6/

and here is the complete changelog:

http://www.obviel.org/en/1.0b6/CHANGES.html

Have fun and let me know what you think of it!

Looking for project!

I'm looking for a project to inspire me! I'm a creative developer with 15 years experience doing Python and the web. So much experience doing web applications does not mean I'm stuck in the past; I'm an accomplished client-side JavaScript developer too (who created a client-side web framework).

You can check out http://startifact.com for an overview of some of the stuff I've done.

If you're interested in hiring me to help you improve your project or create something new, please contact me at faassen@startifact.com

I hope to hear from you!

The Story of None: Part 6 - Avoiding It

The Story of None: Part 6 - Avoiding It

part 1 part 2 part 3 part 4 part 5 part 6

Last time...

Last time we've discussed guard clauses and when not to use them. We've discussed the paranoia developers sometimes feel that causes them to write useless or even harmful guard clauses. The best way to reduce paranoia about None is to make sure it can't be there in the first place.

So let's talk about ways to accomplish this.

Date Validator Redux

The date validator in its last incarnation looked like this:

def validate_end_date_later_than_start(start_date, end_date):
    if start_date is None or end_date is None:
        return
    if end_date <= start_date:
        raise ValidationError(
            "The end date should be later than the start date.")

Here we want to validate two date values which may be missing, in which case we treat start_date as "indefinite past" and end_date as "indefinite future".

We could create special sentinel objects for "indefinite future" and "indefinite past":

INDEFINITE_PAST = date(datetime.MINYEAR, 1, 1)
INDEFINITE_FUTURE = date(datetime.MAXYEAR, 12, 31)

where MINYEAR is 1 and MAXYEAR is 9999.

(Too bad datetime doesn't allow negative dates, or we could've used Bishop Ussher's date for the creation of the universe, date(-4004, 10, 23). Though that's in the proleptic Julian calendar, and I don't care to know what that is right now. Plus it's bogus. But it'd be amusing.)

If we can be sure that those are used instead of None before the validate_end_date_later_than_start function is called, we can simplify it to this:

def validate_end_date_later_than_start(start_date, end_date):
    if end_date <= start_date:
        raise ValidationError(
            "The end date should be later than the start date.")

which is what we started out with in the first place in Part 1 long ago, without any guards! Awesome!

Edge case

This handwaves the edge case where start_date and end_date are both equal to INDEFINITE_PAST or INDEFINITE_FUTURE, which can be argued should not raise ValidationError. In software for a time machine it might be important to get this right, but in many applications not handling this edge case is fine.

Really avoiding the edge cases

If we insist on making the edge case go away, we could deal with it by subclassing the date class to construct these sentinels instead:

class IndefinitePast(date):
    def __lt__(self, other):
        return True

    def __le__(self, other):
        return True

    def __gt__(self, other):
        return False

    def __ge__(self, other):
        return False

class IndefiniteFuture(date):
    def __lt__(self, other):
        return False

    def __le__(self, other):
        return False

    def __gt__(self, other):
        return True

    def __ge__(self, other):
        return True

INDEFINITE_PAST = IndefinitePast(datetime.MINYEAR, 1, 1)
INDEFINITE_FUTURE = IndefiniteFuture(datetime.MAXYEAR, 12, 31)

This is a lot more code though, and therefore in many situations this would be overkill.

(As a puzzle for the reader in this case one could safely skip implementing __le__ and __ge__ for these classes and still have it all work for any possible date. I kept them in for clarity.)

Normalization

So what have we done here? We've made sure that our input was normalized to a date before it even reached the validation function. This way we don't have to worry about our friend None when we deal with dates.

The idea is to normalize the input a soon as possible before it reaches the rest of our codebase, so we can stop worrying about non-normalized cases (such as None) everywhere else. In effect you put the guard clauses as far on the outside of the calling chain as possible.

In the case of our date input, somewhere in the input processing we'd call these functions:

def process_start_date(d):
     if d is None:
         return INDEFINITE_PAST
     return d

def process_end_date(d):
     if d is None:
         return INDEFINITE_FUTURE
     return d

None of those None's to worry about anymore after that!

Drawbacks

Normalization also has some potential drawbacks. Here are some that may apply to this case:

  • to understand how empty date fields are treated in the validation function, we need to read normalization code that may be somewhere else. Our validation function that worried about None was all in one place.

  • it's more code to understand and maintain, especially with the custom date subclasses.

  • normalization of None to a date may be nice during validation, but it may not be what we want to store in a database; we might want to store None there. If we have this requirement we'd need two code paths: one for storage and one for validation.

It all depends on the exact details of your project. If the project is going to compare a lot of dates in many places, it makes sense to normalize missing values to proper dates as soon as possible, and it's a much better approach than having to worry about None everywhere. But if the project only needs a single validation rule that can handle missing dates, then it makes more sense to write one that deals with None directly.

Conclusion

This concludes the Story of None! I hope you've enjoyed it! Perhaps you've learned something.

Let me know if you would like to see more stuff like this - discussions of fairly low-level patterns that happen during development.

part 1 part 2 part 3 part 4 part 5 part 6

The Story of None: Part 5 - More on Guarding

The Story of None: Part 5 -- More on Guarding

part 1 part 2 part 3 part 4 part 5 part 6

Last time...

Last time in the Story of None we've discussed the concept of a guard clause. This is simply an if statement at the beginning of a function that returns early if a certain condition is true.

The guard clause pattern is applicable to more than just the None scenario. We could be writing a function where we need a specific treatment if an integer value is 0, for instance. A guard clause often does more than just a bare boned return. Instead, it could return a value or raise an exception.

So let's discuss some other brief examples of guard clauses.

Raising exceptions

Raising an exception is good when the input really should be considered an error and the developer should know about it:

def divide(a, b):
    if b == 0:
       raise ZeroDivisionError("Really I don't know, man!")
    return a / b

In this case you know that the way to handle b being 0 is to not handle it and instead to loudly complain that there is some error, by raising an exception.

Complaining loudly is important: it is tempting to make up some return value and let the code proceed. We'll go into this in more detail later on.

Dictionary .get guard clause

Let's say we want to implement dictionary .get as a function ourselves, with a guard against a missing dictionary key:

def get(d, key, default=None):
    if key not in d:
        return default
    return d[key]

This guard clause returns a default value if the guard clause condition is true. As you can see here the guard clause can be dependent on multiple arguments, in this case d and key; if the function in question is a method, it can be dependent on object state as well.

Complain loudly for possible input that you cannot handle; it makes debugging easier.

Guard clauses in recursion

Guard clauses often occur in recursive functions, where they guard against the stopping criterion.

Let's consider this rather contrived (as there are much better implementations without recursion in Python) example where we recursively add all numbers in a list:

def add(numbers):
    if len(numbers) == 0:
        return 0
    return numbers[0] + add(numbers[1:])

The main part of the function says: the sum of all the numbers in a list numbers is the first entry in that list added to the sum of the rest of the entries in the list.

But what if the list of numbers is empty? We cannot obtain the first entry in that case, so we need some kind of guard clause to handle this. We know that the resulting sum of an empty list should be 0, so we can simply return this. This is also the stopping criterion for the recursion, because we cannot recurse further into an empty list.

Don't be paranoid

Don't overdo it, and put in guard clauses that guard against cases where you don't actually know they can happen, or where you don't know how to handle them. Guard against exceptional forms of expected input, not against input that is unexpected altogether. Guarding against the expected is sensible, but guarding against the unexpected is paranoia.

So, in the case of None, we don't want to clutter our code with lots of guard clauses just to make sure that the input wasn't None if we don't even know that the input can be None in the first place.

Python tends to do the right thing in the face of the unexpected: its core operations tends to fail early with an exception if asked to do something which they cannot handle: dividing by zero, comparing a date with None, getting a non-existent key from a dictionary. Rely on this behavior, be happy Python is eager to tell you something is wrong, and avoid clutter in your code.

Other languages like JavaScript instead of failing sometimes continue even in the face of the unexpected: they let you add a string to a number, and if a property is missing you don't get an exception but a value undefined. This makes JavaScript in my experience harder to debug than Python. But I still don't clutter my JavaScript code with all kinds of paranoid guard clauses, because I still don't expect these cases, and I don't want to clutter up my code.

In statically typed programming languages such as Java you have to specify exactly what type of input arguments to expect, and the system will fail loudly if you do something that isn't expected. Languages like that are a bit paranoid by nature and you'll have to follow their rules. What they do in the case of failure is correct: fail loudly as early as possible. In dynamically typed languages such as Python or JavaScript you don't specify types for the sake of less cluttered, more generic code.

If you feel paranoid

We all feel paranoid sometimes. Sometimes we think we need to handle some type of input. If you feel inclined to handle something but aren't sure what to do, here is a list of things to consider:

  • Don't return from the function early. This is the worst thing you can do.

    If you handle an unexpected value by returning something you make up, you really are creating a bug. Made-up data is now propagating further through the codebase. You either end up with an exception deeper down in call chain where it becomes harder to debug, or you end up with something worse: a seemingly functional program which delivers bogus data.

    You may think you can avoid returning something made-up by using a plain return statement. But if you do that in case of a function that really needs to return a result, you are implicitly returning None (in Python) or undefined (in JavaScript). This is the same as returning made-up value: this value will likely be used later on, and you'll either get a harder to debug error later on, or bogus results.

    It might seem like we did such a plain return in case of the date validation function; we got a None and we handled it by returning early.

    But in the case of the date validation function, None was according to our requirements expected input, just an exceptional case where the normal case was date input. And a validation function like this has no return value at all, and can stop validating right away as soon as the input is judged valid, so returning early is fine.

  • Don't overuse print statements.

    You could use a print statement to print out arguments, so you can see whether they are unexpected by reading the output.

    Using print is a totally legitimate debugging strategy, but make sure the print is removed from your code after you're done debugging!

    You don't want to clutter up your code with a lot of print statements. You'll get a lot of output that will be hard to understand.

    While print is quick and appropriate sometimes, do consider learning a debugger tool for your language as it can help a lot for more complex cases. For Python the built-in debugger is pdb for instance.

  • Don't log everything.

    Logging for debugging purposes is the advanced form of using print statements. At least it doesn't clutter the standard output. But logging is still clutter in the code. And if you fill log files with a lot of debugging information, it will still be hard to find out whether, if anything, is really wrong.

    Logging is very useful, but I prefer logging to be application-appropriate, not to help debug the program flow. If I want to debug program flow I use a debugger or a bunch of throw-away print statements.

    Of course there are exceptions to this rule; you might for instance want to log debugging information if a bug is hard to find and turns up in production. But use it in moderation, only when necessary.

  • Don't print or log and then return early with a made up value.

    You can print or log some diagnostic information (the value of arguments, say) and then return early with a made-up value.

    The impulse to want to get diagnostics is correct. The impulse to stop going further is also correct. But returning with a made-up value is still wrong -- and you aren't really stopping your program anyway.

    If you return with a made-up value, your program will continue and will likely log some more information. But since you've returned a made-up value, everything logged after this case is nonsense; there's no point in continuing.

    You could instead print diagnostics and do an early exit (sys.exit() in Python). Instead, if your language lets you do it, just throw an exception.

  • Throw an exception.

    If you feel you need to handle an unexpected case, throw an exception. In Python you can throw a basic exception like this:

    if value == 0:
      raise Exception("Something went wrong")

    But of course it makes sense to use more specific exceptions when you can:

    if not isinstance(value, basestring):
       raise TypeError("Expected string, not %r" % value)

    Sometimes you need to make up new exceptions specific to your library or application as none of the built-in ones is appropriate:

    class WorkflowError(Exception):
      pass
    
    ...
    
    if invalid(workflow_state):
       raise WorkflowError("Invalid workflow state: %s" % workflow_state)

    Exceptions do the right stuff automatically:

    • bail out early when you can't handle something.

    • give diagnostic information in the form of a message and a traceback of the function call chain.

    • allow you to handle them after all if you want.

  • Do nothing.

    Doing nothing is often the right impulse.

    In the case of unexpected input, I can often rely on the language to fail with an exception anyway in the appropriate spot.

Next time we'll consider a way to avoid having to scatter guard clauses throughout our codebase: normalization.

part 1 part 2 part 3 part 4 part 5 part 6

The Story of None: Part 4 - Guard Clauses

The Story of None: Part 4 - Guard Clauses

part 1 part 2 part 3 part 4 part 5 part 6

Last time...

Last time in the Story of None we ended up with this validation function:

def validate_end_date_later_than_start(start_date, end_date):
    if start_date is None and end_date is None:
        return
    if start_date is None:
        return
    if end_date is None:
        return
    if end_date <= start_date:
        raise ValidationError(
            "The end date should be later than the start date.")

So let's apply some boolean logic and rewrite all this to be slightly shorter (but still readable):

def validate_end_date_later_than_start(start_date, end_date):
    if start_date is None or end_date is None:
        return
    if end_date <= start_date:
        raise ValidationError(
            "The end date should be later than the start date.")

Guard Clauses

What we've written above with the if ... return statement is a guard clause: a check at the beginning of a function that returns early from the function if the condition is true.

Instead we could have written the function without guard clauses, like this:

def validate_end_date_later_than_start(start_date, end_date):
    if start_date is not None and end_date is not None:
        if end_date <= start_date:
           raise ValidationError(
              "The end date should be later than the start date.")

or like this:

def validate_end_date_later_than_start(start_date, end_date):
    if (start_date is not None and
        end_date is not None and
        end_date <= start_date):
           raise ValidationError(
              "The end date should be later than the start date.")

I think both alternatives are a lot less clear than the one with the guard clause in it.

With a function it's better if the main thing that it is doing, in this case comparing end_date with start_date, is also the main unindented block. This makes the function easier to read. If you make the main behavior of the function be inside some other if clause, it becomes slightly harder to read.

The second alternative doesn't create this second level of indentation, but still makes the comparison between start and end date rather hidden among the is not None clauses. This makes the expression a bit harder to think about compared to the version of the function that uses the guard clause.

Guard and Stop Worrying

The great advantage of a guard clause is that you can forget about the stuff you're guarding against in the code you write after the guard clause. So once we have passed the guards, we can entirely stop worrying about start_date or end_date being None; we've already guarded against those cases and handled them.

This is great! We can compare start_date and end_date freely. We can call methods on start_date and end_date safely. We can pass start_date and end_date to other functions and they don't need to worry about them being None either, unless of course such functions can be called from somewhere else where it could be None.

We worry about None so we can stop worrying about None when it matters. Ah, such a relief! Thank you guard clauses!

We'll talk a bit more about guard clauses next.

part 1 part 2 part 3 part 4 part 5 part 6