The Shoes of the Fisherman's Wife Are Some Jive-Ass Slippers

tpot (at) frungy . org

rss

2003
Months
Nov

Wed, 26 Nov 2003

Programming Interview Question

What does the following code do?

FooClass::FooClass(BarClass* rep) : _rep(rep)
{
	assert(rep);
}
I had to ask someone wtf was going on here. (-: posted at: 14:41 | path: /rants/c++ | permanent link to this entry

Tue, 25 Nov 2003

Samba 4

Tridge, and a handful of others on the Samba Team, have been working on a rewrite of Samba. Slashdot trolls notwithstanding, it's coming along very nicely and at a much greater speed than I had expected.

There are a number of interesting design patterns that have emerged.

  • Test-driven development

    Samba 4 started out as a rewrite of the lowest level protocol layer in CIFS, the SMB layer. Each SMB (there are seventy three distinct SMB messages) was re-implemented from scratch and tests written to exercise every possible field of each SMB. For parts of the protocol that return the same information, such as the seventeen different ways of asking for the length of a file, that information is cross checked with the other methods. Using this technique tridge found a number of bugs in Windows 2003 Server, including some that to fix would require the server to be rebooted or the operating system completely reinstalled.

    Once there is a body of test code to be used, refactoring the code becomes a much more manageable task. This is one of the tenets of Extreme Programming. Having test code also encourages a culture of test case development, especially if the tests can be run easily. Contributors to the project can be confident that their change is good if the existing and any new tests pass.

  • Use of code generation tools

    With the low level SMB layer complete, focus has moved to the RPC layer. Again there is a suite of tests for all known RPC operations written in parallel with the code. All the RPC related code (header files, marshaling, unmarshalling and debug code) is generated from IDL files using an IDL compiler written in Perl.

    Previously Samba had handwritten marshaling code that was painful maintain and hard to write in the first place. The advantage of automatically generated code is that alignment bugs can be fixed in the compiler and thus whole classes of bugs can be fixed at once instead of just one instance.

    Now the really neat thing is that there are tests that check the marshaling and unmarshaling code at the same time. When a blob of data is marshaled, it is also passed through the unmarshaler and the two blobs compared. If they are not equal then there is a bug somewhere.

  • Pool based memory allocation

    Allocating memory in pools is an nice technique for managing dynamic memory allocation in the face of complicated data structures. The idea is that all memory allocated is associated with a "pool" which can be freed with a single function call. This frees the programming from having to iterate over elements of a list, array or other deep structure calling free() on memory blocks in the correct order. Samba uses routines talloc.c or "trivial alloc" which is simply a structure that holds a linked list of pointers to allocated blocks. The talloc_free() function simply iterates over the list and frees each block.

    One participant on the #samba-techical IRC channel said that using talloc() was tantamount to "giving up on doing memory allocation properly". While there is something to be said for donning the hair shirt and making sure every single malloc() is matched with a corresponding call to free() this rapidly becomes a difficult task, especially with large nested data structures. Being able to allocate memory and not have to worry about the consequences is almost like using a modern language with built-in garbage collection like Python or Perl. (-:

    Memory bugs in Samba 4 and to a lesser extent Samba 3 are now reduced to simply forgetting to free a talloc context, or allocating memory from the correct context. The "correct context" is the talloc context with the smallest lifetime and is usually obvious from reading the code.

Any discussion of patterns would be incomplete without some cool anti-patterns. There are still a number of things that annoy me about Samba.
  • Global prototype file

    Samba has a big honking automatically generated header file that contains the function prototypes for all non-static functions. While this is a quick way of keeping header file prototypes up to date, it encourages monolithic design because it's easy just to add a function to a random file, type make proto and continue on your way. Samba should have a small number of utility libraries that export interfaces to be used by other parts of Samba, or third party programs.

    Tridge is very much against removing the global header file for a number of reasons. I think the issues are a bit confused. I break them down like this:

    • Problem: It's too hard to manage the dependencies of system header files.

      Autoconf does a great job of working out which header files are where. Why not switch to a global include file that includes every system header from the right place and in the right order?

    • Problem: The global header file is needed to keep function prototypes automatically up to date.

      I think this argument is particularly bogus as gcc does plenty of checking at compile time to ensure the header and it's implementation are consistent. It's a simple matter to cut and paste the prototype or just edit it by hand. Exactly how many functions are you going to be adding or changing at any one time anyway? The Ethereal project has header files maintained by hand and it is not really too much trouble to update the .h file if you change the .c file.

  • No header file dependencies in build system

    Traditionally, having an accurate representation of header file dependencies is one of the main failings of large build systems. This is a hard task as maintaining them by hand is next to impossible so one is left with the various automatic solutions based on scripts or gcc compiler extensions. Usually broken header dependencies is a result of using recursive make (see my favourite discussion of the topic here) but in Samba's case it is laziness encouraged as aresult of the global header include file.

    A symptom of bad dependencies is when a make clean is required before make will rebuild files that need to be recompiled. In Samba's case this problem is linked to the previous one about a global include file.

    This is a sign that the project is badly organised with no separation of the application logic and groups of utility functions needed to implement that logic. Samba 2/3 depends on a large set of files in the lib and libsmb which in turn depend on random parts of each other. This makes the job of dividing the code into modular sections hard.

    My proposed solution is to use some automated generation of header file dependencies as seen in many other projects (c.f ethereal). Unfortunately(?) most of these techniques require the use of GNU make. It would be nice to assert that a requirement for building Samba is that you must have GNU make. (Ha ha - can't compile GNU make on your system). Another solution is to only enable header file dependencies on systems that have GNU make installed. Samba development is primarily done on these systems anyway.

    My final comment is that fixing header file dependencies will require the global include file to be replaced with more smaller files. The reason being is that since everything depends on proto.h changing anything at all in Samba will require every object file to be rebuilt.

Despite the above two gripes, Samba 4 is forming in to a major architectural and technical improvement over Samba 3. posted at: 14:15 | path: /software/samba | permanent link to this entry

Mon, 24 Nov 2003

More nice things about Subversion

The more I use Subversion the more I like it. My favourite feature at the moment is the easy learning curve when migrating from CVS. The command line and the output produced by commands is very similar to CVS and quite Unixy. The built-in help is consistent and useful.

There are even little hints when you use CVS syntax. For example when diffing two respository versions, Subversion gently tells you that you can't use two -r options ala CVS but rather -r REVISION:REVISION2.

Nice. posted at: 13:02 | path: /software/subversion | permanent link to this entry

Seen on slashdot...

"Avoid the slashdot effect, don't read the articles!" posted at: 10:29 | path: /internet/sigs | permanent link to this entry

Thu, 20 Nov 2003

The cost of operating system integration

I see several problems with integrating "non-core components" in to an operating system. My example in this case is Internet Explorer.

  1. The component requires patching even if it is not being used. From the latest update for Windows 2003 server:
    "Security issues identified in Internet Explorer could allow an attacker to compromise systems with Internet Explorer installed (even if it not used as the Web browser)."
    It's a bit rich to use the phrase "systems with Internet Explorer Installed" as if there is even a choice in the matter.

  2. Again, from Windows Update:
    "After installation, you may have to restart your computer."
    Excuse me? Rebooting after upgrading a web browser?

  3. I've heard Tridge say that making technical decisions for marketing or political reasons is nearly always a bad idea. I think integrating IE into the operating system as an anti-competitive measure is one of these bad ideas.
The problem that grates the most with me is the last one. Sacrificing design quality for marketing reasons is one thing, but for political (read antitrust) reasons is just insane. posted at: 14:39 | path: /rants/microsoft | permanent link to this entry

Tue, 18 Nov 2003

The Microsoft Matrix

From http://satya.virtualave.net/msmatrix.html:

Like Keanu Reeves, most people's eyes will hurt when they first look at the real world, because they've never used those eyes before. But I've chosen that real world, because while the Matrix of Linux has rules and regs every bit as stern -- and often sterner -- as the Matrix of Windows, that Big Difference pops up: unlike the Microsoft Matrix, you can hack the Linux Matrix from the inside, change that reality if you don't like it, and no-one will stop you -- they'll even applaud. You can unplug the steel tubes, squelch out of the nutrient pod, and make your own way in the world. And having that option -- even if you never use it -- makes a huge difference.
posted at: 11:48 | path: /rants/microsoft | permanent link to this entry

Sat, 15 Nov 2003

Spam du jour

It's probably too much to expect spammers to learn about word wrapping.

From: "Larry Moore." 
To: tpot@samba.org
Subject: from Larry.

Hello,
This letter may come to you as a surprise due to the fact that we have
not
yet
met.
...
I have been diagnosed with prostate and esophageal cancer that was
discovered
very late due to my laxity in caring for my health. It has defiled all
form
of
medicine and right now, I have only about a few months to live
according to
medical experts.
Heh. posted at: 15:12 | path: /internet/spam | permanent link to this entry

Fri, 07 Nov 2003

Microsoft is destroying email

Traditionally people have been saying that email is unusable because of spam. I have received more complains from users who have subscribed (presumably posted as well) to the samba mailing list and have immediately started receiving viruses. At least spammers are content to send you only a handful of emails about penis enlargement for each email address. Microsoft viruses just keep on sending you copies of themselves. There's no inherent rate limiting in the process.

My conclusion here is that spam produces less traffic than Microsoft viruses, hence email is rapidly becoming more unusable because of them. posted at: 02:19 | path: /rants/microsoft | permanent link to this entry