|
|
Programming Interview Question
What does the following code do?
FooClass::FooClass(BarClass* rep) : _rep(rep)
{
assert(rep);
}
I had to ask someone wtf was going on here. (-: posted at: 14:41 | path: /rants/c++ | permanent link to this entry
Samba 4
Tridge, and a handful of others on the Samba Team, have been
working on a rewrite of Samba. Slashdot
trolls notwithstanding, it's coming along very nicely and at a
much greater speed than I had expected.
There are a number of interesting design patterns that have
emerged.
- Test-driven development
Samba 4 started out as a rewrite of the lowest level protocol layer in
CIFS, the SMB layer. Each SMB (there are seventy three distinct SMB
messages) was re-implemented from scratch and tests written to
exercise every possible field of each SMB. For parts of the protocol
that return the same information, such as the seventeen different ways
of asking for the length of a file, that information is cross checked
with the other methods. Using this technique tridge found a number of
bugs in Windows 2003 Server, including some that to fix would require the
server to be rebooted or the operating system completely reinstalled.
Once there is a body of test code to be used, refactoring the code
becomes a much more manageable task. This is one of the tenets of
Extreme Programming. Having test code also encourages a culture of
test case development, especially if the tests can be run easily.
Contributors to the project can be confident that their change is good
if the existing and any new tests pass.
- Use of code generation tools
With the low level SMB layer complete, focus has moved to the RPC
layer. Again there is a suite of tests for all known RPC operations
written in parallel with the code. All the RPC related code (header
files, marshaling, unmarshalling and debug code) is generated from IDL
files using an IDL compiler written in Perl.
Previously Samba had handwritten marshaling code that was painful
maintain and hard to write in the first place. The advantage of
automatically generated code is that alignment bugs can be fixed in
the compiler and thus whole classes of bugs can be fixed at once
instead of just one instance.
Now the really neat thing is that there are tests that check the
marshaling and unmarshaling code at the same time. When a blob of
data is marshaled, it is also passed through the unmarshaler and the
two blobs compared. If they are not equal then there is a bug
somewhere.
- Pool based memory allocation
Allocating memory in pools is an nice technique for managing dynamic
memory allocation in the face of complicated data structures. The
idea is that all memory allocated is associated with a "pool" which
can be freed with a single function call. This frees the programming
from having to iterate over elements of a list, array or other deep
structure calling free() on memory blocks in the correct
order. Samba uses routines talloc.c or "trivial alloc" which
is simply a structure that holds a linked list of pointers to
allocated blocks. The talloc_free() function simply iterates
over the list and frees each block.
One participant on the #samba-techical IRC channel said that
using talloc() was tantamount to "giving up on doing memory
allocation properly". While there is something to be said for donning
the hair shirt and making sure every single malloc() is
matched with a corresponding call to free() this rapidly
becomes a difficult task, especially with large nested data
structures. Being able to allocate memory and not have to worry about
the consequences is almost like using a modern language with built-in
garbage collection like Python or Perl. (-:
Memory bugs in Samba 4 and to a lesser extent Samba 3 are now
reduced to simply forgetting to free a talloc context, or allocating
memory from the correct context. The "correct context" is the talloc
context with the smallest lifetime and is usually obvious from reading
the code.
Any discussion of patterns would be incomplete without some cool
anti-patterns. There are still a number of things that annoy me about
Samba.
- Global prototype file
Samba has a big honking automatically generated header file that
contains the function prototypes for all non-static functions. While
this is a quick way of keeping header file prototypes up to date, it
encourages monolithic design because it's easy just to add a function
to a random file, type make proto and continue on your way.
Samba should have a small number of utility libraries that export
interfaces to be used by other parts of Samba, or third party
programs.
Tridge is very much against removing the global header file for a
number of reasons. I think the issues are a bit confused. I break
them down like this:
- Problem: It's too hard to manage the dependencies of system
header files.
Autoconf does a great job of working out which header files are where.
Why not switch to a global include file that includes every system
header from the right place and in the right order?
- Problem: The global header file is needed to keep function prototypes automatically up to date.
I think this argument is particularly bogus as gcc does
plenty of checking at compile time to ensure the header and it's
implementation are consistent. It's a simple matter to cut and paste
the prototype or just edit it by hand. Exactly how many functions are
you going to be adding or changing at any one time anyway? The Ethereal project has header files
maintained by hand and it is not really too much trouble to update the
.h file if you change the .c file.
- No header file dependencies in build system
Traditionally, having an accurate representation of header file
dependencies is one of the main failings of large build systems. This
is a hard task as maintaining them by hand is next to impossible so
one is left with the various automatic solutions based on scripts or
gcc compiler extensions. Usually broken header dependencies is a
result of using recursive make (see my favourite discussion of the
topic here)
but in Samba's case it is laziness encouraged as aresult of the global
header include file.
A symptom of bad dependencies is when a make clean is
required before make will rebuild files that need to be
recompiled. In Samba's case this problem is linked to the previous
one about a global include file.
This is a sign that the project is badly organised with no
separation of the application logic and groups of utility functions
needed to implement that logic. Samba 2/3 depends on a large set of
files in the lib and libsmb which in turn depend on
random parts of each other. This makes the job of dividing the code
into modular sections hard.
My proposed solution is to use some automated generation of header
file dependencies as seen in many other projects (c.f ethereal).
Unfortunately(?) most of these techniques require the use of GNU make.
It would be nice to assert that a requirement for building Samba is
that you must have GNU make. (Ha ha - can't compile GNU make on your
system). Another solution is to only enable header file dependencies
on systems that have GNU make installed. Samba development is
primarily done on these systems anyway.
My final comment is that fixing header file dependencies will
require the global include file to be replaced with more smaller
files. The reason being is that since everything depends on
proto.h changing anything at all in Samba will require every
object file to be rebuilt.
Despite the above two gripes, Samba 4 is forming in to a major
architectural and technical improvement over Samba 3. posted at: 14:15 | path: /software/samba | permanent link to this entry
More nice things about Subversion
The more I use Subversion the more I like
it. My favourite feature at the moment is the easy learning curve
when migrating from CVS. The command line and the output produced by
commands is very similar to CVS and quite Unixy. The built-in help is
consistent and useful.
There are even little hints when you use CVS syntax. For example
when diffing two respository versions, Subversion gently tells you
that you can't use two -r options ala CVS but rather -r
REVISION:REVISION2.
Nice. posted at: 13:02 | path: /software/subversion | permanent link to this entry Seen on slashdot...
"Avoid the slashdot effect, don't read the articles!" posted at: 10:29 | path: /internet/sigs | permanent link to this entry
The cost of operating system integration
I see several problems with integrating "non-core components" in to
an operating system. My example in this case is Internet Explorer.
- The component requires patching even if it is not being used.
From the latest update for Windows 2003 server:
"Security issues
identified in Internet Explorer could allow an attacker to compromise
systems with Internet Explorer installed (even if it not used as the
Web browser)."
It's a bit rich to use the phrase "systems with Internet Explorer
Installed" as if there is even a choice in the matter.
- Again, from Windows Update:
"After installation, you may have
to restart your computer."
Excuse me? Rebooting after upgrading a web browser?
- I've heard Tridge say that making technical decisions for
marketing or political reasons is nearly always a bad idea. I think
integrating IE into the operating system as an anti-competitive
measure is one of these bad ideas.
The problem that grates the most with me is the last one. Sacrificing
design quality for marketing reasons is one thing, but for political
(read antitrust) reasons is just insane.
posted at: 14:39 | path: /rants/microsoft | permanent link to this entry
The Microsoft Matrix
From http://satya.virtualave.net/msmatrix.html:
Like Keanu Reeves, most people's eyes will hurt when
they first look at the real world, because they've never used those
eyes before. But I've chosen that real world, because while the Matrix
of Linux has rules and regs every bit as stern -- and often sterner --
as the Matrix of Windows, that Big Difference pops up: unlike the
Microsoft Matrix, you can hack the Linux Matrix from the inside,
change that reality if you don't like it, and no-one will stop you --
they'll even applaud. You can unplug the steel tubes, squelch out of
the nutrient pod, and make your own way in the world. And having that
option -- even if you never use it -- makes a huge difference.
posted at: 11:48 | path: /rants/microsoft | permanent link to this entry
Spam du jour
It's probably too much to expect spammers to learn about word wrapping.
From: "Larry Moore."
To: tpot@samba.org
Subject: from Larry.
Hello,
This letter may come to you as a surprise due to the fact that we have
not
yet
met.
...
I have been diagnosed with prostate and esophageal cancer that was
discovered
very late due to my laxity in caring for my health. It has defiled all
form
of
medicine and right now, I have only about a few months to live
according to
medical experts.
Heh. posted at: 15:12 | path: /internet/spam | permanent link to this entry
Microsoft is destroying email
Traditionally people have been saying that email is unusable because of
spam. I have received more complains from users who have subscribed
(presumably posted as well) to the samba mailing list and have immediately
started receiving viruses. At least spammers are content to send you only
a handful of emails about penis enlargement for each email address.
Microsoft viruses just keep on sending you copies of themselves. There's
no inherent rate limiting in the process.
My conclusion here is that spam produces less traffic than Microsoft
viruses, hence email is rapidly becoming more unusable because of
them. posted at: 02:19 | path: /rants/microsoft | permanent link to this entry | |