Category: Architecture

The 10 Most Common Mistakes by Tech Job Interviewees (and How to Work Around Them)

The job market is very tight and even if you are the smartest guy around, with tons of experience, and no salary requirements to speak of, you might find it a tad difficult to find a new position. That’s particularly true for those that are just starting out, or who have been out of the game for a while. Unless someone is actively pushing your name somewhere, you’ll see it’s harder and harder to get that job landed.

Having witnessed literally hundreds of interviews, I can speak with some authority on what works and what doesn’t from an interviewer’s perspective. There is outstanding advice out there for general interviewing (“Check your resume for spelling mistakes!”, “Nobody has ever been refused a job for wearing nice clothes!”, “Do not become confrontational!”) the advice is scant for the technical portion of your interview. Here my Top 10 mistakes made, and how to avoid them.
1. Saying, “You Can Always Look That Up,” Responding to a Basic Question.

If there is an abstruse question that required detail knowledge, you may well say you don’t know the details right now. But if someone asks you, “What character terminates a statement in C/C++/Java?” and the word “semi-colon” isn’t shooting out of your mouth at the speed of a Lincecum pitch, you are unlikely to impress. That question is the programming language equivalent of, “What’s the letter after ‘C’ in the alphabet?” Surely, you wouldn’t answer that you can look that up online!

Is the question you got asked a basic question? It helps if you consider when it was asked. Interviewers typically follow a sequential strategy – one field after another. They either aim low and go high, or they go the other way around. If you are interviewing for a junior position, they’ll typically start low, asking you easy questions – which you have to answer easily If you are interviewing for a senior position, they’ll typically start high, asking you tough ones. Depending on how you did, they will adjust their perceived level. If you don’t answer they way they expected, they will adjust down. If you do, they will aim higher.

How do you react if you are asked a basic question and you don’t know the answer? Tell the truth: you don’t know the answer. You may say that it slipped your mind, or that you are nervous. As long as there aren’t a lot of things you don’t know, it’s OK. And if there are lots of basic things you don’t know, there just isn’t a good save at all.

“You can always look that up” doesn’t work. Any interviewer will have heard that a few too many times from unprepared candidates, and the arrogance that comes with the statement – essentially accusing the interviewer of asking a frivolous question – reduces good-will.

2. Lying About Padding Your Resume

So you made the cardinal mistake of not sending out a copy of your resume tailored to the job at hand and the interviewer is amused by the list of 50 programming languages you have on there. Obviously, you are not an expert at all of them, and you probably barely more than heard the name on some of them. Unfortunately, the interviewer happens to be an expert on the last language on your list. So (s)he is going to ask you a question about – I don’t know – the main distinguishing characteristic of FORTH.

Instead of acting surprised that FORTH is on your resume and explaining you made a mistake, that you had put it on there because you were going to take a mandatory FORTH class that got canceled, and take the results in stride, you guess. Horrible things happen. You see, the interviewer asked a question because (s)he knows something about that language/technology. If you open your mouth, you are likely to say something terribly stupid to someone who cares. Very, very bad.

Instead, be honest. Say that you padded your resume (and think of a good reason ahead of time), and beg to move on to a topic that really matters.

3. Sending Out a Generic Resume

This one angers me when friends do it. If you are applying for a position, make sure that your resume works for it. Remove everything that is unrelated and emphasize the things that match. If you are applying for a job as a Java programmer, all those years of System Administration are not going to do you much good. If you are applying for a SysAdmin job, all those years of Java programming will make you look suspicious and likely to have you head in the clouds and your nose stuck in the air.

Take the time and send out a resume that works. You don’t have to write one resume per position – but you are going to apply to a set of job types, and if you are smart, you’ll find that you will need only a dozen or so at most resumes.

Why do you have to do that? Because people are prejudiced. They are not prejudiced because they are evil or malicious, but because they have to read through thousands of resumes, and unless you match, you are out of the running. It’s really not the interviewer’s fault: what would you do if you had a day job and then the need to read hundreds of resumes on top of it? Surely you wouldn’t spend a lot of time interviewing the person that doesn’t sound like a perfect match when you have dozens of perfect matches available, no?

Additionally, if your resume is obviously tailored, it shows that you really care about this particular job. You can’t imagine how many “smart” people write scripts that send their resume to anyone that posts any job on any online board. We started posting on Craigslist and had the same person respond to a job ad for sales, programming, accounting, QA, and admin!

4. Never Giving Up

An interview has only this much time, and the interviewer typically has to gauge not only how much you know (actually, that’s usually the least important part), but also how you work and how you interact. The main question (s)he will want answered is, “Will hiring this person make my life more pleasant?”

When you get asked a technical question and you don’t know the answer, don’t dwell on it unless the interviewer demands it. Make a joke, say that you don’t know, and move on. Always have a joke handy, by the way.

I have seen applicants get stuck on a question and leaving my mental presence to dive in a pool of wonder and amazement, trying to find the solution to the problem, while I was twiddling my thumbs, mentally trying to figure out the rest of my work day. I am a patient person, but even I would get a little annoyed after ten minutes of someone scribbling on paper and refusing to move on even after I said it wasn’t really important.

There really are very few instances in which the actual answer to a question asked during an interview is really important. “Are you allergic to anti-hystamines?” when you are stung by a hornet comes to mind. In most cases, both you and the interviewer can survive if you don’t know the answer, as long as you can find the answer to a lot of other questions.

5. Giving Up Too Easily

The opposite of Never Giving Up comes into play when your interviewer wants to know how you tackle a problem whose answer you don’t know. Sometimes the interviewer will wait until there is a question that stumps you, and sometimes the interviewer will ask a question specifically because (s)he thinks you don’t know the answer.

I used to ask all interviewees for Java position the same question, to which nobody ever knew the answer: “Why is the String class in Java static, immutable, and final?” Everybody knows that it is, since it is a major headache – whenever you want to perform any string operation, you either have to use a StringBuffer (which is NOT a string and needs to be made into one before passing it to anything else) or you have to copy the String as many times as required. Meanwhile, all major APIs use Strings as arguments. Painful!

I ask the question because it allows me to see how a candidate reacts to something well-know but not understood. The answer I’d like to see is, “You know, I never thought about that. Here are a couple of possibilities that come to my mind.” Off we go brainstorming. I know the actual answer, having heard it from the Java inventors, but the correct answer is not half as important as the process you use to get to one, and the creativity and knowledge of the field you display.

If you just tell me, “I don’t know and I don’t care to know,” you display lack of interest in the field of your work. A very troublesome sign.

On the other hand, if you ask the interviewer to explain it to you, you appeal to a very human instinct: parading around one’s skills. Typically, interviewers love to tell you what you don’t know and they do know.

6. Being Too Reactive

Interviewers, especially in technical fields, are frequently not trained in their function. As a result, they make frequent mistakes in their methodology even if their technical skills are top notch. You can suffer from that or use it to your advantage – it’s really your choice.

In the worst case scenario, you find an interviewer that probes superficially and then digs in as soon as (s)he detects you don’t know something. The problem with that is that you spend the vast majority of your interview talking about things you don’t know.  As a result, the impression will come up that you don’t know much, since half the time you didn’t know what (s)he was talking about- despite the fact you were knowledgeable about the vast majority of topics (but (s)he didn’t probe).

In the best case scenario, you succeed in focusing the interview on areas of your expertise that you sense the interviewer doesn’t know well. You also manage to avoid the opposite, areas in which the interviewer speaks with authority and you, well, don’t.

How do you manage to avoid the worst case scenario and get to the best case scenario? Well, for one you structure your resume to emphasize topic related to the job, but unusual. You get a hook that many interviewer will latch on to, and you can get started there.

Even if your interviewer comes in with a list of topics to be discussed and doesn’t guess about what to talk (which is the right way to do a technical job interview, by the way), you can avoid digging deep by simply stating that you feel unprepared in the particular topic at hand, and to please move on.

7. Not Knowing About the Company

Preparing for a job interview includes as one of the essential items that you learn about the company you are going to work for. Sometimes that’s easy, as when you are applying for a job with a large corporation; sometimes it’s harder. In any case, there is a ton of information you can glean about the questions you are going to get asked simply by the company’s work environment.

Focus on a few important items:

  • Do a search for the company on the Internet and see if there is any technical documentation listed – contributions to open source projects, forum posts from users with a domain email address, news articles about adoption of particular technologies
  • Learn and use (if possible) the products of the company. If you apply for a job at Google, don’t use a Yahoo! email address. If you apply for a job at Skype, make sure your phone number is a Skype number. 
  • Look at the technical staff listed on the web site and perform a search for them. That’s most useful with small companies and gives you a general idea of the technical background you can expect. 
  • Look at the job description again and again. It will not only tell you what technical requirements are necessary, but also how they are valued. Does the description list “source code control” generically, or is a particular SCCS listed? The difference is that between someone who may not care all too much and someone that wants you to hit the ground running.

If you want to see how this works, just go to a big company and look up one of their job listings. It will tell you a lot more about their engineering environment and priorities than anything else you could find.

8. Missing the Interviewers’ Skill Level

When your interviewer introduces h(im/er)self, when a question comes over, when a comment or correction is made, you gain valuable information about the interviewer. Some of the clues are human in nature – what mood is the interviewer in, are there time constraints, etc. From a technical perspective, the most important information is the skill level of the interviewer.

You have to try, as far as you can, to match the skill level of the person in front of you. If you detect a guru, try to speak as fluently and technically as possible. Throw in anecdotes of technical nature, pile on relevant and obscure acronyms, try to find something the interviewer doesn’t know that you do.

If the interviewer is technically unskilled, you gain no points by being too hard to understand. You can use any gobbledygook and achieve the same results. You do gain points, though, for explaining things plainly and not overly complicated. Don’t dumb things down – just make them clear.

At one of my jobs, I was interviewed by Marshall T. Rose, of IETF fame. We were on the phone for about five minutes and I got the job. How? He dragged me into a conversation about arcane details of HTTP and Tcl internals, things I knew extremely well. We conversed at the 100,000 foot level, and he really didn’t need a lot of information to know that I could do the job.

At a different job, I was interviewed by the VP of Marketing, who was acting Head of Engineering. He asked me about login processes and we started chatting about best practices in the login/registration process. I explained why things are done a certain way, why things are not done a different way, and what ways are particularly onerous on users. For instance, I told him, any restrictions on characters in a password indicate that the password is stored in a database in cleartext – a huge no-no. The man had an a-ha! moment, I got the job.

9. Ignoring the Basics

I am always shocked when people apply for senior level positions and don’t know the most basic things about their field of work. Yet it’s something that happens extremely frequently.

I’ll give you an example: HTTP is a text-based protocol. Requests and responses are formatted as text, newline delimited, with a “HEADER” colon “VALUE” format. The separator between headers and payload is a double newline. It’s all extremely basic, and is both what’s good (easy to debug) and bad (easy to forge) about HTTP. You can easily telnet into an HTTP server and send a well-formatted request and get a full response, both as text you could edit in vi.

You will not believe how many people I interviewed that didn’t know cookies are headers in HTTP. The question is important, because many people do not understand why it is so easy to forge cookies, and hence why they cannot be relied upon.

Turns out object-orientation is a problem in this case. Many Java developers in particular have never had to concern themselves with Cookie headers. They never see them, as they are processed by the servlet engine/application server. They magically show up as Cookie objects.

It’s fine for a junior developer to miss the basics. You will learn them eventually. But there are a series of things you have to expect of yourself after a while – and one of them is that you need to know how things work internally. Not just the encapsulation of them you find in your programming environment.

10. Blimping by the Buzz

I cannot tell you how many times an interviewer asked a question based on the latest discussions on Slashdot. If the topic of the day was The Year of Linux on the Desktop, I would get a question asked about KDE vs. Gnome. If the topic was Firefox, there’d be something about browsers (or HTTP fundamentals).

Of course you cannot expect things to work out that way, and of course you cannot focus your prep work around Slashdot – but you get an easy start on an interview if you are aware of what’s going on in the world around you and are informed about the latest buzz. Besides, it shows you care about the technology world, which is something that is always good in a tech person. [The sad corollary is that if your interests revolve around partying and surfing, you probably don’t want to mention that.]

Also, try to create accounts on every latest coolest site available. Play a bit with it, until you know what you like and don’t like about it. Look for things like speed, response time, technologies used, etc. It takes surprisingly little time to do that, and the benefits are huge. Besides, you cannot imagine what kind of “street cred” you get from a low registration ID number.

Follow the buzz, stay ahead of the curve, and keep current.

Read more:

Femininity – the Missing Half of Science and Technology?

I am a man, in the most stereotypical way imaginable. I suffer from all the symptoms of the condition – the hair slowly starting to grow where it shouldn’t, the quick temper ready to flare up for virtually no reason, but most of all for the way I think.

You could possibly try to call me sexist because of saying this, but there seem to be marked differences in the way the male and the female brains work. These differences seem to relate to evolutionary advantages, and they seem to confirm a stereotypical notion of gender roles in the incipient human community. I am at the butt end of the evolution, but I can see how we got to me.

Let’s start with the unproven assumption that men were the hunters and foragers, while women were the nurturers. Let’s add the assumption that humans, like many primates, were naturally inclined to form societies. What does that yield?

The males’ main task, hunting, requires focus and concentration. It also requires strength and persistence, but those are bodily traits I don’t care about right now. For the hunter, it is imperative to lock on the prey and to ignore all else. For the hunter, the main rational skill is the ability to find a path to connect the weapon with the prey.

The females’ main task, creating a society, requires a lot more intellectual skill. It requires the balancing of self-interest with the self-interest of others and the interest of the society as a whole. It requires the ability to take the needs of others into account, to predict how far you can go, and to generate a harmonious whole. You win, as a society builder, if you can get others to do things for you without their endangering the welfare of all for you.

Fast-forward a few scores of thousands of years, and you get to the modern age. The age of science and technology. Two fields that are predominantly male, even now – after the Sexual Revolution and Women’s Liberation. It’s a mind-boggling mystery.

Or is it?

You see, I recall my days in college. I majored in Theoretical Physics in Aachen, Germany. A great school, if a little old-fashioned. What shocked me was that after we did our B.S. (I love saying that!),  I realized that all my friends had left. About 50% of the class hadn’t made it through the exams, so I wondered why all my friends would be affected. Then I started noticing that my friends reported they had moved on to something else, that physics was not for them. One started playing guitar in a band, another became a journalist, a third had gone on to teach high school kids.

None of my friends had gone on to another technical field. All of them declared science “too dry.” Which I found interesting, because science was not dry to me. Not at all.

I did realize, though, that I had surrounded myself with creative people with lots of imagination. I also realized that they had buckled under the strain of exams, which are mostly there to weed out people that don’t have the ready ability to recollect facts in a short time frame.

Now, the assumption is that you don’t recollect facts because you didn’t learn enough. But there just are people that are not good at recollecting facts, no matter how much they learn. Of course, lack of focus would prevent you from memorizing facts. Being creative, on the other hand, almost presupposes lack of focus. Was I onto something?

Another item happened shortly thereafter. I was finishing up my favorite class of the year, Theoretical Mechanics, and the last assignment was to compute the relativistic modification in the perihelion of Mercury. If you don’t know physics, just ignore what that is. The issue at hand was that I sat down with the equations in hand and solved the problem as if no physicist had ever looked at it before. That was just foolish: Einstein had solved the equation, and his result was one of the first experimental confirmations of the Theory of General Relativity.

Needless to say, everybody else had taken the solution from the text book and handed it in. When we received our grades, I had 0 out of 5 points – complete failure. I was annoyed, since I had spent so much time solving the problem. I looked at the paper, and interestingly it was completely unmarked, except for the big, red 0/5 up front and the comment, “I don’t have time to find the mistake in this pile of crap, but it’s all wrong,” at the end.

Well, I wasn’t going to give up 5 points for nothing, so I went to the TA and asked. He sighed, looked annoyed, and asked the class whether they wanted to talk about it. He misjudged them (after all, we had gotten rid of the artsy and creative types) and we went on to look for the mistake. After an hour, we realized that I hadn’t solved the problem wrong at all: at the point where Einstein solved a first degree approximation, I went on to the second degree. Technically, my solution is more accurate than Einstein’s – albeit nobody cares, because the first order approximation is more than accurate enough.

That left me with a strange impression. I went to the library, where they kept copies of the original of Einstein’s published papers. I read the articles about relativity, and I was shocked. There was virtually no math in there.

Now, this is not about the math being missing – it was all well-known. It’s that Einstein did something totally different than what other theoretical physicists do: instead of going from a premise to a consequence, he simply explained differently what was known already. He took something universally known (at least in the case of Special Relativity) and explained it in a way that made sense, no matter how unintuitive it was at first.

In short, he had taken a creative approach to the idea of science. He was the opposite of my colleagues, who were all about the method.

This continued to puzzle me, until one day I realized that mathematical and theoretical papers are all styled in a particular format: they set out a chain of reasoning to get to a conclusion. That’s what is called a proof. Typically, you start out by saying what you want to prove, and you go on proving it. You go step after step (theorem after theorem and lemma after lemma) until you have enough evidence to make a final statement.

That’s all nice and good. It has, though, a fundamental flaw: it teaches you nothing about how you came up with the thing you want to prove in the first place. It’s as if you took a camera to record how you climbed down a mountain, but had turned it off while going up.

And that’s the problem with the male-dominated fields of science and technology: they focus on the male part of the process, the linear, follow-through, goal-oriented, focused proof; they teach nothing about the female part of the process, the creative, intuitive, non-linear, random idea. The issue is of enormous importance, as is easily seen in the case of Einstein’s theory: it took decades for the world of physics to come up with an explanation of the phenomenon.

The female part of science and technology is frequently poo-pooed within the field, too. It is common for people to make fun of a hypothesis, because it’s unproven. Even with hypotheses that have been around for centuries and have stood the test of time, one feels the tinge of ugliness and disrespect. The idea, I think, is that, “Everybody can come up with a hypothesis. It’s the proof that makes it science.”

That is, of course, true – but it doesn’t give the hypothesis enough credit. More accurately, it doesn’t give those that come up with good hypotheses enough credit.

Some of the best science in the world has remained a hypothesis for the longest time, and the scientists that came up with the original idea are some of the best minds in the field. Riemann, for instance, came up with his famous hypothesis, unproven to this day. My own thesis was about the Kaplan-Yorke conjecture – a brilliant form of unproven theory.

So, what’s the point? I think I derive two important conclusions from my thinking:

1. Science and technology need to focus on giving more information not just about how one proves or implements something, but also to the fuzzier aspects of how one came up with the idea in the first place. That’s going to be a problem for the scientific and technology communities, because we tend to be embarrassed by revealing our thought process.

2. The study of science and technology needs to change. Receiving a degree of any form needs to include a part that tests the creativity of the individual. Those tests are not as easy to administer in a standardized fashion, but they are crucial to the success of science and progress. In particular, multiple choice tests are the worst enemies of creativity in the current methodology of study.

Read more:

The Role and Importance of Quality Assurance (QA)

There is a moment where the young and enthusiastic learn that seat-off-the-pants is quick but eventually leads to catastrophe. You can tell at which stage engineers are by asking them what they think of QA: if they think it’s an occupation for the lesser divinities of programming, they aren’t there yet; if they have enough experience, they will think of QA engineers as demi-gods whose verdict makes and breaks months of coding.

Having been on for decades, I am of course a very, very strong proponent of mandatory QA. To me, this last step in the development process fulfills three main goals:

  1. Interface stability and security Making sure that the code does what it is supposed to do especially in boundary conditions that developers typically overlook. The most common scenario is that of empty data (null pointers, etc.) somewhere code assumes there to be an object, but testing code for SQL injections is another, perfectly invaluable example. This has nothing to do with the functionality of the code, but with its ability to behave properly in unusual conditions.
  2. Performance and stress testing Checking how the code behaves under realistic scenarios and not in the simple case the developer faces. Instead of 5 users, make 500,000 run concurrently on the software and see what it does. Instead of 100 messages, see what the system does with 100,000,000. Instead of running on a souped up developer machine with a 25″ display, look at your software from the point of view of a user with a $200 netbook.
  3. User experience and acceptance Ensuring the flows make sense from the end use perspective. Feel yourself into the user and try performing some common tasks. See what happens if you try doing something normal, but atypical. For instance, try adding an extension to a phone number and see whether the software rejects the input. 

We have gone a long way towards understanding how these three goals help the development process. What is just as important, though, is to see how they (a) have to be implemented, and (b) what the downsides are of not implementing them.


The modern trend is towards implementing interface tests at the developer level. The basic idea is that there is a contract between developers, and that each developer has to write a series of tests that verify the code they wrote actually performs as intended. The upside is that the code will do as desired and that it is fairly easy to verify what kind of input is tested. The downside is that the testing code almost doubles the amount of programming that needs to be done.

Agile methods, with their quick iterations, are particularly emphatic about code testing. Each developer is required to provide testing code to the tune of the main deliverable. At first, it seems odd that people willing to throw out code regularly would be so adamant about testing it. Closer inspection, though, shows that if there is no complete set of tests, the time saved by not implementing them is paid by having to find and remove inconsistencies and incompatible assumptions.

Stress and performance tests usually have to be separated from the interface tests, because they require a complex setup. Performing a pure stress test without a solid data set leads to false negatives (you think the code is OK, but as soon as the real data is handled, it breaks where you didn’t think it would). A good QA department will have procedures to create a data set that is compatible with the production data and will test against it.

There are two goals to this kind of test: (a) characterization and (b) profiling. Characterization tells the department how the code performs as load increases. Load is a function of many factors (e.g. size of database, number of concurrent users, rate of page hits, usage mix) and a good QA department will analyze a series of these factors to determine a combined breaking point – a limit beyond which the software either doesn’t function anymore or doesn’t perform sufficiently well.

Profiling, on the other hand, helps the developers. The software profile gives the developers an idea where the code breaks down. Ironic, considering that a software profile is a break down of where the processors spent time. Profiling needs very active interaction between QA and development, but is a very powerful tool for both.

Finally, user acceptance tests are performed by domain experts or using scripts provided by domain experts. This is the most delicate of function of QA, because the testers become advocates or stand-ins for users. In this capacity, they test how the software “feels”. They have to find a grasp of what the user will ultimately think when faced with the software.

It is here that the tension between developers and testers gets worst. The attitude of many developers is that the software performs as intended and they are frequently upset when a tester complains about something immaterial that forces them to do a lot of work for something that seems minor, like splitting a page in two or reversing the logic with which information is gathered.

It is also here that the engineering manager has to be most adamant and supportive of the testers. Ultimately, the users will perform the same tasks for many, many times. To them, an extra click may translate to wasted hours on a daily basis, something that would infuriate anyone.

Not Implementation

What is the downside of not implementing Quality Assurance? If you are a cash-strapped, resource-strapped Internet startup, the cruel logic of time and money almost forces you to do without things like QA, regardless of the consequences. So let’s look at what happens when you don’t follow best practices.

First, you can easily do without unit tests in the beginning. I know, you wouldn’t have expected to hear that from me, but as long as your application is in flux and the number of developers is small, unit tests are very inefficient. You see, the more you change the way your application operates, the more you are likely to have to toss your unit tests overboard. On the other side, the fewer developers you have, the less they are going to have to use each other’s code.

Problems start occurring later on, and you certainly want to have unit tests in place after your first beefy release. What I like to do is to schedule unit test writing for the time period right after the first beta release – the time allocated to development is near nothing, and you don’t want to push the developers onto the next release, potentially causing all sorts of issues with development environments out of sync. So it’s a good time to fill in the test harnesses and write testing code. Since the developers already know what features are upcoming, they will tend to write tests that will still function after the next release.

Second, performance tests are a must before the very first public release. As an architect, I have noted how frequently the best architecture is maligned because of a stupid implementation mistake that manifests itself only under heavy load. You address the mistake, fix the issue, and everything works fine – but there is a period of time between discovery and fix that throws you off.

Performance and scalability problems are very hard to catch and extremely easy to create. The only real way to be proactive about them is to do performance and load testing, and you should really have a test environment in place before anything goes public.

There are loads of software solutions that allow you to emulate browser behavior, pretending to be thousands or millions of users. Some of them are free and open source, many are for-pay and extremely expensive. Typically, the high-end solutions are for non-technical people, while the open source solutions are designed by and for developers.

Finally, lack of final acceptance testing will have consequences mostly if your organization is not able to cope quickly with user feedback. In an ideal world, you would release incremental patches on a frequent basis (say, weekly). Then you can take actual user input and modify the application accordingly.

The discipline required to do this, though, is a little beyond most development shops. Instead, most teams prefer to focus on the next release once one is out the door, and fixing bugs on an ongoing basis is nobody’s idea of a fun time. So you are much better off putting in some sort of gateway function that has a final say in overall product quality.

Many engineering teams have a formal role of sign off. Unless the responsible person in the QA department states that the software is ready for consumption, it isn’t shipped. I found that to be too constricting, especially because of the peculiar form of tunnel vision that is typical of QA: since all they ever see of the software is bugs, they always tend to think of the software as buggy.

Instead, I think it more useful to have a vote on the quality and release: in a meeting chaired by the responsible person in QA, the current state of the release is discussed and then a formal vote is taken, whose modality is known ahead of time. Who gets to vote, with what weight, and based on what information – that’s up to you. But putting the weight of the decision on the shoulders of a person that has no responsibility but detecting issues is unfair.

Read more:

Security Matters

When I was little, I recall watching this popular science program in which Peter Ustinov popularized the theory of relativity. There was a rocket, a man on the rocket, a launch of the man and the rocket, and a transmission from the really fast man on the really fast rocket. The man on the really fast rocket saw earthlings slow down. Conversely, the earthlings saw the man talk really fast. Makes sense?

No, it doesn’t. A cursory understanding of relativity tells you that the other’s time must slow down or accelerate for both observers, not just for one. So both the man on the rocket and the earth station would have observed an apparent slowdown in the other’s time. Of course, that seems confusing, since once the man on the really fast rocket returns to earth, time will have passed much faster on earth than for the man – but the reason for it is not speed, but acceleration.

This intro serves to explain something that has been bothering me for a while: the way people misunderstand information security concepts and continually use the wrong thing for the right purpose. It’s really not hard, since there are only a very few and very distinct concepts – yet people get them wrong all the time. It’s a little as if people take “security” as a one-size fits all umbrella, and doing something secure means doing everything under the umbrella.

I was reading this Slashdot article this morning. Apparently, people were using TOR routing to send confidential information back and forth, not realizing that TOR anonymizes connections (that is, correlation between information source and destination) but not content. Anyone with access to a TOR node can snoop all the data passing through it, and if the data is not protected, it’s fair game.

So, what are the fundamental things you can do in security? Here is a partial list:

  • Protect content from eavesdropping (encryption in transit)
  • Protect content as stored (encryption at rest)
  • Ensure the content received is the content you sent (signing and private key encryption)
  • Ensure only the intended recipient can read the content you sent (public key encryption)
  • Ensure nobody knows sender and recipient are talking to each other (anonymization)
  • Ensure the content you received is really from the sender you expect (PKI, certificates)
  • Ensure the person connecting with you is who they say they are (login)
  • Ensure the connection made is from a person you have logged in (authentication)
  • Ensure the person who is requesting an action is allowed to perform it (authorization)

At first, all of this may seem to be one and the same thing, but it really isn’t. If you are trying to accomplish one of the tasks on the list by performing another solution (for instance, anonymizing a connection and expecting the content to be protected) you gain nothing and most likely make things worse than if you had done nothing at all.

Protecting Content – Encryption

When you don’t want other people to see the content your are sending or receiving, you encrypt it. Encryption comes in many forms, but the most important distinction to us is whether you want to encrypt something permanently or only during the transmission.

Here you have to understand that “transmission” is a technical term and means “exchange of messages between two end points.” An email, for instance, is not a “transmission,” because it is handled by as many intermediate “end points” as necessary. Hence it behaves to the technical user as something that needs to be treated as in need of permanent encryption.

Now, how do you encrypt your message? You have a lot of options – some better, some worse. In general you want to think of encryption like a lock box into which you put the message. The better the box, the safer the message. In addition, the lock is also really important, as if it is easy to replicate or fake the key, then your safety is gone.

The nerd distinguishes between two types of encryption: symmetric, in which the same key is used to lock and unlock the box, and asymmetric, in which different keys do the same trick. In general, symmetric encryption is easier to handle, but asymmetric encryption much more powerful.

Think of the standard you use as the lock box itself, and the particular password as the key to the box. Sometimes the password is called a passphrase or key, or more in general the secret. It really has the same function as the key in the box: it makes sure that only the person that owns this particular box can open it.

Typically, you do not have to consciously choose a standard. The software you use to encrypt data will typically choose an appropriate standard, and if you keep it up to date it will also change standard as security improved.

Asymmetric Encryption – Private and Public Keys

With symmetric encryption, both end points use the same key to encrypt and decrypt data. For instance, you would use a password on a ZIP file, and the recipient uses the same password. In asymmetric encryption, though, the encryption occurs with one password, the decryption with another. How is this possible? Well, the two passwords are not chosen randomly. Instead, they match precisely: the one used to encrypt is a 100% match of the other, and such passwords are always generated in pairs.

Just like in a Sudoku puzzle, where you have plenty leeway in placing the numbers, but find ultimately strict constraints, in asymmetric encryption you cannot choose just any random pair of passwords. You actually cannot even choose one of the two. Instead, the two passwords are generated for you, and they are made up of gobbledygook that only security software is really happy with.

The two passwords, or keys, have slightly different functions. One of the two can be used to generate the second, but the second cannot be used to generate the first. Because of this, the first one is more valuable and must be kept safe at all times. The other one, on the other hand, is not independently important of the first and you can handle it in a much less strict way. That’s how they got their respective names: the first one is called private key, the other one public key.

Private keys are so important that you usually encrypt them with symmetric encryption to ensure nobody can use them even if they get to them – at least not use them quickly. So, when you generate a key pair, the software will ask you to assign a passphrase to the private key (not to the public key).

While the asymmetry is fundamental, there is one thing in which the key pair is symmetric: what is encrypted with either key can only be decrypted with the matching other. Since only you have the private key, that makes for all sorts of interesting applications.

1. Sending a Message Only You Can Read

You are the only one with the private key. Anyone that encrypts a message with your public key ensures that nobody can read it but you. Not even they can read it once it’s encrypted!

2. Sending a Message That’s Certainly Yours

When you send a message encrypted with your private key, anyone with the public key can decrypt it. But since they have to use your public key, they know the message came from you, since only you can encrypt with the private key.


As we just saw, when you send a message that is encrypted with your private key, you essentially state the message came from you. What about if you want anyone to read the message, but want to ensure they know it’s really from you?

Decades of unreliable connections have left us with the concept of a checksum. That’s a number that is computed on the content of a message/file and is a summary or digest of the message itself. The message can be any length, the digest is typically only a few bytes – it’s only function is to tell you whether the message was received accurately.

To give you a rough idea of how that works, imagine that you take the value of each letter of the alphabet and add them all up. You tack on the sum at the end of a message, and that’s your digest. A = 1, B = 2, etc. Then the digest of this paragraph is 7997.

Now, imagine you create a digest of your message, but you encrypt it with your private key before tacking it onto the message. Suddenly, only you can have created the message, and anyone with your public key will be able to decrypt it. This way, they will know that the message is yours and that it was received as sent. That’s quite brilliant, because it allows you to send something that you don’t mean to be hidden from view,  just making sure people that care have a good way of verifying it’s really from you.

Certificates and Chain of Trust

Now, imagine you wanted to send something to someone who doesn’t have your public key. You want it to be either encrypted or signed, in any case you want them to know it’s from you and only from you. Well, to do that, you would have to send them your public key, no? Easy!

But wait? If you send them your public key, how do they know it’s your public key? Imagine a rogue government that listens to message exchanges and inserts its own public key whenever a different one is detected. Once it does that, it controls all encrypted traffic. And it doesn’t even have to have a government – imagine the WiFi network at your coffee shop!

Fortunately, there is a solution to that, in the form of certificates. The idea here is that there is someone you trust in the world, because they are known to be trustworthy and because they have a process in place that ensures their word is worth your trust. You get their public key and whenever they send you a message encrypted with the corresponding key, you know it’s them.

Imagine now they sent you a message saying something like, “Yes, I know XYZ, and I if they tell you abc is their public key, then that’s right.” Why, then you could trust the public key “abc”! Note that you have to trust a lot here: you have to trust the verifier in both keeping their list of good public keys safe, and in not snooping on the messages you send using “abc”. After all, they told you “abc” is good!

In practice, that’s done all the time without your knowing it. Browsers connect to secure sites (the ones whose URL starts with https:// instead of http://) by doing just that: the site shows a certificate that has some basic information, including someone trustworthy that can verify the information on the certificate. That someone is trusted by someone else, who is trusted by someone else again, who is trusted by someone you trust. In the end, you trust that first certificate because of all the other people’s/sites’ trust.

When our browser connects to a secure URL, e.g., it immediately demands a certificate. Once it verifies all the data and the certificate chain, then it talks to the server and trusts it. The browsers usually display that in a very non-obvious way, for instance by showing a little lock at the bottom of the screen. If you click on the lock, you’ll probably get the certificate information on screen. When you look at it, you’ll notice that the fields in there all make sense now.


Sometimes it’s just as important to know that two parties talked as it is to know what they said. You probably remember how big a deal it was when rumors surfaced that an Al Qaeda operative had been in secret talks with the Iraqi government during the build-up of the War in Iraq. It didn’t really matter what they had said: it was the fact itself they had talks that was
Frankly, on the Internet the problem is less one of spies and terrorists and more one of not having certain parties know you are using certain services. Maybe you don’t want your Internet provider to know you are surfing for porn, or maybe you don’t want your government to know that you are reading up on a massacre it perpetrated.

Anonymization provides that level of protection. It takes a message and bounces it around so that neither the end point nor anyone untrusted in the middle know where it came from.

Today, there are two main forms of anonymization available: trusted proxies and TOR. The former is mostly used to bypass restricted networks, the latter… oh, well, look it up yourselves, I am not going to get into a controversy here.

Basically, in both cases the idea is to connect through an encrypted tunnel. While your requests and the responses may be in the clear, the tunnel through which they flow may be encrypted.

Let’s consider the easier case of a trusted proxy, since it is conceptually the same as in the other case. Assume you are in a country that doesn’t allow you, I don’t know, to search on Google. It does, though, allow you to connect to secure sites. Now, since the connection to secure sites cannot be monitored, what if there was a secure site that just goes out and searches for you on Google?

Well, that’s what a secure proxy does: you connect to it, and it connects for you to the site you really want to go to. When the response comes back, the site sends it back as its very own response, but through the secure channel that only it and you can penetrate.

The downside to this is that the proxy will see everything you are sending and receiving, including passwords and content – whatever you didn’t want others to see. That’s why you need to establish that the proxy is trusted: if it betrays you, it will know everything there is to know about you.

Lastly: Login, Authentication, Authorization

One of the questions that comes up regularly is, “Why is it that the server has to prove who it is with a certificate, and the user doesn’t?”

Indeed, the web would be a much better place if client certificates were required. Unfortunately, as the whole security thing came up, trusted certificates were expensive and really hard to come by, so there was no way to ensure everybody used one. Still more damningly, it is hard to setup a server that authenticates using certificates, so even if you had one, it would be of no use.

Instead, the web moved to a model in which you first prove who you are by providing credentials. You essentially come to a web server as an unknown entity, but you provide a set of required data items and the web server accepts you. Usually, that’s a user name and password combination, but some sites request more (like a rotating security question).

The point of this login phase is to establish that you are who you say you are and to provide you with “something” that tells the servers on the next try that it’s you. This is to avoid having to send user name and password back and forth all the time.

When you connect to the server again, for instance because you clicked on a form, you will present this “something” (usually a combination of encrypted cookie and encrypted form field) and the server makes sure that those credentials are valid. Notice that we replaced something you provided (user name and password) with something the server provided. Since the server owns the new credentials, it can do with them as it pleases, including declaring them invalid at any point.

Every web service must include this verification step at any request that connects to user data. Failure to do so can cause the worst security issues, which gives this step its fundamental importance in web security. It’s called authentication.

Once you are authenticated, the server still has to decide whether you can do something. Some users may be allowed to do things that others are not allowed to, for instance perform administrative tasks. Sometimes that’s handled by creating separate applications for different tasks, and have separate user accounts on each service. But more and more frequently, applications are merged for ease of development and access control lists are used. This final step is called authorization.


At this point, you should understand the fundamental difference between terms used in security and should be able to make informed choices on a variety of options. Please, comment on omissions and requests for clarification.

Read more:

Web 3.0

People have been thinking about the next generation of Web ever since Web 2.0 landed and got its incarnation in Friendster, MySpace, Facebook, and Twitter. No conclusive Web 3.0 road map has ever convinced me, though, so I started thinking about my own one.

I started looking at what made Web 1.0 and then Web 2.0 and decided that the trend could be extrapolated from there. The approach, I thought, should be Hegelian: every manifestation of the Web should solve a problem, create a new problem, and find its own solution in the next one.

What made Web 1.0? The problem we were having was information. Mainly the availability of information anywhere. People were paid for information back in the days. They were in a huge industry of information gathering, sifting, sorting, and selling. Web 1.0 was all about that information  – easier ways to distribute it, easier ways to connect with it, easier ways to share it.

If the problem was information and access to it, the solution was given by new ways of consuming it. What was available in print was now online. What was a catalog became an e-commerce site. Newspapers made themselves obsolete, Craigslist stole the classifieds, and eBay the for sale signs.

The process is not linear. Long after we started seeing the first issues with Web 1.0, we still had old economy in places, especially where there was regulation or a monopoly. The best example is probably the banking and insurance sector, as well as real estate. This is an important realization, because the pressures generated by these hold-overs are going to be pushing towards Web 3.0.

The problem (in Hegel terms: antithesis) to the access/consumption thesis became the flood of information. Spam, search engine rigging, low-quality sites and information polluting the value of the Web. We now had access to a lot of information, but there was no quality control. That’s still an improvement over no information at all, but the novel problem still needed to be solved.

Web 2.0 sounds like it’s about content generation by individuals, and the sites mentioned above are all in that category. But that would be the solution to an orthogonal problem to that of information quality at first sight, and it would explain only one tiny fraction of the development on the Web.

Instead, I believe that Web 2.0 is all about relevance. We now had tons of information and we had to find out what is relevant. Part of it was finding content about people we care about – which is what social media was all about. But the majority of it is sifting through information by quality. That’s what Google was all about.

The crown jewel of Web 2.0, though, and the one that points the way to Web 3.0, is neither Facebook nor Google. It’s Wikipedia. Wikipedia is to the 21st century what the automobile was to the 20th, the key invention from which all other developments would derive.

Wikipedia has an obvious Web 1.0 function: it makes content available in massive quantities. The amount of information that can be gathered by a lookup of a keyword is staggering, and the online links make the information infinitely relevant to the task at hand. People that don’t use or don’t trust Wikipedia as a matter of principle in 2010 are not just wrong, they are in the wrong century.

Less obvious are the Web 2.0 implications of Wikipedia. The information on it, namely, is hugely relevant for no specific reason. There are no automated algorithms that need to be constantly perfected as is the case with Google. There are no walled gardens and privileged information, as is the case for Facebook. Wikipedia is relevant because it is.

Nowhere is this more obvious than in the interaction between Google and Wikipedia. In a vast array of searches for non-products, Wikipedia is the first available link or at least near the top. Try Constantinople, lung cancer, string theory: Wikipedia is there to help you with (mostly) accurate information.

What does that have to do with Web 3.0? Well, the problem that we are facing now, increasingly, is that relevance is starting to be used against us. More and more, the problem is that relevant information is available, but we cannot shield ourselves from this availability. Privacy is a concern, identity theft, the consequences of information availability.

How does Wikipedia point in the direction of a solution: the main thing that it does that is revolutionary is that it organizes people. The organization is phenomenally unexpected, as the same thing that causes much of the Internet to be unusable (trolling, faking, lying) is miraculously only a side aspect of Wikipedia life.

At the same time, when we look at the hold overs from the pre-web era, the information processing giants in finance and insurance, we see that they continue to broker information and make tons of money with it. The reason for that is that to make use of the information they have, you have to be organized.

What do I mean? A bank doesn’t do anything but connect people with money to people in need of it. It is an honest broker, and the thing it sells is not money (although that’s what you buy) but the information of who has it and who needs it.

Insurance companies behave in much the same way: they insure by giving a value to a risk, which is something that comes from lots of information. They figure out how likely it is that something happens and give you a premium that will cancel out the risk because it is distributed.

Banks and insurance companies, furthermore, share the same methodologies. Financial firms, for instance, need to insure themselves from the risk of default, which is properly an insurance risk.

Web 3.0 is about organizing the individuals into entities that can replace information brokers. The first inklings of it are already visible: there are micro-banks that pool people into borrowing money, there is Wikipedia with its collaborative approach to content relevance improvement.

What we will see in the future, though, is going to be more powerful: sites that organize people into political movements for specific causes; sites that aggregate and pool decision making; sites that provide for much of the information distribution that we are used to buying right now; finally, sites that are organized by users, instead of being provided by companies.

That’s the one thing that Wikipedia didn’t get right, and the source of much of the frustration with it: it is run by a dedicated group of people with their own opinions, that do not match the community. If you believe in community generating an encyclopedia, you probably should also believe in the same community running it.

Read more:

Open Source and Interfaces

One of the things that complicates the development of software in the open source world is the enormous variety of different interfaces you have to deal with. This is eminently an architectural problem: interfaces need to be defined ahead of coding, and if you just start developing your own project, you have no need for uniformity across projects.

Initially, of course, there is no standard and hence no interface. Later, separate projects come up and they make a point of having different interfaces to better spread and create incompatible ecosystems. That was the whole enmity between KDE and Gnome, or the many different brands of SQL server implementations.

There is nothing wrong with competing projects, especially nowadays. It dawned on people that copying from successful efforts is a good idea, and the availability of source code makes it easy to get not only ideas, but also implementations moved. It so came to be that the outstanding networking code in BSD became the forefather of the networking layers in many UNIX-ish operating systems.

On the other hand, there are plenty of places where work is unnecessarily complicated by incompatbile implementations – even within a single framework. Getting these incompatible layers realigned should be an important task of Linux and open source developers.

Instead, each project seems to think that being incompatible is a goal, because it highlights the capabilities of your system. KDE, for instance, has this widget system called Plasma that allows you to embed all sorts of things in your applications (especially the desktop). Only KDE has that, so if you like Plasma, you have to use that. Nice.

On the other hand, KDE also has so-called KIO slaves that are plugins that allow you to connect to sources and destinations that are treated as files. They work fine in pretty much all KDE applications, but nowhere else.

Even when the different technologies put their minds together and come up with a common interface, it seems to be dumbed down just because. Such was the result of the unification of DCOP (KDE) and Gnome’s CORBA into DBUS. At least on the KDE end, the implementation of DBUS interfaces is primitive compared to the former power and glory of DCOP.

The main problem I see is that the open source community is empowered by diversity. For everything, there are dozens of different ways of solving the same problem. Some are good, some are not so good. All of them have one thing in common, they are open.

Interestingly, UNIX had its forte in the small utilities you could combine together to achieve enormous power. In the old days, you could either use a Windows tool that could not be automated or scripted, or put together a pipe of shell utilities that did what you wanted automatically.

The new Linux doesn’t do that any more. It has become entirely a commingling of ecosystems that are independent and happy to be so.

For a different (and better) approach, look at one particular GNU project: GNUCash. It’s a “Quicken clone,” a personal finance manager. Its application logic is separated from the presentation logic, and the KDE people could re-use the app layer in their own KMyMoney project.

From an architectural perspective, you would want to separate application logic from presentation logic. You would also want to separate glue code and user interface from the core application. In particular, you would want to script the glue and interface, so that users can change both as they see fit.

I’ll give you an example: one of the things I commonly have to do is to copy a line in a shell and paste it as a new command. There was one shell on DOS that allowed me to do that: copy and paste in one shortcut. Strangely, that’s an enormous improvement over the current state – seems incredible, but the problem is that in Linux, it’s quite unpredictable if the application you are using actually has the clipboard. So you copy, but when you paste you get some old clipboard content.

KDE has a shortcut functionality that confuses a great many beginners. [Note: that’s mostly because every KDE application has a Configure Shortcuts menu item next to the Configure Application, which seems stupid – you’d expect the shortcut configuration just to be part of the app config.] The basic idea there is to connect the functionality embedded in the application to keyboard interaction. Adding scripting logic to this is simple (as a matter of fact, some KDE applications embed a scripting engine, QtScript). But, for the most part, the scripting is just a secondary thought.

Instead, all menus, widgets, buttons, user interface elements should be laid out using a scripting engine, and the user should have liberty to rearrange things and logic as she sees fit.

Why? Because that way, someone else can reuse your widget and embed it in their own application. There is no reason to say they shouldn’t – it’s open source, so what they want, they can steal. But it’s far better if they can steal the thing instead of the code.

The same is true for a variety of other cases. In general, the power of an application is directly proportional to the way it is flexible. The more an application is inflexible, the less likely it is going to be reused and usable.

Finally, the one huge advantage of going about this in the way proposed is that you can finally take those application that have outstandingly horrible user interfaces but equally outstandingly powerful functionality and rationalize them. Sometimes I wonder what would happen to Blender, GIMP, or KDEenlive if someone with a little love of UI could go and change them at random.

Read more:


There is this whole motion afloat, trying to declare SQL bankrupt and do without. Instead of SQL, one hears, there are going to be much better databases in the future. Dozens of projects are floating around, each with a different notion of what “better storage” mean, all aiming at being better data stores for the Internet.

Now, it is clear that SQL databases have their supreme annoyances, and the need for reform is clear. What pretty much all NoSQL project have in common, though, is that they look at the wrong problems and try to solve them with a more theological than philosophical or architectural approach.

Let’s look at the deficiencies of SQL first. There are three main classes of problems:

  1. SQL the language itself
  2. SQL the data store format
  3. SQL database scalability

Each of these classes brings a completely different set of considerations into play – and different reasons why SQL needs to change.


The main problem I have found with SQL as a language has always been the dynamic, interpreted nature of its command strings. You don’t issue orders to a database, you pass it a string, and the database interprets it. As everyone knows that has dealt with interpretation in Poetry 101, that’s dangerous and rarely unique.

The main issue is that it’s very easy to get the escaping rules for content wrong and end up with a query that is not what you meant. In particular, that’s dangerous when you mindlessly send user input to a database, which is precisely what a SQL injection is all about.

The forms of remedy that databases have used so far are meant to mitigate, not to prevent the issue. I think the problem is that most database developers think of developers that expose themselves to SQL vulnerability as fundamentally incompetent and are not willing to admit that the problem is in the language itself, that makes such exploits so easy.

The solution to the problem, instead, comes from changing the language itself. Instead of sending a string that needs to be interpreted, the database should require a structure that needs to be executed.


SQL is a little like the “C” of databases: something that is complete in that it can do everything that you need, but that is needlessly complicated for standard things you might want to do. In many respects, SQL constructs remind me of the absurd complexity of C “for” statements. When you want to iterate over a collection, it is silly to use a control structure that allows you all sorts of fancy.

There are two immediately obvious issues with SQL:

  1. You can have only a single column of a given name
  2. You cannot use a table as a data type

OK, I probably lost most of you now. What do these two mean?

Imagine your typical application. You have a user table, and it needs to store the user’s email address. Now, users can have multiple email addresses. What do you do in SQL? You have two main options:

  1. You create a column for as many email addresses as you want to allow
  2. You create a separate table for email addresses and link it back to the user table

In the first case, you end up with tables that have columns like EMAIL1, EMAIL2, etc. In the second, whenever you want to add email to your searches, you have to perform a join with a table from which you match the email address by user ID. Something like:
SELECT user.username, email.address FROM user, email WHERE user.first = ‘Marco’ and = email.userid;
Notice that you have to do this every single time you want user information, and the email table becomes an adjunct of the user table.

Instead, the rational thing to do would be to tell the database to worry about this crap, and to just allow the user to specify that a particular structure is present multiple times, or that a particular other table contains records that we are interested in.

For instance, there could be a field attribute ‘hasmany’ that allows you to specify that a table has a particular attribute multiple times. When a query comes in for this attribute, all the different values are considered. Instead of searching for an email address in the many fields, like this:
SELECT FROM user WHERE email1 = ‘’ OR email2 = ‘’ OR email3 = ‘’;
you search in the field email that has multiple possible occurrences.

At the same time, consider the case of an address. That is a complex field, made up of multiple subfields (ZIP code, street address, city, state, apartment number, etc.). Instead of creating those in the user table (and then again in every other table that requires addresses), we can create an address table and link to it. Again, though, when we search, we don’t want to have to do a join, we’d like to have the database do that for us.

There is a FOREIGN KEY constraint in SQL, but it is just that – a constraint. Instead, we’d like to have the external table as a data type. Right now, we specify the same type of the key (typically some form of int) in the foreign key column and the database allows only values that correspond to keys or the NULL. Instead, we’d like to simply say that the type of the column is the other table. The column definition would change from:
address_id INT FOREIGN KEY,
address TABLE addresses,

The new type is structured, so that when you ask for it is clear what you mean.

Hierarchical Data

Another thing that SQL is notoriously bad for is the storage and retrieval of hierarchies. That’s a problem with SQL and recursion, I presume, and it would be easily fixed.

Suppose you have an employee table in which you store the hierarchical relationships between employees. Every employee has a manager (except for the CEO), and reports. The only functionality that SQL offers is the foreign key constraint into the same table, which is way too weak to be of help. We cannot ask questions like, ‘who all is under the CFO?’ Instead, we have to ask who is under the CFO, then who is under those who are under the CFO, then who is under this last set, etc. We need to repeat that for as many times as we have reports.

If SQL databases were aware of hierarchies, they could do the work without bothering us with complex queries. Even better, since the data they’d have to look at is such a small subset of the total data in the table, a specialized hierarchy index would speed up queries enormously. At the same time, it’s really easy to figure out what hierarchical data means – if you have a foreign key into the same table, then it’s hierarchical. Real easy to do.

Full Text Search

Another common annoyance of the SQL data store is that it doesn’t provide a full view of the contents. When you want to know something, you have to ask every single column. Now, the murmur in the crowd will tell me, ‘Why, Marco, but that’s precisely what SQL databases are meant to be! If you want full text search, go create a full text index!’

Things are not as simple as that. Fact is that you want and have a structured store, but sometimes you just want to have a complete view of the record.

So far, if you wanted to do a full text search, you had to do one of the following:

  1. Create a full-text index into the database or table, using an engine that allowed that
  2. Dump the database or table and create an index of the dump
  3. Create queries that joined all the possible fields in the database

All of these are inadequate. The first one limits you to specific engines. The second one relies on a laborious process, without guarantee that you can actually find the record associated with a particular location in the dump. The third one is too labor intensive.

The amazing issue is that databases know exactly where data is stored. For them to find a particular piece of information in the raw data is fairly easy, and it’s a real shame that they don’t allow for full text searching and indexing as a matter of course.

SQL Scalability

In my experience on the web, the first thing that causes headaches from a scalability perspective is the database server. Those, correspondingly, require the highest level of attention, the most effort, and the biggest hardware cost. Given that they don’t really do that much, it’s a real shame that they would be the consistent bottleneck. Fortunately, Internet architectures have several possibilities for improvement on current designs.

First, it should be though noted that most developers are not very proficient with SQL and database design. As a matter of fact, databases are a mode of thinking that is so far from software development, that it makes perfect sense for someone to specialize in them.

Unfortunately, this database administrator is also the person least likely to understand the shortcomings of database designs and the least likely to have the key insights to the changes required and desirable. All of them, indeed, stem from the difference between the enterprise design of database and their Internet usage. Follow me for a minute.

ACID Semantics

The greatest care in databases has traditionally been given to ACID. You can look up what that stands for, essentially it means that at any precise point in time, a database will always give the same answer for a specific request. That’s extremely important when money is concerned, and relatively important in enterprise settings when you want to know for sure that nobody is going to get two different answers for the same request.

On the Internet, for virtually all applications that don’t involve buying and selling, you are much better off relaxing that requirement. In general, it’s not particularly important if an update arrives instantly or in two minutes, and it’s not tragic if you happen to get a different response if you happen to hit a different server.

Strangely, ACID is one of the highest cost factors in databases, and it’s only inertia that kept this kind of semantical effort afloat for all types of data. Giving up on it partially allows for a series of performance improvements that can make scalability much easier and cheaper.

Read/Write Mixes and Replication

Depending on the application and the implementation, on the Internet people typically read data much more frequently than they write it. Since writing is more expensive (to a database) than reading, it makes perfect sense for anyone that wants to scale up to distribute the reads and concentrate the writes.

To do this, database servers have to be set up in replication mode. One server is the master, a set of servers is the read replica that receives updates from the master. MySQL does replication in a particularly transparent manner: it creates a binary log of transactions and replays that log from the master to all slaves. In essence, the master tells all slaves what it did, and all the slaves copy it verbatim.

Some replication systems are more focused on speed and transmit file differences instead of transactions. Other systems focus on efficiency and transmit the smallest possible set of changes. In any case, the result is always the same: replicas and master are generally not synchronized, which means data can be different on the master and the replicas. This is the so-called replication delay.

Query Caching

Indexes into the database speed up standard queries by factors. The availability of full text indexes speeds up those by factors, as well. Individual queries, though, would benefit from their own caching. Most database servers implement query caching, but they cannot know which queries should be cached and so keep in memory everything that has been asked, which means the data that is likely to be asked again can be removed from the cache so that useless data can be stored.

Instead, queries should allow for a cache hint that tells the server they should be kept around, since the results are going to be required again. What kind of queries would that be? Typically, those that involve large numbers of items that need to be scrolled through. If your query needs to be paginated, then it needs to be cached, since the user will ask for more than you are presenting.

Optionally, the hint could be formulated in a negative fashion, like in the C compiler hint discardable.


Wow, this was quite a lengthy blog post. In summary, I believe that the movement to drop SQL in favor of different types of databases is unnecessary and destructive. Instead, a set of incredibly important improvements and extensions to the SQL language and to the semantics of databases would serve the new use case of Internet databases very well.

Read more:

MVC I: Hierarchical Views

I thought I’d start this architecture blog with a post on one of the things that, traditionally, have given me the most heartburn: the implementation of view in MVC architectures.

As a refresher, MVC is by now the standard architecture for most applications, web or not. It stands for Model, View, Controller. Model is the abstract representation of the data you are handling (for instance, invoices). The view is the particular representation of the data (for instance, the invoice edit screen). The controller is what ties the two together and add user input. The controller fetches the data by instantiating the model, and passes it to the view, which can take the data and make it into a page.

The problem I have seen over and over is that MVC implementations tend to be controller-centric. The controllers compose the activities of the application. They tell the models which data to fetch, how to alter the data, and how to instantiate the session and other user parameters. They then select the view to display and invoke it, passing it the data they found along the way.

As a result of this mode of thinking, views are typically considered subservient to the controllers. Every controller has its own set of views, and when a particular view isn’t working precisely for the problem at hand, the template is copied, a new folder created, and the copy manipulated.

The main problem with this approach is that this is not how graphic designers and users think. These two groups cannot see the underlying structures and think in terms of objects on pages. To them, the navigation bar at the top is a constant – an object – and seeing it change for no reason is surprising. Which in applications always means, bad.

Some view implementation try to take that into account by adding “snippets” of various kinds to the implementation. They can either be subtemplates, macros, shortcuts, include files, or something like that. But, really, what the implementation must look like is object-oriented. If the user thinks of a portion of the page as a navigation bar, then that’s exactly what the code should say.

The advantage of going about things this way is that when the graphic designer thinks it would be better to have a vertical instead of a horizontal navigation bar, you change the implementation only in the file that defines the position of objects in the area of the navigation bar. This way, one change to the views (from the users’ perspective) is reflected in one change in the code.

There is no penalty but planning for doing so. Modern processors are perfectly capable of doing the replacement at incredibly high speed, and the page generation can be cached for performance (even though the caching is typically unnecessary, since the fetching of data is typically a lot more expensive). The code is both easier to write and easier to maintain, and subclassing gives you a good feel of how inconsistent your application becomes.

Which is why I wonder why I yet have to encounter a popular framework that starts its business with class hierarchies for views.

Read more:

printf(“Hello, World!”);

I’ve been working on the architecture of Internet systems since 1994. It’s been a wonderful time, one that has seen software architects move collectively from obscure geekdom to running the development departments of the biggest Internet juggernauts. The best of times and the worst of times, frequently very close to each other, sometimes even coinciding.

I find that the world of software needs a lot of architectural attention. You won’t agree with all my ideas of how things should work, but I guess you’ll always have an opinion about them. The more people think about architecture, the more they actually do architecture, and that can only be a good thing.

So, comment liberally, disagree or agree with me, just use these humble thoughts as a jumping board for your own imagination. That’s all I want out of this, and it would be a lot.

Read more: