Hacker News new | past | comments | ask | show | jobs | submit login
How to Rock an Algorithms Interview (palantir.com)
435 points by harrisreynolds on Sept 26, 2011 | hide | past | favorite | 163 comments



>Given a whiteboard and one hour, determine whether the person across from you is someone you’d like to work with, in the trenches, for the next n years.

If we ignore the requirements of one hour...

Brute force: Hire a random candidate that hasn't been hired by you before. Fire them when you get fed up with them. Hire a different candidate. Repeat.

Greedy Algorithm: Develop the model for an ideal candidate, assign that candidate a numerical model and then compute the distance from the current candidate to the ideal candidate.* Hire the candidate with the lowest computed distance. If a candidate comes along with a lower computed distance fire the old candidate and hire the new one.

Simulated Annealing: As with the greedy algorithm, but instead of hiring based solely on lowest computed distance, include a small % chance that you will randomly fire the candidate and hire candidate that is slightly worse. (Optional extension to the algorithm is to keep the best candidate so far cached somewhere, potentially a transporter buffer).

Genetic Algorithms: From your pool of candidates, select the top candidates. Breed these candidates, randomly introducing mutations. Repeat again with the progeny of the previous pool, until one of the candidates passes a satisficing threshold 'distance' score.

* = Notice that this is the actually the most difficult part of (almost any useful application of) these algorithms but I managed to hand-wave right through it.


Divide and conquer: Assign all applicants randomly to teams. Run competitive coding competition. Reject all but the winning team. Split the winning team into smaller ones, and repeat.


Recursive: require the candidate to come up with the perfect hiring strategy.


As an interviewer, I have asked candidates what questions they like to ask when interviewing and then proceeded to ask them that question. :)

Or ask the candidate "when question should I ask you that I haven't?" These are somewhat softball questions, but they let the candidate drive the interview and highlight what they think is important about their experience.


I'm willing to believe that the ideal candidate is a fixed point under your operator, but I think it's going to be tricky to prove convergence :)


And there might be other fixed points, perhaps with even greater catchment areas.


Now, would the old candidate be pushed down the stack, or would there be some TCO (tailing candidate optimization) going on where only the most recent one exists?


Two more approaches:

1) There's the well-known Secretary (or Marriage, or Sultan's Dowry) Problem (http://en.wikipedia.org/wiki/Secretary_problem). It has three caveats, though:

* Solution assumes that you have a lot of (e.g. ~20) candidates.

* It is assumed that you can sum up each candidate's ability in a single number. This is the hardest part, since the human mind works relationally and assigning absolute numbers to anything is hard.

* The probability of picking the best candidate is only 1/e.

But at least there's a guaranteed bound on the probability of success, that some of the other methods lack.

2) (If possible) Hire the engineer and HR person at the same time and have the HR candidates evaluate the engineering candidates. Then run Knuth's stable marriage algorithm (http://en.wikipedia.org/wiki/Stable_marriage_problem) to pick the best candidates for each.


> Hire the engineer and HR person at the same time and have the HR candidates evaluate the engineering candidates.

Demetri Martin's "Important Things" TV show had a comedy sketch about two interview candidates inadvertently interviewing each other, each thinking the other candidate was the hiring manager. :)

http://www.comedycentral.com/videos/index.jhtml?videoId=2686...


1) The secretary problem does NOT assume a large number of candidates. In fact, your odds go down as the number of candidates go up.

2) The secretary problem DOES assume you can't go back once you've evaluated a candidate. This is not the case for prospective employees.

Edit: s/passed on/evaluated/


I meant the 1/e neat solution (where the summation is replaced by an integral) assumes the large number. Otherwise, the sum can be evaluated directly, but no closed form solution exists.

I don't get your second statement, though, what do you mean by the "odds"? The 1/e solution is asymptotic.


Epic, really. That's one for the 'best of HN'.

I'd hate to work for you though ;)


I dunno, it might be fun to work at a company run with algorithm #4.


Didn't Bill Gates search for a mate among the employees?


It's why Microsoft Bob was released, IIRC.


have each applicant interview each other. The applicant that gets the most 'hires' wins


Nash Equilibrium: Nobody gets hired.


FTA:

    You should know these data structures inside and out.
    What are the insertion/deletion/lookup characteristics?
    (O(log n) for a balanced binary tree, for example.)
How does one achieve this? Not just being familiar with data structures and algorithms (I am), but being fluent in them, to the extent that you can produce the Big O complexity for different operations off the top of your head.

I didn't have a traditional CS education in college. Maybe there's some kind of magic that happens in Algorithms [1-3]01 that permanently etches these characteristics into your brain. But whatever it is, I missed out.

My main problem is that I don't seem to need this information to do what I do. I work predominantly with server-side, interpreted languages, and I just don't run into the bottlenecks and performance problems that Big O-knowledge seems to mitigate. Maybe I've been lucky. Maybe it's because none of my personal projects have gotten popular enough to be subjected to the kind of load that reveals such problems. Maybe it's because I haven't worked on the kind of project that needs this sort of attention. Whatever the reason, I simply have never hit a point in my work (which I assure you has been non-trivial at times) when I sat back and said to myself, "Gosh, if only I were more familiar with the complexity breakdown of the classic data structures, I would be able to solve this problem much more effectively or efficiently."

The thing is, I know that I'm perfectly capable of nurturing greater fluency in algorithms, so it seems like a waste to inevitably flub the more demanding portions of algorithm-based interviews. So what should I do? Is the answer as simple as biting the bullet and using flash cards until I know it all by rote (which feels like cheating), or am I "doing it wrong" somehow? Is my lack of fluency preventing me from understanding just how important it is (a la Dunning-Kruger)?


The good thing is that most operations on most data-structures (as well as many of the more common algorithms) don't require solving recurrence equations to determine their asymptotic growth, usually our intuition based on counting tends to be pretty close.

Sketch out your basic data structures on a piece of paper, then just use your finger to trace out how you could do insert, delete, search, etc you can usually tell if something is taking O(lg n), O(n), O(n lg n), etc just by counting the steps you take. After you're comfortable with this, sketch it in your head. You can't forget something that you understand.

The point is that if figuring what out Big O is seems like something you have to memorize to recall quickly, then you very likely don't really understand the underlying data structure/algorithm (and as such you rightfully should do poorly on algorithms interviews). Take the time to make sure you really understand what's going on, and everything else should fall into place.


Your post needs to be, not only upvoted, but also put on the syllabus of every data structures class.


Our interviews at Palantir test for these for two reasons:

We evaluate a lot of people coming straight out of school in CS. They don't have much experience in development. Thus, the best way to test if a candidate is smart is to see if he's learned his course material well. We do a lot of heavy algorithms and distributed systems, so we ask that, but it also happens to be what students learn.

Running time and Oh, however, in addition to systems knowledge, is extremely important to creating good software in general. If you can't determine how your software will scale with big data, and you can't understand why efficient code that runs quickly with small operations does so, you simply can't write good software.

I realize a lot of people on HN take offense at these claims because they've written programs without this knowledge and claim they are self taught. The reality, however, is that they aren't good programmers. You don't need a good programmer to write a prototype that works and determines market fit. You do need good programmers to scale your product, add features efficiently, and architect solutions without making a crippling mess of your code.

Edit: I realize the above comment sounds a little condescending. It wasn't meant to be. I have met self taught programmers that are amazing. Most people just don't have the self discipline to fully learn a subject matter (I many times don't). So a lot of "self taught" programmers have only taught themselves 10% of what they need to know and never bother learning the other 90%.


I'm self-taught. I also wrote the world's then-largest mail system, which eventually handled 4,000 TPS at zero downtime on servers about as powerful as my iPhone. I also don't know algorithms.

E-mail, at the time, didn't need them - there was no search, no deduplication, no anything but "do a bunch of deterministic, bounded operations very quickly." Sure, some hash tables and plenty of linked lists, but that's about it.

Am I just a big fat outlier there, condemned to cram Skiena before every interview for the rest of my life?


You may be an outlier, or you may just not be in the group of self taught programmers that was described. As I said, most self taught programmers don't fully understand what they do. Many, however, do. I was just explaining the most common case.

Judging by the fact that you were able to do many deterministic bounded operations quickly, and understand why they were bounded and why they were quick, I get the feeling that you'd do much better at an algorithm interview than you give yourself credit for.

And I'd also cram CLRS before an interview :P I know I'm good at what I do compared to the avg coder, but compared to the average coder at Palantir, I have a long way to go. I'm in no way satisfied with my ability where it is today and its always smart to take a refresher before an interview.


As a self taught programmer, I fully agree with you. Forcing myself to work through CLRS was the most important thing I did on my way towards where I am.

What you said isn't condescending at all. If you are a self taught programmer and you haven't read up and gotten a good handle on big O notation, data structures, and algorithms, do yourself a favor and start.


I upvoted you for the good explanation (and the refreshingly self-aware edit). However, I think it is rather presumptuous to paint the broad strokes of "good" and "bad" without adding so many caveats that it would no longer be germane to this conversation.

By your definition, I am currently a "bad programmer" because I do not have fluency in algorithms.

(I will leave the objective truth of this statement as an exercise to the reader; God knows it might be so. However...)

I contend that I am capable of learning new material quickly enough that, if I were suddenly called upon at my job to write code which handles all of the slings and arrows of algorithmic complexity, I would be able to do so with very little friction. To phrase it differently: I (probably) don't know enough to write gorgeous, algorithmically sophisticated code on the first pass, but I know enough to know when it's time for me to hit the books, and where to start.

Obviously, it would take me longer than someone with prior experience in writing such code! If the overarching question is "does algorithmic fluency constitute a dealbreaker when hiring at certain positions", then I suppose the answer depends on how much longer it would take. I can only speculate on the difference in a flagrantly biased manner, as I would egotistically prefer to believe that I ought to be hired in more cases than not. :-)

Or, if we look through the Dunning-Kruger lens, we may conclude that lack of algorithmic fluency always constitutes a dealbreaker at certain positions, because even if the developer in question can learn on the job, he or she will still make errors and oversights with far-reaching implications that are invisible--unknown unknowns--to the algorithmically illiterate. This feels less plausible to me than the simpler consideration that "learning on the job" may simply be an undesirable speed bump.

I don't really have any concrete points or refutations to make; I'm just fascinated by the issue in general and enjoy exploring the possible angles.

(Also I shall likely just cut the Gordian knot and go play with algorithms until it simply isn't an issue anymore. Go figure)

EDIT: Actually, there is a point I forgot to make:

In your edit you wrote, "So a lot of "self taught" programmers have only taught themselves 10% of what they need to know and never bother learning the other 90%." I think this is the crux of the "good/bad" problem: "what they need to know" is very subjective and dependent on context.

I think it's reasonable to assert that there are programmers out there who have long and satisfying careers, but never learned X, or didn't learn it for many years. Who's to say whether they are "good" or "bad"? It's a judgment, and judgments are meaningless without context: "good programmers need to know X" doesn't really say anything, but "programmers need to know X if they want to Y" does.

You certainly went down this road in your comment, but your Y was "You do need good programmers to scale your product, add features efficiently, and architect solutions without making a crippling mess of your code.", and I feel like lack of fluency in algorithms is just not a surefire indicator of bad architecture skills (among the others you listed) as well.


Based on 20 years of experience, lack of fluency in algorithms _is_ a surefire indicator of bad architecture skills. Yes, it gives a few false positives, but the amount of false negatives is pretty much 0.

And since actually getting a new person on-board is an expensive process, you aim for criteria that are a bit too stringent, if you can afford it.

There's also the issue that making a bad algorithmic decision at the center of the problem turns scaling into an insanely hard problem, sometimes. It's not so much about learning about it when you need it, but about avoiding issues in the first place.

True, there are many places where it truly doesn't matter - but if scaling/performance matter for your company, you don't want to head down the garden path because your developers didn't know better.

Not knowing algorithms won't exclude you from all jobs (or even many of them), but there are some jobs where it _will_ bite you. It's your choice if you want to go for those jobs. Personally, I think they are where the cool stuff is happening, but YMMV.


I agree with jchonphoenix and groby_b. I'm sure there are a lot of great self-taught genius who don't have the formal training but does a much better job than most, but there are orders of magnitude more in the camp of "half-taught" mediocre programmers, with shiny formal training, that can easily be filtered out using this approach.


Thank you, this is an illuminating response! Hopefully I'll remember to look back on this thread when I'm enlightened so I can see if I've switched sides on the debate. :-)


"I contend that I am capable of learning new material quickly enough that, if I were suddenly called upon at my job to write code which handles all of the slings and arrows of algorithmic complexity, I would be able to do so with very little friction. To phrase it differently: I (probably) don't know enough to write gorgeous, algorithmically sophisticated code on the first pass, but I know enough to know when it's time for me to hit the books, and where to start. Obviously, it would take me longer than someone with prior experience in writing such code! If the overarching question is "does algorithmic fluency constitute a dealbreaker when hiring at certain positions", then I suppose the answer depends on how much longer it would take. I can only speculate on the difference in a flagrantly biased manner, as I would egotistically prefer to believe that I ought to be hired in more cases than not. :-)"

You are SERIOUSLY underestimating the amount of time required to learn that stuff. In essence, the full 4 years of a CS course deals with data structures, algorithm and math. Good luck finding a project where the stakeholders are willing to wait that much. Also, unless you are a very unique snowflake, you will not have the required discipline.

On the other hand, learning a new language is accomplished in a matter of days. Learning how to code well requires experience and willingness to learn from one's mistakes.

Which a candidate with a good CS background may already have.


When talking of learning in this sense, no one talks of reading the book from the first page to the last page.

The learning is often sufficient to serve the needs at the moment. If I have to quickly fix my juicer, I don't under go do Electrical and mechanical engineering courses for the next 4 years.

Instead what I do is, I define the problem. Search for the solutions on the internet. The solution requires me to understand some theory and some practical aspects of things with some tool usage. It is more than sufficient to that to solve the problem for the moment.

That is how most of the software is, its practically impossible to know every thing from the book. What is more important today is given the body of knowledge can a person study and solve the problems. A person can be productive and deliver the goods with the above approach, With a little discipline.

That's how most knowledge based work happens in the 21st century.

I also see a lot of people around me trading stocks, although they understand virtually nothing about economics in detail. For the most the part they can make 10-15% profit on their investment by just using some basic analytical skills.


Search for the solutions on the internet

This is the problem with kids these days.


This is the problem with kids these days.

We can argue as much as we can about generational differences. These days you can do a lot of things without knowing much about it before hand. That's the kind of advantage internet offers you these days.

If a person is unable to leverage this to his advantage, I feel sorry for him. And I see no reason why others shouldn't do this just because he can't.


I disagree with your first sentence. When I had a month of downtime between graduating college and interviews, I read CLRS (nearly) cover-to-cover, and did approximately 1/4-1/3 of the problems (with an emphasis on chapters and sections I had a weaker grasp on). I can't be the only one.


We once hired a guy with PhD in computer science & he was a (great) lecturer & researcher for many years. We got him to do some "complicated stuff". I'm not going to go further into the details in case he recognises this.

After he left I went back to learn math for interest and got really into it. I now realise that literally six months of the work he did I could do in an afternoon as I now have a deeper understanding of the mathematics behind it than he did. I'm not smarter, far from it, just have more tools to call on. And this was a guy that taught the algorithms course at a good university.

We are all capable of learning almost anything, but a lack of knowledge about the subject matter constrains your solution search space to a large degree. I believe strongly that fluency in algorithms is a minimal prerequistite to be a decent developer. It's like learning calculus without knowing trig, you can do it, but you'll be reinventing (poorly) a tonne of work that came before and you'll be slower and less accurate. It's not like it's rocket science either, you could get a basic understanding in a concentrated day of work so there's no reason not to.


I don't have enough information to make any conclusions here. But let me add to the discussion that more often than not, it is an order of magnitude (or more!) easier to look at someone else's solution and understand it than it is to arrive at the solution in the first place. I am even willing to believe that, perhaps if the problem were very well-defined, you might be able to solve it in an afternoon. But if you were in his situation, how long would it have taken you to characterize the problem so that you could solve it? Have you had successes doing "complicated stuff" for which there wasn't already something existing to key off of?

In computer science, there is the notion of the NP-complete problem, for which there is no known efficient solution. But if a solution is found, it can be verified very quickly. I think the analogy holds here.


It wasn't this. The example was trying to show that a lack of knowledge can be catastrophic in terms of productivity for even the gurus. He spent more than six months on this particular problem and if he had known more math (very typical of people in comp sci unfortunately) he would have been enormously more productive. The problem was formulated before he started any work on this, it really was his lack of (graduate level) math and statistics that caused this particular. I'm not blaming him, it's just he didn't have the tools in his toolbelt to understand the best way to solve the problem, and nor would anyone who didn't have a strong math background.

Math is unreasonably effective. Just like knowledge of basic algorithmic analysis. I don't think that should be controversial.


To be fair, what he needed was actually DOMAIN knowledge, in this case, some specific math stuffs. Unfortunately, unlike other domain knowledge such as business workflow, stuffs like Math and Physics are kind of hard-core, can not be quickly picked up by self-learning. He should have had some domain experts (mathematicians) to help him.


That's my point. Algorithms are the domain knowledge of cs.


"I contend that I am capable of learning new material quickly enough that, if I were suddenly called upon at my job to write code which handles all of the slings and arrows of algorithmic complexity, I would be able to do so with very little friction."

How do you know whether or not this is true? Until you know it, how do you go about estimating how long it will take to you to learn it? How do you even assess when you've learned "enough" algorithmic complexity for the task at hand? Can you give any reasons someone looking to hire you should believe that you can learn the ins and outs of algorithmic complexity with "very little friction"?


The thing is, it's very easy to end up using a bad algorithm without noticing. If you're doing a linear search over a collection of n elements, fine, but if you end up having to do it n-ish times, suddenly you're up to O(n^2), and if n is ever a few million in reality, your app will enter a loop that's practically infinite. Meanwhile, sorting them would be O(n log2 n) and intersecting two sorted lists is O(n), and n hash table searches is only O(n) if you have enough memory to avoid collisions.


This algorithm fetish is just stupid. If a function / program isn't slow then it doesn't matter and if it's slow the dev will notice. I mean, come on -- is the dev not going to notice a practically infinite loop?


It wouldn't be slow on "reasonable" testing datasets.

Do you time your app at various loads (including well beyond expectations), graph the performance and look for non-linear increases?


>> Edit: I realize the above comment sounds a little condescending. It wasn't meant to be. I have met self taught programmers that are amazing. Most people just don't have the self discipline to fully learn a subject matter (I many times don't). So a lot of "self taught" programmers have only taught themselves 10% of what they need to know and never bother learning the other 90%.

It was condescending to my eyes, nothing more nothing less.


What I've found is that I do not remember that, say, a tree map or heap have logarithmic insertion time, but rather have a vague mapping of general concepts to behavior: something like "tree" => logarithms, "list" => lines (linear), "array" => magic (constant). Then I think something like "a hashmap is like an array, so it can be accessed in constant time".

If you're a visual thinker, the "shape" of an algorithm or data structure can be very helpful as well. Imagine it geometrically and think about how, say, the height of a tree compares to the length of a line with the same number of elements.

Of course, practice also helps. I'm lucky enough to get plenty of practice in school, but I think doing various condensed but nontrivial problems (think programming contests/project Euler) is probably both more fun and more practical for somebody out of school.

Ultimately, knowing running time of algorithms and behavior of data structures isn't really what's important; rather, it is just a "symptom" (in a good way) of having a solid grasp of what's going on. Just memorizing the times with flashcards is basically faking it--it might not be completely useless, but you'd be better off actually understanding everything.

I used to not have any idea about this sort of thing. My code worked fine, and I had no problems--I also never thought "hmm, I could have made this more efficient if only I knew about tree maps!". However, I now realize that the reason I never saw this was because I didn't know enough--I didn't even know enough to know I had a problem!

In short: you should learn how the algorithms and data structures work, not just for running times or interview questions, but to make you a better programmer. It definitely won't make you worse!


Thank you, this is a great answer that confirmed some of my suspicions. I'd be interested if anyone could recommend other good resources such as Project Euler. Perhaps the book Data Structures and Algorithms ought to be higher up on my reading list? I would feel a lot more enthusiastic about getting a copy and diving into it if someone with the context of my OP question could confirm that it's a relevant resource.


I haven't read Data Structures and Algorithms, but I can point you to a draft of the book for my current Algorithms class, creatively called Algorithms[1]. I think content-wise the draft is almost identical to the published versions; the main differences are in the exercises.

[1]: http://www.cs.berkeley.edu/~vazirani/algorithms.html

It's fairly in-depth but narrative in style and readable; I rather like it. One particularly nice touch is that it goes over the practical applications of the various algorithms at a high level, giving you a good idea of where they are used and why.

If you do not do much math and do not remember Linear Algebra/Discrete Mathematics, it might be best to skip the first couple of chapters and look at graphs first instead. That said, the first two chapters cover some math which is probably good to know.

The book is primarily about algorithms. It touches on data structures where needed to implement or describe an algorithm, but not much more besides (at least in the parts I've gotten through so far). I have not read any good data structures book, but have found Wikipedia to be a very good resource. I suggest avoiding overly language-specific books on the matter--I think that Java books in particular are insidious, but I am little biased in that regard.

The best approach would probably be to read through a section about some algorithm, look at what sorts of problems it can solve and then write a simple program to solve those problems. For example, when reading about path-finding algorithms, you could write a simple navigation program. I find this helps keep the material interesting and also aids retention.

Additionally, I have heard good things about Introduction to Algorithms[2] (another creative name)--it was suggested as a reference for the course, although I did not get it (Wikipedia is good enough for me).

[2]: http://en.wikipedia.org/wiki/Introduction_to_Algorithms


I took the algorithms class at UCSD a few years ago from Sanjay Dasgupta, one of the authors of that book. At the time the book wasn't finished but we used a draft as the lecture notes. One of the best classes I took. I use that book to this day as a reference for some algorithms.


The facebook puzzles are pretty good, especially for graph algortihms.


I'll give you my point of view. I'm young(21), I'm in my third year as an informatics student, and I feel greatly intimidated by all these hard stuff. I can barely write a working web app, even with all the hand-holding modern frameworks give me. I don't consider myself smart(i'm bright enough to be a programmer, but nothing more).

Recently I sat down and thought about where I want to be in 5-10 years. I don't want to be writing django apps in 10 years, I don't want to write software that sort of works some of the time, and i don't want to be a replaceable commodity programmer working on outsourced software that the real programmers didn't feel like working on, so they send it to India, or Eastern Europe(which is where I’m from). Sure, I can make a comfortable living that way, but what I want to be in 10 years is a competent expert.

I want to work in diverse fields, with people much smarter than me, and the little knowledge and experience I can gain in my career I want to pass down to younger programmers. I want to wake up in the morning and be proud that I've build things that actually matter, and not just the latest social networking pop-culture crap that many of us are dealing with now. In short, i want to e a hacker!

To do this, I decided to dedicate myself to actually learn the hard stuff that most people shy away from, just because they can get by with only some python/ruby/php and jquery knowledge. Math, CS(algorithms, data structures, languages, compilers, architectures, networks, operating systems), science, philosophy, history, art, etc. All hard stuff that I'll never need to know in order to build a web store site for a small business that barely gets the web, get paid, and go wash the car on the weekend. Even if i can never learn all of it, even a fraction is better that writing the same simple web apps for 10 years.


Thank you for such a candid post! I know how you feel.

I'm only a little farther along in my journey than you; at 26, I finally feel confident in my abilities, but am still humbled by how much I don't know. Last month I spent hours poring over Linux network programming resources so I could implement a performant, scalable, thread-safe game server [1]. After weeks, I finally got it working--replete with tests, sane coding practices and other Good Things--and it felt so good.

Like you said: "All hard stuff that I'll never need to know in order to build a web store site..." I didn't need to reinvent the wheel--there are tons of pre-existing solutions and frameworks out there that I could pick up and use to hit the ground running. But, like most people on this site, you and I were bit by the same bug: the desire to understand how and why it works.

For me, the hard part isn't dedication to learning or writing dime-a-dozen web apps to pay the bills. The hard part is finding the right balance. I think that's probably something I'll keep learning more about until I finally kick the bucket.

If I can presumptuously offer advice, from barely a half-decade down the line: Don't give up, and learn to see the intimidation for what it is: thrill! Do as the sagacious Derek Sivers commands: whatever scares you or excites you, go do it [2]. If you come upon something that seems insurmountably complex; it could be anything: compilers, emulators, algorithms, low-level network programming... do it. You don't have to learn it in a day, a month, or even a decade, but never fool yourself into believing that it's too hard. It isn't. Take breaks to keep yourself sharp, but never give up.

[1] - http://stackoverflow.com/questions/6849239/how-should-i-arch...

[2] - http://sivers.org/scares-excites-do-it


I would say that you will feel this way once you see someone doing half the difficult job you do, but because of his productivity and problem solving skills takes huge strides because of the lack of good folks in the 'outsourcing quality' work that you describe.

The day you see such a guy, you suddenly get a feeling about the what the hell are you doing being average among the crowd of algorithm writers, while you could actually be a leader earning great money and with a big designation.

As we age we carry on a lot of financial responsibilities and money will play a huge role in the choices you make. You may give a great interview and go on to work with the brightest engineers at google and get lost in the brilliant crowd. At that time when you start reviewing your choices, you will get a feeling that you could have been unmatchable only if you had chosen a more easier career path.

In short,

There is no point being an average guy among a crowd of brilliant people.


I have to disagree. How did those people become brilliant? Hanging around smarter people probably had something to do with it. That's why I joined HN in the first place, 1207 days ago I was probably among the most inexperienced, and stupid people on this site, probably still am, even though I've grown so much since. I think there is no shame in choosing the simpler route, but i can't make this choice for my self. I care too much about doing good work, even if i suck at it at the moment.


I am all for hanging out with smart people. Learning and working on tough problems. But If your even half serious about doing something big. Those leanings have to happen quickly. You will have to then go on by yourself. Do something on your own, lead and manage something big.

My point was that there is no benefit in being the average among the best. But you can do wonders by being the best among the averages.

Of course you can disagree with this. But more often, all you will need is certain guiding principles in life 'which you need to learn from smart people' after that you will have to the grind work individually on your own. The sooner you learn those principles the better. There is never going to be a situation where you will be carbon copying from those smart people all over the time.

The other sort of learning, is incremental learning which happens all time, regardless of where or what work you do. Regarding specific life changing learnings, there are only going to very few of those in your whole life.


Such moments of clarity are rare and brief, but to a surprising extent they define our lives. I would bet on your success.

Back on topic: it's actually pretty easy to intuitively estimate the asymptotic time performance of most data structures if you sit down with a good algorithms book for a while and read through until you understand it. If you don't find everything simple the first time, go back to the beginning a few days later and read it again. This will give you big-O-estimation skills for things that aren't mentioned in the book, plus thorough understanding of the things that are mentioned.

I prefer Robert Sedgewick's Algorithms books, because the pictures are amazingly useful and abundant, but there are other good options. After a while it just starts to make sense.


This seems to be the path taken by Paul Graham and Robert Morris (just to name an obvious example most readers here will know). Because they knew a lot about Computer Science (and Lisp), they were able to take an approach to building their software and business that their competitors would have never considered. I'm guessing their knowledge of "science, philosophy, history, art, etc." played a big part in their approach, as well.

Which is to say, I think you're on the right track if you don't just want to follow the popular trends, but want a chance to do something new in a way that no one has even considered yet. Which certainly sounds a lot more fun, if nothing else.


"I work predominantly with server-side, interpreted languages, and I just don't run into the bottlenecks and performance problems that Big O-knowledge seems to mitigate."

I tend to think of programming tasks being split into two major categories: applications programming and systems programming.

My definition of these terms: applications programming is writing software to be used by a normal person. Systems programming is writing software to be used by another programmer. (These definitions may be a little different from how these terms are commonly used.)

From your self description, you seem to land pretty squarely on the application programming side of that divide. For most application programming tasks, getting good performance consists mostly of selecting libraries, databases, web servers, languages, etc. with the performance characteristics your application requires, and building your application on top of them.

The system programmer is the one writing those libraries, etc. and so those performance requirements fall squarely on them.

So I would say that as long as you stick to applications programming, algorithmic ignorance probably isn't a major problem. But a company like Palantir needs systems programmers (per my definition above, even if the "clients" are just other developers within the company) which is why they need to ask algorithms questions in their interviews.


I didn't do CS as a major in undergrad (I found it too easy... that sound pretentious, but the program at my college simply wasn't difficult enough for me), but I did do a MS in CS afterwards.

It helped a lot with basis for algorithms (terminology so you can understand what you are reading), but it didn't really teach me all of the data-structures and algorithms that I know. (There are really too many to cover everything properly/thoroughly in the extent of a course or two, even at graduate level). Also, I would have been sunk if I hadn't known a lot of it before applying to the program.

So I learned a lot of it through my own natural curiosity.

Everything concrete that I learned about data structures and algorithms started by reading Wikipedia. The CS articles are fairly accurate (it is hard to gain anything by spreading misinformation in them) and there are often times links at the bottom to the actual papers and code describing and implementing them. Start with the overview article and then read the paper. Look up any terms you don't know. Try implementing it in a language you like.

You can also Learn about data-structures in languages that you like (looks like ruby/python for you; you are in luck... those are both open source). Figure out how they are implemented, and performance characteristics. Go ahead and download the source tree and read the code.

Implement naive versions of various datastructures/algorithms yourself. See if you can play with them and improve them for various tasks. (Can I make this hash table use less memory? Give it a faster hash function? Make it a game).

I think flash cards would be a bad move, because then it starts to feel like a chore. Programming and reading and learning shouldn't be taken to like chores, they should be enjoyable and rewarding.


Thank you for the Wikipedia suggestion; I hadn't really considered using it as a learning resource because I've had bad experiences trying that with other fields. (For example, I think Wikipedia is an abhorrent place to learn mathematics, though it clearly functions well as a reference for people who already know what the hell is going on.) Perhaps being already fluent in the fundamental concepts of programming and computer science will make the Wikipedia articles on data structures & algorithms more useful from a learning perspective.

Tinkering with the source code for $language is also a great idea! That meshes with my empirical experience that the best way to learn is to "play". These days, it's rare for me to write code that isn't related to some specific personal project or business goal; maybe bringing back a bit of the "play" aspect is one of the keys to improving fluency.


To be clear, Wikipedia is really just a jumping off point. You can only put so much information on a topic into a Wikipedia article.

You really start learning when you read the references, and have 5 or 6 browser windows open with web searches to things that either interested you and weren't referenced, or to terms that you didn't understand.

>That meshes with my empirical experience that the best way to learn is to "play".

It has actually been shown in educational studies that this is one of the best ways to learn; so social science seems to match up with your experience as well.


I've been doing this exact same thing to learn. I read an interview question that asked how to sort a linked list. I tried implementing it on my own. Later I looked online to see other implementations and compared with my version. I was a little bummed to see my solution wasn't ideal but it only made me dig deeper.

Later on I found some Stanford lectures, particularly Programming Abstractions by Julie Zelenski (around lecture 14). After watching that lecture I was able to classify my linked list sorting attempt as a selection sort. I was pretty excited that I wrote a selection sort without even knowing what kind of sorting I was doing. I have yet to use my newly acquired knowledge at work but I'm now confident I can spot certain sorting algorithms and I'm much more aware of when to use each.

If anyone is aware of any other resources like books, online tutorials or videos I would love to hear about them!


The Art of Computer Programming (TAOCP), Knuth

http://www-cs-faculty.stanford.edu/~uno/taocp.html


That's pretty heavy going though — I remember my algorithms lecturer suggesting we should only get those tomes for the geek points. Meanwhile, Cormen, Leiserson, Rivest and Stein's Introduction to Algorithms is relatively cheap, has good explanations, a more modern presentation of psuedocode, and a breadth of material that covers all the algorithms you could be expected to know as a general programmer, plus some more.


For a significant number of gigs, you don't need to know algorithms and data structures. This continues to surprise me.

I think for interviewers, this approach to quizzing on algorithms and data structures is probably just a well-known practice and probably lacks correlation with the work of a specific job role most of the time.

If you aspire to work somewhere that has a demand for deeper algorithmic work, you do need to understand algorithms and data structures. I think this is probably a smaller number of potential jobs than many of us on HN would like to admit - there is simply a large number of mundane IT/dev jobs where the naive imperative solution will be just fine.

I would skip flash cards, but try to find small projects at work that require algorithmic and data structure thinking. Then you get some applied experience that will "tell a good story" at an interview plus build your analysis skills.

I think for a lot of jobs, even being able to talk about an algorithm and data structure design that you personally did will put you way out in front of a number of candidates.


Author of the post here. I don't think you're "doing it wrong" or missing anything because of a cognitive blindspot. Algo skills are important for certain types of problems (scaling, number crunching), but useless for others (interaction design, API design, ...). It's just one of the skills we're looking for, since a lot of our meatiest problems don't require algorithms at all (beyond a simple hashtable or two). If you've looked at your own coding history and don't see the need to be more fluent with algorithms, then you're probably right.


> I work predominantly with server-side, interpreted languages, and I just don't run into the bottlenecks and performance problems that Big O-knowledge seems to mitigate.

Algorithmic complexity isn't about performance, it's about scalability. A hand-written implementation in assembly that's O(N^2) will be slower than an implementation in Basic that's O(N) (or even O(N log N)) for a sufficiently large N.

I suggest picking up Programming Pearls, they had an example of a BASIC program on C64 vs. a Fortran program on a mainframe (the book is from the 80s) demonstrating this.

That said, if building basics web apps is all you want to do (e.g., internal enterprise applications) then you don't need this knowledge. However, the harder stuff (whether algorithmic, systems, etc...) like (but not limited to) what Palantir works is what I find more interesting.


The problem with this is, in most cases you're not dealing with sufficiently large N - plus the high-level language of choice will already come with an efficient sorting algorithm for whatever lsit type it provides.

Whenver you do start to approach values of N for which you need to fine tune, you're going to be researching and profiling and optimising the hot code anyway.

I'm not saying you don't need to know algorithms, but you don't need to know the big O for every sorting algorithm / insertion cost to each type of tree off the top of your head. A rough idea of whats terribly inefficent is fine for most tasks.


Maybe. But it is terribly easy in a high level language to use a built-in function that is O(N) and put in in a loop, and now you're at O(N^2). Maybe it works in reasonable time for N=10,000. Maybe you test it on small data sets to see that it works. Now turn it loose on an N=10^7 dataset. Oh.


That said, if building basics web apps is all you want to do (e.g., internal enterprise applications) then you don't need this knowledge.

Watch it before deploying those big joins that worked so nicely on your development data set, then.


There aren't too many things to remember in order to be useful:

- Operations that take the same amount of time regardless of the input are O(1). ie: 1+1 takes just as long as 10000+10000 (okay, not really, but it's a reasonable assumption.)

- Doing a small sequence of operations runs in O(maximum of all operations). Doing 5 adds (which are O(1)) runs in O(1). Doing an O(n) operation followed by an O(k) (where k > n) runs in O(k) time.

- Looping n times over O(k) operations runs in O(n*k) time. (Assume n is fairly big.)

- Due to the structure of a tree, accessing a node is generally O(depth of the node). Max depth = log_m(number of nodes) for a full tree. (Where m is the number of children each node has. For a binary tree, m=2)

- From there, you can pretty much combine these rules to analyze many algorithms. These aren't by any means all of the rules, but they are some of the most useful in my experience.


Sorry, I should have been more specific: I understand the rules of Big O complexity, which you aptly summarized; the part I'm really curious about is "From there, you can pretty much combine these rules to analyze many algorithms." It's not that the concepts are mysterious to me, but that making them a part of my learning habits has not come naturally.


> the part I'm really curious about is "From there, you can pretty much combine these rules to analyze many algorithms."

I'm not sure if I'm still misunderstanding you, but here's an example of what I meant.

Say, for whatever reason, you want to populate a binary tree with k items of random data:

    for(int i = 0; i < k; ++i) {
        int data = rand(); // Let's assume this is O(1)
        tree.insert(data); // O(log(n))
    }
The inside of the loop runs in O(log(n)) time because O(log(n)) > O(1). We're doing it k times, so the total time is O(klog(n))

Now, if you wanted to do this for m trees:

    for(int j = 0; j < m; ++j) {
        for(int i = 0; i < k; ++i) {
            int data = rand();
            tree[j].insert(data);
        }
    }
We already know that the inner loop runs in O(k
log(n)). Since we've just added a loop around that, it's easy to see that the whole thing runs in O(mklog(n)) time.


We already know that the inner loop runs in O(klog(n)). Since we've just added a loop around that, it's easy to see that the whole thing runs in O(mklog(n)) time.

This is not necessarily true for all cases. (It is for yours.) It's possible for interaction to exist between a loop and its contents, such that the total time is not m times the running time of the interior of the loop.

  for (int i = 0; i < data.length(); i++)
      bubble_sort(data);
That's a trivial case but sufficient to prove the point. Bubble sort runs in O(n^2), and it looks like we're doing that n times, so the whole thing runs in O(n^3). But that's not true, since bubble sort runs in O(n) when the array is already sorted, so the whole thing is still O(n^2). In other words, the first iteration of the loop induces a side effect that affects all later iterations.


That is true, but one generally doesn't care about the best case. It's a good point though.

(Also, if data is a linked-list, length() could be an O(n) operation. That would make this O(n^3) or O(n^4) ;))


1+1 does really take the same amount of time as 10000+10000 on nearly all CPUs.


> Maybe it's because none of my personal projects have gotten popular enough to be subjected to the kind of load that reveals such problems.

That's probably the case, and there's no shame in that. The companies who ask these questions do so because their engineers deal with these kinds of scaling challenges every day. They can't afford to have people casually slip in an O(n^2) algorithm when an O(n lg n) or O(n) algorithm would do. If you don't have the experience of working in this kind of environment it's not a dealbreaker, you just need to show an awareness of the big-O implications of your algorithms, which will most likely come from some combination of study and practice.

> Is my lack of fluency preventing me from understanding just how important it is

Probably not -- if you're working at a small scale, then big-O probably truly is one of your lesser concerns (unless you write an O(2^n) algorithm, which is less common to happen accidentally).


The companies who ask these questions do so because their engineers deal with these kinds of scaling challenges every day

I'm skeptical of that statement. Some of the engineers at, say, Google, deal with code complexity and scaling every day. Most probably don't. Certainly, most at the non-Googlish companies hardly ever deal with this. Yet the majority of interviews I've gone to over 12+ years have involved significant questioning about algorithmic complexity.

I agree with your statement about awareness of Big-O, but often companies seem to be looking for over-stressed obsession, even for jobs which don't utilize it.


> I'm skeptical of that statement. Some of the engineers at, say, Google, deal with code complexity and scaling every day. Most probably don't.

I am an engineer at Google, and basically all of the engineers I work with have to scale their software in least one dimension such that big-O complexity is a significant (and daily) concern. What I mean by "daily concern" is that the check-ins are happening on a daily basis would be noticeably inefficient if they used an O(n^2) algorithm instead of an O(n) algorithm, for example.

> Certainly, most at the non-Googlish companies hardly ever deal with this.

My main experience is Google and Amazon, so I couldn't say. I was meaning to talk mainly about Google/Amazon/Facebook/Twitter-like companies. I'm betting Palantir is also one such place.


And financial companies, banks, hedge funds etc. Even the most "boring" financial companies (actuaries, insurance companies) process huge amounts of data where performance becomes critical.

We have a data base that gathers about a 5GB of market data a day and we have to run algorithms to analyse the data and analyse the behaviour of our system. Efficient algorithms become very important.


I've worked at a long series of companies that deal with these at least every week or so. Maybe not daily. But I'm writing code where it matters most days.

Maybe I'm unusually hardcore? I'm definitely not all about the algorithms, but I work on a lot of systems programming.

The other thing is that often you don't realize that what you're doing requires thinking of this until somebody else has to go fix what you wrote. The fix will even look very simple. The insight going into the fix isn't as simple, though.


I've worked at a long series of companies that deal with these at least every week or so

And I've worked at a long series of companies that don't. That's the thing: different strokes for different folks, as it were.

Maybe I'm unusually hardcore?

I think it's less about degree-of-core-ness and more about platform and domain knowledge. Personally, if I were hiring for what I consider to be a fairly typical company, I would be looking mostly for ability to consistently get functional, bug-free code running. I would likely seek stronger algorithmic knowledge in some small ratio of employees.

I guess I feel strongly about this because I think it's fundamentally disrespectful to all coders, and ignores the tremendous variety of tasks and skills out there. A self-taught hacker who can get a great product going in a short time may be just as important and valuable to a company as the algorithm-strong coder who catches scaling problems before they come up. When it comes down to it, the goal is to create product and value, and that happens in many, many different ways.

The fix will even look very simple. The insight going into the fix isn't as simple, though.

That's true of the majority of bugs I've written or come across, whether its related to performance, crashing, or unpredictable behavior. In fact, looking at fixes that coworkers make to your own code can be a tremendously enlightening (and humbling) experience.


My preference would be that any developer for any job should be able to write functional, and (fairly) bug-free code.

Then I would hope they have a good sense of usability, if they are writing user facing code. And I would hope they have a good sense of algorithms and other aspects of scalability, if they are writing server side or systems code.


People ask algorithms questions not because they are relevant, but because they are a good proxy for your abilities. Well, at least they are perceived to be a good proxy ...


Right, that's my point. It become sort of a cargo cult of candidate selection by people who don't actually understand what they want or need.


I suspect a "cargo cult" for hiring is not what rguzman meant. Rather, algorithm skills are one of the best signals you can extract from a short interview. I've always found them to be much better at predicting a candidate's ability to do the job than anything else I could ask in 45 minutes. That may mean I'm a poor interviewer, but until I can figure out better questions to ask, I'm happy to use this signal since in my experience it works well.


That may mean I'm a poor interviewer

Not at all. It means that humans are complex and ranking them is extremely difficult.

My issue is with the fact that I've met too many very smart people who can prattle on about Big-O and optimal architectures for hours, but when it comes time to get dirty and finish a job, they lose interest, produce unmaintainable, over-engineered nightmares, and/or turn out not to actually understand the platform well enough to execute a clean solution.


>My main problem is that I don't seem to need this information to do what I do. I work predominantly with server-side, interpreted languages, and I just don't run into the bottlenecks and performance problems that Big O-knowledge seems to mitigate. Maybe I've been lucky.

if you don't know the time it really should take, then you just don't know what you have a performance problem. Dunning-Kruger in application to software engineering. Too typical a situation in the industry.


I'm an engineer. I'll worry about the time my code takes once it's notieably slow. If everything I write runs in a fraction of a second, why would I waste my brain optimizing it to make it faster, when all I'd achieve is introduce tricky bugs.

Coding is an engineering discipline: it's all about tradeoffs. Bugs and reliability. Performance. I know to focus on what is important to get a working product.


>I'll worry about the time my code takes once it's notieably slow.

the Dunning-Kruger is exactly about ability to notice

>If everything I write runs in a fraction of a second,

where are different fractions of a second out there. Some are slow, some aren't. One either knows which are which or he doesn't.

>why would I waste my brain optimizing it to make it faster, when all I'd achieve is introduce tricky bugs.

that is one of the main points where effective implementation knowledge [incl. algorithms] comes to play. Typical "blackmail" in the industry - if performance then bugs.

>Coding is an engineering discipline: it's all about tradeoffs. Bugs and reliability. Performance. I know to focus on what is important to get a working product.

man, you sound like a typical PM when schedule is pressing and the things are started to be pushed out of release, and a pile of slow sh!t is released as result.


Rather, I think his point aligned with points Jonathan Blow makes in this presentation: http://the-witness.net/news/2011/06/how-to-program-independe...

One of his points - with which he uses an id game as an example - is that sometimes the naive implementation is the best implementation. His example has to do with loading levels at startup. The written code was, from an algorithmic perspective, inefficient. It was written in such a way to optimize for developer time. His point was that in this circumstance, that was the correct thing to do because this was not a performance critical part of the code, and naive implementations are easier to maintain.


>I'll worry about the time my code takes once it's notieably slow.

the Dunning-Kruger is exactly about ability to notice

It seems like you are saying that every programmer ought to go out of his way to become fluent in algorithms, because without fluency he or she will never notice when their code is running slowly.

If so, I disagree. It requires no skill to notice whether code is running slowly--you needn't look further than my techno-skeptical dad muttering, "Jeez, this thing takes forever to boot up" as he regards his desktop with disappointment. I think this is what the grandparent commenter was talking about: if I'm not fluent in algorithms, but my dad doesn't think it's slow, then where's the problem?

"Premature optimization is the root of all evil."


>"Jeez, this thing takes forever to boot up"

this "forever" consists of myriad of "a fraction of a second"-s, with each fraction individually not being "notieably slow" and thus not "prematurely optimized".

And if not optimized "prematurely" (ie. written efficiently from the start), then after-the-fact optimization, if happens at all, would shave only a fraction out of the "forever" - thus it frequently doesn't happen at all because of such low projected ROI.


> And if not optimized "prematurely" (ie. written efficiently from the start), then after-the-fact optimization, if happens at all, would shave only a fraction out of the "forever" - thus it frequently doesn't happen at all because of such low projected ROI.

My personal experience completely contradicts what you're saying.

Despite the cushy comforts of server-side scripting languages, I have experienced the occasional performance or scalability problem, and in every single case, I wrote inefficient code on the first pass, discovered unacceptable slowness at a later date, then found the bottleneck and optimized it.

Perhaps I am a brilliant super-coder (unlikely), but I have never ignored a performance problem because I didn't think I could improve it enough to be worth the effort.


man, i'm talking about complex systems (2G+ of source code for example) that are already well past several low-hanging-fruit passes, component- and system-wise. Their performance after that reflects the overall average emphasis on performance and skills of developers through the life of the project. And no amount of after-the-fact optimization would make Windows into Linux performance-wise.


I agree.

2011 Programmer Cost/Benefit reality:

1) The chance my app will be so heavily used that it requires deep knowledge of many algorithms is pretty low.

2) The chance my app has serious marketing or business model problems is pretty high.

3) The extra cost of using a scalable platform (PaaS) like Appengine, Heroku or plain EC2 is less than both the cost of my time to learn or relearn all of those algorithms and the cost of setting up my own scalable platform.

I see far more business model/marketing problems than scalability/performance ones.


I think you're correct that for certain types of apps, this sort of knowledge is pretty useless.

However, once the number of things you're dealing with gets up to, say, the millions, algorithmic complexity can really bite you in the ass and no matter how much hardware you throw at it (rented or otherwise), if the work that a single node needs to do is unreasonably complex, your whole app will be slow for every user, even if you have enough capacity to handle many simultaneous users.

In the mobile space, you can think of algorithmic complexity being a proxy for battery life - if you can make it cheaper to compute, you do less overall work, and the battery lasts longer.


I think you're correct as well.

I'm just saying most paid programmers are not dealing with the problem of how to handle processes that involve millions of users, but they do have problems with marketing their app or monetising it. Therefore spending their weekend learning something like Dijkstra's Algorithm may not be the best use of their time, but finding ways to better understand the needs of their paying users probably would be time well spent.


> I don't seem to need this information to do what I do

Are you sure about that? Have you never tried to do something, found it was taking forever, and gave up? Maybe you concluded it was impossible, or you found some way to approximate what you needed.

I am another self-taught programmer, and believe me you are going to be kicking yourself once you do learn a few good algorithms and data structures. Your favorite accomplishments -- intricate beasts that you felt you had tamed with much sweat and effort -- are going to turn into five or six lines. Or maybe two or three small, easy to understand programs. Impossibly large amounts of data are going to look easily tameable.

> How does one achieve this?

Not so hard - read a good book. There are lots.


Related: What books or other resources would HNers recommend for someone with limited algorithm and data structure experience who wants to beef up via self-study?

I'm specifically interested in books/other resources that lend themselves to self-study. It's easy enough to look up what MIT is using for their Intro to Algorithms course, but it's harder to gauge if a book or other resource is suitable for usage outside of a classroom setting. Bonus points if it has a practical focus so that I can quickly add the gleaned knowledge to my real-world toolkit.


The standard reference (in the "That's what MIT uses" sense) is Introduction to Algorithms[1], as you probably know. It is a wonderful book and I have a copy that is wearing of because I use it so much.

As for "outside of a classroom setting", the best book I can recommend is "The Algorithm Design Manual" [2]. It is worth its weight in gold. Very user friendly, as I never though an algorithms book could be. It contains dense material, but at the same time reading it feels like reading a good fiction book. I first heard of it while reading Steve Yegge's post "get that job at google"[3], bought a copy and, wow, what an amazing book.

[1] http://mitpress.mit.edu/algorithms/ [2] http://www.algorist.com/ [3] http://steve-yegge.blogspot.com/2008/03/get-that-job-at-goog...


For anyone trying to buy "The Algorithm Design Manual" (or at least add it to their Amazon wish-list with a dreamy sigh), It seems like the "correct" version is here:

http://www.amazon.com/Algorithm-Design-Manual-Steven-Skiena/...

I was a bit flummoxed at first because algorist.com doesn't tell you how to buy the thing, and the Kindle edition appears to be out of date (based on the first edition, obsoleted by the second edition from 2010).


Sorry about that.

I should have linked to the Amazon page in the first place (who am I kidding?). You linked to the correct version.


Another good option is Algorithms by S. Dasgupta, C.H. Papadimitriou, and U.V. Vazirani. A draft is available at http://www.cs.berkeley.edu/~vazirani/algorithms.html.

I find this book readable and interesting; it does tie the algorithms to practical applications at a high level. It is language-agnostic, so it should be a good fit for almost anybody.


http://www.amazon.com/Algorithms-Nutshell-OReilly-George-Hei...

The book was written as a companion to Introduction to Algorithms by CLRS and is comparatively lighter on theory, but still a handy desk reference.

disclaimer: it was written by my undergraduate advisors


I have found "Foundations of Computer Science" by Aho and Ullman approachable. The book is available online at: http://infolab.stanford.edu/~ullman/focs.html


Just read the lecture notes for a CS course that covers this already. It will almost always be a course that also covers the sorting algorithms and basic data structures(what is a linked list, tree, graph, etc.) It won't take that long. As for coming up with the Big-O omplexity for a routine on the spot, most job interveiws will throw something at you where it should be intuitive if you know the canonical algorithms. I have yet to read about someone having to solve a recurrence relation on the spot(everyone's already forgotten the cook book approach they learned in school, if they ever learned it at all). It's not a lot to remember, so no big deal if the only thing you use if for is to impress someone. If you want true "Who gives a shit?" material from academia, try Automata Theory, or whatever the course where you learn about the pumping lemma is called. Never read about that coming up in an interview.


How about the obvious thing, which is to look at the curriculum of undergrad and grad level algorithms courses, pick up some textbooks, and learn it on your own? Flashcards won't help, because it's not a problem of memorization. You need to own the knowledge that lookup in a balanced binary tree is a logarithmic operation.


On like the very first day of algorithms class there was a table with a bunch of data structures down the side and insert, lookup, delete across the table with a bunch of n's and lg n's in the cells. There was also a table very much like it on the first exam, except the cells were empty.


The sad thing is, I bet a lot of people memorized those instead of learning enough about the data structures to be able to fill out the table on the fly.


For your next side project, read CLRS and force yourself to complete a reasonable number of the problems (say every other problem, or every nth problem).

Then I think it will stick.

If you really don't need this info, then don't waste your time. To work at a place like Palantir, you do need this info.


It is unfortunate that instead of turning into a discussion about how to excel in an algorithms interview (which is what the original post is about) this thread has wandered into whether it is useful to even learn algorithms.

What a waste!


Maybe 50% of the replies to my OP involve the importance of algorithmic fluency; the rest of the posts in this sub-thread are all delightful suggestions about which books to read and how to fit algorithms into one's learning habits even if they weren't already there.

And what is wrong with one sub-thread dipping into an utterly related topic? Are you suggesting that no discussion should ever shift from one aspect of a topic to another? Or are you suggesting that every programmer, anywhere, ever, should obviously learn algorithms until they are completely fluent? Both of those are baffling to me, as the former would be a completely unstated expectation in the entirety of internet discussion forums, and the latter is what we are discussing in this sub-thread!

This is one of the most interesting threads I have ever read on Hacker News. It contains a fantastic breadth of perspectives and has remained admirably civil and candid. I've had 2 people e-mail me just based on this thread, and I've purchased two books just based on this thread. I take umbrage that you would call it a waste.


The collections data structures are more or less designed to achieve certain performance characteristics. So knowing those performance characteristics means knowing what those data structures were designed to do. This seems like a pretty critical part of familiarity to me.



You could do some problems at topcoder or some acm problems. They require knowledge of algorithms and data structures, so its a good way to learn it.


This is ringing true. It somehow illustrates how essential computational complexity is because you're totally fucked up if you have no clue--yet how rarely you actually bump into the question of complexity in real life.

I'm self-taught and I've read and learned about computational complexity several times from various CS books and later Internet since I was maybe fifteen years old. I know how O(1), O(lg n), O(n lg n), O(n), O(n^x), and O(x^n) scale and which kind of algorithms roughly belong to which order of complexity.

However, keyword being several times.

I so rarely bump into these that I don't actively remember if heapsort is lg n or n lg n, or how the best-case, average-case, and worst-case complexities differ exactly between heapsort, quicksort, and mergesort. (I do remember that mergesort had a good worst-case behaviour in comparison to the others, at least quicksort, but I'd really have to check the bounds separately.) To mitigate this, I've guiltily re-learned them several times--I like reading Knuth or other CS books per se, just for fun--but like anything you don't need you do have a hard time remembering.

However, I do recognize them all being lg-something and looking up the various bounds is just a Google search away. Further, I do recognize the complexity orders in the sense that I might prototype something and initially stash items into a linked list but make a mental note that accessing the list can potentially become a bottleneck because it's O(n) so I shall reserve some mind power to change it into some O(lg) tree or ordered array at some point.

But at the other end the truth is that you just have to have some idea about complexity. You generally don't need more than that, and when you do you can just re-study that portion of the subject in detail from a huge number of web pages.

For example, it's useful to recognize the obvious bottlenecks of O(n^x) and O(x^n) but if you do, you don't bump into them anymore. You might bump into O(n) bottlenecks because you left them there yourself, probably on purpose, but since you recognize what linear complexity can do you will generally just change the relevant data structures into O(lg something) and be done with it. And if O(lg something) code becomes a bottleneck, you usually have stupid access patterns elsewhere in the code; I don't remember ever seeing a binary tree alone having become the real bottleneck in any program with the rest of the code being optimal.

What's interesting is that I more often bump into problems with the physical representation of the data structure than the computational complexity. A big naive hash table can cause lots of cache misses unless you deliberately work to align your items withing as few cachelines as possible. The difference between accessing five cachelines rather than five hundred cachelines can be blasting. I know this kind of optimization wouldn't scale towards n⇒∞ generally but it can have a big effect on real-world programs.


Disclaimer: I work at Palantir Technologies. These are my own opinions and not my company's.

The Palantir post is great for how to handle yourself when you are already there. Steve Yegge's "Get That Job at Google" is a great how-to-really-prepare piece. http://steve-yegge.blogspot.com/2008/03/get-that-job-at-goog...

A couple other suggestions:

* Find sample questions that similar companies use. Work through them. Discover what you don't know. Find if you're weak on dynamic programming, graph algorithms, algorithm analysis, etc. Then get strong on those. The less you want to do sample problems, the more important they are for you.

* When practicing interview questions, whiteboard at least some. Talk through them. Really go through a question in every detail.

* Before you interview at your dream company, interview elsewhere with similar difficulty questions. In my opinion, it should be another company that you'd like to work at but don't covet. If that doesn't work for you, a technical or even a non-technical friend could play the role of interviewer (you'll have to give the non technical friend questions!). Nerves have been a giant issue for me in interviews - and this has significantly helped.

* Calm down as much as you can and forgive yourself when you make errors. The questions at good companies are calibrated to have non-obvious solutions, interviewers expect you to struggle a bit. Candidates who freeze-up and attach their mind to one (usually wrong track) solution fail interviews. I've done it. If you've practiced with the first three bullet points, this should be less likely.


You mean the anti-Wikileaks Palantir?



For me, this blog post represents a step backwards. They open up saying that the 1 hour interview provides absolutely no indication of how well a prospective candidate will perform. Hopefully most people will agree here. Then they go ahead and state that their staple interview will be an on the spot problem solving screening and then list the steps the expect the candidate to take in solving said problem. What does this say about their own problem solving skills? I will leave this upto the reader. Let me throw an alternative out. Why don't you give a prospective candidate a written test/problem and give them as much time as they want, using methods and environments they are comfortable with, and have them return to you a solution of their choosing. Wouldn't this best capture how a person breaks down a problem, solves it, and finally presents a solution? Leave the onsite to better understand someones personality, likableness, and workability. Maybe even ask them to walk you thru how they came up with the solution and how they worked the problem. The options here are plentiful.


Thank you! I came here to post this. I was really disappointed when an alternative wasn't presented after they declared algorithm quizzes useless.

I'm all for code walk-throughs and explanations. You can't bullshit your problem solving process, but you can memorize (or forget) algorithm trivia.

For what it's worth I've failed and succeeded at these kinds of algorithms on the whiteboard interviews. Even after nailing it I felt uncomfortable and disrespected. I think there is a real problem with continuing to do these interviews because it's the cool thing to do, and these smart people don't feel like turning their problem solving skills towards the hiring problem.


That does seem to be an interview technique used at some companies. It comes with its own set of problems, though.

What if someone asks a friend for help or pays someone to solve the problem in its entirety? If your problems aren't unique enough, they could also google the answers.

With that said, I actually prefer your approach. Its pitfalls have to be weighed carefully; hiring someone who knew enough to fake it could be a costly setback for a small company. However, it does remove some unnecessary pressure from the interview process.


That would only be a risk if we were talking about a typical high school multiple choice test. I would think that a cheater would be filtered out by this point in their career (this point being applying to a high tech company). However, collaborating on a solution is not necessarily a bad thing. I think the goal here is to find candidates which can deliver unique, valuable, and insightful problem solving skills. To go ahead and bottle that process up into a pre determined steps or a 1 hour drill doesnt serve your goals very well. Who knows, maybe these guys have a recipe which works for them. From my own experiences, I would disagree.


You'd be surprised at what people will do to game an interview. I've had people give me the correct answer to a question I didn't ask (but from our list of questions) - clearly having gotten from someone else who had interviewed earlier.

Note that algorithms interviews are just one part of a larger inventory that one takes of a candidates, one dimension. Take home and/or whiteboard coding can be a part of that process as well, but each has its downsides.

What we look for, in particular, is methods that have a very low false positive rate. In person algorithm interviews, where the point is not the 'right' answer but hearing about how someone thinks are in that category. You can't really fake it.


Apple does something very similar (or at least used to). They give a difficult algorithmic coding problem and a few days to complete it. Then they grill you on it at the on site interview. There is no way you could fake explaining it - you have to really understand what you did.


One thing I noticed in recent interview is that the problems often test if the candidate can think to use two structures or algorithms to solve a problem. Early on, I made the mistake of thinking of interview questions more as homework problems from college where there was often only one type of data structure or algorithm involved in a problem. That's good when teaching a single concept, but real-world problem are more complex. My advice is to find problems that use multiple structures or concepts together and study and code those.

Also, learn to code the basics so well that you can do them by hand on a basic text editor or whiteboard. Get so good at it that you can write code that has a very good chance of running if it were actually compiled/run. I found that it's very different to code on a whiteboard than it is in an IDE or text editor. When my first interview came up, I struggled to write down things with a dry-erase that would have taken me seconds on-screen.

One strategy that I liked to see an employer use was homework. That might make some job seekers cringe, but gave me a chance to write real code like I would for a job. It was also a filter for how badly I wanted a job. There were jobs I didn't complete the homework for just because the location or employer wasn't something I was genuinely interested in.

I did homework for the position I landed and they did a code review as part of my interview. It went well, because I got to see how my future co-workers would handle giving feedback, and some comfort on my end that this employer actually cared enough to do code reviews.


The interview process at this point, at least in Silicon Valley, is broken.

There's an arms race between interview "gamers" that memorize all the answers to every single question out there, vs interviewers that are asking increasingly ridiculous questions. It's naive to think these days that not knowing the answer to a common question, but coming up with the answer will work. It won't. If 9 candidates know the answer right away, and you don't, but you figure it out, do you honestly think the interviewer will care? I've talked to plenty of interviewers at my company and that's the unwritten rule.

For example, over the phone, one of my friends was asked to design and give an algorithm to solve a maze. It took 10+ mins just to understand exactly what the interviewer wanted, and at that point, time was up, the interviewer was frustrated, and my friend didn't get the job, even though he's better than me, technically.

The only way to interview these days, it to know everything.


It was much cooler to be jaded about the interview process last year.

But in all seriousness, I'm sure there are companies like the one you described and there are many that aren't that way. I'm certain that your blanket characterization of 'the interview process, at least in Silicon Valley' being broken is incorrect.

You don't need to come up with weirder and harder questions. You just want to give them something in a form that they haven't seen before. It makes them listen and think, which is what you're looking for.

I always look at an interview to be more like a code review than a unit test: it's not about getting the right answer, it's how you get there that tells me what kind of an engineer you are. In fact, if you get the answer too quickly or jump to it, even if I don't think you're cheating, I need to ask another question or make you explain yourself in depth so I can probe how you got there.


Frankly, I find this type of interview insulting. Let's face it. Unless you work with "algorithms" on a dialy basis, I'm willing to bet that most who do not cannot regurgitate a red black tree at the drop of a hat. I work in this field off in on as an embedded developer and I'm always going to various texts on the subject to align my thinking with the code I'm writing (C++ in my case). It's one thing to talk through these scenarios in an interview and it's yet another to judge one's ability to solve problems on how well they come up with a solution to an interview question. I give merit to the "talking through" and this is what I do. I find that if a candidate can carry on an intelligent conversation about a particular problem domain, he's hirable material and in 15 years of running my own company I've had only 2 out of 43 people leave my company.


I don't think anybody's asking you to write a red-black tree of the top of your head, but it would be nice if everybody knew what a balanced tree was and why they might be useful.


It would be nice, but I bet the former question is asked far more frequently than the latter.


>Given a whiteboard and one hour,

We, hiring folks, aren't limited to just that anymore. There's github, linkedin, hn/reddit posts, random google stalking, etc.

I can find a lot about you, your attitudes, opinions, ability to communicate, style, etc. that, or you for whatever reason (paranoid, on the lam, aren't passionate) have zero online presence).

The face to face interview is mostly to confirm or refute what I've already learned about you.


I am a strong believer that hackers need a portfolio of work for potential employers.

I am not convinced that such a portfolio must exist in an online and easily discoverable package for potential employers.

Having a strong and trusted personal network is probably a stronger signaling system.

Of course, having both a meaningful online presence and a strong referral network is best.


What about people who don't use their real name online? Do people really give recruiters their reddit username?


More than one application I've handled recently has included links to HN and LinkedIn profiles. Most list github account.


YC uses HN usernames internally.


speaking about value of the online anonymity :)


This is a great, meaty blog post. I really like the first point about always coming up with a brute force solution so that you at least have something, and so that you have a chance to think about the problem a little. I've seen candidates skip the brute force approach too many times so that they can immediately work on a more optimal solution. At the end, they often end up without any solution at all. I guess Knuth's advice about premature optimization applies too interviews as well.

There's also a good Quora page that talks about the (complementary) coding side of algorithm questions: http://www.quora.com/What-is-needed-to-write-flawless-code-d...


... or you could just forget about O(n) questions in interviews. I know those got really popular with Google, but as an R&D software developer, I probably had to worry about complexity maybe about 0.1% of my time.

How clean is your code? Do you have good coding habits? Do you get lost inside complex data structures? Do you understand concurrency issues?


"as an R&D software developer, I probably had to worry about complexity maybe about 0.1% of my time."

Yes, but does it scale? If you deal with 10x, or 100x, or 1000x the data, that % will grow. Fast.


Sign up at topcoder.com (https://www.topcoder.com/reg/) and do a BFS (Breadth First Search) of their Algorithm (Single Round Matches) SRMs (http://apps.topcoder.com/wiki/display/tc/Algorithm+Overview).

Oh and if you want to brush up on your Algorithm Basics (http://community.topcoder.com/tc?module=Static&d1=tutori...).


I'm seeing a lot of great discussion on this thread, but I'd like to point out that you should NEVER rely on memorization to know the asymptotic bounds of operations on data structures.

You should understand how that data structure works and then derive the running time from that.

People seem to know hash sets use O(1) lookup, trees use O(log(n)), and so forth. In reality, they have an in depth understanding of the underlying implementation and are rederiving their answers on the spot with simple mental calculations.


You must hire people for two things to win big today's world. Productivity and Analytical skills. If you are searching people with specific factual knowledge, sure that is important but often that leads only to mediocre or average results.

If you know how to do a thing before hand, that helps only in solving that kind of problems specifically. And that too only if the person is productive enough to do it in time. Any new problem or a change in paradigm of thought will require you to undergo the same regime of work what it would be for a newbie. They don't call learning a never ending process just like that.

The biggest winning point in today's world is not knowing something. But discovering something quickly, and acting on it in time. If you are not hiring people for this you will end with a lot of work force which looks good, but doesn't necessarily translate to doing good.

This is the biggest problem with education systems around the world currently. They impart facts and leave the person just there. Further its the individual responsibility to take that forward.

The most brilliant folks who have done big around me have hardly been algorithm experts. In fact if you are one, you are not likely to take big risks. You are more than happy with the addiction towards that monthly salary. And what helps you pay of those college loans.

The most successful around me are the ones, do a great thing. Think about the next step. Research about it, work on it with full productivity. Deliver.. Next step and the process goes on.

If you hire people for factual knowledge you may hire the best, but you are sure to miss out the exceptional. This is probably the reason, why big companies fail to come with game changers.

You get folks who are great at the regular day to day operations. But anything else, and you see the problems clear.


"In fact if you are [an algorithm expert], you are not likely to take big risks. You are more than happy with the addiction towards that monthly salary."

Please elaborate on the evidence for a correlation between knowledge of algorithms and risk taking?

Seriously....


I have seen most people with degrees and especially brilliant ones hesitate to take risks.

Probably that's because they feel after all the hardwork its better to have some safer alternatives than risks which don't necessarily guarantee returns.

Now don't ask me to come up with survey/research paper. No one goes out and spends time/money/energy to prove each debating point.

But I agree with you as well, there is no co relation between them. But unfortunately truth is often stranger than fiction.


I don't agree with Aaron Swartz on very much, but we agree on hiring practices:

http://www.aaronsw.com/weblog/hiring


I think the part that's missing from discussion here is: "First: Make sure you understand the problem". Sure, for strictly algorithmic questions, where there is one mathematically demonstrable optimal answer (or at least, a way to compare solutions), this may seem less important, but for more open-ended questions (e.g., "How many barbers are there in [$LOCATION]?"), clarifying the why is essential, because definitions are important and vary by context. It's also important to seek whether a "good enough" solution is acceptable, or if high optimization is important. (Related, what are the memory/processing/load/runtime constraints, if any.) These are the sorts of questions that, in my mind, demonstrates a candidate's willingness to reflect on his/her own solutions.


Abstracting away the specific content about algorithms, these suggestions are helpful for any kind of technical oral exam, e.g. qualifying exams in grad school. The more clearly you can communicate your thought process to your interviewers, the more accurately they can assess your work.


For an excellent discussion of the approaches outlined here and problem solving heuristics in general check out Polya's How To Solve It, it's a classic (http://en.wikipedia.org/wiki/How_to_Solve_It)


Thanks for the link for the article - I appreciate it.


Sincere request to HN readers. If you have ethics, try to avoid working for Palantir. Palantir is an unethical company. They were involved in the HBGary scandal. They tried to smear Glenn Greenwald. They might have the best and hardest interview process and may have the smartest people. But all these technical wizardry is moot when they don't have ethics.

* http://www.salon.com/news/opinion/glenn_greenwald/2011/02/15...

* http://www.reuters.com/article/2011/02/17/idUS12186607112011...


If you're struggling with algorithms I'd suggest reading the book Programming Pearls (2nd edition). It's a little dated but not enough to matter too much. The big thing it helps with is what I think of as "algorithmic thinking". The various chapters go through different approaches to solving problems and help you get a handle on why a particular approach might be O(n^2) versus O(logn) and why that's important. It's very helpful stuff.


"Error establishing a database connection". heh.


Whoops - yeah, we're working on that problem right now - at the moment it's back up.


Was that a O(n!) query killing the DB? :)


It looks like it should stay up now - the traffic from this is much more than we usually get, but it's doing fine now that the server has more resources available to it.


You sure there couldn't have been an algorithm you could have optimized instead of just throwing more resources at the problem? Your solution seems awfully real world.


Real good advice. Totally agree with the brute force thing, I usually resort to that first just to put something down. Related note, Palantir was at the MIT Career Fair recently and when talking to prospective candidates, they asked them a little brain teaser as a filter on the spot.


Recognize the name because palantir sponsored one of the super happy dev houses. Rather insightful talk was given. Seems like you guys do a lot of number crunching which involve a lot of algorithms work.


Great article about this http://ycombinator.com/munger.html


Step 1, send your resume and skills to wikileaks or other whistle blowing operation Step 2, Palantir assists HBGary Federal in the hacking of wikileaks and wikileaks supporters Step 3, Get hired when they read through your skills


why those quizz interview sux is simple: anyone can look up the questions, learn them in 2 weeks of time (going slowly) if they've any kind of basic programming knowledge. so basically the questions are wasting the interviewed person's time.

if the person fails that test, it means they just didnt care much about that job. but i'm pretty sure any skilled interviewer can figure that out through a much quicker process.


I had an interview back in 2009 where the CTO of the startup asked me a question, I answered it, after which he very confidently corrected me with the wrong answer(!)

Needless to say I didn't join them. The startup also went belly up about 6 months after I interviewed there. How do guys like this get funded?


This is a well-written article. You've acknowledged an interview process, and you've provided very good advice for candidates who may encounter a similar one.

Ideally, both hirer and hiree would obtain perfect insight during the interview, and an alignment of skills, personalities and potential could be determined and made obvious to both parties.

We cannot obtain such insight, and yet hiring decisions are still made every day. Some work out, some don't.

As in most things, companies must optimize toward some strategy that produces results consistent with their priorities. If each of your programmers must be able to define - not just implement, but actually recall - breadth first, depth first, big-o notation, and others, I suppose your strategy will have to include such questions. What better way to determine whether your candidate can memorize definitions than to ask them for them?

But some parts of the common technical interview exist simply because the interviewers themselves aren't aware of any better strategy to obtain the insight that really matters. Is your culture easy-going? Do you need a certain kind of creative influence on your team? Is it really true that your programmers need to program algorithms at all?

My point is really quite simple: to alleviate the feeling that we're passing over good people, we should alter the interview process to give the candidate ample opportunities to show us they're a good fit (and that we're a good fit for them. For the record, I never accept offers from companies that ask me lexical structure or algorithm questions, especially if those questions are posed by peer-level interviewers - it suggests a non-performing competitive rather than cooperative atmosphere). We should identify, if we can, the priorities of our team, and then attempt to find a strategy that is optimized for presenting the candidates we're seeking in a favorable light.

(I'm leaving this comment because you've written a post that is less about improving the recall or application of algorithms, and more about helping a candidate show you they're capable.)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: