An Iowa EE in Taipei: June 2008

Monday, June 23, 2008

Final exams, part 4: Asynchronous systems

At last, finals are over and the semester is complete! The last final was not really an exam at all. It was a presentation. We all had to do a final project, and then give a talk about it. And because this is my favorite class, I decided to overachieve like a bandit.

Instead of doing a final project, I did five final projects. One of them was writing a paper. One of them was modifying a program from the University of Manchester to work properly with Quartus II. Then I made a random number generator in hardware, and some more hardware for 16-bit cyclic redundancy checking. And a simulator for sets of production rules, which can also use IRSIM to do transistor-level simulations. You might say that I went a little nuts.

The reason I did all this is because this class was actually very, very interesting. It was aimed at graduate students (and I think I'm technically enrolled as a grad student for some reason) and the material was only dumbed down slightly! This has some good points and bad points.

Let's get the bad part out of the way first: a lot of people had trouble. For example, a homework project might have been assigned a week or two in advance, but when it comes due I'm sometimes the only person to have it done. The class included practical work and reading real research papers, and I guess that can be a little intimidating.

The good part is that it's something useful and interesting that you can really sink your teeth into. Some research papers are remarkably accessible, and the practical work is at least manageable. Let me give an example.

Here's a paper on how we might be making computers in five years. It talks about a neat new way of fabricating electronics called a nanowire crossbar array, and while it's stupendously tiny it's also not suitable to the way we design processors today. Imagine electrical signals as water slowly spreading through wires, and you need to ensure that water reaches a thousand faucets at the same time, but you don't control the plumbing. How can you do it? You can't. This is similar to the problem of clock distribution: all conventional processors require that you send a really fast signal to all parts of the chip, keeping it all synchronized like a coxswain on a rowboat. And you can't do that on a crossbar array. But using asynchronous techniques, you can make a processor on it. The processor will just be very different from almost every other processor ever made. Exciting to be on the forefront of a big new thing, isn't it?

When I came here I decided that I would use some of my Copious Free Time to go overboard on something worthwhile. This is it, and I'm glad I put in the extra effort.

Thursday, June 19, 2008

Final exams, part 3: Integrated Circuit Technology

I just finished up the final exam for IC tech. It went okay. On the one hand, I was able to answer most of the questions in a decent way. On the other hand, I forgot my notes on epitaxy! No!

At the beginning of the semester I thought that this class would eat me. It started with a bang, talking about wave functions and Fermi levels and similar quantum crazy stuff. After a lot of very intense studying I managed to get a basic understanding of the early stuff.

And then the learning curve showed its true nature: you start off by banging into a brick wall with the semiconductor physics, and then it's smooth sailing. Later topics were things like oxidation of silicon, or doping. These are complicated subjects in their own right, but much easier to grasp. And to top it off, there aren't really that many formulas that you'll ever have to use, and they're all in the lecture notes, which you can use on exams.

Overall, I think that I now have a much better understanding of how transistors work, how chips are made, and what the big deal is with integrated circuit layout. Nobody learned more than a fraction of the material -- the professor knew a lot, but the information just came too fast and too disjointed to hold on to all of it -- but it was good stuff. I think we were tested not so much on our retention of the material, but on how good we were at understanding the lecture notes well enough to look up the bit of information that we needed. And that's actually a very clever approach to testing.

Three down, one to go.

Tuesday, June 17, 2008

Final exams, part 2: Operating Systems

I've finished the final exam for my Operating Systems class, and I must say, I'm glad to see the end of it. It wasn't a particularly difficult class -- while I probably got my lowest grade in this class, I would describe it as almost comically easy by engineering standards. The problem is that the class was poorly taught and I already knew most of the material in more detail than we were ever given.

I'll tell you an example: filesystems. A hard drive can be thought of as a bunch of blocks of data, and each block can be read sequentially. It can fetch any block for you, but switching to another block is much slower than sequential reading within a block. And since these blocks are small, we need to work out how to divide up files among them. It's as if you had a blackboard divided into small squares, and you needed to compose several essays on this blackboard. How do you use the little squares?

You could just start in the upper-left corner of the blackboard and use each square in order, and write down your essays. This is called contiguous allocation. It has problems, of course. What if you need to add something to the beginning or end of an essay? Maybe there will be extra space at the beginning or end; no such luck if you want to add a paragraph in the middle, and you have to recopy a bunch of words from one block to another -- in the worst case, you might even end up rewriting everything on the whole blackboard just to add a single word.

You could give each block a number and write the number of the next block on it. Then when you want to go to the next part of the essay you can just look to see the number of the next block, and go there. And so on: just follow the chain. This is called linked allocation. It makes adding paragraphs and such much easier: just change a few numbers around and add a new block to the chain. The same goes for deleting data. Finding an arbitrary place in your essays can be a little tricky, though: you have to follow a bunch of links, and each jump from block to block takes time.

Another strategy is to write the order of blocks on its own block -- a table of contents, if you will. This is called indexed allocation, and it's the most practical system of the lot. There are still issues with this, and they can be made faster by using something called a B+ tree. But we never got to that -- we spent too much time obsessing over minute details of broken implementations of things like linked indexing. I'm not kidding.

And if you ever asked a question, you would get an unrelated answer. Always.

So, as I said before, I'm very happy that this class is no longer a part of my life.

Final exams, part 1: Algorithms

I just finished the first of my final exams. This was for Algorithms, which has a reputation as a nasty class. I didn't find it to be particularly nasty, but I'm sure part of that is because I've had a book on algorithms for years and so already knew about half the material.

The really neat thing about spending this semester abroad is the way grades transfer over, or more accurately the way they do not. As long as we get passing grades, we get credit for the class -- and the classes don't affect our GPA. This is very pleasant. While I'm not going to neglect my education (perish the thought!) I love being able to shrug off the work that doesn't teach you anything.

For example, consider optimal matrix parenthesization: you want multiply a bunch of matrices together, calculating something like ABCDEF (where A, B, C, ... are matrices). Matrix multiplication is associative, so you can do the multiplications in any order and get the same result, but how much calculating you need to do depends very much on the order. Maybe A(B(C(D(EF)))) takes five minutes and (A(BC))((DE)F) takes five seconds -- this could be a very valuable thing to know ahead of time.

To simply go through and solve this by brute force and ignorance for n matrices requires going through something proportional to 2^n possibilities -- probably more trouble than it's worth. But there's a key observation you can make: any part of an optimal solution is an optimal solution to a subproblem. So in the example I gave above, A(BC) is going to be the optimal way to parenthesize ABC. The optimal solution for DEF is going to be (DE)F. And at the highest level, we just need to know where to split the multiplication into the product of two subproblems. This leads to a recursive solution, but it repeats a lot of subcalculations, so by remembering the solutions to problems and not recomputing them, we can make it dramatically faster.

Now, that's the high-level understanding. It's also sufficient to actually write a program to solve the problem, and it will be pretty simple and zippy. But for various (mostly historical) reasons, we have to learn to do this calculation using a system of filling in these weird triangle-shaped tables of numbers. This is more efficient if you really want to write a fast program, and I suppose it's easier to do on paper -- but it's completely unnecessary from the standpoint of understanding what's going on. Indeed, I don't think our lectures ever really mentioned that the whole point of the tables was just efficient memoization.

I never did learn exactly how to use those tables. My grades suffered a little because of it, but I understood the algorithm and managed to keep my morale up by avoiding the triangle tables. I consider that a good bargain.

There is too much emphasis, here and everywhere else I've been, on memorizing how to do some particular kind of problem that will be on the test. If you spend too much time focusing on the details of one particular implementation of one variant of the concept, you lose sight of why the textbook even gave you that example in the first place: as something concrete to generalize from. We had quizzes which ostensibly tested our knowledge of Quicksort, but the thing they tested was very different from what computer scientists mean when they say "Quicksort". Quicksort refers to a whole family of related algorithms, and the thing they all have in common is that they pick a pivot element, put everything smaller on one side and everything larger on the other, then recursively sort each side. This can cover everything from a simple textbook quicksort on up through complicated things like randomized median-of-three partitioning adaptive Quicksort that uses insertion sort for small subarrays and can revert to heapsort on perverse inputs. What we were quizzed on was an example program in the book, and we were expected to memorize everything about it, right down to how it rounded an insignificant number and the order of the arguments to the function. Anything slightly different from this was "wrong", even though this criterion would make a lot of other algorithms textbooks "wrong" as well. Such decisions have to be made in any implementation of the algorithm, but for learning how Quicksort works they are useless details.

Complaints aside, this was a pretty good class. The material was interesting, and it's a pity we didn't get to more of it. The textbook was friendly, or at least as friendly as a thousand-page clunker can be. The professor spoke English well and seemed like he would probably feel more at home in an American classroom. (For all the reputation Americans have for being dumb and lazy, engineering students in the US compare very nicely with our Taiwanese counterparts.) The homework assignments were few, although I hear that this is because homework solution copying is utterly rampant here.

In any case, it's nice to have this done. One down, three to go.

Saturday, June 14, 2008

McCain and habeas corpus (warning: polemic)

The Supreme Court has just ruled 5-4 that detainees in Guantanamo bay have the right of habeas corpus: they can not be held indefinitely without being charged with a crime, without legal representation, in military prisons beyond the reach of the law. Naturally, Obama is cheering about this and John McCain is vehemently against the decision. Forgive me if I get a bit polemic, but this deserves to be replied to paragraph by paragraph in an appropriately angry manner:

Read the article here. It's fairly short, but I'll only be quoting part of it.

I used to have some residual shred of respect for John McCain left over from when he was running as a less conservative opponent to Bush back in 2000, back when he seemed to have some principles.

That respect is gone. McCain is a pitiful excuse for a presidential candidate and an unprincipled shill for the Bush administration.

“We are now going to have the courts flooded with so-called … habeas corpus suits against the government, whether it be about the diet, whether it be about the reading material. And we are going to be bollixed up in a way that is terribly unfortunate because we need to go ahead and adjudicate these cases,” he said at a town hall meeting in New Jersey.

Frivolous lawsuits like "Stop holding me without trial." Frivolous lawsuits like "I've been in here for six years and have yet to be charged with a crime." Frivolous lawsuits like "Please stop torturing me."

McCain said he has worked hard to ensure the U.S. military does not torture prisoners but that the detainees at Guantanamo are still “enemy combatants.”

McCain has repeatedly defended and voted for torture and spouted the administration's "what we do is not torture because we're doing it and we don't torture" circlespeak. He's been lying about this for years now, in the most contemptible ways possible.

“These are people who are not citizens. They do not and never have been given the rights that citizens in this country have,” he said. “Now, my friends, there are some bad people down there. There are some bad people.”

No, you manipulative fascist, they are people. Ever heard of human rights? Hell, haven't you ever read the part of the Declaration of Independence that talks about "inalienable rights" and the "self evident" idea that all men are created equal?

Contrary to what McCain says, US citizens are not the only people who matter. US citizens not the only people with rights. The rest of the world is not filled with dirty subhumans who can be stepped on at the whims of fearmongering demogogues.

This paragraph is just so fractally wrong that I'm going to go back and quote parts of it again:

“These are people who are not citizens. They do not and never have been given the rights that citizens in this country have,”

PEOPLE ARE NOT GIVEN RIGHTS! Rights are not yours to give or take; they belong to every person, and they morally follow from personhood -- not from the mercy of the government. This is one of the fundamental principles upon which the United States was founded.

“Now, my friends, there are some bad people down there. There are some bad people.”

And I would sooner have all those alleged bad people go free than have a single innocent person detained indefinitely without charges in your illegal and immoral system of military prisons. There's a damn good reason that "Innocent until proven guilty" is so deeply enshrined in our legal system: without it, innocence and guilt cease to matter, and the "law" depends entirely on what the people in power want to do to you. Letting a few guilty people escape from justice is a small price to pay for this fundamental protection, because it is what lets us have rights at all.

And now McCain is desecrating all of our rights by declaring the prisoners guilty and claiming that his opinion on this is above the law. He is a natural authoritarian, a J. Edgar Hoover for the new millennium.

Sunday, June 8, 2008

New EE 185 lesson idea: debugging and testing

In fall semesters I have a TA gig for the introduction to EE class at ISU, EE 185. In the labs we have to teach the freshmen how to program a computer in Matlab and a little bit of C. It's a fun job, but very challenging: most of the students are complete newbies who have trouble with the simplest of programming concepts, and we've got to do whatever we can to help them out.

One huge meta-problem that always gives them grief is debugging. When their programs don't work -- and they usually don't work at first -- they don't know how to go through the program and figure out why it doesn't work.

I came across a fascinating article recently by Chris Okasaki, talking about program testing helping students learn. The idea is that, when you give out a problem for students to do, you also give them a program to test their solution. They write a solution, and the test code you give them runs their program on 50 or so test cases and complains when the solution fails on some of the test cases. This gives instant feedback on their mistakes, so debugging is reduced to squashing a series of bugs. He's had good results with this in the classes he teaches, and the students in EE 185 tend to have a lot of glaring bugs because they only use one or two makeshift test cases for their programs.

So I'm thinking of doing a lesson where we explicitly walk them through common debugging techniques: printf debugging and stepping through your program with a debugger. Both of these are absolutely basic techniques that would solve about 30% of the headaches I see in that class. So, mostly for my benefit, here's a lesson plan that I'd like to use in some form next semester. The prerequisites for doing this lesson are loops, arrays, and functions. So let's start with a title:

Debugging and testing

When the programs you write don't work, you need to find out what you did wrong and fix it. This is called debugging. There are some ways of debugging your code that can make it a lot easier. Let's look an a bit of example code. It calculates the sum of the numbers from 1 to 10:

n = 10;
sum = 0; i = 0;

while i < n
sum = sum + i;
i = i + 1;
end

disp(sum);

This code doesn't work. The result is supposed to be 1+2+3+4+5+6+7+8+9+10 = 55, but instead we get 45. How do we find the mistake?

Very simple debugging

The first method is to have your code print out the value of some variables while it's running. This can be as easy as taking away some semicolons:

n = 10;
sum = 0; i = 0;

while i < n
sum = sum + i
i = i + 1;
end

disp(sum);

This isn't very helpful so far, but it does show that the code seems to be stopping early. Let's try looking at the progress of another variable, by printing i each time through the loop:

n = 10;
sum = 0; i = 0;

while i < n
i
sum = sum + i;
i = i + 1;
end

disp(sum);

Run this and look at the output. What is the problem? Fix the code and include it in your lab report.

Stepping through your code

Another good way to find mistakes is to go through your program, line by line, and see what it's doing. Figure out what all the variables will be, and check if this matches your idea if what should happen.

Matlab has a way of doing this for you, called stepping. There's a "Step" command in the Debug menu, which you can also use by pressing F10. Try it out: open an M-file and step through it. Each time you step, Matlab runs one line of code. If you look at the variables in your workspace, they will show the values of the variables right now. This can really help you understand what a program is doing.

Here is more information on debugging M-files.

The assignment, part 1:

Write a function called splitfirst that will take an array and split it around the first occurrence of a given element. For example, if your function is given the array [3,1,4,1,5,9] and it told to split it around 4, then the function should return this matrix:

> splitfirst([3,1,4,1,5,9], 4)

ans =
3 1 0
1 5 9

Notice that the extra space was filled with a 0. Here's another example. If the function is given the array [1,2,3,4,5,6,7,8] and told to split it around 6, then it should return this matrix:

> splitfirst([1,2,3,4,5,6,7,8], 6)

ans =
1 2 3 4 5
7 8 0 0 0

The 6 has been removed, and everything after it has been moved into the next row. One final requirement: no empty rows! If you try to split at the beginning or end of an array, this should not produce a result with a row of all zeros. Instead, it should do something like this:

> splitfirst([1,2,3,4], 1)

ans =
2 3 4

Focus on getting your code working at all, then worry about this.

Writing the code for this problem is a little tricky, so you've been given a function called "above" in above.m which will help. The above function will take one matrix and put it on top of another, filling the remaining spaces with 0. To produce the first answer, you could do this:

> above([3,1], [1,5,9])

ans =

3 1 0
1 5 9

And for the second answer, you could do this:

> above([1,2,3,4,5], [7,8])

ans =

1 2 3 4 5
7 8 0 0 0

Above also works with arrays that have more than one row. For example:

> above([1, 2; 3, 4], [5,6,7,8])

ans =
1 2 0 0
3 4 0 0
5 6 7 8

Empty rows will be removed, which should be helpful:

> above([], [1,2,3])

ans =
1 2 3

Look at splitfirst.m and add code to do the splitting. Good luck.

Important note: you can test your code with the "testsplitfirst" function. Check it out:

> testsplitfirst
Testing splitfirst...
Failed: splitfirst([1,2,3,4], 1) should be [2,3,4]
Failed: splitfirst([1,2,3,4], 4) should be [2,3,4]
Passed 4/6 tests.

Your goal is to get it to pass all the tests. Include your test results in your lab report.

The assignment, part 2:

If you got this far, then congratulations on completing part 1. Part 2 is to split an array by every occurrence of a given element, not just the first one. For example:

> splitevery([3,1,4,1,5,9], 1)

ans =
3 0
4 0
5 9

Here, every time a 1 appears in the sequence, everything following it is moved down another row. As before, extra space is filled with zeros. Here's another example:

> splitevery([1,2,3,4,5,6,7,8], 4)

ans =
1 2 3 0
5 6 7 8

Since the 4 only appears once, this has exactly the same effect as using splitfirst. Here's an even more extreme case:

> splitevery([1,2,3,4,5], 9)

ans = 1 2 3 4 5

Since 9 doesn't occur anywhere in the array, it doesn't get split.

As in part 1, there should be no rows filled with just zeros. Here's an example that illustrates what's supposed to happen:

> splitevery([5, 4, 3, 5, 5, 5, 5, 1, 2, 5], 5)

ans =
4 3
1 2

Look at the file splitevery.m and add code. You can test it with the "testsplitevery" function:

> testsplitevery
Testing splitevery...
Passed 4 out of 4 tests.

Include your test results in your lab report.

Monday, June 2, 2008

Earthquake!

The ground started to vibrate! Nothing major, just a little alarming. And very, very cool.

I'm glad I wasn't asleep when it happened. I've never felt an earthquake before.

An Iowa EE in Taipei