Friday, August 29, 2008

Matlab failure #22994

I'm an engineer, and spend a bit of time using engineering tools, some good, some bad. It seems that the bad tools are always the ones that get used the most. I'll put up a post later dealing with the worst tools ever- there are definitely more than enough to fill up a page or two!

Today, I found myself using Matlab at work (actually GNU octave, but whatever...). I was trying to do what would have been a fairly simple task in any normal programming language (C, Java, Python etc), and probably even some really crappy or hard to use languages (I could list some here... but I didn't)- comparing two strings. Yep, looking at a few characters to see if they are identical. Just four characters- thirty two bits. There's a pretty standard way to compare strings in languages: the strcmp, or string compare function. It's pretty simple: it looks at two strings, and tells you if they are different from one another. If you take the difference of two identical things, what's the difference? Well, if they are truly identical, there should be zero difference right? So two identical things have a difference of zero, thats what you get as the difference when you compare two identical objects. And if they are different, you should get something else, right (if the function works)? How bout a non-zero difference for two non-identical things? Does that make sense?

Mathworks has indeed reinvented the wheel. They decided to make it square this time- after all, square wheels must be better- they keep cars from rolling away when parked, and cars are parked most of the time, right? So square wheels are clearly better?

Well, not in our world- most cars have been designed for round wheels. And most programmers, and hundreds of programming languages think that comparison of two strings with no difference between them merits a zero response. But not mathworks. They like square wheels. They go against the flow. They ignore millions of man years of programmer experience, and design things their own way, including the historical strcmp.

So, to summarize, a sane language (and most non-sane languages) does this:

string s1 = "heres a sentence";
string s2 = "heres a more different sentence";
string s3 = "heres a sentence";

strcmp(s1, s2); // is nonzero
strcmp(s1, s3); // is zero


But matlab does this:
s1 = 'heres a sentence';
s2 = 'heres a more different sentence'
s3 = 'heres a sentence';

strcmp(s1, s2) % is zero despite these being the same
strcmp(s1, s3) % is nonzero?? Why??


Note: this matlab code won't work like you think it might- its simplified to make a point. Real string stuff in matlab is a bit more complicated due to its totally non-native string implementation, which is even more crappy than I have conveyed here...

0 comments: