Most of you have probably seen some of these before, they’re QR Codes, right?
Well, not exactly. Technically only this one is a QR Code.
The rest are different codes that you maybe have never heard of, like this one which is called a Data Matrix code:
and this one you’ll see in some places, especially in electronics. For example, if I open up an old HP laptop they’re all over the place. I think I counted about seven of them and you may have even tried scanning these with your phone thinking it was a QR Code before out of curiosity, and it didn’t work and you never really thought anything about it. All of these different types of codes are called either Matrix barcodes or just 2D barcodes
and they’re used for all sorts of stuff, having URLs or just storing product information, even allowing you to connect to a Wi-Fi hotspot. And today I’m going to turn you into an expert on all of the most popular ones, and I’m also going to show you how they’re actually encoded, at least for some of them, and it should be pretty interesting.
First, let’s go over a quick introduction to these Matrix barcodes. So like original, regular barcodes that are one-dimensional matrix,
or 2D barcodes are designed to be easily and quickly read by computers or machines. One-dimensional barcodes like UPC usually store the information in different widths of lines and spaces and then that translates to different numbers. And the advantage of this is that they can be read very easily by lasers just going across because they’re one-dimensional.
Two dimensional barcodes on the other hand, just by looking at them you can tell that they can store way more data than just one-dimensional ones that mostly store numbers, but the methods for storing information in two dimensions can vary wildly. There are some that are pretty widely used by all sorts of different people and companies such as QR Code and Data Matrix. I’ll get into more detail about them later, but there are others that may only be used by really a single company. For example, MaxiCode,
which was invented and used by UPS. This one’s actually public domain now, but it was basically made to prioritize speed necessarily overcapacity, which is important if a package is flying across a conveyor belt.
There’s also Aztec code which is apparently used in a lot of countries for train tickets and such,
or the PDF417 code,
which is reminiscent looking of original barcodes.
And this can be seen on things like airplane boarding passes, US postal shipping labels, and most if not all, US IDs on the back.
There are also some more exotic ones such as ShotCode,
which is actually circular, colored codes like JAB Code
or even the High Capacity Colored Barcode,
which uses colored triangles for storing data and this one was actually developed by Microsoft.
Another probably obvious and major difference between 1D and 2D barcodes is that while 1D barcodes can be read with just a simple linear laser, 2D barcodes, almost all of them have to be read with a camera, obviously, that can see the whole thing at once.
Why don’t we first go over QR Code,
which is probably the most well-known, and it actually stands for Quick Response code. It can easily be recognized by the three big squares in the three corners only,
not the fourth. These are known as the finder pattern, and it basically just tells the camera, the orientation of the code, to know which is up and down, left and right.
There are many different possible sizes of QR Codes, 40 in fact, and each version number will tell you the size where the number of pixels across for that code is four times the version number plus 17. For example, QR Code Version 1 is 21×21 and Version 40 is 177×77.
A Version 40 QR Code can store 4,296 alphanumeric characters. To put that in perspective, about 2-3 of these could store basically the entire script of the average 4 minutes video.
Now depending on the size of the QR Code, it may also have more squares in the middle
and this is called the alignment pattern. And there’s also a variant called Micro QR,
which has one finder square, and that can go down to 11×11 squares.
One really interesting feature of QR Codes and actually a lot of 2D barcodes is the built-in error correction, which can reconstruct certain amounts of missing data. And depending on the error correction level, it can actually replace any kind of missing data up to a certain level, on any part of the code and be able to still read it. These different levels are low, medium, quartile, and high, and you can see the corresponding amount of percentage of data that can be missing for it to still work.
Of course, higher levels of error correction do have a trade-off in capacity, but this is actually the reason you may see some QR Codes that have really fancy designs like cool symbols in the middle or whatever. And you might be wondering, how is that in there if it’s covering up half the code? Well, that’s because the error correction still allows the reader to read all the data even if part of it’s just covered up.
Now, you might be wondering how QR Codes are actually encoded and the data’s put in there, and I am going to explain that but you better buckle up because it does get a little bit complicated. So in any QR Code, you’re always going to see the three squares,
which is the finder code in the three corners. And that’s going to be a black square surrounded by a white border, then a blackboard, and then another whiteboard.
Then this part of the code always contains formatting info,
so this includes the error correction level and the masking pattern, which I’ll get to later.
This part is called the timing pattern,
and this is always going to just be an alternating line of black and white squares and this extends from the black finder outline. The point of the timing pattern is it can be used if the code is warped, for example, which helps the camera determine the true shape from the perspective.
And also there’s always this black square
in this particular spot no matter what, but I’m not exactly sure why.
So now that we have all that basic stuff added in, we’re finally ready to add in the data we want to store, well, almost at least. First I’ll just show you how the data is laid out,
basically. So as you probably guessed, each square which is called a cell equals one bit, and it can be either a one or a zero. And it could be either that the black square is one of the white squares is zero. I believe that’s encoded in the encoding info so the camera knows.
The information stored starts at the bottom right in blocks, and then goes up, over, and down, then back up and continues in this zigzag pattern.
At the start, though at the very bottom right (Enc) you’re going to see a four-square block, which actually tells the computer the encoding “mode” of the following data. There are several different possibilities for this. It could be numeric, so it’s all numbers, alphanumeric so letters and numbers only, or byte encoding which is basically any ASCII character or Kanji characters, which is in Japanese and more.
The next block (bottom right Len) after the encoding block is the character count indicator. And this is going to tell the reader how many characters the total message is in that data, and therefore it basically tells you how many blocks of data it’s going to have to read.
Now, this part I’m not a hundred percent clear on because every example I’ve come across shows each character as being one byte or eight bits or eight squares. But apparently depending on the encoding mode and the code version, these blocks might be anywhere from 8 to 16 bits, but I’m just going to show the byte encoding which is ASCII and that’s just eight characters because that’s the easiest example.
So now we finally are really ready to start adding in the encoded message. Now again, depending on the encoding mode, each little cell in the block is going to be assigned a number like this, 1, 2, 4, 8, 16, 32, 64, 128.
Now why those numbers in particular you may ask? Well, that’s because using those numbers, you can add any combination of them up to make any number between 0 and 255. If you make all of them black, for example, to trigger a 1 and count all of them, it’s 255. If you have more white then none of them and it’s zero. And again, depending on the combination, you can make any number between those.
And the reason that’s important is because that is the number of characters in the ASCII character table if you include the extended table.
So that means you can use one block or eight bits or one byte to refer to any character in the ASCII table, which includes a lot of uppercase and lowercase letters, numbers, and even some symbols. So I’ll do one quick example.
Let’s say you want to encode one capital letter X. So looking at the ASCII table we can see that that is a decimal 88 corresponding to capital X.
So that means that using the table number we have, we’re going to add up 64, 16, and 8 and mark all those as black,
meaning 1, which means just “count these”. And then for a computer, it does this very quickly. It’s very easy. It just adds those all up, corresponds it to the ASCII table number, and writes that down. And then the blocks just continue like that. They go up and over and down and zigzag like that, finally stopping with a 4-bit (another square) end code.
After the end code, it’s not actually done because after that is where the error-correcting information starts.
And that again continues in the same eight-bit or block sizes and again, it goes up and down until the very end from right to left.
Now I am absolutely not prepared or qualified to explain how the actual error correction works. It is some serious hardcore math. I had no idea what I was even looking at when I was looking it up. If you want to look it up yourself, it’s called Reed-Solomon Error Correction. It’s used in a lot of different stuff. So you can just look up more information about yourself, but I will point out that one really awesome thing about this type of error correction is that any part of the data can be removed and any block of error correction can be reused to replace any part of the missing data. I mean, it’s pretty crazy to think about. If you have all the error corrections missing except one error correction block and there’s one block of actual data missing, you can replace the one with the other. Actually, I’m not a hundred percent sure if that one-to-one ratio is true, but that’s the basic idea of it.
Now, there is one more thing about this whole encoding method as if things weren’t complicated enough, and that is the mask pattern.
There are several different possible black and white patterns, which are called “masks” and these are overlaid over all the data. And depending on what parts of the mask are black and white, the black ones, for example, are going to flip any pixels in the data and make it reversed, while the white parts won’t. Apparently, the purpose of these masks is to break up any possible confusing parts of the code. For example, any large white spaces, or large black spaces, that might make it hard for the scanner to distinguish how many are actually in there, and also break up patterns that may look like a finder pattern and also confuse the scan. Now you might think, oh my gosh, why would they complicate it so much? What’s the point? Well, remember the scanner and the computer can reverse this and decode it all in an instant. It doesn’t take that long for the computer, so it just reads from the format code what type of mask it’s using. It simply flips them back and then does the decoding like it normally would. It doesn’t take the computer anytime at all. So now you know the very basics of this, but I do want to point out I did give a very simplified explanation because there are a lot more complications I wouldn’t want to get into. For example, how the numbers in the blocks are going to change depending on the orientation,
it’s not like all the numbers flipped with the blocks. And also how larger versions of QR Codes are going to have an alignment pattern in there,
which actually changes the shapes of some blocks and how it cuts off on one end and continues on the other. So it just makes things way more complicated to explain but again, remember what the computer has no problem decoding this. It’s all just built-in preset with the rules and it does it instantly. Try using our free QR Code generator and test it out.
So that was QR Codes but what about Data Matrix?
It looks similar to QR Codes, but actually it’s encoded pretty differently. So I am going to explain that too, though. I will keep it a lot shorter than the QR Code explanation. So a Data Matrix code is going to look like this,
and you can always tell because it’s going to have a solid black line on the bottom and left, and this is its finder pattern. These Data Matrix codes may be used for a lot of purposes, but you’ll often see them for electronics. So for example, I opened up this old HP laptop
and you can see they’re all over the place in here. They’re on the Wi-Fi chips, on the RAM, all sorts of stuff. And one thing about Data Matrix is it can actually be arranged as a square
or alternatively, a rectangle
and also you can see in this HP one, you can also stack some Data Matrix codes, this one’s four by four, to store some more information.
The biggest Data Matrix codes can store up to 1,556 bytes or 2,335 alphanumeric characters.
So not as much as the biggest QR Codes, however, a Data Matrix code can actually store more information per same amount of space than a QR Code. And that’s why you typically see Data Matrix codes on very, very small things like little electronics because it can store about 50 characters in as little as 2-3 square millimeters. Now you may actually know that most phone cameras these days actually have a built-in QR Code reader in the camera app, but a lot of times these phones cannot actually scan Data Matrix, so if you ever tried it won’t work.
However, I did actually find one app, it’s called Scandit, which is available on both iOS and Android, it’s completely free, it has no ads and it can scan almost every kind of barcode in use, not just 1D but also 2D barcodes, all the ones I mentioned. And it even has an option called “Any Code” where you basically just point it and it’ll look for any kind of code in there. So that seems to be a pretty good app to have on your phone if you ever come across a 2D barcode you had never seen before, it might be able to scan it.
As for how information is actually encoded in Data Matrix, again, it’s a lot different from QR Codes, but I’ll give a super simplified explanation. Instead of rectangular blocks, like in QR Code, the Data Matrix has these weird L-shaped code blocks
- Green color – Data,
- Yellow color – Padding,
- Red color – Error Correction,
- Purple color – Finder & Timing,
- Orange color – Unused.
and also instead of going up and down, it kind-of goes in this weird diagonal zig-zag pattern.
And what’s also strange is that for some of these blocks, they actually get partially cut off and then wrap around and get continued on a completely separate edge. So you might have half of a letter on one side, then you have to look at the other side to continue reading it and it’s going to depend on what the shape of the cutoff part is. So kind of complicated, but again, remember the computer knows this set of rules and it has no problem decoding it pretty much instantly.
Like a QR Code, at the end of the message there is going to be a ending block signal,
and then the rest of it also has error correcting information sort of like QR Codes, just arranged differently.
Now there are several more 2D barcodes we can go over, but I’m not going to explain how each one of them are encoded, that would take too long. Like I mentioned before, there’s the UPS MaxiCode,
which actually stores it in sort-of a hexagonal arrangement, and these are apparently all going to be about one inch in size. I guess it’s just standardized to make it much easier for the camera to read, it doesn’t have to calculate the size. And you also see these MaxiCodes always have this circular pattern in the middle. This kind of code can only store about 93 characters of data, but they can be chained together to store more. And some of the information encoded is basically structured data, so it’s standardized as being some package information like the postal code, maybe the addressee, stuff like that, the country it’s going to, all sorts of stuff mostly for shipping.
Another code I also mentioned is PDF417,
and this is used pretty commonly in both government and commercial applications. You can usually recognize this because it always has the same pattern for starting and ending on left and right. Now one interesting advantage of this code, which is unlike other 2D barcodes we’ve talked about, is it doesn’t have to necessarily be scanned with a camera. You can actually do a linear sweep with a laser and instead of all at once, it goes line by line. And this is done because it has different code blocks on the left and right, which actually tell the reader and scanner, which row it’s on while it’s scanning. So it goes down and it makes sure, “okay, I read row one, two, three, four, yes, let’s add it all together” and then it knows the complete message.
One interesting code I haven’t mentioned before is called AR Code,
and this is apparently used in a lot of augmented reality applications for things like location tracking of the headset. So if you’ve ever seen, for example, the prototype Valve Software VR headset room,
you noticed that it had a bunch of codes on the wall. You might’ve thought they were QR Codes, but I believe these are actually AR codes, so that was one use in early days of that. Now obviously VR headsets use other methods now, but it’s interesting to see that way back then that was the thing they used.
A final pretty common code we can mention is called Aztec code,
in Germany. And apparently at least 12 other rail companies use Aztec codes on the tickets to basically be scanned by the people checking tickets and that sort of thing. I’ve also read that some companies in Canada have used Aztec code for putting on bills sent to customers, I guess, so they can more easily pay it or something. I don’t know, but it’s just used there too.
Now I’ve only just scratched the surface. There are plenty of other 2D barcodes that are out there, probably just less common. If you want, you can look up the Wikipedia page for Barcodes where it has pretty much all of them in existence and you can look them up yourself. So let me know what you think down in the comments. Has your mind been blown now you finally know what all these different codes are for? Let me know. And also if you liked the Blog, be sure to share it and also come back for new blog posts every week. If you guys want to create a QR Code you can use one of our free tools it is called QR code generator. So thanks for reading guys and I’ll see in the next one.