Integer ASCII code: | 0 |
Binary code: | 0000 0000 |
Octal code: | 0 |
Hexadecimal code: | 00 |
Group: | control |
Seq: | ^@ |
C/C++ notation: | \0 or '\0' |
Unicode symbol: ␀, int code: 9216 (html ␀) hex code: 2400 (html ␀)
At first it had a function of leaving the gaps on paper tape for some edits. Time passed and later it was used for one more purpose: padding after a code that could've taken some time in order to process (e.g. a carriage return or line feed on a printing terminal). Nowadays it's frequently used as a string terminator. It is pretty convenient for the programming language C.
Let's move in to some descriptive facts of this character. So, the null character (also null terminator or null byte), or briefly NUL, is a control character with the value zero. Lots of character sets have it, including ISO/IEC 646 (or ASCII), the C0 control code, the Universal Coded Character Set (or Unicode), and EBCDIC. Practically all popular programming languages use null character in their operations.
What was the initial meaning of NUL? It's actually was like NOP— it doesn't play any role, when it is sent to a printer or a terminal, (nevertheless, some terminals display it as space, which wasn't correct). At past times, when people used electromechanical teleprinters in the role of computer output devices, one or even a couple of null characters were used in the following way. They were sent at the end of each printed line in order to let the mechanism take its time just to return to the first printing position on the next line. There is no a single hole on the character on punched tape, so a completely new tape which wasn't punched is totally filled with null characters. This way, the text could often be "inserted" at a specially kept for this space of null characters by punching the new characters into the tape, covering the nulls.
Nowadays the null character is of primer importance in C and its derivatives and in many data formats as well. There it plays a role of the reserved character which function is to show the end of a string, which is frequently called a null-terminated string. This way the string can be as long as it is needed with just the overhead of one byte. There is the alternative of storing a count is needed either a string length with a limit of 255 or an overhead of more than just one byte (you can find the other advantages/disadvantages under null-terminated string).
The commonly used representation of the null character is the escape sequence \0 in source code string literals or character constants. In an overwhelming amount of languages (such as C, from which we have started this article), null is represented not by an individual escape sequence, but an octal escape sequence, which has a single octal digit 0; in sequence of digits, it is impossible for \0 to be followed by any of the digits 0 through 7; if it happens, it is treated as the beginning of a longer octal escape sequence. There are some other escape sequences that are used in different languages, for example \000, \x00, \z, or the Unicode representation \u0000. allows to place a null in it URL with the percent code %00.
If a null character is represented, there is no guarantee that the resulting string will be properly interpreted. It happens because lots of programs will consider the null as the end of the string. There is such a term like null byte injection. An ability to type in (in case of unchecked user input) is actually a weak point. This can lead to security exploits.
The null character in caret notation ha the following representation: ^@. There are some keyboards, which allow entering a null character by holding down Ctrl and pressing @ (which usually requires also holding ⇧ Shift and pressing another key such as 2 or P).
The null character in documentation may have the representation of a single-em-width symbol, including the letters "NUL". However, this happen not so often. Unicode has a character with the appropriate glyph for visual representation of the null character, "symbol for null", U+2400 (␀) It`s important not to confuse it with the real null character, U+0000.
In all character which are used today, the null character possesses a zero code point value. This is translated to a single code unit with a zero value in overwhelming amount of all the encodings. Let's see the example. In UTF-8 it is a single zero byte. Nevertheless, in Modified UTF-8 the null character is encoded as two bytes: 0xC0, 0x80. This lets the byte with the value of zero to be used as a string terminator, because currently it's not used for any character.
input value | base | output hash |
---|---|---|
NUL | char | 93B885ADFE0DA089CDF634904FD59F71 |
0 | dec | CFCD208495D565EF66E7DFF9F98764DA |
00000000 | bin | DD4B21E9EF71E1291183A46B913AE6F2 |
0000 0000 | bin | E7E9990CC7667AC4CA751FD99679A592 |
0 | oct | CFCD208495D565EF66E7DFF9F98764DA |
00 | hex | B4B147BC522828731F1A016BFA72C073 |
0x00 | hex | 83D6AD7F1C5045AB8112A8411C8091F2 |
input value | base | output hash |
---|---|---|
NUL | char | 6E340B9CFFB37A989CA544E6BB780A2C78901D3FB33738768511A30617AFA01D |
0 | dec | 5FECEB66FFC86F38D952786C6D696C79C2DBC239DD4E91B46729D73A27FB57E9 |
00000000 | bin | 7E071FD9B023ED8F18458A73613A0834F6220BD5CC50357BA3493C6040A9EA8C |
0000 0000 | bin | D6D6FFF452F1A53B96210A59E6E274CD10A32BC22826A90A806385FC2C9B26AC |
0 | oct | 5FECEB66FFC86F38D952786C6D696C79C2DBC239DD4E91B46729D73A27FB57E9 |
00 | hex | F1534392279BDDBF9D43DDE8701CB5BE14B82F76EC6607BF8D6AD557F60F304E |
0x00 | hex | C4DD67368286D02D62BDAA7A775B7594765D5210C9AD20CC3C24148D493353D7 |
input value | base | output hash |
---|---|---|
NUL | char | AA== |
0 | dec | MA== |
00000000 | bin | MDAwMDAwMDA= |
0000 0000 | bin | MDAwMCAwMDAw |
0 | oct | MA== |
00 | hex | MDA= |
0x00 | hex | MHgwMA== |