Integer ASCII code: | 26 |
Binary code: | 0001 1010 |
Octal code: | 32 |
Hexadecimal code: | 1A |
Group: | control |
Seq: | ^Z |
Unicode symbol: ␚, int code: 9242 (html ␚) hex code: 241A (html ␚)
Substitute character was initially created in order to use as a transmission control character to reflect that distorted or invalid characters had been received. In addition to its main function, substitute has frequently been used in order to achieve some other goals. It is used to show the end of file in case when in-band signaling of errors that it causes isn't necessary, particularly when a person uses some rough methods of error detection and correction. There may be another case where errors occur rare enough to make using the character for other purposes recommended. It is also used in DOS, Windows and other CP/M derivatives in order to indicate the end of file, in cases either when typing on the terminal or sometimes in text files stored on disk.
A substitute character (␚) is a control character. It is represented in the place of a character that is identified to be invalid or incorrect, or in cases when it can't be represented on a device used. Besides, it's used in the role if an escape sequence in some programming languages.
This character is encoded by the number 26 (1A hex) in the ASCII character set. Standard keyboards send this code while simultaneously pressing the Ctrl and Z keys. Ctrl+Z, is traditionally often described as ^Z). Unicode encodes this character either, but recommends to use the replacement character (, U+FFFD) instead of representing un-decodable inputs, in cases when the output encoding is compatible with it.
If we remember history, we will see that under PC CP/M 1 and 2 operating systems of those times (and derivatives like MP/M) it was necessary to clearly denote the end of a file (EOF). There was a reason for that: the CP/M filesystem couldn't record the exact file size by itself; files were distributed in size (records) of a fixed size. Usually some distributed but unused space was left at the end of each file. This extra space was filled with 1A16 (hex) characters under CP/M. The extended CP/M filesystems used by CP/M 3 and higher (and derivatives like Concurrent CP/M, Concurrent DOS and DOS Plus) supported byte-granular files. It started to be just a physical requirement but a simple convention (particularly for text files) in order to ensure backward compatibility.
The SUB character was also used in CP/M, 86-DOS, MS-DOS, PC DOS, DR-DOS, and their different derivatives, in order to show the end of a character stream, and thus used to cease user input in an interactive command line window (this way, frequently used to end up console input redirection, e.g. as instigated by COPY CON: TYPEDTXT.TXT).
This convention is still supported by lots of text editors and programs, despite the fact that nowadays indicating the end of a file is no longer technically needed. Besides, this can be set up in order to put this character at the end of a file during editing process, or at least cope with them in text files in a right way. In such cases, it is often called a "soft" EOF, because it doesn't definitely show the end of the file in its physical sense. It should be treated as a marker more, showing that "there is no useful data beyond this point". However, in real use the situation is a little bit different, while some more data may exist beyond this character up to the real end of the data in the file system. This way it can be used in order to conceal some content when the file is entered through the console or opened in some editors. For implementing this very function, lots of file format standards (e.g. PNG or GIF) include the SUB character in their headers. Some modern text file formats (e.g. CSV-1203), that nowadays are used everywhere, however, still recommend a trailing EOF character to be added as the last character in the file. Besides, worth mentioning is the fact, that typing Control+Z doesn't implement an EOF character into a file neither in case of MS-DOS or Microsoft Windows, nor in case of APIs of those systems use the character to indicate the actual end of a file.
There are some programming languages (e.g. Visual Basic) that won't be able to read past a "soft" EOF in process of using the built-in text file reading primitives (INPUT, LINE INPUT etc.). This way some alternative methods must be taken over. There are can be lots of them, for example opening the file in binary mode or using the File System Object in order to move forward beyond it.
Character 26 was used in order to mark "End of file". The ASCII considered in to be a Substitute, and has other characters for this, but still Character 26 works well. Character 28 which is called "File Separator" has been used as well for achieving such goals.
This character can be used in Unix operating systems. Here it's usually used in order to terminate the currently executing interactive process. The terminated process can be resumed later in foreground (interactive) mode. What's more, the execution can be resumed by it in background mode, or be terminated. When a user enters at his computer terminal, the currently running foreground process is sent a "terminal stop" (SIGTSTP) signal. In plain words, it makes the process to terminate its execution. The user can continue the process execution later with the help of the "foreground" command (fg) or the "background" command (bg).
The Unicode Security Considerations report recommends to use this character in the role of a safe replacement for characters, which are not applicable during character set conversion.
In lots of GUIs and applications Control+Z (⌘ Command+Z on Mac OS) can be used in order to cancel the last performed action. In lots of applications previously done actions, than the last one can be cancelled as well. The operator only has to press Control+Z for a couple of times. In order to control text editing, program designers at Xerox PARC picked Control+Z keyboards sequence. It was just one of a bunch of existing keyboard sequences. Perhaps the reason for choosing these very key combinations is their location on a standard QWERTY keyboard, because the Z (undo), X (cut), C (copy), and V (paste) keys are placed closed to each other at the left end of the bottom row.
input value | base | output hash |
---|---|---|
SUB | char | BEBE43A13D6320B4C6751958BF5398A7 |
26 | dec | 4E732CED3463D06DE0CA9A15B6153677 |
00011010 | bin | 665F5462D2E4E2D70E2B81614E2FE174 |
0001 1010 | bin | A4CB49911E3BAC8E3D964DECF3E3E4DE |
32 | oct | 6364D3F0F495B6AB9DCF8D3B5C6E0B01 |
1A | hex | 0723DFD10075AEE37A1804A728349DC3 |
0x1A | hex | CDE0CA77DD54C00E672F34CD314410F9 |
input value | base | output hash |
---|---|---|
SUB | char | 58F7B0780592032E4D8602A3E8690FB2C701B2E1DD546E703445AABD6469734D |
26 | dec | 5F9C4AB08CAC7457E9111A30E4664920607EA2C115A1433D7BE98E97E64244CA |
00011010 | bin | 2AF1C2A51C6AC57DD385308C7C675CC0CDB21B46E47BA5FBBBE886CA450FB7DC |
0001 1010 | bin | D4F171BC5A3FC94C60D9D92925479720722369FED616052E4149C63DA21189D2 |
32 | oct | E29C9C180C6279B0B02ABD6A1801C7C04082CF486EC027AA13515E4F3884BB6B |
1A | hex | CF3D835A30188F3D1566497DF9EA64CB8074AACF0A83D6506953258D6EC76F24 |
0x1A | hex | 76312A7EC79DA909DE8699916CDA4C8A9E5D614558D83AD7A37F72A743C037C9 |
input value | base | output hash |
---|---|---|
SUB | char | Gg== |
26 | dec | MjY= |
00011010 | bin | MDAwMTEwMTA= |
0001 1010 | bin | MDAwMSAxMDEw |
32 | oct | MzI= |
1A | hex | MUE= |
0x1A | hex | MHgxQQ== |