17.1. GLOSSARY

最后更新于:2022-04-01 04:43:26

ALC Apple Lossless Coding, a lossless compression method for audio. Algorithm Algorithm a process for achieving an outcome, normally for a general problem such as searching, sorting, finding an optimal path through a map and so on. Algorithm analysis Algorithm analysis working out the complexity of an algorithm. Algorithm complexity Algorithm complexity how long the algorithm takes to run (or how much memory it uses). These are almost always specified in terms of the size of input. Alphabet Alphabets In formal languages, this is the set of characters that might be processed. For many compilers and text processing systems the alphabet is the set of all ASCII characters, but for example, for a finite state automaton controlled by an “up” and “down” button, the alphabet is just the two symbols “up” and “down”. For systems processing binary numbers, the alphabet would usually be “0” and “1”. Many of the small examples just use a small alphabet of a few characters (typically “a”, “b”, “c” etc.) to keep things simple. ASCII ASCII the commonly used code for representing characters as 8-bit numbers (although only 7 of the 8 bits are usually used). Attack Gaining access to or decrypting a file that is using encryption, without having the key. There are several types of attacks, some of which are also defined in this list. Binary Number System The base 2 number system, i.e. numbers only made up of the digits “0” and “1”. All numbers that can be represented in the decimal number system can be uniquely represented in the binary number system. Binary search Binary search searching a sorted list by looking at the middle item, and then searching the appropriate half recursively (used for phone books, dictionaries and computer algorithms). Bit Bit short for “binary digit” - a digit that is either 0 or 1. Brute force attack A type of attack that is carried out by trying every possible key. Bubble sort Bubble sort a sorting algorithm based on swapping adjacent items that are out of order. It is not a good method, but serves as an example of a slow method in contrast to others like quicksort. Byte Byte a group of 8 bits, able to represent numbers from 0 to 255, can store one ASCII character (also known as an octet). Caesar Cipher A very simple cipher that offsets each letter in the alphabet by a certain amount, specified by the key. It is no longer used in practice due to being very easy to attack. Chatterbot An AI system that has text conversations with the user, typically based on simple pattern matching. Check digit An extra digit that is added onto the end of a number such as an ISBN, credit card number, or barcode number. This digit is calculated using a formula based on the other digits in the number. Error detection works by using the check equation to determine whether or not the check digit is as expected. Check equation An equation that is used to check whether or not the check digit for a number is correct. Chomsky hierarchy A hierarchy of types of languages ranging from the simple “regular expression” through to unrestricted grammars. Each level of the hierarchy can describe more complex rules, but is also harder to implement. It is named after the linguist Noam Chomsky. Cipher An algorithm used to encrypt a piece of plain text. Cipher text Text which has been encrypted. Compiler Compiler translates an entire program written in a high level language to machine language in advance before running it. Complexity Complexity how long it takes to solve a problem. A problem has an inherent complexity (minimum time needed to solve it); any algorithm to solve the problem will have a higher complexity (take at least that long). See also algorithm complexity. Compression Compression making a file smaller by removing redundant information (typically using standards like zip, jpeg, mpeg, mp3). Decimal Number System The standard base 10 number system that is used in everyday math, using the digits “0”, “1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, and “9”. Decrypt Decryption Decipher Getting the plain text for a piece of cipher text by either using the key or an attack. Encryption Encryption changing the representation of data so it can’t be read by an eavesdropper who doesn’t have the encryption key. Encryption key Encryption key the password or secret code that will unlock an encrypted file. Error correction Correcting an error that has been detected in some data. This can be demonstrated in the Parity trick, where a person is able to flip the changed bit back over so it is correct again (after they have “detected” which bit was incorrect). Not all error control schemes are able to correct errors; some are only able to detect them. Error detection Detecting when an error has occurred in some data, such as a number getting typed incorrectly or a bit getting flipped. Some simple examples of this are parity bits or a check digit. Feature A function available on a digital device or software, such as copy/paste, autofocus, voice dialling or undo. Features are often used to sell a device, but having features (functionality) should not be confused with people being able to use the device effectively (usability). Feedback Responding to or acknowledging a user action. Users find the devices hard to use if the feedback is slow, confusing, or non-existent. Finite state automaton FSA A simple notation for processing input symbols to determine if they obey some specified. An FSA has a starting state, transitions between states based on the next input symbol, and “accepting” states, which indicate that the input is accepted if the processing ends up in one. Frequency Analysis Attack An attack on substitution ciphers that takes advantage of the fact that some letters are generally more common than others in a piece of text (e.g. in English, the letter “e” is usually the most common letter) by looking at which letters appear the most in the cipher text and guessing that they must be the substitutions for the most common letters. GIF A lossless image compression system typically used for small images with few colours in them (in practice it can be lossy because it has only 256 colours, and if the original has more colours then some will be lost). Gigabyte About 1000 megabytes (1,000,000 kilobytes and 1,000,000,000 bytes). This is 8,000 million individual bits (i.e. 0’s and 1’s). [Like a kilobyte, there are other definitions, such as 1024x1024x1024 bytes, but usually this level of accuracy isn’t important]. Commonly referred to as a “GB”. Grammar Rules that specify a language, typically used for defining programming languages. Graphics Graphics in computer science, designing algorithms that can produce images on a computer. HCI HCI human computer interaction; an area of computer science looking at how people interact with a digital device, with an emphasis on the quality of the experience to complete tasks. Heuristic A heuristic is rule or guideline, usually devised from experience. The term is used in both HCI and algorithms. In **HCI**, heuristics are often used as a benchmark to evaluate interfaces — they aren’t strict rules, but usually highlight issues with designs. A very common set is given at www.useit.com . In **algorithms**, and heuristic is an approximate solution to a problem; it doesn’t guarantee to give the best possible answer (such as the shortest route on a map), but by using simple rules the calculation can be done quickly, and the solution is hopefully good enough for practical use. Hexadecimal The base 16 number system. Uses the digits “0”, “1”, “2”, “3”, “4”, “5”, “6”, “7”, “8”, “9”, “A”, “B”, “C”, “D”, “E”, and “F”. All numbers that can be represented in decimal can be uniquely represented in hexadecimal (just like binary). It is most often used as a shorthand notation for binary, by assigning 1 hexadecimal digit to each 4 bit pattern (the assigning is done in numeric order). Hexadecimal colour codes A representation for colours that tells the computer how much red, blue, and green light to display in a pixel (to make the desired colour). Uses 1 byte for each of these 3 primary colours, which is 3 bytes (24 bits) in total. These 24 bits are often written as 6 hexadecimal digits to make them easier for humans to read, which is why they are called “Hexadecimal colour codes”. They are commonly encountered when specificying colours in HTML for web pages. High level language High level language a programming language that is designed for humans to read and write (e.g. Java, Python, C, C#, Basic, Scratch…) as opposed to machine languages. Insertion sort Start with an empty list, and insert each item in the correct place; this is a relatively slow method, usually between selection sort and quick sort in speed. Intelligent Systems Intelligent systems an area of computer science that investigates ways to simulate or approximate human intelligence on computers; often referred to as artificial intelligence (AI). Interface The part of a computer, software, or electronic device that a human interacts with, whether this is by sight, hearing, or touch. Interpreter Interpreter runs a programming language by translating each line of code as it is execute. ISBN Stands for International Standard Book Number. Every published book has one of these numbers on the back of it. ISBN is significant to error control coding because it uses a check digit for error detection. JPEG A lossy image compression system typically used for photographs. Key (in algorithms) It is an item of data that is being searched for or sorted, and therefore will be compared with other data. Key (in cryptography) The password or secret value that is used to encrypt and decrypt an encrypted file (without having to use an “attack”). Some widely used methods have different keys for encryption and decryption. Kilobyte About 1000 bytes. This is 8,000 individual bits (i.e. 0’s and 1’s). [We say “about” 1000 bytes because the term is ambiguous and it is often taken as 1024 bytes; however, rounding it to 1000 is close enough for most calculations]. Commonly referred to as a “KB”. Known plain-text Attack Working out the key or method of encryption (cipher) based on having access to both the original plain-text and its encrypted form. Language A set of strings, typically obeying some rules defined by a regular expression or grammar e.g. all strings containing the letter “a” exactly twice, or all strings that are legal Java programs. Lexical analysis When compiling a computer program, working out what the components of the program are e.g. identifiers, keywords, integers. Linear Complexity Linear complexity grows in proportion to the size of the problem - if the problem is twice as big, it will take roughly twice as long to solve. Logarithm Logarithm is a very slow growing mathematical function written as \log n. In computer science logarithms are usually in base 2, that is, \log_2 n, which is the inverse of the incredibly fast growing exponent function 2^n. Logarithms are not needed to understand the material in this book, but they are used a lot in computer science and are a useful concept to understand. Logarithms happen to come up a lot with algorithms, and the two words are often confused. The value \log_2 n is just the number of times you can halve n until you get down to 1; for example, \log_2 32`is 5, and :math:log_2 1024` is 10\. Binary search takes \log_2 n steps to search *n*items; storing the number *n* in binary takes \log_2 n bits. Lossless A compression method that does not cause any loss of data. This means that the uncompressed file will be identical to the original file that was compressed (which is important for text). In the case of images and sound, it means they will be of the same quality before and after compression. For example, ZIP and ALC use lossless compression. Lossy A compression method that trades off quality for file size. Lossy compression methods can make files smaller than lossless compression methods can, but the quality of the resulting file will be lower. For example, MP3 and JPEG use lossy compression. Machine language The native language for instructions for a computer, not very easy for humans to read and write. Megabyte About 1000 kilobytes (1,000,000 bytes). This is 8 million individual bits (i.e. 0’s and 1’s). [Like a kilobyte, there are other definitions, such as 1024x1024 bytes, but usually this level of accuracy isn’t important]. Commonly referred to as a “MB”. MP3 A lossy audio compression system. Nibble 4 bits (half a byte), sometimes called a nybble. Nielson’s Heuristics A widely used set of heuristics for evaluating computer interfaces that was devised by Jakob Nielson (available from [http://useit.com](http://useit.com/)). Octal The base 8 number system. Like hexadecimal, it is significant to computer scientists as it allows a shorthand notation for writing binary numbers. Octal assigns a digit to each possible 3 bit pattern. Note: You probably don’t need to know this for the achievement standard, although it is included here in case you come across the term. Parity Adding an extra bit to a set of bits to make it so that there is an even number of 1’s. Storing the parity makes it possible to detect and correct errors later. [This is known as an even parity bit; an odd parity bit is also possible where the extra bit ensures there is an odd number of 1’s] Parse tree The structure derived by parsing some input. Parsing Reading some input (typically a computer program) and making sense of it by breaking it into parts according to their function. Pattern matching Finding strings of characters that match simple rules, typically based on a regular expression. Plain Text Text before it has been encrypted or after it has been decrypted (so essentially text in plain language, without any encryption). PNG A lossless image compression system typically used for small images with few colours in them. Quadratic complexity Quadratic complexity grows with the square of the size of the problem - if the problem is twice as big, it will take roughly 4 times as long to solve. Quick sort Quick sort pick an item at random, put all the smaller items in a group on its left and the larger items in a group on its right. Now do quick sort on the two groups. This is one of the better sorting algorithms, and is good for comparing with others. Students don’t need to understand how it works, but some may be curious. Redundant Bits Extra bits that are not part of the actual data but instead have been added for error detection and possibly error correction. Regular expression A simple expression used for pattern matching, typically using characters combined with “*” (repetition), “|” (selecting one or the other) and parenthesis (to group operations). Some systems allow more complex patterns such as ”.” (matches any character), “{n}” (repeated n times), and “\d” (digit). Search Find a key in a large amount of data. Selection sort Selection sort select the smallest item, then the second smallest, and so on. This is not a very fast algorithm, but it’s not as bad as bubble sort, and provides a good contrast with quick sort. Sort Sort puts keys (numbers, names or other values) in order from smallest to largest (outside computer science this is usually called ordering). String Strings A sequence of characters or symbols from an alphabet. For example, the two-character strings that can be made from the alphabet {“a”,”b”} are “aa”, “ab”, “ba” and “bb”. Substitution Cipher A type of cipher that works simply by replacing each letter or combination of letters in a plain text with a certain other letter or combination of letters to make up the cipher text. The result of this is that each unique letter combination of letters in the plain text (e.g. “t”) is represented by the same unique letter combination of letters in the cipher text (e.g. “y”) Caesar Cipher is a simple example of a substitution cipher. Substitution ciphers are vulnerable to Frequency Analysis Attacks, so are not used in practice. Syntactically correct A string is syntactically correct if it matches the specifications for a formal language. For example, the string “()(())” is correct for a grammar that gives the rules for balanced parentheses. In a computer program, a syntax error is when a character occurs in the input which isn’t allowed, and the program is therefore not syntactically correct. Syntax Syntax rules about what text can appear in a programming language, used by a compiler or interpreter and therefore need to be followed by a programmer to avoid syntax errors. Syntax diagram Also known as railway (or railroad) diagrams, these are a graphical representation of a grammar using arrows (the “train tracks”) to show the options for each component of a language. Task Something a user might do with a piece of software or electronic device to achieve a goal. In the case of a cellphone this might be “send a text message” or in the case of a microwave it might be “heat up yesterday’s leftovers”. Interfaces are best evaluated when considering how they help a user to perform a task. Time complexity Time complexity the usual meaning of the complexity of an algorithm; this makes it clear that we’re talking about the time taken. Normally it’s expressed in terms of steps, not real time on a particular computer, as different computers are different speeds. Unicode Unicode an extension of ASCII; it supports characters from multiple languages, using 16 bits per character. Usability heuristic See [*Heuristic*](http://csfieldguide.org.nz/appendices/Glossary.html#term-heuristic). User The human using the computer system or electronic device. Visual computing Visual computing designing systems that can perceive, process, understand and generate images; images typically come from scanners and cameras, and may be displayed on monitors, head mounted displays, or as movies.
';