# $Revision: 11028 $ # $Author: saulius $ # $Date: 2024-01-06 16:47:05 +0000 (Sat, 06 Jan 2024) $ Reverse Polish Notation calculator for matrices =============================================== Saulius Gražulis Vilnius, 2024 CONVENTIONS =========== The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [1]. PROGRAM ======= Write a Perl (or other permitted language) program that reads in matrices [2] and operation specification on them in Reverse Polish Notation [3], performs all operations specified in the input stream and writes the result to STDOUT. Implementation rules -------------------- As usual in assignments of this kind, you are *not* supposed to use any third party libraries, unless explicitly permitted. Code reuse is an important topic but this assignment is targeted at developing a completely different set of skills. Any modules necessary for the implementation should be developed by the student and presented in the code repository. The use of any libraries other than the standard language libraries and built-in functions of the language should be discussed with the professor and agreed upon before submission of the solution. For this particular assignment ("Reverse Polish Notation calculator for matrices"), it is explicitly allowed to use the following libraries: - libraries for specific number arithmetic (complex numbers, reals, quaternions, etc.); - a library function for matrix inversion (when a complex algorithm like Gauss-Jordan elimination is necessary). PROGRAM FUNCTION ================ Individual assignment requirements ---------------------------------- This text specifies common requirements for a set of related assignments that will differ by the kind of matrices the programs are supposed to process, the type of numbers they should support and the set of operations they need to perform. Each individual assignment will additionally contain a short indication of the following: -- the exact name of the program. If the name of the program is not given, it SHOULD follow the Unix conventions: it must contain only lowercase letters and digits, underscores or hyphens; it must start with a letter and consist of one word or group of words separated by hyphens or underscores. The name MUST NOT contain the file extension, especially not the extension indicating the programming language; -- the languages and/or programming systems to be used with this assignment; -- the type of matrices the program is supposed to process (square, non-singular, orthogonal, Hermitian, etc.); -- the type of matrix elements (integers (ℤ), positive numbers (ℕ\{0}), non-negative numbers (ℕ∪{0}), finite fields (e.g. ℤ/13ℤ), rational numbers (ℚ), real number (ℝ), complex numbers (ℂ), quaternions (ℍ), etc.); -- the set of operations ('+', '-', '*', 'inv', etc.); -- additional notes (e.g. for diagnostics of certain special conditions). When a matrix is read from the input stream, the program MUST check that the matrix and its elements belong to the required class, and issue and ERROR message if they do not. The list of individual assignment requirements will be given by you professor. Common function requirements ---------------------------- The program MUST read its input stream (from a file or STDIN), skip comments and empty lines, and process all input data. Input data elements MAY be either: a) a matrix; b) operation codes, specified as operation symbols (e.g. '+', '-' (without the quotes)) or function names (e.g. 'inv', 'dup' (without the quotes)). When a matrix is encountered in the input stream, it is supposed to be pushed onto stack. The program MAY limit the maximum depth of the stack; if this is the case, the maximum permitted stack depth MUST be at least 64 numbers. When an operation is encountered in the input stream, the number of operands necessary for performing this operation is supposed to be popped from the stack, the operation performed in the matrix arithmetic, and the result should be pushed back onto the stack. At the end of the data stream (all input files and/or STDOUT) the contents of the stack MUST be printed to STDOUT in the same syntax in which it is read in. Supported operations -------------------- The exact list of the operations that the program MUST support is given in an additional assignment file as a short list of items. If the operation is listed there, its syntax and semantics MUST conform to the following specifications (operations that are not listed my be left unimplemented). Operation definitions should be taken from any classical linear algebra text, e.g. [2]. binary: -- matrix addition: + -- matrix subtraction: - -- matrix multiplication: * unary: -- negation (multiplication by -1): neg -- transposition: transp -- inversion (multiplicative inverse): inv stack maintenance: -- remove the top stack element: drop -- duplicate the top stack element: dup -- exchange the two top stack elements: swap -- rotate the top N elements of the stack: rot N -- show the top element of the stack without removing it: show -- print the top element of the stack and remove it: print The positive rotation number N (e.g. 3) moves the top element to the position 3 (1-based) on the stack, and at the same time moves all elements from depth N to depth 2 one stack position *up* (towards the top of the stack; i.e. the 2nd element from the top becomes the topmost). The negative rotation number (e.g. -3) rotates the stack in the opposite direction. Examples -------- The examples below assume that 'x', 'y', 'z' and 't' are names of matrices pushed onto the stack. Example 1: Stack: x <- Top of the stack (TOS) y z t 'rot 3' yields: y <- TOS z x t 'rot -3' yields: z <- TOS x y t Example 2: 'rot 1' and 'rot -1' are equivalent to 'swap'. Number syntax ------------- Integer numbers processed and output by the program should be written in decimal number system and follow the usual number notations. Leading zeroes are insignificant. The numbers MUST match the following Extended Regular Expression (ERE) or Perl Compatible Regular Expression: -- ERE: [-+]?([0-9]+) -- PCRE:[-+]?(\d+) Real numbers processed and output by the program MUST conform to the usual real number syntax of programming languages, i.e. it MUST match the following Extended Regular Expression (ERE) or Perl Compatible Regular Expression: -- ERE: [-+]?([0-9]+(\.[0-9]\*)?|\.[0-9]+)([eE][-+]?[0-9]+)? -- PCRE:[-+]?(?:\d+(\.\d*)?|\.\d+)(?:[eE][-+]?\d+)? Note that numbers that look like integers (e.g. '132') are permitted as real number representations. Efficiency considerations ------------------------- The program MUST NOT "slurp" the whole file into RAM; only the values currently placed onto the stack SHOULD be kept in RAM. INPUTS AND OUTPUTS ================== The program MUST read all files provided on the command line and process them. If a file name is a lone hyphen ('-'), the program MUST read from STDIN when processing this file name (to specify the file that has a name consisting of the lone hyphen, prepend it with the current directory name, i.e. "./-"). The program SHOULD NOT assume that STDIN or any other file is rewindable. If the files are not provided, the program MUST read from STDIN. The program MUST write its results to STDOUT. INPUT ===== Program input consists of text lines with the following permissible items: - Comments. Lines that start with the '#' (hash, ASCII code 35) character on the first line MUST be treated as comments and ignored; - Empty lines (i.e. lines containing no characters or only white space characters – space, TAB or new line characters) MUST be ignored; - Lines containing a matrix definition. The matrix definition starts with two integers, N and M, at the beginning of the line, separated by white space characters, optionally prepended by white space characters. The first integer contains the number of rows in the matrix. The second integer is the number of columns in the matrix. These two integers are followed by N×M white space separated numbers on the same and/or on the subsequent lines. Comments in the middle of matrices MUST NOT be used. The type of numbers that must be read from the inputs are specified in the individual assignment requirements. Example of two real matrices: 2 2 12.4 17.8 20.1 0.12 3 3 1 0 1.23 10 17. 1E-12 4.0 8 1.2E7 The program MAY accept the N and M numbers on separate lines, but it MUST always output them on a single line. New line breaks in the middle of the matrix are not significant. For example, all following tree matrix examples are equivalent descriptions of the 2 by 2 unity matrix: 2 2 1 0 0 1 2 2 1 0 0 1 2 2 1 0 0 1 In other words, matrices may be *folded* by inserting new line characters between the matrix numbers. - lines, containing operation codes ('+', '-', 'transp', ...), as described in the "Supported operations" subsection. If the operation requires additional argument, that argument MUST be provided Example: -------- #-----------8<---------------------- # Input operation stream 2 2 1 0 0 1 2 2 0 1 1 0 * # End input stream #----------->8---------------------- OUTPUTS ======= The output of the program must contain: - On the first line, it MUST contain a comment (starting by a hash sign '#' on the first position) with the program version identifier, generated automatically by the version control or the build system. A permissible program identifier is the one automatically generated by the Subversion 'Id' keyword. The output of the program, however, MUST NOT contain the dollar symbols ('$'), to avoid confusion with the Subversion keywords in the output files. - The subsequent lines MUST contain the resulting matrices, either printed by the 'show' and 'print' operations or output automatically at the end of the program operation. The program MAY choose a single format for the output matrices; e.g. it may choose to output each matrix in a single line format, or in a folded format. If a folded format is used, it is RECOMMENDED that the lines in the output do not exceed 80 characters in length, and an empty line is inserted between the distinct matrices for readability. In all cases, the program MUST be able to read its own output. Example of the output: ---------------------- # Id: matcalc 1238 2009-09-03 07:14:51Z author 3 3 1 0 0 0 1 0 0 0 1 OPTIONS ======= For this assignment, the program does not need to support any options. ERROR DIAGNOSTICS ================= All error messages MUST be output to STDERR. If an error is detected, a suitable non-zero status (exit) code MUST be returned. The program MUST detect abnormal situation and report them. An informative user-friendly error reporting should be used. IN particular, the error message MUST contain: the program name, the input file name where the error was detected (use '-' without the quotes as a file name if your input is STDIN), input line number where the error was detected; quoted example of the erroneous input. The program MAY output error position in the first character where the error was detected. The error messages should be informative and permit user to fix the error. Perl functions 'die' and 'warn', invoked with and without the "\n" character at the end of the message are examples of acceptable error reporting. The native Perl diagnostics (switched on by "use warnings") and file conditions detected by "while(<>) {...}" construct SHOULD be used for error reporting. The program SHOULD attempt to recover from errors and SHOULD continue its operation as long as possible. The program must support at least two error severity levels, ERROR and WARNING. The program MAY support the third error level; if supported it should be called NOTE. Error severity level ("ERROR", "WARNING" or "NOTE") SHOULD be selected based on the validity of the program output. The project MUST mandate the "ERROR" severity level for all conditions that do not allow to compute the correct and correctly formatted result. The "WARNING" error level SHOULD be mandated for those conditions when the output is physically, logically and syntactically correct but might be unusual or not expected by the user (e.g. empty or degenerate output when such output is permitted by the task and the output format). The "NOTE" level SHOULD be mandated for by conditions that do not interfere withe the generation of correct and expected results, but might be of interest for the user when interpreting the output data. The following conditions MUST be detected and reported: -- missing, unreadable, unreachable files; -- system read errors; -- wrong input number format; -- number of input numbers does not match the matrix size definition (integers N and M); -- incompatible matrix size and structure for the requested operations (such as different sizes of matrices for addition or the number of columns for the first matrix not equal for the number of rows in the second matrix for multiplication); -- not enough values on the stack to perform the requested operation (e.g. less than 2 values for '+' or '*'; less than N values for 'rot N'; ERROR); -- not enough values on the stack for printing at the end of the program, i.e. empty stack (ERROR); If the program detects wrong format of at least one input number which participates in arithmetic, the output result MUST be the "NaN" string (IEEE 754 Not-a-Number) [4]; If the program encounters floating point overflow condition while processing a matrix, its output on that line MUST be "Inf" (IEEE 754 Infinity) with a relevant sign on the corresponding output line. Error message content --------------------- Error messages SHOULD contain at least: -- the name of the program that diagnosed the error; -- the name of the file that was being processed when the error happened (if appropriate); use the "-" string (with quotation marks) or the name "STDIN" (without the quotes) if STDIN was being processed when the error happened; -- the line of the file and the character position within the line (the column) where the error was detected (if appropriate); -- a short (10–40 characters) citation of the context where the error happened (if appropriate), if the actual value is longer than the maximum number of cited characters, the remaining characters should be replaced by three dots ('...', the ellipsis sign); -- a short but informative message indicating the cause of the error and possible actions to rectify situation. Exclamatory marks SHOULD NOT be used in the messages. Example: -------- Presented with the input stream from the file "data.rnp": #--------------- 3 3 0 0 0 0 0 0 0 0 0 inv #--------------- the program tasked to process any real square matrices should output an error message to the STDERR similar to an example below: matcalc: data.rnp(4): ERROR, matrix '3 3 0 0 0 ...' is singular, cannot perform 'inv' or matcalc: ERROR, matrix '3 3 0 0 0 ...' in file "data.rnp", line 4 is singular, cannot perform 'inv' NOTE: an error message for a single error SHOULD be output as one line (i.e. should not contain new line characters '\n' in the message). It is permitted to output the error message longer than 80 characters, but the message length should be limited to some reasonable length (e.g. one should not quote whole matrices with the size 100x100). NOTE: for this assignment, an exact syntax of the error messages is not specified, although a serious data processing package should specify the syntax of the error messages using formal grammars (EBNF) or format specifications (XML, JSON and the corresponding schemata). EXIT STATUS =========== The program MUST return the 0 exit status if all computations were performed correctly, 1 if some results were NaNs, 2 if there were WARNING level issues reported, and 3 if a fatal errors (of the level ERROR) occurred. QUESTIONS TO THINK ABOUT ======================== Are there matrices that are non-zero but give zero product when multiplied together? (i.e. are there zero divisors?) Is the multiplication commutative? Is the multiplication associative? What subset of matrices has multiplicative inverses? What is the most efficient algorithm to computer the inverse matrix for the matrices of the kind you are tasked to process? In what practical applications are these numbers used? In STEM? In Bioinformatics? References ========== 1. S. Bradner "Key words for use in RFCs to Indicate Requirement Levels", RFC 2119, URI: https://tools.ietf.org/html/rfc2119 2. Wikipedia. Matrix (mathematics) (2023) URL: https://en.wikipedia.org/wiki/Matrix_(mathematics) [accessed 2024-01-06T11:55+02:00; permalink: https://en.wikipedia.org/w/index.php?title=Matrix_(mathematics)&oldid=1192680220] 3. Wikipedia. Reverse Polish notation (2024) URL: https://en.wikipedia.org/wiki/Reverse_Polish_notation [accessed 2024-01-06T11:12+02:00; permalink: https://en.wikipedia.org/w/index.php?title=Reverse_Polish_notation&oldid=1193885306] 4. Wikipedia. IEEE 754 (2023) URL: https://en.wikipedia.org/wiki/IEEE_754 [accessed 2023-02-13T15:27+02:00; permalink: https://en.wikipedia.org/w/index.php?title=IEEE_754&oldid=1135219367] Colophon ======== $Id: matrix-rnp-calculator.txt 11028 2024-01-06 16:47:05Z saulius $ $URL: svn+ssh://saulius.grazulis.lt/home/saulius/svn-repositories/paskaitos/VU/bioinformatika-III/u%C5%BEduotys-praktikai/skai%C4%8Di%C5%B3-u%C5%BEduotis-RNP-calculator/en/05-split-quaternion-RPN-calculator.txt $