# $Revision: 11028 $
# $Author: saulius $
# $Date: 2024-01-06 16:47:05 +0000 (Sat, 06 Jan 2024) $

Reverse Polish Notation calculator for matrices
===============================================
             Saulius Gražulis
              Vilnius, 2024

CONVENTIONS
===========

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [1].

PROGRAM
=======

Write a Perl (or other permitted language) program that reads in
matrices [2] and operation specification on them in Reverse Polish
Notation [3], performs all operations specified in the input stream
and writes the result to STDOUT.

Implementation rules
--------------------

As usual in assignments of this kind, you are *not* supposed to use
any third party libraries, unless explicitly permitted. Code reuse is
an important topic but this assignment is targeted at developing a
completely different set of skills. Any modules necessary for the
implementation should be developed by the student and presented in the
code repository.

The use of any libraries other than the standard language libraries
and built-in functions of the language should be discussed with the
professor and agreed upon before submission of the solution.

For this particular assignment ("Reverse Polish Notation calculator
for matrices"), it is explicitly allowed to use the following
libraries:

- libraries for specific number arithmetic (complex numbers, reals,
  quaternions, etc.);

- a library function for matrix inversion (when a complex algorithm
  like Gauss-Jordan elimination is necessary).

PROGRAM FUNCTION
================

Individual assignment requirements
----------------------------------

This text specifies common requirements for a set of related
assignments that will differ by the kind of matrices the programs are
supposed to process, the type of numbers they should support and the
set of operations they need to perform.

Each individual assignment will additionally contain a short
indication of the following:

-- the exact name of the program. If the name of the program is not
   given, it SHOULD follow the Unix conventions: it must contain only
   lowercase letters and digits, underscores or hyphens; it must start
   with a letter and consist of one word or group of words separated
   by hyphens or underscores. The name MUST NOT contain the file
   extension, especially not the extension indicating the programming
   language;

-- the languages and/or programming systems to be used with this
   assignment;

-- the type of matrices the program is supposed to process (square,
   non-singular, orthogonal, Hermitian, etc.);

-- the type of matrix elements (integers (ℤ), positive numbers
   (ℕ\{0}), non-negative numbers (ℕ∪{0}), finite fields (e.g. ℤ/13ℤ),
   rational numbers (ℚ), real number (ℝ), complex numbers (ℂ),
   quaternions (ℍ), etc.);

-- the set of operations ('+', '-', '*', 'inv', etc.);

-- additional notes (e.g. for diagnostics of certain special
   conditions).

When a matrix is read from the input stream, the program MUST check
that the matrix and its elements belong to the required class, and
issue and ERROR message if they do not.

The list of individual assignment requirements will be given by you
professor.

Common function requirements
----------------------------

The program MUST read its input stream (from a file or STDIN), skip
comments and empty lines, and process all input data. Input data
elements MAY be either:

a) a matrix;

b) operation codes, specified as operation symbols (e.g. '+', '-'
   (without the quotes)) or function names (e.g. 'inv', 'dup'
   (without the quotes)).

When a matrix is encountered in the input stream, it is supposed to be
pushed onto stack. The program MAY limit the maximum depth of the
stack; if this is the case, the maximum permitted stack depth MUST be
at least 64 numbers.

When an operation is encountered in the input stream, the number of
operands necessary for performing this operation is supposed to be
popped from the stack, the operation performed in the matrix
arithmetic, and the result should be pushed back onto the stack.

At the end of the data stream (all input files and/or STDOUT) the
contents of the stack MUST be printed to STDOUT in the same syntax in
which it is read in.

Supported operations
--------------------

The exact list of the operations that the program MUST support is
given in an additional assignment file as a short list of items. If
the operation is listed there, its syntax and semantics MUST conform to
the following specifications (operations that are not listed my be
left unimplemented). Operation definitions should be taken from any
classical linear algebra text, e.g. [2].

binary:

-- matrix addition:       +
-- matrix subtraction:    -
-- matrix multiplication: *

unary:

-- negation (multiplication by -1): neg
-- transposition: transp
-- inversion (multiplicative inverse): inv

stack maintenance:

-- remove the top stack element: drop
-- duplicate the top stack element: dup
-- exchange the two top stack elements: swap
-- rotate the top N elements of the stack: rot N
-- show the top element of the stack without removing it: show
-- print the top element of the stack and remove it: print

The positive rotation number N (e.g. 3) moves the top element to the
position 3 (1-based) on the stack, and at the same time moves all
elements from depth N to depth 2 one stack position *up* (towards the
top of the stack; i.e. the 2nd element from the top becomes the
topmost).

The negative rotation number (e.g. -3) rotates the stack in the
opposite direction.

Examples
--------

The examples below assume that 'x', 'y', 'z' and 't' are names of
matrices pushed onto the stack.

Example 1:

Stack:

x <- Top of the stack (TOS)
y
z
t

'rot 3' yields:

y <- TOS
z
x
t

'rot -3' yields:

z <- TOS
x
y
t

Example 2:

'rot 1' and 'rot -1' are equivalent to 'swap'.

Number syntax
-------------

Integer numbers processed and output by the program should be written
in decimal number system and follow the usual number
notations. Leading zeroes are insignificant. The numbers MUST

match the following Extended Regular Expression (ERE) or Perl
Compatible Regular Expression:

-- ERE: [-+]?([0-9]+)
-- PCRE:[-+]?(\d+)

Real numbers processed and output by the program MUST conform to the
usual real number syntax of programming languages, i.e. it MUST
match the following Extended Regular Expression (ERE) or Perl
Compatible Regular Expression:

-- ERE: [-+]?([0-9]+(\.[0-9]\*)?|\.[0-9]+)([eE][-+]?[0-9]+)?
-- PCRE:[-+]?(?:\d+(\.\d*)?|\.\d+)(?:[eE][-+]?\d+)?

Note that numbers that look like integers (e.g. '132') are permitted
as real number representations.

Efficiency considerations
-------------------------

The program MUST NOT "slurp" the whole file into RAM; only the values
currently placed onto the stack SHOULD be kept in RAM.

INPUTS AND OUTPUTS
==================

The program MUST read all files provided on the command line and
process them. If a file name is a lone hyphen ('-'), the program MUST
read from STDIN when processing this file name (to specify the file
that has a name consisting of the lone hyphen, prepend it with the
current directory name, i.e. "./-").

The program SHOULD NOT assume that STDIN or any other file is
rewindable.

If the files are not provided, the program MUST read from STDIN.

The program MUST write its results to STDOUT.

INPUT
=====

Program input consists of text lines with the following permissible
items:

- Comments. Lines that start with the '#' (hash, ASCII code 35)
  character on the first line MUST be treated as comments and ignored;

- Empty lines (i.e. lines containing no characters or only white space
  characters – space, TAB or new line characters) MUST be ignored;

- Lines containing a matrix definition.

  The matrix definition starts with two integers, N and M, at the
  beginning of the line, separated by white space characters,
  optionally prepended by white space characters. The first integer
  contains the number of rows in the matrix. The second integer is the
  number of columns in the matrix.

  These two integers are followed by N×M white space separated numbers
  on the same and/or on the subsequent lines. Comments in the middle
  of matrices MUST NOT be used. The type of numbers that must be read
  from the inputs are specified in the individual assignment
  requirements.
  
  Example of two real matrices:

  2 2   12.4 17.8 20.1 0.12

  3 3
  1 0 1.23
  10 17. 1E-12  4.0 8 1.2E7

  The program MAY accept the N and M numbers on separate lines, but it
  MUST always output them on a single line.

  New line breaks in the middle of the matrix are not significant. For
  example, all following tree matrix examples are equivalent
  descriptions of the 2 by 2 unity matrix:

  2 2 1 0 0 1

  2 2
  1 0
  0 1

  2 2 1
  0 0
  1

  In other words, matrices may be *folded* by inserting new line
  characters between the matrix numbers.

- lines, containing operation codes ('+', '-', 'transp', ...), as
  described in the "Supported operations" subsection. If the operation
  requires additional argument, that argument MUST be provided 

Example:
--------

#-----------8<----------------------
# Input operation stream

2 2
1 0
0 1

2 2   0 1   1 0

*

# End input stream
#----------->8----------------------

OUTPUTS
=======

The output of the program must contain:

- On the first line, it MUST contain a comment (starting by a hash sign
  '#' on the first position) with the program version identifier,
  generated automatically by the version control or the build system.

  A permissible program identifier is the one automatically generated
  by the Subversion 'Id' keyword. The output of the program, however,
  MUST NOT contain the dollar symbols ('$'), to avoid confusion with
  the Subversion keywords in the output files.

- The subsequent lines MUST contain the resulting matrices, either
  printed by the 'show' and 'print' operations or output automatically
  at the end of the program operation.

  The program MAY choose a single format for the output matrices;
  e.g. it may choose to output each matrix in a single line format, or
  in a folded format.

  If a folded format is used, it is RECOMMENDED that the lines in the
  output do not exceed 80 characters in length, and an empty line is
  inserted between the distinct matrices for readability.

In all cases, the program MUST be able to read its own output.

Example of the output:
----------------------

# Id: matcalc 1238 2009-09-03 07:14:51Z author
3 3
1 0 0
0 1 0
0 0 1


OPTIONS
=======

For this assignment, the program does not need to support any options.

ERROR DIAGNOSTICS
=================

All error messages MUST be output to STDERR. If an error is detected,
a suitable non-zero status (exit) code MUST be returned.

The program MUST detect abnormal situation and report them. An
informative user-friendly error reporting should be used. IN
particular, the error message MUST contain: the program name, the
input file name where the error was detected (use '-' without the
quotes as a file name if your input is STDIN), input line number where
the error was detected; quoted example of the erroneous input. The
program MAY output error position in the first character where the
error was detected. The error messages should be informative and
permit user to fix the error.

Perl functions 'die' and 'warn', invoked with and without the "\n"
character at the end of the message are examples of acceptable error
reporting. The native Perl diagnostics (switched on by "use warnings")
and file conditions detected by "while(<>) {...}" construct SHOULD be
used for error reporting.

The program SHOULD attempt to recover from errors and SHOULD continue
its operation as long as possible.

The program must support at least two error severity levels, ERROR and
WARNING. The program MAY support the third error level; if supported
it should be called NOTE. Error severity level ("ERROR", "WARNING" or
"NOTE") SHOULD be selected based on the validity of the program
output. The project MUST mandate the "ERROR" severity level for all
conditions that do not allow to compute the correct and correctly
formatted result.

The "WARNING" error level SHOULD be mandated for those conditions when
the output is physically, logically and syntactically correct but
might be unusual or not expected by the user (e.g. empty or degenerate
output when such output is permitted by the task and the output
format).

The "NOTE" level SHOULD be mandated for by conditions that do not
interfere withe the generation of correct and expected results, but
might be of interest for the user when interpreting the output data.

The following conditions MUST be detected and reported:

-- missing, unreadable, unreachable files;

-- system read errors;

-- wrong input number format;

-- number of input numbers does not match the matrix size definition
   (integers N and M);

-- incompatible matrix size and structure for the requested operations
   (such as different sizes of matrices for addition or the number of
   columns for the first matrix not equal for the number of rows in
   the second matrix for multiplication);

-- not enough values on the stack to perform the requested operation
   (e.g. less than 2 values for '+' or '*'; less than N values for
   'rot N'; ERROR);

-- not enough values on the stack for printing at the end of the
   program, i.e. empty stack (ERROR);

If the program detects wrong format of at least one input number which
participates in arithmetic, the output result MUST be the "NaN"
string (IEEE 754 Not-a-Number) [4];

If the program encounters floating point overflow condition while
processing a matrix, its output on that line MUST be "Inf" (IEEE 754
Infinity) with a relevant sign on the corresponding output line.

Error message content
---------------------

Error messages SHOULD contain at least:

-- the name of the program that diagnosed the error;

-- the name of the file that was being processed when the error
   happened (if appropriate); use the "-" string (with quotation
   marks) or the name "STDIN" (without the quotes) if STDIN was being
   processed when the error happened;

-- the line of the file and the character position within the line
   (the column) where the error was detected (if appropriate);

-- a short (10–40 characters) citation of the context where the error
   happened (if appropriate), if the actual value is longer than the
   maximum number of cited characters, the remaining characters should
   be replaced by three dots ('...', the ellipsis sign);

-- a short but informative message indicating the cause of the error
   and possible actions to rectify situation. Exclamatory marks SHOULD
   NOT be used in the messages.

Example:
--------

Presented with the input stream from the file "data.rnp":

#---------------
3 3
0 0 0  0 0 0  0 0 0
inv
#---------------

the program tasked to process any real square matrices should output
an error message to the STDERR similar to an example below:

matcalc: data.rnp(4): ERROR, matrix '3 3 0 0 0 ...' is singular, cannot perform 'inv'

or

matcalc: ERROR, matrix '3 3 0 0 0 ...' in file "data.rnp", line 4 is singular, cannot perform 'inv'

NOTE: an error message for a single error SHOULD be output as one line
(i.e. should not contain new line characters '\n' in the message). It
is permitted to output the error message longer than 80 characters,
but the message length should be limited to some reasonable length
(e.g. one should not quote whole matrices with the size 100x100).

NOTE: for this assignment, an exact syntax of the error messages is
not specified, although a serious data processing package should
specify the syntax of the error messages using formal grammars (EBNF)
or format specifications (XML, JSON and the corresponding schemata).

EXIT STATUS
===========

The program MUST return the 0 exit status if all computations were
performed correctly, 1 if some results were NaNs, 2 if there were
WARNING level issues reported, and 3 if a fatal errors (of the level
ERROR) occurred.

QUESTIONS TO THINK ABOUT
========================

Are there matrices that are non-zero but give zero product when
multiplied together? (i.e. are there zero divisors?)

Is the multiplication commutative?

Is the multiplication associative?

What subset of matrices has multiplicative inverses?

What is the most efficient algorithm to computer the inverse matrix for
the matrices of the kind you are tasked to process?

In what practical applications are these numbers used? In STEM? In
Bioinformatics?

References
==========

1. S. Bradner "Key words for use in RFCs to Indicate Requirement
   Levels", RFC 2119, URI: https://tools.ietf.org/html/rfc2119

2. Wikipedia. Matrix (mathematics) (2023) URL:
   https://en.wikipedia.org/wiki/Matrix_(mathematics) [accessed
   2024-01-06T11:55+02:00; permalink:
   https://en.wikipedia.org/w/index.php?title=Matrix_(mathematics)&oldid=1192680220]

3. Wikipedia. Reverse Polish notation (2024) URL:
   https://en.wikipedia.org/wiki/Reverse_Polish_notation [accessed
   2024-01-06T11:12+02:00; permalink:
   https://en.wikipedia.org/w/index.php?title=Reverse_Polish_notation&oldid=1193885306]

4. Wikipedia. IEEE 754 (2023) URL: https://en.wikipedia.org/wiki/IEEE_754
   [accessed 2023-02-13T15:27+02:00; permalink:
   https://en.wikipedia.org/w/index.php?title=IEEE_754&oldid=1135219367] 

Colophon
========

$Id: matrix-rnp-calculator.txt 11028 2024-01-06 16:47:05Z saulius $
$URL:
svn+ssh://saulius.grazulis.lt/home/saulius/svn-repositories/paskaitos/VU/bioinformatika-III/u%C5%BEduotys-praktikai/skai%C4%8Di%C5%B3-u%C5%BEduotis-RNP-calculator/en/05-split-quaternion-RPN-calculator.txt
$