# $Author: saulius $ # $Revision: 9766 $ # $Date: 2022-02-08 10:24:14 +0000 (Tue, 08 Feb 2022) $ Vector file format check ======================== INTRO ===== Words MUST, MAY, SHALL, SHOULD, written in all capitals, should be interpreted as specified in the RFC 2119 [1]. Input data validation is important part of every computational research. One part of input validation involves checking of input file formats. Although any program that uses files of this format should be in principle able to do the check, sometimes the checks are not stringent or fast enough. A standalone tool for format checking is thus very welcome. PROGRAM ======= Write a Perl program that checks formats of the vector file inputs. Program name: vcheckdim Program invocation: vcheckdim vectors*.dat Input files are optional; if they are not provided, the program MUST read from STDIN (i.e. it's design must follow the Unix filter pattern). DATA FORMATS ============ Input data ========== -- All lines that start with a has symbol ("#", ASCII HEX 23) MUST be treated as comments and ignored; -- All empty lines MUST be ignored; -- All data lines MUST contain white space separated floating-point numbers following the conventional programming language (Perl, Pascal, C) syntax (e.g. 6.02e+23); -- All other lines SHOULD be treated as errors. Output data =========== All data output MUST proceed to STDOUT. -- The first line of the output MUST contain the program Id, as generated by Subversion keywords (or equivalent subsystems generated by other version control systems); the line MUST start with the has character ("#") and MAY NOT contain the dollar characters ("$", ASCII HEX 24); -- The remaining lines MUST contain the same vector component lines as in the output. Only correct lines must be output to STDOUT. ERROR HANDLING ============== All error messages MUST be output to STDERR. If an error is detected, a suitable non-zero status (exit) code MUST be returned. The program MUST detect abnormal situation and report them. An informative user-friendly error reporting should be used. IN particular, the error message MUST contain: the program name, the input file name where the error was detected (use '-' without the quotes as a file name if your input is STDIN), input line number where the error was detected; quoted example of the erroneous input. The program MAY output error position in the first character where the error was detected. The error messages should be informative and permit user to fix the error. Perl functions 'die' and 'warn', invoked with and without the "\n" character at the end of the message are examples of acceptable error reporting. The native Perl diagnostics (switched on by "use warnings") and file conditions detected by "while(<>) {...}" construct SHOULD be used for error reporting. The program SHOULD attempt to recover from errors and SHOULD continue its operation as long as possible. The following conditions MUST be detected and reported: -- missing, unreadable, unreachable files; -- system read errors; -- wrong input number format. -- dimensions of the vectors do not match dimension of the first encountered vector. The program MUST return zero status if no errors were found, and non-zero status if some errors were detected. Status 1 should indicate that there were lines with mismatched number of components; higher values MUST be used to indicate if other errors were present. The program SHOULD indicate each type of errors with different status code. REFERENCES ========== 1. S. Bradner "Key words for use in RFCs to Indicate Requirement Levels" (1997) URL: https://tools.ietf.org/html/rfc2119 . 2. Wikipedia. IEEE 754 (2022) URL: https://en.wikipedia.org/wiki/IEEE_754 [accessed 2022-02-08T09:01:44 EET].