asgn
- assignment files - do not touchfmwkng.hpp
- generic testing frameworkmacroasgn.hpp
- data generator for the Macroprocessormacroasgn.cpp
- data generator for the Macroprocessormacromain.cpp
- testing pattern for the Macroprocessormacrogold*.cpp
- checksum and timing data for various configurationssol
- solution skeleton files - edit your solution heremacrosol.hpp
- the Macroprocessormacrosol.cpp
- the Macroprocessormachine | author | commit | commit date | platform | time | checksum | check | rel time | bonus |
---|
Your task is to implement a class representing the state of a macroprocessor with the following interface:
template< typename policy> class macroprocessor { public: bool process(const std::string& s, std::string& output); };The function
process
shall analyze the string s
containing an input line (without any line delimiters).
false
.
output
and return true
.
The class shall be a template parameterized with one of the policy classes describing the vectorization instruction set supported.
For this class, vectorization is probably useless, therefore, you may avoid some template-related problems by rewriting the interface as follows:
class macroprocessor_impl { public: bool process(const std::string& s, std::string& output); }; template< typename policy> class macroprocessor : public macroprocessor_impl {};
The macroprocessor class will not be copied or moved, i.e. copy/move support may be disabled.
All input lines are composed of words, white-space, and (in the case of macro definition) the character '='
.
Words are sequences of letters, i.e. [a-zA-Z]+
.
White-space is a sequence of spaces, i.e. [ ]+
.
Adjacent words must be separated by white-space, in addition,
there may be white-space at the start/end of a line and next to the '=' sign.
Lines will never contain a newline character, nor any other character not mentioned above.
The output lines shall contain only words, separated by single spaces. (The correctness is verified by hashing the output lines.)
The definition of a macro is a line containing a '='
after the first word (optionally interleaved by whitespace).
The first word is the name of the macro being defined, the (possibly empty) remaining sequence of words is the body of the macro.
Macro names are case-significant.
The same name may be used more than once, i.e. macro definitions may be changed (but not deleted).
Macro expansion, i.e. the replacement of macro names by macro bodies, is done only during the (recursive) processing of an input line which is not a macro definition. Macro definitions themselves must be stored without expansion, because a subsequent macro definition may change the expansion of a word in a macro body. The expansion process logically works with sequences of words, i.e. words are never glued together and the amount of white-space between them is irrelevant.
The inputs will never contain recursive macros, i.e. no detection of/protection against recursion is required.
The following input
EF = GO G GO ABC = abc EF def BFLM ABC D EF G = GAGA BFLM ABC D EF G = BFLM ABC D EFshall produce the following output
BFLM abc GO G GO def D GO G GO BFLM abc GO GAGA GO def D GO GAGA GO BFLM abc GO GO def D GO GO
The interface of std::[unordered_]map<K,T>
contains the function
iterator find( const K &);When used as
std::[unordered_]map<std::string,T>
, it means that the sequence of characters being searched must be converted to std::string, invoking allocation and copying of the characters.
std::[unordered_]map<std::string_view,T>
;
however, the ownership of the character sequences (i.e. macro names) stored in the map must be solved somehow because string_view
does not act as the owner.
nonmacro_weight indirectly sets the probability that a word is a macro name, iterated through the set { 3, 12, 48 } which roughly corresponds to probabilities { 0.5, 0.2, 0.06 }. Note that being a macro name does not necessarily mean that the macro was already defined because the macro definitions are randomly scattered among the lines; consequently, the true frequency of defined macros is lower, especially within the first lines. In Debug mode, only { 3, 48 } is used.
exp_width is the expected number of words in a line (or the right-hand side of a macro), iterated through the set { 3, 12 }. The numbers of words are randomly generated according to an exponential distribution which means that there is significant variability with both small and large outliers.
lines is an auto-adjusted parameter, determining the number of lines, i.e. calls to process. The range of the parameter is set so that the expected run time ranges from fractions of a second to seconds (stopped by the auto-adjustment mechanism after exceeding a second).
With the command-line parameter --dump=on
,
the testing framework will produce a number of input.*.txt
and output.*.txt
files in the current directory
(a pair of files for each pair of the nonmacro_weight and exp_width parameters and for each thread).
These files contain the first 1000 input lines generated by the framework
and the corresponding output generated by your solution.
(The files are generated before the actual performance measurement starts,
the generator is restarted and the macroprocessor
class recreated before the actual measurement.)
Some generated inputs and their reference outputs may be found here: macro-gold.zip