Retriever Details
listing of retriever.h
RetrieverPtr;
public:
/**
* The factory will call the constructor with a string. The string
* specifies where to locate the data (e.g. a filename), but
* interpreting the string is left up to the implementing code.
*/
//Retriever(const std::string &);
/**
* The destructor must be virtual to make sure the right one is
* called in the end.
*/
virtual ~Retriever()=0;
/**
* This is the method for retrieving data from a file. The whole
* tree will be written to the new file immediately after being
* called. Interpreting the string is left up to the implementing
* code.
*/
virtual void getData(const std::string &, tree &)=0;
/**
* This method is to be used for debugging purposes only. While the
* string can be anything, most useful is "[mime_type] source".
*/
virtual std::string toString() const=0;
/**
* Factory method to create new retrievers.
*/
static RetrieverPtr getInstance(const std::string &, const std::string &);
};]]>
is the listing of the Retriever
abstract base class. In addition to these methods there are a couple
of assumptions made about classes that implement this interface
Other constraints
The copy constructor and
assignment operator will not be used. It is suggested that they are
made private methods.
There is a static const string called
MIME_TYPE which will be used to determine if that particular
Retriever should be created by the factory. Care must be made to
select a unique MIME_TYPE to prevent name clashing.
The destructor will properly
deallocate all resources allocated in the construtor. Specifically,
if a file is opened in the constructor, it should be closed in the
destructor.
If anything goes wrong during the
course of the Retriever's operation, an std::exception will be
thrown.
The rest of this chapter describes how to create the body of
code, and header, for an example implementation.
The Simple ASCII Retriever as an Example
The simplest retriever is the the one for the &mime-type
text/plain. Because of this it makes a good
example of how to create your own retriever. The files are located in
the text_plain subdirectory as
retriever.h and
retriever.cpp.
listing of text_plain/retriever.h
// this is not intended to be inherited from
class TextPlainRetriever: public Retriever{
public:
TextPlainRetriever(const std::string &);
~TextPlainRetriever();
void getData(const std::string &, tree &);
std::string toString() const;
static const std::string MIME_TYPE;
private:
TextPlainRetriever(const TextPlainRetriever&);
TextPlainRetriever& operator=(const TextPlainRetriever&);
std::string source;
int current_line;
std::ifstream infile;
};]]>
Note that none of the methods are virtual, so this is not
intended to be derived from directly. That being said, you may want to
copy the header and code for your own retriever as a basis of what
works. In this example the copy constructor and assignment operator
are made private as specified in Other constraints . The private data is a filehandle and the
name of the file that is open for reading. The file name and
&mime-type are used in the toString to identify
it uniquely for debugging as seen in .
Listing of simple ascii toString
The first non-trivial function to write is the constructor. The
constructor is not very complicated or insightful. The
source and accounting for where in the file the
reading is (current_line) are initialized in
line 1. Line 3 opens the file, and line 6 confirms that it was opened
without error. An exception is thrown if there is a problem to follows
Other constraints . The constructor is very
brief because C++ fstream library provides the
ifstream object that does most of the
work.
Listing of the simple ascii constructor
The destructor for the Retriever in is just as simple simpler since all
it has to do is close the file. There were no calls in the constructor
(or anywhere else) to new or
malloc so the constructor does not need to call
delete or free.
Listing of the simple ascii destructor
Next is the getData function which is
simple as well. All that getData does is grab a
line of text from the file and create a node. Lines 3-4 are error
checking, and line 7 converts the location
string into an integer. Line 10 moves to the appropriate place in the
file while line 12 gets the string on that line. Since every
getData must put a node
into the provided tree, line 15 creates a
node to be filled with data. Lines 18-20 update
the generic node with the string read in from
the source file. Finally line 21 adds the single node to the supplied
tree.
Listing of the simple ascii getData
&tr){
// check that the argument is not an empty string
if(location.size()<=0)
throw invalid_argument("cannot parse empty string");
// check that the argument is an integer
int line_num=string_util::str_to_int(location);
// set stream to the line before
skip_to_line(infile,current_line,line_num);
// read the line and print it to the console
string text=read_line(infile);
// create an empty node
Node node("empty","empty");
// put the data in the node
vector dims;
dims.push_back(text.size());
update_node_from_string(node,text,dims,Node::CHAR);
tr.insert(tr.begin(),node);
}]]>
is brief because it
leverages existing functionality. The ifstream
objects does all of the work of getting information out of a
file. skip_to_line and
read_line are very short functions that scan to a
point in an ascii file and read from a point to the next end-of-line
character, respectively. Finally, the function
update_node_from_string existed in &nxtranslate
already to assist node creation while reading
the translation file. The interested reader can look at the source of
node_util.cpp and
text_plain/retriever.cpp to see the body of the
functions.