Retriever Details

Retriever Details listing of <filename>retriever.h</filename> RetrieverPtr; public: /** * The factory will call the constructor with a string. The string * specifies where to locate the data (e.g. a filename), but * interpreting the string is left up to the implementing code. */ //Retriever(const std::string &); /** * The destructor must be virtual to make sure the right one is * called in the end. */ virtual ~Retriever()=0; /** * This is the method for retrieving data from a file. The whole * tree will be written to the new file immediately after being * called. Interpreting the string is left up to the implementing * code. */ virtual void getData(const std::string &, tree &)=0; /** * This method is to be used for debugging purposes only. While the * string can be anything, most useful is "[mime_type] source". */ virtual std::string toString() const=0; /** * Factory method to create new retrievers. */ static RetrieverPtr getInstance(const std::string &, const std::string &); };]]> is the listing of the Retriever abstract base class. In addition to these methods there are a couple of assumptions made about classes that implement this interface Other constraints The copy constructor and assignment operator will not be used. It is suggested that they are made private methods. There is a static const string called MIME_TYPE which will be used to determine if that particular Retriever should be created by the factory. Care must be made to select a unique MIME_TYPE to prevent name clashing. The destructor will properly deallocate all resources allocated in the construtor. Specifically, if a file is opened in the constructor, it should be closed in the destructor. If anything goes wrong during the course of the Retriever's operation, an std::exception will be thrown. The rest of this chapter describes how to create the body of code, and header, for an example implementation. The Simple ASCII Retriever as an Example The simplest retriever is the the one for the &mime-type text/plain. Because of this it makes a good example of how to create your own retriever. The files are located in the text_plain subdirectory as retriever.h and retriever.cpp. listing of <filename>text_plain/retriever.h</filename> // this is not intended to be inherited from class TextPlainRetriever: public Retriever{ public: TextPlainRetriever(const std::string &); ~TextPlainRetriever(); void getData(const std::string &, tree &); std::string toString() const; static const std::string MIME_TYPE; private: TextPlainRetriever(const TextPlainRetriever&); TextPlainRetriever& operator=(const TextPlainRetriever&); std::string source; int current_line; std::ifstream infile; };]]> Note that none of the methods are virtual, so this is not intended to be derived from directly. That being said, you may want to copy the header and code for your own retriever as a basis of what works. In this example the copy constructor and assignment operator are made private as specified in Other constraints . The private data is a filehandle and the name of the file that is open for reading. The file name and &mime-type are used in the toString to identify it uniquely for debugging as seen in . Listing of simple ascii <function>toString</function> The first non-trivial function to write is the constructor. The constructor is not very complicated or insightful. The source and accounting for where in the file the reading is (current_line) are initialized in line 1. Line 3 opens the file, and line 6 confirms that it was opened without error. An exception is thrown if there is a problem to follows Other constraints . The constructor is very brief because C++ fstream library provides the ifstream object that does most of the work. Listing of the simple ascii constructor The destructor for the Retriever in is just as simple simpler since all it has to do is close the file. There were no calls in the constructor (or anywhere else) to new or malloc so the constructor does not need to call delete or free. Listing of the simple ascii destructor Next is the getData function which is simple as well. All that getData does is grab a line of text from the file and create a node. Lines 3-4 are error checking, and line 7 converts the location string into an integer. Line 10 moves to the appropriate place in the file while line 12 gets the string on that line. Since every getData must put a node into the provided tree, line 15 creates a node to be filled with data. Lines 18-20 update the generic node with the string read in from the source file. Finally line 21 adds the single node to the supplied tree. Listing of the simple ascii <function>getData</function> &tr){ // check that the argument is not an empty string if(location.size()<=0) throw invalid_argument("cannot parse empty string"); // check that the argument is an integer int line_num=string_util::str_to_int(location); // set stream to the line before skip_to_line(infile,current_line,line_num); // read the line and print it to the console string text=read_line(infile); // create an empty node Node node("empty","empty"); // put the data in the node vector dims; dims.push_back(text.size()); update_node_from_string(node,text,dims,Node::CHAR); tr.insert(tr.begin(),node); }]]> is brief because it leverages existing functionality. The ifstream objects does all of the work of getting information out of a file. skip_to_line and read_line are very short functions that scan to a point in an ascii file and read from a point to the next end-of-line character, respectively. Finally, the function update_node_from_string existed in &nxtranslate already to assist node creation while reading the translation file. The interested reader can look at the source of node_util.cpp and text_plain/retriever.cpp to see the body of the functions.