Iterators are used to abstract moving sequentially through containers, from the details of the container. This also allows the many STL template algorithms to be independent of the container type.Containers classes supply suitable typedef typenames for their iterator types. Iterators are used with a very similar syntax to C pointers. In fact pointers can generally be supplied as iterators to most STL functions where legacy C arrays are being used. To get at the object pointed to by the iterator, just dereference it like a pointer. Pointer-like arithmetic can be done with the iterators of random access containers (e.g. vector). Other iterators (e.g. for maps) may only supply ++ and -- operators.
Most STL containers supply begin() and end() methods to access the first element, and one past the last. Note end() is not a valid element, but because STL algorithms are generally [), end() is usually the correct loop terminator.
For example consider a loop to reset every AAM of a vector of AAM sub-models.
vcl_vector< vaxm_mr_active_model_instance > subModels;
....... and fill up the vector of model instances somehow, then
............
//Reset all
vcl_vector< vaxm_mr_active_model_instance >::iterator submIter = subModels.begin();
vcl_vector< vaxm_mr_active_model_instance >::iterator submIterEnd = subModels.end();
while(submIter != submIterEnd)
{
(submIter++)->reset();
}
If a container is supplied as a const reference, use const_iterator. e.g. copy out a subset of points defined by an index vector.
vcl_vector< vgl_point_2d< double > > points ;
vcl_vector< unsigned > index ;
//... fill in points index somehow...
vcl_vector< vgl_point_2d< double > > subset(0) ;
//The index should not be changed so use a const_iterator
vcl_vector< unsigned >::const_iterator indexIter = index.begin();
vcl_vector< unsigned >::const_iterator indexIterEnd = index.end();
subset.reserve(index.size());
while(indexIter != indexIterEnd)
{
subset.push_back( points[*indexIter++]);
}
Note the use of reserve to pre-allocate the necessary memory. This is more efficient than relying on the automatic reallocation of push_back. Also typically more efficient than using resize, as no construction of default objects occurs.
You can also use a back_insertion iterator. Assigning into this in effect does the push_back, but back_inserters are generally more useful in STL algorithms (see later). Note there are similar front_insertion iterators, and a general insertion iterator. The latter is particularly useful when the size of the output container is not known in advance of the algorithm (e.g. set difference or set intersection). The push_back in the previous example can be replaced by a back_insertion iterator thus.
subset.reserve(index.size());
back_insert_iterator< vcl_vector< unsigned > > bi(subset);
while(indexIter != indexIterEnd)
{
*bi++ = points[*indexIter++]);
}
You can also iterate backwards from end to beginning using a reverse_iterator. e.g. copy out a subset of points defined by an index vector and reverse the ordering.
vcl_vector< vgl_point_2d< double > > points ;
vcl_vector< unsigned > index ;
//... fill in points and index somehow...
vcl_vector< vgl_point_2d< double > > subset(0) ;
vcl_vector< unsigned >::const_reverse_iterator rindexIter = index.rbegin();
vcl_vector< unsigned >::const_reverse_iterator rindexIterEnd = index.rend();
subset.reserve(index.size());
while(rindexIter != rindexIterEnd)
{
subset.push_back( points[*rindexIter++]);
}
You can get the associated normal iterator from a reverse_iterator using the base method.
//Get the first element equal to 3 counting back from the end
vcl_vector< unsigned >::reverse_iterator ri=find(v.rbegin(),v.rend(),3) ;
vcl_vector< unsigned >::iterator i=ri.base();
Then i points to the element AFTER 3, viewing in the forward sense. So ((++ri).base()) is a forward iterator which points to 3.
Suppose we use dynamic programming to find the shortest route through a graph connected via a sequence of stages.The recursion in effect gives us the route from end back to start, so we typically need to reverse it.
vpdp_pt2d_node::node_vec best_route;
vpdp_pt2d_node::stage_vec graph;
.... somehow add nodes to each stage of graph
if(vpdp_execute(graph,best_route)
vcl_copy(best_route.rbegin(), //print reversed solution to cout
best_route.rend(),
vcl_ostream_iterator< vpdp_pt2d_node >(vcl_cout,"\n"));
Note the use of stream iterator in copy. Assigning to an ostream_iterator is like doing operator<< to the linked stream. There is a similar istream_iterator. End of stream is indicated in a slightly odd way - a default constructed iterator in effect means end of stream. For example to read an entire text file into a string, use the iterator range constructor with a istreambuf_iterator.
ifstream infile("config.txt");
vcl_string strConfig(istreambuf_iterator< char >(infile),
istreambuf_iterator< char >()); //This means up to end of stream
NB whereas istream_iterator performs a formatted read, an istreambuf_iterator does unformatted character reads on increment - so is more efficient. Also it preserves whitespace by default.
As find-like algorithms return "end" on failure, a search within a file will return a default constructed istream_iterator if it fails. For example to scan a file stream until a keyword is located just use find, checking the return:
//Keep reading past any header or comments line until the keyword $KEY is found
vcl_istream_iterator< vcl_string > ifs_end; //default stream iterator means end of stream condition
if(vcl_find(vcl_istream_iterator< vcl_string >(infile),ifs_end,"$KEY") == ifs_end)
{
throw myException(vcl_string("Could not locate start of config data "));
}
//.. Then read the data from here using >> on infile stream.
Note if the sought keyword is a single character then istreambuf_iterator is more efficient.
STL containers normally have insert and remove methods which take an iterator range as well as for individual elements. These are normally faster than looping over multiple insertions (especially if you forget to reserve).
//Identify the sub-range within ptsIndex and insert it into a segment-specific sub-vector
vcl_vector< unsigned> ptsIndex;
..... Fill up ptsIndex somehow in parts connectivity order and select two IDs for a sub-segment
unsigned firstId,secondId;
....
vcl_vector< unsigned> ptsIndexSeg;
vcl_vector< unsigned>::const_iterator startIter = vcl_find(ptsIndex.begin(),ptsIndex.end(),firstId);
vcl_vector< unsigned>::const_iterator endIter = vcl_find(ptsIndex.begin(),ptsIndex.end(),secondId);
if(startIter != ptsIndex.end() && endIter != ptsIndex.end())
{
vcl_vector< unsigned> ptsIndexSeg; //The sub-segment if indices between first and second Ids
if(startIter < endIter) //we iterate in same sense as parts
{
//Use an iterator range insert method where possible
ptsIndexSeg.insert(ptsIndexSeg.end(),startIter,endIter+1); //Note the +1 to include endIter itself
}
else //reverse order to parts - need to reverse the ordering
{
//Unfortunately member insert does not work with reverse iterator so use reverse_copy
ptsIndexSeg.reserve(1+unsigned(vcl_distance(endIter,startIter)));
vcl_reverse_copy(endIter,startIter+1,vcl_back_inserter(ptsIndexSeg));
//NB the reserve is not needed but makes it probably more efficient by avoiding re-allocations as
//ptsIndexSeg increases in size.
}
}
So note vector (and other random access) iterators are ordered and vcl_distance gives the number in the sequence between two iterators [first,last). Note also the use of back_inserter as a shorthand for constructing a templated back_insert_iterator. This is a global template function and works out its return type from the input type.
By our standards "algorithm" might sometimes seem a slight misnomer for STL algorithms. In this context an algorithm essentially means a templated function which operates on an iterator range, often an entire container. Note pointers are still iterators and so STL algorithms can also be used on C-style arrays via pointers. There are around 100 STL algorithms, this section just gives you a feel for some of the more common. Generally they are more efficient, less bug-prone, and more succinct than writing your own loop. Also the algorithm name announces the intention immediately, in a way that for(i=0;i < N;i++) does not!. You can probably replace around 50% of the loops in your code with STL algorithm calls.
Algorithms all take iterator ranges, often this will be from begin() to end(). NB the range is always [a,b) - the last element is NOT acted on (just as .end() is one beyond the last valid element). The simplest algorithms just do something to each element of a container. Examples are fill, replace, generate. There are also copy and reverse_copy as seen above, and numerous copy variants (replace_copy, remove_copy_if etc etc).
For example to set all point states to free, and associated weights to zero.
vcl_fill(pointStates.begin(),pointStates.end(),vsml_PtFree); vcl_fill(pointWeights.begin(),pointWeights.end(),0.0);
To replace all occurrences of a value, use replace, e.g.:
vcl_replace(pointStates.begin(), pointStates.end(),
vsml_PtNormal,
vsml_PtConstrained); //make all normal points constrained
//Replace any normal point weight (i.e. weight < 10.0) with a notional weight of INITIAL_WEIGHT
vcl_replace_if(pointWeights.begin(),pointWeights.end(),
vcl_bind2nd(vcl_less< double >(), 10.0),
INITIAL_WEIGHT);
See later in the Functor section (Adapters) for the meaning of bind2nd.
To count up the number of times a value occurs use count, e.g. for number of fixed points:
unsigned nfixed = vcl_count(pointStates.begin(), pointStates.end(), vsml_PtFixed);
Many algorithms perform an operation on the objects pointed to by the iterators. The most common are for_each and transform.These can take a normal function, but it is often more useful to use a function-like object known as a functor. A functor is a class which overrides the function call operator (operator()). Suppose we want to realign a vector of points to their closest point on a bezier defined by some other "special" points.
// Create a Bezier curve from these points
vsml_bezier bezier(markedPoints);
bezier.smooth();
//Calculate the located points, displaced to align to their nearest point on the "true" bezier
vcl_vector< vgl_point_2d< double> > alignedPoints(shapePoints);
vcl_for_each(alignedPoints.begin(),alignedPoints.end(),
align_to_bezier_functor(bezier,vsml_part::CFOpen));
Here align_to_bezier_functor would be passed a reference to the bezier at construction, the class would provide: operator()(vgl_point_2d< double> & pt) to realign the input point onto its closest point on the curve.
As functors are typically short lived stack objects they are frequently constructed anonymously as above. Also as temporary stack objects, it often makes sense to use reference members which are perforce set at construction. For example the bezier align functor class would look something like this (note the & on the bezier).
//Align a point to its closest point on a bezier
class align_to_bezier_functor
{
const vsml_bezier& bezier_;
bool isOpen_ ;
public:
align_to_bezier_functor(const vsml_bezier& bezier,vsml_part::CurveForm form):
bezier_(bezier),isOpen_(form==vsml_part::CFOpen ? true : false) {}
void operator()(vgl_point_2d< double>& pt);
};
Sometimes it is necessary to have some additional data in the functor to accumulate some total over the range for example. In this context note that although for_each is often treated as "void" it actually returns a copy of the functor, so you can use this returned object to access the final functor state. Also note that all STL algorithms operate on a copy of the functor- not the one you supply. Therefore keep functors lightweight. If they do need lots of data, use something like a bridge pattern with a separate class to hold the data, and let the functor have a pointer to it. Use of a bridging "helper" class like this is also usually the best way to use polymorphism with functors.
Another common type of loop is to loop over an input range, perform some calculation for each element, and then assign the result into the corresponding element of an output range. For this the STL provides transform. Actually there are two variants for either a unary or binary transform function.The first version of transform takes a single input range, and a unary operation. To transform from world to image coordinates over a vector of points, via a vimt_transform:
vcl_transform(worldPoints_.begin(),worldPoints_.end(),
imagePoints_.begin(),
image_.world2im());
The output range of transform can be the same as the input range. So for example to perform a warp on a vector of points:
mbl_thin_plate_spline_2d warper;
warper.build(modelPoints,initialPoints);
vcl_transform(rawPoints.begin(),rawPoints.end(), //input range
rawPoints.begin(), //output range - just overcopy warped point
warper);
Suppose now (referring to the bezier alignment example of for_each) we want to calculate the distance between our original points and their realignment on the bezier. Here, the output range is a different container than the input range so for_each is no good. Instead we use transform again, this time with a double input range and a binary functor.
//Then calculate the error distances
vcl_vector< double > errors;
errors.reserve(bezierIndices.size());
vcl_transform(markedPoints.begin(),markedPoints.end(), //First input range
alignedPoints.begin(), //second input range
vcl_back_inserter(errors), //output onto end
point2D_distance_functor()); //distance from point 1 to point 2
The distance functor would look something like this:
//: return the distance between two points
struct point2D_distance_functor
{
double operator()(const vgl_point_2d< double>& pt1, const vgl_point_2d< double>& pt2)
{return vgl_distance(pt1,pt2);}
};
There are obscure compiler reasons why using template functions like vgl_distance often fails to compile as an input to an STL algorithm, whereas template functors are fine. There can be situations where the name of an instantiation of a function template is not equivalent to the name of a function. This is why vgl_distance was not used directly. Functors are also more likely to lead to successful inline-ing of code in STL algorithms, because functors are passed as objects, whereas functions are passed to STL constructs as pointers to the function, and on many compilers this precludes inline-ing. This is why STL sort usually runs significantly faster than the old C qsort for example.
A better implementation of the distance functor would be to make it a template and derive from vcl_binary_function to make it adaptable (see later). A "proper" version would be :
template< class T>
struct distance_functor : public vcl_binary_function< vgl_point_2d< T >,
vgl_point_2d< T >, T >
{
T operator()(const vgl_point_2d< T >& pt1, const vgl_point_2d< T >& pt2)
{return vgl_distance(pt1,pt2);}
};
Our loop above would then become:
vcl_transform(markedPoints.begin(),markedPoints.end(), //First input range
alignedPoints.begin(), //second input range
vcl_back_inserter(errors), //output onto end
distance_functor< double >()); //distance from point 1 to point 2
The template functor is entirely free of the obscure compiler problems which plague the template function (see Effective STL by Scott Meyers for an explanation), and you can be pretty sure the code is inlined too.
Transform-type functors often also work well with reference members. For example the early iterator loop example extracting a subset of points from a vector via an index vector, can be performed via an indexing functor, which is constructed with a reference to the required point vector. Then the loop to extract the subset is replaced by transform.
subset.reserve(index.size());
vcl_transform(index.begin(),index.end(),
vcl_back_inserter(subset),
index_functor< vgl_point_2d< double > >(points));
//:Given a vector of objects, select an indexed reference to one
template< class T >
class index_functor
{
//This functor copies out element vec_[index]
private:
//:reference to vector used to store objects to be sub-indexed
const vcl_vector< T >& vec_;
public:
index_functor(const vcl_vector< T >& v): vec_(v) {}
const T& operator()(unsigned index) const
{return vec_[index];}
};
When faced with the choice of a hand-written loop for a simple operation, or for_each or transform with a functor, the hand-written loop almost always seems quicker to write to STL novices. However I find if I take a one-off hit with the functor, I almost always reuse it. I also make far fewer errors, like forgetting to increment the iterator, or going out of bounds etc. Furthermore the STL loops are almost always faster, and may use library writers' optimisations (like direct pointer arithmetic for vector iterators). They will avoid multiple calls to container member functions like size() and end() on loop bound checks. And if you are altering the container in the loop (e.g. erasing or inserting elements), then you are far less likely to get obscure memory errors by invalidating an iterator. Obviously complicated operations should be in a separate function anyway, then you usually may as well make it a functor (or if it's a class member invoke it via the mem_fun adapters - see below).
The big advantage to my mind is that the STL construct tends to focus the reader's attention on the high level operation being performed on the container as a whole, rather than the low level detail. It is usually obvious from the supplied range what is being looped over, in a which it is not given a loop like:
for(int i=0;i< n;i++)
This traditional loop announcement not only gives no clue about what is happening in
it; it often does not even give any clue as to what is being looped
over, or what the bound means. Such loops are also quite error prone
when there are multiple index variables around (usually called i,j and
the like, with loop bounds called n,m etc). It is very easy to be
confused by what i and n are, or use i when you meant j. The confusion is added to when for
efficiency the code within the loop never even addresses the original
container by name; but instead uses a pointer to the container's internal data block. At least when you
see:
for_each(v.begin(),v.end(),do_something_functor())
you know immediately that an operation is being appled to the entire range of
container v; and the do_something name should give you a very large
hint as to what! Moreover you do not need to set up pointers to
internal data blocks for efficiency, as unless your STL came from an
exceptionally incompetent source, the iterator arithmetic concealed in
for_each should be just as good.
The trickier choices are with loops that would have say just a few
lines of code in them, and it hardly seems worth bothering with a
little functor class. I usually base my judgement on the likelihood of
reusing the functor if I do write it; and whether the function
containing the loop is itself getting oversize and in need of some decomposition.
Note that STL algorithms can be used with ordinary functions (or class scope static methods), but a functor is often a better choice unless there is no associated data other than the object being looped over (and you don't want inline code).
Three other advantages of functors over functions are: 1) They facilitate inlining of code (see sort discussion below) 2) When templated, there are far less obscure compiler problems than with templated functions. 3) Being able to use adapters such as bind2nd (see later) to set constant arguments ,particularly on binary operations.
Sometimes a "main" class can have an associated namespace of "helper" functors, which provide encapsulated bits of intelligence. It may be appropriate to keep some of these private to the c file as logically they are really private internal classes. It may also be appropriate to privately nest functor classes within their main class client. However some older C++ compilers do not support nested classes properly (e.g I suspect MSVC 6). Nested classes seems more common as a Java approach, so perhaps the namespace of associated helper functors is preferable.
Note some common automatic code generators such as Ansi C++ with Rational Rose is liable to create code using nested classes for functors if your UML model defines the functor as an internal class.
The vcl_binary_function gobbledegook is not strictly needed in the above template distance functor example. However the STL provides a rich set of so-called adapters, which are themselves template functors. With logical predicate functors (next section) there are the usual logical operations. Another useful adapter is bind2nd, which can be used to adapt a binary function for use over a single input range, when one argument is a constant. So if we want the distance of a vector of points from some fixed point:
vgl_point_2d< double> fixedPoint(10.0,20.0);
vcl_transform(points.begin(),points.end(),
vcl_back_inserter(distances), //output distance from special point onto end
vcl_bind2nd(distance_functor< double >(), fixedPoint)); //distance from iterated point to fixedPoint
Or if we want to replace all weight values in a vector of weights with some constant value, if the weight is below a magic value (say 10.0):
//Replace any normal point weight (i.e. weight < 10.0) with a notional weight of INITIAL_WEIGHT
vcl_replace_if(pointWeights.begin(),pointWeights.end(),
vcl_bind2nd(vcl_less< double>(), 10.0),
INITIAL_WEIGHT );
A functor can only be used with a binder if it is "adaptable", which means it must be derived from vcl_unary_function or vcl_binary_function. The template arguments are the first argument to operator(), the second argument if a binary_function, and the result of operator(). The base structs unary_function and binary_function have no members, but do supply these typedefs to the binders. less is an example of one of the STL's built-in functors. Most simple arithmetic and logical operation have built in functors.
Another useful adapter is mem_fun (or mem_fun_ref). These allow object member functions to be called rather than functors. For example to call reset on all models in a vector of pointers to model instances (see first example in iterators):
vcl_vector< vaxm_mr_active_model_instance* > subModelPtrs;
....... and fill up the vector of model instances somehow, then
............
//Reset all
vcl_for_each(subModelPtrs.begin(),subModelPtrs.end(),
mem_fun(&vaxm_mr_active_model_instance::reset));
The syntax is slightly obscure - you pass a member pointer giving the address of the invoked member function. If you have a container of objects rather than object pointers use mem_fun_ref instead.
Warning - unless you got your STL from Boost, mixing bind2nd and mem_fun almost always causes obscure compiler errors due to the reference to reference problem.
Effective STL by Scott Meyers (Addison-Wesley)
Martin Roberts