template: titleslide
???
We've pretty much only talked about working on single things at once
Simulation/analysis usually require us to deal with many things: many data points, many atoms in your MD simulation, many images to recognise objects in etc.
So we have some intuition about what this means
In C++ a container:
-
Holds objects of a uniform type
-
Owns the objects that it contains
-
Provides access to its elements
The means of access and the performance of these operations depends on the container.
The standard library has 13 container template classes, but we'll only touch on a few.
-
vector
- a dynamically sized contiguous array -
array
- a statically sized contiguous array -
list
/forward_list
- a doubly/singly linked list -
(
unordered_
)set
/ (unordered_
)map
The size of the vector is determined at run time (and hence memory is allocated then)
The elements are contiguous in memory, so it is fast to jump to any element by index and to iterate though them.
#include <vector>
#include <iostream>
void ShowData() {
std::vector<int> data = GetData();
for (int x: data) {
std::cout << x << "\n";
}
}
![:thumb](Use by default - data locality often wins over algorithmic complexity)
???
Poorly named, but we're stuck: std::vector
is not a vector in either
the 3D spatial sense nor the vector space sense...
We put the type of the elements inside the angle brackets
Here we show a for loop that will iterate through every element of the vector
Why does contiguous memory => fast? Because main memory is slow, chips have caches which bring lines of memory into small high speed bits of memory on chip. They also notice when you are running through a chunk of memory and pre-fetch (or the compiler does this)
Supports:
-
copy and move (if element type is
noexcept
moveable) -
random element access by index
-
resize (and pre-reserving memory)
-
element insertion
-
element deletion
Note that it owns its elements: when it reaches the end of its lifetime, contained elements will also be destroyed.
Be aware that resizing operations may force reallocation and copying!
std::vector<unsigned> first_n_primes(unsigned n) {
std::vector<unsigned> ans;
unsigned maybe_prime = 2;
while (ans.size() < n) {
if (is_prime(maybe_prime)) {
ans.push_back(maybe_prime);
}
maybe_prime += 1;
}
return ans;
}
??? Default constructor creates an empty vector
push_back
adds a new value to the end of the vector
This might require a re-allocation of memory and copying the contents -
Discuss capacity.
-
Contiguous in memory but the size is fixed at compile time.
-
Almost like a vector, but you can't change the size.
-
Only difference is construction - list the values inside the braces
#include <array>
using GridPoint = std::array<int, 3>;
auto p1 = GridPoint{1,2,3};
std::cout << p1.size() << std::endl;
// Prints 3
-
Almost always implemented as a doubly (singly) linked list.
-
Elements are allocated one by one on the heap at run time.
-
Traversal requires following pointers from one end
-
Fast element insertion and deletion (if you don't have to look for the element!)
![:thumb](Use when you will be adding and removing from the ends a lot and the contained objects are expensive to copy/move.
Consider converting to vector
if you have a build/access pattern.)
-
These are associative containers implemented as sorted data structures for rapid search (typically red-black trees).
-
set
is just a set of keys,map
is a set of key/value pairs (types can differ). -
You must have defined a comparison function for the key type.
-
You may want to use the
unordered
versions which use a hash table (requires a hash function)
![:thumb](Use if you either
-
have a large key space that mostly lacks value, or
-
will be looking up unpredictable values a lot or frequently adding/removing values. )
In some structural simulation each parallel processs might own a part of the domain.
So after starting all the processes and reading the domain we might have code like this:
auto rank2comms = std::map<int, BoundaryComm>{};
for (auto p = 0; p < MPI_COMM_SIZE; ++p) {
if (ShareBoundaryWithRank(p)) {
rank2comms[p] = BoundaryComm(my_rank, p);
}
}
// later
for (auto [rank, bc]: rank2comm) {
bc->SendData(local_data);
}
???
Map takes two type parameters in the angle brackets: the key type and value type
What's with the square brackets? It's a structured binding similar to python's tuple unpacking
Each container has its traits
That define the places where they are great
Particularly vector
You don't need a lecture
Just use vector
Where choosing a container, remember vector is best
Leave a comment to explain if you choose from the rest
Credit - Tony van Eerd @ CppCon 2017
template: titleslide
Three types:
-
while
-
range-based
for
loop -
C-style
for
loop
while (boolean expression) {
// Code
}
???
The expression inside the parens can be anything that is convertable to bool
int
- zero => false, anything else => true
Consider computing some prime numbers
std::vector<unsigned> first_n_primes(unsigned n) {
std::vector<int> ans;
unsigned maybe_prime = 2;
while (ans.size() < n) {
if (is_prime(maybe_prime)) {
ans.push_back(maybe_prime);
}
maybe_prime += 1;
}
return ans;
}
???
condition can be a variable or a more complex condtion
#include <vector>
#include <iostream>
void ShowData(std::vector<int>& data) {
for (int x: data) {
std::cout << x << "\n";
}
}
???
Syntax
Read it as "for x in data"
Compare to python
Point out we are copying each element into a local variable x
#include <vector>
void DoubleInPlace(std::vector<int>& data) {
for (int& x: data) {
x *= 2;
}
}
???
We are making x
a reference now
Changes made through it will update the value inside the container
What can you iterate over here?
-
Any type with
begin()
andend()
member functions -
All the STL containers have this
-
Any type where you have overloads of the free functions
begin
andend
accepting your object
Familiar to C programmers
Fortran programmers will know this as a DO
loop
void DoubleInPlace(std::vector<int>& data) {
for (int i = 0; i < data.size(); ++i) {
data[i] *= 2;
}
}
???
This is the canonical form for a counting loop, but you can implement all sort of things
Remind people that C counts from zero (because it's how many elements to advance past the first one) up to but not including the end
Note pre-increment - do not post increment without good reason
Sometime you really need this - but should usually have a range for loop for the simple cases. "What you say when you say nothing"
template: titleslide
Let's think about a C-style for loop:
void DoubleInPlace(std::vector<int>& data) {
for (int i = 0; i < data.size(); ++i) {
data[i] *= 2;
}
}
???
In this function we are mixing things up
- The way in which we iterate through
data
- The operation that we perform on it
- The way in which we access an element
We are also requiring that data
supports random access (cos
operator[]
). Not all containers do
void DoubleInPlace(std::vector<int>& data) {
for (std::vector<int>::iterator it = data.begin();
it != data.end(); ++it) {
*it *= 2;
}
}
This is the "old fashioned C++" way
void DoubleInPlace(std::vector<int>& data) {
for (auto it = data.begin(); it != data.end(); ++it) {
*it *= 2;
}
}
This uses auto
to let the compiler deduce the type for you
???
Discuss pre-increment (optimiser is not perfect)
Discuss operator*
Discuss operator==
To define your own iterator, you need to create a type with several operator overloads (exactly which ones depends on the category of iterator you need). Basics are:
-
dereference operator (
*it
) - you have to be able to get a value (either to read or write) -
pre-increment (
++it
) - you have to be able to go to the next one ![:fn](Why not post-increment? Because this has to return the value ofit
from before it was incremented. This usually means a copy.) -
assigment - you need to be able to bind it to name
-
inequality comparison (
it != end
) - you need to know when you are done
template: titleslide
In your clone of this repository, find the containers
exercise and list
the files
$ cd APT-CPP/exercises/containers
$ ls
Makefile test.cpp vector_ex.cpp vector_ex.hpp
As before, test.cpp
holds some basic unit tests.
vector_ex.cpp
/.hpp
hold some functions that work on std::vector
- provide the implementations
The tests require that you implement, in a new header/implementation
pair of files, a function to add a string to a std::map
as the key,
the value being the length of the string.
You will want to find the documentatation for map
on https://en.cppreference.com/
You can compile with make
as before.