Economics Everywhere -- Huanren Zhang's Blog: January 2014

Saturday, January 18, 2014

Headers and Includes: Why and How

Organize your directories so that each class has its own header file (.hpp) with the class declaration and its own implementation file (.cpp) with the source code for the class methods.

Your main() function will be in its own .cpp file and all the .cpp files will be compiled into .obj files, which will then be linked into a single program by the linker.

http://www.cplusplus.com/forum/articles/10627/

The #include statement is basically like a copy/paste operation. The compiler will "replace" the #include line with the actual contents of the file you're including when it compiles the file.

The difference between Header files and Source files? Basically, header files are #included and not compiled, whereas source files are compiled and not #included. Files with header extensions might be ignored by the compiler if you try to compile them.

1) Only #include things you need to include (covered next section)
2) Guard against incidental multiple includes with include guards.

An Include Guard is a technique which uses a unique identifier that you #define at the top of the file. Here's an example:

//x.h

#ifndef __X_H_INCLUDED__   // if x.h hasn't been included yet...
#define __X_H_INCLUDED__   //   #define this so the compiler knows it has been included

class X { };

#endif

This works by skipping over the entire header if it was already included. __X_H_INCLUDED__ is #defined the first time x.h is included -- and if x.h is included a second time, the compiler will skip over the header because the #ifndef check will fail.

Always guard your headers. Always always always. It doesn't hurt anything to do it, and it will save you some headaches.

C++ Programming (from the book Thinking in C++)

Identifier names

The first letter of an identifier is only capitalized if that identifier is a class. If it is a function or variable, then the first letter is lowercase. The rest of the identifier consists of one or more words, run together but distinguished by capitalizing each word.

So a class looks like this:
class FrenchVanilla : public IceCream {

an object identifier looks like this:
FrenchVanilla myIceCreamCone(3);

and a function looks like this:
void eatIceCreamCone();
(for either a member function or a regular function).

The one exception is for compile-time constants (const or #define), in which all of the letters in the identifier are uppercase.

The value of the style is that capitalization has meaning – you can see from the first letter whether you’re talking about a class or an object/method. This is especially useful when static class members are accessed.

Order of header inclusion
Headers are included in order from “the most specific to the most general.” That is, any header files in the local directory are included first, then any of my own “tool” headers, such as require.h, then any third-party library headers, then the Standard C++ Library headers, and finally the C library headers.

If the order of header inclusion goes “from most specific to most general,” then it’s more likely that if your header doesn't parse by itself, you’ll find out about it sooner and prevent annoyances down the road.

Include guards on header files
Include guards are always used inside header files to prevent multiple inclusion of a header file during the compilation of a single .cpp file. The include guards are implemented using a preprocessor #define and checking to see that a name hasn’t already been defined. The name used for the guard is based on the
name of the header file, with all letters of the file name uppercase and replacing the ‘.’ with an underscore. For example:
// IncludeGuard.h
#ifndef INCLUDEGUARD_H
#define INCLUDEGUARD_H
// Body of header file here...
#endif // INCLUDEGUARD_H

Use of namespaces
In header files, any “pollution” of the namespace in which the header is included must be scrupulously avoided. That is, if you change the namespace outside of a function or class, you will cause that change to occur for any file that includes your header, resulting in all kinds of problems. No using declarations of any
kind are allowed outside of function definitions, and no global using directives are allowed in header files.

Accessor Method
As a general rule of design, you should keep the data members of a class private. To access private data in a class, you must create public functions known as accessor methods.

A public accessor method is a class member function used either to read the value of a private class member variable or to set its value. This practice enables yo to separate the details of how the data is stored from how it is used. You can later change how the data is stored without having to rewrite any of the other functions in your programs that use the data.

const Member Functions
If you declare a class method const, you are promising that the method won't change the value of any of the members of the class. It is good programming practice to declare as many methods to be const as possible.

Class declarations and method definitions
Each function that you declare for your class must have a definition (the function implementation). The convention is to put the declaration into what is called a header file: most of the time, clients of your class don't care about the implementation specifics, and reading the header file tells them everything they need to know.

The declaration of a class is called its interface because it tells the user how to interact with the class. The function definition tells the compiler how the function works.

Pointers
The address-of operator (&) returns the address of an object in memory.
Pointers are used mainly for three tasks:
1. Managing data on the free store
2. Accessing class member data and functions
3. Passing variables by reference to functions

The free store (heap) is not cleaned until your program ends, and it is your responsibility to free any memory that you've reserved when you are done with it. The advantage to the free store is that the memory you reserve remains available until you explicitly state you are done with it by freeing it.

You allocate memory on the free store in C++ by using the new keyword. new is followed by the type of the object that you want to allocate. The return value from new is a memory address.
int pPointer= new int;

Remember to delete a pointer after you are done using it. For every time in your program that you call new, there should be a call to delete.

When you call delete on a pointer to an object on the free store, that object's destructor is called before the memory is released.

Objects on the free store persist after the return of a function. The capacity to store objects on the free store enables you to decide at runtime how many objects you need, instead of having to declare this in advance.

The this Pointer
Each class member function has a hidden parameter: the this pointer, pointing to "this" individual object. A pointer to the object that holds the function.

If you declare a pointer to a const object, the only methods that you can call with that pointer are const methods.

Protect objects passed by reference with const if they should not be changed. Set pointers to nullptr rather than leaving them uninitialized or dangling.

The C++ clients of classes and functions can rely on the header file to tell all that is needed: it acts as the interface to the class or function. The actual implementation is hidden from the client. This enables the programmer to focus on the problem at hand and to use the class or function without concern for how it works.

Passing by Reference
Passing by value is like giving a museum a photograph of your masterpiece instead of the real thing. Passing by reference is like sending your home address to the museum and inviting guests to come over and look at the real thing. The solution is to pass a pointer to a constant object OR a reference to a constant object. Doing so prevents calling any non-const method on it, and thus protects the object from change.

Whenever possible, pass parameters by reference. Don't use pointers if references will work.
Whenever possible, use const to protect references and pointers.

It is safer to build your functions so that the delete the memory they create.

Arrays
The American National Standards Institute (ANSI) standard declares the scope of variables in the for loop only to the block of the for loop itself.

Think of the index of an array as the offset.

In C++, an array name is a constant pointer to the first element of the array. In the declaration
Cat Family[500];
Family is a pointer to &Family[0], which is the address of the first element of the array Family. For all practical purposes, you can treat the pointer to an array as the name of the array. The one thing you will need to do, however, is to free the memory you allocated in setting up the array. In C++, arrays are no more than special cases of pointers.

Deleting Family automatically returns all the memory set aside for the array if you use the delete with the [] operator. By including the square brackets, the compiler is smart enough to destroy each object in the array and to return its memory to the free store. If you leave the brackets off, only the first object in the array is deleted.

When you create an item on the heap by using new, you always delete that item and free its memory with delete. Similarly, when you create an array by using new <class>[size], you delete that array and free all its memory with delete []. The brackets signal the compiler that this array is being deleted.

The biggest advantage of being able to allocate arrays on the heap is that you determine the size of the array at runtime and then allocate it.

Monday, January 6, 2014

Python Trivia

Any command you can run in the terminal window can be run inside IPython when you start the command with !

Shallow copy means copying references and deep copy implies copying the complete contents of an object (roughly speaking). The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):

A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.

A trailing comma would turn an expression into a one-tuple and "()" would represent a zero-tuple. it is the commas, not the parentheses, that define the tuple

Like Jave, Python uses automatic garbage collection, releasing objects whose reference count is zero.

Lists and References:

Creating Lists

Python creates a single new list every time you execute the [] expression. No more, no less. And Python never creates a new list if you assign a list to a variable.

A = B = [] # both names will point to the same list

A = []
B = A # both names will point to the same list

A = []; B = [] # independent lists

Note that the for-in statement maintains an internal index, which is incremented for each loop iteration. This means that if you modify the list you’re looping over, the indexes will get out of sync, and you may end up skipping over items, or process the same item multiple times. To work around this, you can loop over a copy of the list:

    for object in L[:]:
        if not condition:
            del L[index]




Alternatively, you can use create a new list, and append to it:
    out = []
    for object in L:
        if condition:
            out.append(object)

>>> l1=range(3)
>>> l2=range(20,23)
>>> l3=range(30,33)
>>> l1[len(l1):]=[l2]    # equivalent to 'append' for subscriptable sequences 
>>> l1[len(l1):]=l3      # same as 'extend'
>>> l1
[0, 1, 2, [20, 21, 22], 30, 31, 32]

By putting the list constructor around l2 in l1[len(l1):]=[l2], or calling l.append(l2), you create a reference that is bound to l2. If you change l2, the references will show the change as well. The length of that in the list is a single element -- the reference to the appended sequence.

With no constructor shortcut as in l1[len(l1):]=l3, you are making a true (shallow) copy of the elements in l3.

So here, changing the element of l2 will change corresponding "list" element of l1, and vice versa. Changing the element of l3 will not changing the corresponding element in l1, and changing corresponding element in l1 will not change the element in l3, because "extend" works using a shallow copy and the elements in l3 is immutable.

>>> l1=range(3)
>>> l2=range(20,23)
>>> l3=[l2]
>>> l1[len(l1):]=l3      # same as 'extend'
>>> l1

In this case, however, changing the element in each of the lists will change the corresponding element in the others, because even for a shallow copy, references are stored because the corresponding elements are mutable (a list).

The list object consists of two internal parts; one object header, and one separately allocated array of object references. The latter is reallocated as necessary.

The list has the following performance characteristics:

The list object stores pointers to objects, not the actual objects themselves. The size of a list in memory depends on the number of objects in the list, not the size of the objects.
The time needed to get or set an individual item is constant, no matter what the size of the list is (also known as “O(1)” behaviour).
The time needed to append an item to the list is “amortized constant”; whenever the list needs to allocate more memory, it allocates room for a few items more than it actually needs, to avoid having to reallocate on each call (this assumes that the memory allocator is fast; for huge lists, the allocation overhead may push the behaviour towards O(n*n)).
The time needed to insert an item depends on the size of the list, or more exactly, how many items that are to the right of the inserted item (O(n)). In other words, inserting items at the end is fast, but inserting items at the beginning can be relatively slow, if the list is large.
The time needed to remove an item is about the same as the time needed to insert an item at the same location; removing items at the end is fast, removing items at the beginning is slow.
The time needed to reverse a list is proportional to the list size (O(n)).
The time needed to sort a list varies; the worst case is O(n log n), but typical cases are often a lot better than that.