cpp - Ye Zheng's Blog

1.空类的大小是多少？为什么是1？ 2.手写消费者生产者模式的伪代码

一个数组中只有一个数出现了一次，其他数都出现了两次；找出这个数位运算异或若有两个数只出现一次，其他数都出现了两次；找到这两个数位运算与

1.重载和重写 2.指针和引用的区别 3.#define 和 const 区别 4.结构体字节对齐 5.C++内存分配模型

说一下C语言中#ifdef #endif的作用

数组指针和指针数组概念和区别

double a = 1/3; a = ?
#define

它属于宏，是预处理器的一部分。预处理是在编译之前的一道，简单地进行字符串替换。它不按照语言的语法，而是直观自己的语法。
const
- const是单词constant的简写，字面意思是常数、常量。用于变量修饰，表明这个变量不能被修改；
- 用于指针修饰，表明指针的指向物不能被修改；
- 用于方法修饰，表明这个方法不会对对象造成改变。
static在不同场景下的不同意义

static有三个完全不同的含义：
- 用在全局变量，表明这个变量在每个编译单元有独自的实例：
  1 2
  
  // foo.h static int a = 123;
  1 2 3
  
  // foo.cpp #include "foo.h" int foo_func() { return a++; }
  1 2 3
  
  // bar.cpp #include "foo.h" int bar_func() { return a++; }
  如果你分别编译foo.cpp和bar.cpp，再把它们链接在一起，全局变量a会有两份，那两个函数会操纵不一样的a实例。
- 用在函数里的局部变量，表明它的生存周期其实是全局变量，但仅在函数内可见：
  1 2 3 4 5
  
  int get_global_id() { static int seed = 0; return seed++; }
  每次访问这个函数的时候，会获得不同的int值。那个=0的操作仅在第一次访问时执行，其实是初始化而不是赋值。
  
  函数static变量在在函数第一次调用时初始化这个问题在c++里尤其重要，并且和析构顺序一起构成远古深坑。
- 用在类成员，表明成员或者方法是类的，而不是对象实例的。
  1 2 3 4 5 6
  
  struct Foo { int a = 0; static int aaa = 0; static int bbb() { return 123456; } };
  每个Foo实例会只含有一个int a。bbb方法通过Foo::bbb()调用。
  
  static成员前的static意思是对一个变量的声明，而非定义。静态成员变量的初始化通常必须在类的定义外，即.cpp文件里而不是.h里。只有成员被声明为const或constexpr才允许对静态成员进行类内初始化
  
  另外，在全局/静态区定义类对象通常是很危险的行为。这是因为对这些变量进行初始化的顺序不确定，容易产生bug并且很难追查。在Google C++ Style Guide中，这样的行为是直接禁止的。良好的姿势是定义为指针，并使用C++11的std::call_once对其进行初始化。call_onec也是不好的，应该坚持在main函数里面按顺序构造和析构所有全局指针变量。
  
  （C语言为了内联才需要在头文件中放static inline函数，C++只用inline不要static。static函数放到实现文件的开头比较好）
reference和pointer的区别

指针和引用通过它都可以得到一个对象。

区别：
- 指针一般指的是某块内存的地址，通过这个地址，我们可以寻址到这块内存；而引用相当于一个变量的别名，
- 指针是对象，占据内存；引用不是，不一定占据内存。（考虑int a; int& r = a;编译器做的优化，以及引用作为形参时必须占据存储）
- 引用必须被初始化，指针不必。
- 引用初始化以后不能被改变，指针可以改变所指的对象。
- 不存在指向空值的引用，但是存在指向空值的指针。
- 引用使用时无需解引用，而指针需要
- sizeof引用得到的是所指向对象的大小，而sizeof指针是得到指针本身。（这里有数组指针的问题）
- 指针和引用的自增自减运算意义不一样。

vector和list有什么区别

std::vector	std::list
Contiguous memory.	Non-contiguous memory.
Pre-allocates space for future elements, so extra space required beyond what’s necessary for the elements themselves.	No pre-allocated memory. The memory overhead for the list itself is constant.
Each element only requires the space for the element type itself (no extra pointers).	Each element requires extra space for the node which holds the element, including pointers to the next and previous elements in the list.
Can re-allocate memory for the entire vector any time that you add an element.	Never has to re-allocate memory for the whole list just because you add an element.
Insertions at the end are constant, amortized time, but insertions elsewhere are a costly O(n).	Insertions and erasures are cheap no matter where in the list they occur.
Erasures at the end of the vector are constant time, but for the rest it’s O(n).	It’s cheap to combine lists with splicing.
You can randomly access its elements.	You cannot randomly access elements, so getting at a particular element in the list can be expensive.
Iterators are invalidated if you add or remove elements to or from the vector.	Iterators remain valid even when you add or remove elements from the list.
You can easily get at the underlying array if you need an array of the elements.	If you need an array of the elements, you’ll have to create a new one and add them all to it, since there is no underlying array.

In general, use vector when you don’t care what type of sequential container that you’re using, but if you’re doing many insertions or erasures to and from anywhere in the container other than the end, you’re going to want to use list. Or if you need random access, then you’re going to want vector, not list. Other than that, there are naturally instances where you’re going to need one or the other based on your application, but in general, those are good guidelines.

new 和 malloc的区别

malloc与free是C++/C 语言的标准库函数，new/delete 是C++的运算符。对于非内部数据类的对象而言，光用maloc/free 无法满足动态对象的要求。对象在创建的同时要自动执行构造函数，对象消亡之前要自动执行析构函数。由于malloc/free 是库函数而不是运算符，不在编译器控制权限之内，不能够把执行构造函数和析构函数的任务强加malloc/free。

区别是 new 失败了会掉一个 handler 做补救, 该 handler 默认是 nullptr.

再一个是 new 失败了会丢 bad_alloc(如果没指定 nothrow 的话)

再一个是 C++ 不许有 size 为 0 的对象, 所以这样的 new 会被强制转为 1

malloc: C函数。简单的分配一段给定大小的内存给你返回一个void*，如果失败，返回nullptr。

new: C++函数，可以理解成两步：1。malloc；2。如果需要的话，在那段内存上初始化对象(赋值或者调用构造函数）。3. 返回指向特定类型的指针。

new / delete

Allocate/release memory
1. Memory allocated from ‘Free Store’
2. Returns a fully typed pointer.
3. new (standard version) never returns a NULL (will throw on failure)
4. Are called with Type-ID (compiler calculates the size)
5. Has a version explicitly to handle arrays.
6. Reallocating (to get more space) not handled intuitively (because of copy constructor).
7. Whether they call malloc/free is implementation defined.
8. Can add a new memory allocator to deal with low memory (set_new_handler)
9. operator new/delete can be overridden legally
10. constructor/destructor used to initialize/destroy the object

malloc/free

Allocates/release memory
1. Memory allocated from ‘Heap’
2. Returns a void*
3. Returns NULL on failure
4. Must specify the size required in bytes.
5. Allocating array requires manual calculation of space.
6. Reallocating larger chunk of memory simple (No copy constructor to worry about)
7. They will NOT call new / delete.
8. No way to splice user code into the allocation sequence to help with low memory.
9. malloc/free can NOT be overridden legally

Table comparison of the features:

Feature	`new` / `delete`	`malloc` / `free`
Memory allocated from	‘Free Store’	‘Heap’
Returns	Fully typed pointer	`void*`
On failure	Throws (never returns `NULL`)	Returns `NULL`
Required size	Calculated by compiler	Must be specified in bytes
Handling arrays	Has an explicit version	Requires manual calculations
Reallocating	Not handled intuitively	Simple (no copy constructor)
Call of reverse	Implementation defined	No
Low memory cases	Can add a new memory allocator	Not handled by user code
Overridable	Yes	No
Use of constructor / destructor	Yes	No

Technically, memory allocated by new comes from the ‘Free Store’ while memory allocated by malloc comes from the ‘Heap’. Whether these two areas are the same is an implementation detail, which is another reason that malloc and new cannot be mixed.

常量指针和指向常量的指针是怎么区分的？
1 2 3 4 5 6

const int p; const int* p; int const* p; int * const p; const int * const p; int const * const p;
关键看const在的左边还是右边。从右往左读，读到的时候point to。

2.4.3.Top-Level const

As we’ve seen, a pointer is an object that can point to a different object. As a result, we can talk independently about whether a pointer is const and whether the objects to which it can point are const. We use the term top-level const to indicate that the pointer itself is a const. When a pointer can point to a const object, we refer to that const as a low-level const.

用于声明引用的const都是底层const.
OOP基本是什么？什么是多态

多态可分为变量多态与函数多态。变量多态是指：基类型的变量（对于C++是引用或指针）可以被赋值基类型对象，也可以被赋值派生类型的对象。函数多态是指，相同的函数调用界面（函数名与实参表），传送给一个对象变量，可以有不同的行为，这视该对象变量所指向的对象类型而定。因此，变量多态是函数多态的基础。

多态还可分为：
- 动态多态（dynamic polymorphism）:通过类继承机制和虚函数机制生效于运行期。可以优雅地处理异质对象集合，只要其共同的基类定义了虚函数的接口。也被称为子类型多态（Subtype polymorphism）或包含多态（inclusion polymorphism）。在面向对象编程中，这被直接称为多态。
- 静态多态（static polymorphism）：模板也允许将不同的特殊行为和单个泛化记号相关联，由于这种关联处理于编译器而非运行期，因此被称为“静态”。可以用来实现类型安全、运行高效的同质对象集合操作。C++ STL不采用动态多态来实现就是个例子。函数重载、运算符重载、带变量的宏多态。
虚函数是怎么实现的？纯虚函数是什么？
为什么要用virtua destructor，为什么没有virtual constructor？

1）从存储空间角度

vtable其实是存储在对象的内存空间的。如果构造函数是虚的，就需要通过vtable来调用，可是对象还没有实例化，也就是内存空间还没有，怎么找vtable呢？所以构造函数不能是虚函数。

（2）从使用角度

虚函数主要用于在信息不全的情况下，能使重载的函数得到对应的调用。构造函数本身就是要初始化实例，那使用虚函数也没有实际意义呀。所以构造函数没有必要是虚函数。

（3）从虚函数的作用

虚函数的作用是在于通过父类的指针或者引用来调用它的时候能够变成调用子类的那个成员函数。而构造函数是在创建对象时自动调用的，不可能通过父类的指针或者引用去调用，因此也就规定构造函数不能是虚函数。
如何定义clone来实现virtual constructor
inline关键字的作用，有什么优劣

inline function的好处是可以减少函数入栈，出栈，以及跳转的开销，从而提高程序的性能。但是，CPU有I-cache和D-cache，如果一个函数的代码都在I-cache里面，性能当然会高，但是如果函数太大，不能全部load到cache里面，导致函数执行过程中，总是要更新cache，同样会导致性能下降。

优点：

1）inline定义的内联函数，函数代码被放入符号表中，在使用时进行替换（像宏一样展开），效率很高。

2）类的内联函数也是函数。编绎器在调用一个内联函数，首先会检查参数问题，保证调用正确，像对待真正函数一样，消除了隐患及局限性。

3）inline可以作为类的成员函数，刀可以使用所在类的保护成员及私有成员。

缺点：

内联函数以复制为代价，活动产函数开销

1)如果函数的代码较长，使用内联将消耗过多内存

2)如果函数体内有循环，那么执行函数代码时间比调用开销大。

Advantages
- By inlining your code where it is needed, your program will spend less time in the function call and return parts. It is supposed to make your code go faster, even as it goes larger (see below). Inlining trivial accessors could be an example of effective inlining.
- By marking it as inline, you can put a function definition in a header file (i.e. it can be included in multiple compilation unit, without the linker complaining)
Disadvantages
- It can make your code larger (i.e. if you use inline for non-trivial functions). As such, it could provoke paging and defeat optimizations from the compiler.
- It slightly breaks your encapsulation because it exposes the internal of your object processing (but then, every “private” member would, too). This means you must not use inlining in a PImpl pattern.
- It slightly breaks your encapsulation 2: C++ inlining is resolved at compile time. Which means that should you change the code of the inlined function, you would need to recompile all the code using it to be sure it will be updated (for the same reason, I avoid default values for function parameters)
- When used in a header, it makes your header file larger, and thus, will dilute interesting informations (like the list of a class methods) with code the user don’t care about (this is the reason that I declare inlined functions inside a class, but will define it in an header after the class body, and never inside the class body).
Inlining Magic
- The compiler may or may not inline the functions you marked as inline; it may also decide to inline functions not marked as inline at compilation or linking time.
- Inline works like a copy/paste controlled by the compiler, which is quite different from a pre-processor macro: The macro will be forcibly inlined, will pollute all the namespaces and code, won’t be easily debuggable, and will be done even if the compiler would have ruled it as inefficient.
- Every method of a class defined inside the body of the class itself is considered as “inlined” (even if the compiler can still decide to not inline it
- Virtual methods are not supposed to be inlinable. Still, sometimes, when the compiler can know for sure the type of the object (i.e. the object was declared and constructed inside the same function body), even a virtual function will be inlined because the compiler knows exactly the type of the object.
- Template methods/functions are not always inlined (their presence in an header will not make them automatically inline).
- The next step after “inline” is template metaprograming . I.e. By “inlining” your code at compile time, sometimes, the compiler can deduce the final result of a function… So a complex algorithm can sometimes be reduced to a kind of return 42 ; statement. This is for me extreme inlining. It happens rarely in real life, it makes compilation time longer, will not bloat your code, and will make your code faster. But like the grail, don’t try to apply it everywhere because most processing cannot be resolved this way… Still, this is cool anyway…

虚函数表（继承和多态）, 虚函数的缺点

首先说明一下"子类和父类内存布局一致"的原则。以class Base和class Derived: public Base为例说明，Derived对象的布局应该让从Base继承的成员变量的布局和Base中的布局一样，否则需要调整（见下文"虚继承"）。原因如下：对于下面的代码Base *p = new Derived()，如果调用p->foo()（foo为Base的成员函数，不是虚函数），那么p->foo()将调用的是Base中的foo函数，因此会按照Base的内存布局对p指向的内存进行操作。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


class A{
public:
    int a;
    virtual void foo(){ cout << "A::foo()" << endl; }
    void bar(){ cout << "A::bar()" << endl; }
};
class B: public A{
public:
    int b;
    void foo(){ cout << "B::foo()" << endl; }
};
class C: public B{
public:
    int c;
    void foo(){ cout << "C::foo()" << endl; }
};

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


(gdb) p /x a
$1 = {_vptr.A = 0x8048990, a = 0x8048879}
(gdb) x/2a &a
0xbffff2d4:  0x8048990 <_ZTV1A+8>  0x8048879 <__libc_csu_init+9>
(gdb) p /x b
$2 = {<A> = {_vptr.A = 0x8048980, a = 0xb7d6e225}, b = 0xb7fed280}
(gdb) x/3a &b
0xbffff2c8:  0x8048980 <_ZTV1B+8>  0xb7d6e225 <__cxa_atexit+53>  0xb7fed280
(gdb) p /x c
$3 = {<B> = {<A> = {_vptr.A = 0x8048970, a = 0x80488c2}, b = 0x1}, 
  c = 0xbffff384}
(gdb) x/4a &c
0xbffff2b8:  0x8048970 <_ZTV1C+8>  0x80488c2 <__libc_csu_init+82>    0x1 0xbffff384

内存布局分别为：

1
2
3


a:   vptr.A | a
b:   vptr.A | a | b
c:   vptr.A | a | b | c

vptr.A是一个指向虚函数表的指针，它只占一个指针大小的空间。

每个类都有自己的虚函数表（当然前提是该类里有虚函数），所以对象a, b, c里所存的虚函数表是各不相同的，如下所示：

1
2
3
4
5
6


(gdb) x/a 0x8048990
0x8048990 <_ZTV1A+8>: 0x804877a <A::foo()>
(gdb) x/a 0x8048980
0x8048980 <_ZTV1B+8>: 0x80487a6 <B::foo()>
(gdb) x/a 0x8048970
0x8048970 <_ZTV1C+8>: 0x80487fe <C::foo()>

多重继承虚函数指针在内存里是什么样的？

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13


class A{
    ...
    virtual void foo(){ cout << "A::foo()" << endl; }
};
class B{
    ...
    virtual void bar(){ cout << "B::bar()" << endl; }
};
class C: public A, public B{
    ...
    void foo(){ cout << "C::foo()" << endl; }
    void bar(){ cout << "C::bar()" << endl; }
};

内存布局：

1
2
3


a:   vptr.A | a
b:   vptr.B | b
c:   vptr.A | a | vptr.B | b | c

C的布局：

1
2
3
4
5
6
7


(gdb) p /x c
$4 = {<A> = {_vptr.A = 0x80489b8, a = 0x1}, <B> = {_vptr.B = 0x80489c8, b = 0x1}, c = 0xbffff384}
(gdb) x/5a &c
0xbffff2b4:  0x80489b8 <_ZTV1C+8>  0x1 0x80489c8 <_ZTV1C+24> 0x1
0xbffff2c4:  0xbffff384
(gdb) x/2a 0x80489b8
0x80489b8 <_ZTV1C+8>: 0x80487e2 <C::foo()>    0x8048816 <C::bar()>

c里的vptr所指向的虚函数表包括了c中的所有虚函数，这样对于类型为A或C的引用或指针，当它们调用虚函数时，不用调整指针偏移，此时它们并不需要知道还有vptr.B的存在。对于B *pb = &c; pb->bar();，pb需要调整指针偏移指向vptr.B，这样才能保证pb指向&c和pb指向&b的内存环境是一样的。

vptr.B的地址变动了，

编译器会默认地将this指针当作第一个参数传进去。然后对地址进行加减。pb的值是调整过的值&c + 8，它指向vptr.B，而此时调用函数bar，该函数需要解析的是对象c，这样才能完整地实现多态（调用c的函数，解析的对象也是c）

主流 C++ 实现（gcc/clang/VC++）对于最简单的单继承且基类有虚函数的情况（非虚拟继承）的做法都差不多：

vtable 是每个 class 类型一个，不是每个对象一个。vtable 的大小（长度）跟这个 class 的虚函数总数（继承和自己新增）正相关。
vptr 是每个对象一个，vptr 大小是固定的，跟虚函数多少无关。
对象的 vptr 在构造和析构的时候可能会变，指向不同的 vtable。对象构造完就不会变了。
“只使用前一段”有可能发生，如果派生类添加了新的虚函数的话（比如 virtual void B::print()）。

虛函数有成本，分别是运行时间(间接函数调用、也不能内联)及空间(每个实例的虚表指针、每个类的虚表)。

这些成本对于一些软件可能不重要，但对于性能和内存要求高的情况就是重要的。

总言之，真的需要运行时多态，才使用虚函数。

普通的成员函数可以被看作是类作用域的全局函数,不在对象分配的空间里,只有虚函数才会在类对象里有一个指针,存放虚函数的地址等相关信息。成员函数的地址，编译期就已确定，并静态绑定或动态的绑定在对应的对象上。对象调用成员函数时，早在编译期间，编译器就可以确定这些函数的地址，并通过传入this指针和其他参数，完成函数的调用，所以类中就没有必要存储成员函数的信息。

allocator是怎么回事？

某个具体实现的STL的allocator采用什么技术。其实很多STL实现的allocator就是new/delete，没有pool。关于pool，其实不free，程序退出的时候自然被os清掉了。每次成功调用allocate()都会返回一个指针，用户可以通过这个指针管理它指向的那片内存。deallocate()、construct()和destroy()都需要一个【allocate()返回的指针】（把它当作迭代器也是可以的）作为第一参数，说明一个allocator对象并没有一片内存的所有权。
RAII

RAII全称为Resource Acquisition Is Initialization，它是在一些面向对象语言中的一种惯用法。RAII要求，资源的有效期与持有资源的对象的生命期严格绑定，即由对象的构造函数完成资源的分配(获取)，同时由析构函数完成资源的释放。在这种要求下，只要对象能正确地析构，就不会出现资源泄露问题。

因为只有被构造成功(构造函数没有抛出异常)的对象才会在返回时调用析构函数[4]，同时析构函数的调用顺序恰好是它们构造顺序的反序[5]，这样既可以保证多个资源(对象)的正确释放，又能满足多个资源之间的依赖关系。

因为RAII确实好用，既简洁，又健壮，又强大。论简洁，RAII类定义好之后，用的时候定义类对象即可。回收资源不需要再写任何一个字。论健壮，资源回收保证执行，无论你增减return还是扔异常都不会出问题。论强大，RAII不仅能管理内存，还能管理任何需要回收的资源。互斥锁，数据库连接，文件句柄，哪个都没问题，而且写法一致。可以理解为利用C++的那个超出作用域则自动析构的特性来保证一个操作必定会被执行,常用的是unlock啊,delete啊之类的。
内存管理

从malloc/free到new/delete

从new/delete 到内存配置器（allocator）。

STL并不推荐使用 new/delete 进行内存管理，而是推荐使用allocator。

allocator alloc是一个静态的内存分配器

它可以管理许多内存,管理的方法可以是动态分配内存,

所以alloc是静态的,但是 alloc.allocate和alloc.deallocate是动态的

shared_ptr和unique_ptr
memory alignment and padding
常见STL实现和操作
extern C 用途

extern主要是为了解决在几个源文件共享同一个变量，在链接各个cpp文件时。当一个cpp文件在编译的过程中，若需要一个变量但是当前的作用域没有发现其定义，如果这个变量是extern修饰的，那么编译器会知道其定义在其他文件中，

extern变量可以声明多次，但是只能初始化一次

extern变量的初始化需要在全局作用域中初始化，所以在局部作用域中不论是声明并初始化，或者声明与初始化分开都会导致编译器报错

一旦声明了extern变量，那么在程序链接的时候无论它初始化没有，都会查找extern的定义。
派生类对象的内存分布
函数指针复杂形式的识别
字符串相关的操作
auto lanbda decltype
实现vector的move assignement operator
std::shared_ptr的实现，reference count在哪里定义
unorder_map的原理，Hash table是如何实现的，从空table一致增加元素会出现什么？
final和override的作用，以及使用场合。为什么要用，不用会有什么后果？
标准库各容器基本操作的复杂度。标准库算法的复杂度。平均和最坏。
标准库各容器的数据结构，以及vector的容量增长方式。
map的实现
为什么vector::push_back()的复杂度分摊之后是O(1)
lower_bound / upper_bound解决问题
实现set_intersection() 或 set_union() 或 merge()
实现word count 统计每个单词出现的次数，输出时按出现次数排序。
迭代器失效
标准库的线程安全性
自动化对象生命周期管理，智能指针，循环引用，weak_ptr
list的insert() / erase() 与 vector相比哪个快？（不是那么简单）
继承、多重继承、虚函数表等对sizeof的影响
用template写一个factorial
Throw exception和return error code各自的优缺点。
lambda function, capture by reference 与 by value的区别
new expression, operator new 和 malloc的联系
dynamic_cast是怎么实现的？

Table of Contents