Smart Pointers and Memory Ownership in C++ Libraries

January 17, 2020

I don’t often get deep into technical minutiae on this blog, preferring to write for a more general audience. Despite that, programming computers is still my day job, and the articles that do get into technical detail tend to get people interested: the article that dove deep into the type system I developed for my thesis got some recent attention from Hacker News, the article about how to extract messages from WeChat is consistently one of the pages with the most hits (presumably because it’s useful to some people), and I’ve even gotten some good feedback and attention to my unpolished and unedited programming languages notes from university, which I only barely ever publicized.

I’ve even gotten a bit of a reputation at work as someone who can explain things to people in a way that’s understandable. I’ve been trying to lean into that strength at my job, and while I’m doing that, I may as well bring some of that knowledge to the outside world as well, where appropriate.

All this to say: this article was adapted from something I wrote at work, in order to explain some subtleties about how smart pointers in modern C++1 work to people that only had experience with older C++. More specifically, this gets into how the way you use smart pointers should impact the design of libraries and their APIs.

What are the issues with memory ownership in older C++?

Let’s say that your library has a class ThingManager, that creates CoolThing objects that are stored and used by the caller. This is probably what it would look like in older C++:

class ThingManager {
 public:
  CoolThing* makeCoolThing(...) {
    CoolThing* thing = new CoolThing();
    // ... Do some setup for thing ...
    return thing;
  }
  // ...
};

The problem is that this leaves out a lot of important information about this API:

  • If I delete the ThingManager, are the things it created still valid?
  • Is the ThingManager ever going to delete this CoolThing itself?
  • Should I, as the caller, be deleting the CoolThing myself? If so, when?

Effectively, unless the person who wrote this library left you some thorough documentation, you have no idea what the answers to these questions are. You don’t have a good sense of memory ownership, of what part of the code is responsible for managing the lifetime of this CoolThing.

In modern C++, we have more tools available to us. If you use shared_ptr and unique_ptr properly, you can clarify a lot of the answers to these questions and have the compiler enforce them.

What are shared_ptr and unique_ptr?

Put simply, they’re objects introduced in C++11 to help make these ownership semantics more clear. They act like pointers, but they automatically manage resource allocation and deallocation for you.

Here’s how shared_ptr works:

  • Every time a shared_ptr is copied, an internal reference count to the object is incremented by one.

  • Every time a shared_ptr is destroyed, that internal reference count is decremented.

  • When the reference count hits zero, the object it points to is deleted.

// Pretend we have this function defined somewhere.
void passByValueFunction(std::shared_ptr<CoolThing> thing);

{
  std::shared_ptr<CoolThing> ptrToThing(new CoolThing());
  // Our object now has a reference count of 1.
  
  {
    std::shared_ptr<CoolThing> secondPtrToThing = ptrToThing;
    // Our object now has a reference count of 2.
    
    passByValueFunction(secondPtrToThing);
    // The function call will copy its argument, meaning
    // that the reference count becomes 3 for the duration
    // of the function call, then drops to 2 when the
    // function returns and the copy is destroyed.
  }
  // At the end of this scope, secondPtrToThing is destroyed,
  // and the reference count is now 1.
}
// At the end of this scope, ptrToThing is destroyed, and
// the reference count hits 0, triggering the deletion of
// our CoolThing.

unique_ptr is a bit different. It’s a pointer to an object, same as shared_ptr, but it cannot be copied. That is, if you try to do the following, you’ll get a compile error:

void passByValueFunction(std::unique_ptr<CoolThing> thing);

std::unique_ptr<CoolThing> ptrToThing(new CoolThing());

std::unique_ptr<CoolThing> secondPtrToThing = ptrToThing;
// Not allowed! Compile error!

passByValueFunction(ptrToThing);
// Also not allowed! Compile error!

However, you can still pass it by reference:

void passByReferenceFunction(std::unique_ptr<CoolThing>& thing);

std::unique_ptr<CoolThing> ptrToThing(new CoolThing());

passByReferenceFunction(ptrToThing); // This is allowed!

And while you can’t copy a unique_ptr, you can move it, which doesn’t do any additional copying or allocation, but instead just moves the unique_ptr’s internal CoolThing* from one unique_ptr to another. Note that this invalidates the pointer being moved from, making it null!

std::unique_ptr<CoolThing> ptrToThing(new CoolThing());

std::unique_ptr<CoolThing> secondPtrToThing = 
  std::move(ptrToThing); // Allowed!
// But ptrToThing is null from now on!
// Only secondPtrToThing is valid.

How should I create smart pointers to objects?

In the examples above, I was creating those smart pointers by directly allocating memory with new and passing the result of that allocation to the constructor of shared_ptr or unique_ptr. This actually isn’t the best idea.

Instead, you should use the std::make_shared and std::make_unique functions. Just pass the same arguments to those functions as you would to the constructor of the object itself:

class MyClass {
  MyClass(int a, float b, std::string c) {...}
}

std::shared_ptr<MyClass> sharedThing
  = std::make_shared<MyClass>(1, 3.14, "hi");

std::unique_ptr<MyClass> uniqueThing
  = std::make_unique<MyClass>(2, 6.28, "hello");

// It's less verbose with "auto", which infers the type.
auto sharedThing2 = std::make_shared<MyClass>(3, 2.71, "bye");

This prevents you from having to futz around with new yourself, which makes it less likely that you’d accidentally break the guarantees these smart pointers give you (such as by giving them an object that was created elsewhere, that then gets deleted elsewhere as well).

However, make_unique and make_shared are not always usable. For example, if the constructor you’re invoking is private, they won’t work. For these situations, you can always just fall back on new as I did above. But if you’re doing this, be sure never to delete the object yourself. Let the smart pointer handle that work, otherwise you’re going to end up double-freeing your object.

Can I convert between the two?

You can easily convert from unique_ptr to shared_ptr. Just use std::move:

auto uniqueThing = std::make_unique<CoolThing>();

std::shared_ptr<CoolThing> sharedThing
  = std::move(uniqueThing);

As usual with std::move, this invalidates the previous unique_ptr, making it null. The object is now under shared ownership inside the shared_ptr.

However, you can’t statically convert from shared_ptr to unique_ptr. If you think about what those objects represent, this makes sense: unique_ptr says “This object is only owned by this one pointer, and that’s final.” shared_ptr, on the other hand, says “This object is referenced in any number of different places with shared ownership.” If something’s potentially referenced in many places, there’s no way to statically convert it back to single ownership.

Do I have to use std::move every time I return a unique_ptr from a function?

Nope! You can easily do this:

std::unique_ptr<CoolThing> createCoolThing() {
  auto thing = make_unique<CoolThing>();
  // ... Do some setup for thing ...
  return thing;
}

std::unique_ptr<CoolThing> uniqueThing = createCoolThing();
// Allowed!
std::shared_ptr<CoolThing> sharedThing = createCoolThing();
// Also allowed!

The reason for this is that, wherever possible, C++ does a move rather than a copy when assigning from a temporary value like the return value of a function. Since it knows that the return value isn’t going to be used again, it knows it can just move the return value to your local variables rather than make a copy. std::move is only required when you want to force a move, where you’re explicitly telling the compiler “Trust me, I’m not going to be using this variable again, feel free to invalidate it.”

If you want to look further into this, the keywords to look for are rvalue references and move constructors.

When do I use shared_ptr vs unique_ptr?

The most important difference between the two of these, particularly in an API, is what they say about memory ownership.

Let’s bring back our ThingManager example, but this time we’ll return a unique_ptr. Returning unique_ptr communicates to the calling code, “Hey, you can do what you want with this and delete it wherever you like, since this is guaranteed to be the only owning reference to this object.” This is generally preferred in most cases, if you don’t need to hold a reference to the object internally.

class ThingManager {
 public:
  // This function signature says "I won't be holding
  // onto this object. It's yours to deal with now.
  // Have at it."
  std::unique_ptr<CoolThing> makeCoolThing(...) {
    auto thing = make_unique<CoolThing>();
    // ... Do some setup for thing ...
    return thing;
  }
  //...
};

By doing this, you’re explicitly saying that the caller owns this object now and the object is guaranteed to live for as long as the caller holds onto that variable.

If you do need to hold a reference to the object, then return shared_ptr. Presumably you’re already holding the object as a shared_ptr internally, and you can just return a copy of that.

class ThingManager {
 public:
  // This function signature says "I'm going to hold onto a
  // reference to this object somewhere. Even when you're
  // done with it, that object will probably still be
  // living somewhere inside the guts of this library.
  std::shared_ptr<CoolThing> makeCoolThing(...) {
    lastCreatedThing_ = make_shared<CoolThing>();
    // ... Do some setup for it ...
    return lastCreatedThing_;
  }
  //...
 private:
  std::shared_ptr<CoolThing> lastCreatedThing_ = nullptr;
};

By doing this, you’re saying that the object is going to live either as long as the ThingManager lives, or as long as the caller holds onto its variable, whichever one’s longer. The object has shared ownership and neither ThingManager nor the caller can unilaterally delete it.

If, on the other hand, you know that ThingManager, and the object it holds on to, is going to outlive the calling code, consider just returning a raw reference:

class ThingManager {
 public:
  // This can be a bit dangerous, but is okay to do if
  // you know for sure that ThingManager is going to
  // outlive whatever's calling it.
  CoolThing& getCoolThing(...) {
    return internalThing_;
  }
  //...
 private:
  // When doing this, you don't need shared_ptr anymore,
  // since ThingManager has exclusive, non-transferrable
  // ownership of this object.
  CoolThing internalThing_;
};

When writing a library, though, you don’t usually have this degree of control over the calling code. This is a slightly more risky API design in general, but it can be appropriate in some places, since the presence of a raw reference implies that ThingManager has ownership of this object and that you shouldn’t expect the object to be long-lived.

But hey, what about weak_ptr?

weak_ptr is a kind of smart pointer that you can construct from a shared_ptr. It keeps a reference to the object, but doesn’t increase the reference count. You also can’t dereference a weak_ptr directly: it has to be converted to a shared_ptr by calling lock() first. And when the object is eventually deleted, lock() will return a null pointer instead.

void checkForNull(std::weak_ptr<CoolThing> weakThing) {
  std::shared_ptr<CoolThing> strongThing = weakThing.lock();
  // If this works, it will increase the reference count.
  if (strongThing != nullptr) {
    std::cout << "I got a cool thing!\n";
  } else {
    std::cout << "I got a nullptr...\n";
  }
}

// Elsewhere...
std::weak_ptr<CoolThing> weakThing;
{
  auto ptrToThing = std::make_shared<CoolThing>();
  // The reference count is now 1
  
  weakThing = ptrToThing;
  // The reference count is still 1!
  
  checkForNull(weakThing); // Prints "I got a cool thing!"
}
checkForNull(weakThing); // Prints "I got a nullptr..."

This can be very useful for breaking reference cycles to avoid memory leaks. In short: the problem with shared_ptr is that if you have, for example, two objects that hold shared_ptrs to each other, those two objects will never be freed because each of them is keeping an owning reference to their counterpart. Changing one of those two objects to use a weak_ptr instead can alleviate this issue.

But the fact that you can trivially convert a weak_ptr to a shared_ptr means that it doesn’t really win you a whole lot in terms of actual lifetime control. The client can always do something like this:

class ThingManager {
 public:
  std::weak_ptr<CoolThing> makeCoolThing() {
    lastCreatedThing_ = make_shared<CoolThing>();
    // ... Do some setup for it ...
    return lastCreatedThing_;
  }
  //...
 private:
  std::shared_ptr<CoolThing> lastCreatedThing_ = nullptr;
};

// And in the calling code...
std::shared_ptr<CoolThing> ptrToThing
  = makeCoolThing().lock();
// Hahaha! Now this object will live for as long as I
// want it to!

This makes weak_ptr less useful in terms of communicating lifetime to the calling code.

Does using smart pointers guarantee that you can’t mess up lifetimes?

No, there’s no guarantee for things like that in C++. You could always, for example, hold onto a raw CoolThing& reference to an object, and then return it via unique_ptr to the calling code. But by writing your API using smart pointers to communicate lifetime, you at least can get a sense that you shouldn’t do things like that, because the very act of passing a unique_ptr means that you are explicitly handing over control of its lifetime to the calling code and you won’t know when that reference will be invalidated.

The point is: this isn’t going to guarantee safety the way, say, Rust’s borrow checker would. But designing your API this way is at least highly suggestive of the correct way of handling the objects that your API returns, and that’s already well beyond what you get with old-style C++.

  1. For the purposes of this article, we’ll assume that “modern C++” means C++11 and later, whereas “older C++” means everything before C++11.