| # What’s Up With Pointers |
| |
| This is a transcript of [What's Up With |
| That](https://www.youtube.com/playlist?list=PL9ioqAuyl6ULIdZQys3fwRxi3G3ns39Hq) |
| Episode 1, a 2022 video discussion between [Sharon ([email protected]) |
| and Dana ([email protected])](https://www.youtube.com/watch?v=MpwbWSEDfjM). |
| |
| The transcript was automatically generated by speech-to-text software. It may |
| contain minor errors. |
| |
| --- |
| |
| Welcome to the first episode of What’s Up With That, all about pointers! Our |
| special guest is C++ expert Dana. This talk covers smart pointer types we have |
| in Chrome, how to use them, and what can go wrong. |
| |
| Notes: |
| |
| - https://docs.google.com/document/d/1VRevv8JhlP4I8fIlvf87IrW2IRjE0PbkSfIcI6-UbJo/edit |
| |
| Links: |
| |
| - [Life of a Vulnerability] |
| - [MiraclePtr] |
| |
| --- |
| |
| 0:00 SHARON: Hi, everyone, and welcome to the first installment of "What's Up |
| With That", the series that demystifies all things Chrome. I'm your host, |
| Sharon, and today's inaugural episode will be all about pointers. There are so |
| many types of types - which one should I use? What can possibly go wrong? Our |
| guest today is Dana, who is one of our Base and C++ OWNERS and is currently |
| working on introducing Rust to Chromium. Previously, she was part of bringing |
| C++11 support to the Android NDK and then to Chrome. Today, she'll be telling |
| us what's up with pointers. Welcome, Dana! |
| |
| 00:31 DANA: Thank you, Sharon. It's super exciting to be here. Thank you for |
| letting me be on your podcast thingy. |
| |
| 00:36 SHARON: Yeah, thanks for being the first episode. So let's just jump |
| right in. So when you use pointers wrong, what can go wrong? What are the |
| problems? What can happen? |
| |
| 00:48 DANA: So pointers are a big cause in security problems for Chrome, and |
| that's what we mostly think about when things go wrong with pointers. So you |
| have a pointer to some thing, like you've pointed to a goat. And then you |
| delete the goat, and you allocate some new thing - a cow. And it gets stuck in |
| the same spot. Your pointer didn't change. It's still pointing to what it |
| thinks is a goat, but there's now a cow there. And so when you go to use that |
| pointer, you use something different. And this is a tool that malicious actors |
| use to exploit software, like Chrome, in order to gain access to your system, |
| your information, et cetera. |
| |
| 01:39 SHARON: And we want to avoid those. So what's that general type of attack |
| called? |
| |
| 01:39 DANA: That's a Use-After-Free because you have freed the goat and |
| replaced it with a cow. And you're using your pointer, but the thing it pointed |
| to was freed. There are other kinds of pointer badness that can happen. If you |
| take a pointer and you add to it some number, or you go to an offset off the |
| pointer, and you have an array of five things, and you go and read 20, or minus |
| 2, or something, now you're reading out of bounds of that memory allocation. |
| And that's not good. these are both memory safety bugs that occur a lot with |
| pointers. |
| |
| 02:23 SHARON: Today, we'll be mostly looking at the Use-After-Free kind of |
| bugs. We definitely see a lot of those. And if you want to see an example of |
| one being used, Dana has previously done a talk called, "[Life of a |
| Vulnerability]." It'll be linked below. You can check that out. So that being |
| said, should we ever be using just a regular raw pointer in C++ in Chrome? |
| |
| 02:41 DANA: First of all, let's call them native pointers. You will see them |
| called raw pointers a lot in literature and stuff. But later on, we'll see why |
| that could be a bit ambiguous in this context. So we'll call them a native |
| pointer. So should you use a native pointer? If you don't want to |
| Use-After-Free, if you don't want a problem like that, no. However, there is a |
| performance implication with using smart pointers, and so the answer is yes. |
| The style guide that we have right now takes this pragmatic approach of saying |
| you should use raw pointers for giving access to an object. So if you're |
| passing them as a function parameter, you can share it as a pointer or a |
| reference, which is like a pointer with slightly different rules. But you |
| should not store native pointers as fields and objects because that is a place |
| where they go wrong a lot. And you should not use a native pointer to express |
| ownership. So before C++11, you would just say, this is my pointer, use a |
| comment, say this one is owning it. And then if you wanted to pass the |
| ownership, you just pass this native pointer over to something else as an |
| argument, and put a comment and say this is passing ownership. And you just |
| kind of hope it works out. But then it's very difficult. It requires the |
| programmer to understand the whole system to do it correctly. There is no help. |
| So in C++11, the type called `std::optional_ptr` - or sorry, `std::unique_ptr` - |
| was introduced. And this is expressing unique ownership. That's why it's |
| called `unique_ptr`. And it's just going to hold your pointer, and when it goes |
| out of scope, it gets deleted. It can't be copied because it's unique |
| ownership. But it can be moved around. And so if you're going to express |
| ownership to an object in the heap, you should use a `unique_ptr`. |
| |
| 04:48 SHARON: That makes sense. And that sounds good. So you mentioned smart |
| pointers before. You want to tell us a bit more about what those are? It sounds |
| like `unique_ptr` is one of those. |
| |
| 04:55 DANA: Yes, so a smart pointer, which can also be referred to as a |
| pointer-like object, perhaps as a subset of them, is a class that holds inside |
| of it a pointer and mediates access to it in some way. So `unique_ptr` |
| mediates access by saying I own this pointer, I will delete this pointer when I |
| go away, but I'll give you access to it. So you can use the arrow operator or |
| the star operator to get at the underlying pointer. And you can construct them |
| out of native pointers as well. So that's an example of a smart pointer. |
| There's a whole bunch of smart pointers, but that's the general idea. I'm going |
| to add something to what a native pointer is, while giving you access to it in |
| some way. |
| |
| 05:40 SHARON: That makes sense. That's kind of what our main thing is going to |
| be about today because you look around in Chrome, you'll see a lot of these |
| wrapper types. It'll be a `unique_ptr` and then a type. And you'll see so many |
| types of these, and talking to other people, myself, I find this all very |
| confusing. So we'll cover some of the more common types today. We just talked |
| about unique pointers. Next, talk about `absl::optional`. So why don't you tell |
| us about that. |
| |
| 06:10 DANA: So that's actually a really good example of a pointer-like object |
| that's not actually holding a pointer, so it's not a smart pointer. But it |
| looks like one. So this is this distinction. So `absl::optional`, also known as |
| `std::optional`, if you're not working in Chromium, and at some point, we will |
| hopefully migrate to it, `std::optional` and `absl::optional` hold an object |
| inside of it by value instead of by pointer. This means that the object is held |
| in that space allocated for the `optional`. So the size of the `optional` is |
| the size of the thing it's holding, plus some space for a presence flag. |
| Whereas a `unique_ptr` holds only a pointer. And its size is the size of a |
| pointer. And then the actual object lives elsewhere. So that's the difference |
| in how you can think about them. But otherwise, they do look quite similar. An |
| `optional` is a unique ownership because it's literally holding the object |
| inside of it. However, an `optional` is copyable if the object inside is |
| copyable, for instance. So it doesn't have quite the same semantics. And it |
| doesn't require a heap allocation, the way `unique_ptr` does because it's |
| storing the memory in place. So if you have an `optional` on the stack, the |
| object inside is also right there on the stack. That's good or bad, depending |
| what you want. If you're worried about your object sizes, not so good. If |
| you're worried about the cost of memory allocation and free, good. So this is |
| the trade-off between the two. |
| |
| 07:51 SHARON: Can you give any examples of when you might want to use one |
| versus the other? Like you mentioned some kind of general trade-offs, but any |
| specific examples? Because I've definitely seen use cases where `unique_ptr` is |
| used when maybe an `optional` makes more sense or vice versa. Maybe it's just |
| because someone didn't know about it or it was chosen that way. Do you have any |
| specific examples? |
| |
| 08:14 DANA: So one place where you might use a `unique_ptr`, even though |
| `optional` is maybe the better choice, is because of forward declarations. So |
| because an `optional` holds the type inside of it, it needs to know the type |
| size, which means it needs to know the full declaration of that type, or the |
| whole definition of that type. And a `unique_ptr` doesn't because it's just |
| holding a pointer, so it only needs to know the size of a pointer. And so if |
| you have a header file, and you don't want to include another header file, and |
| you just want to forward declare the types, you can't stick an optional of that |
| type right there because you don't know how big it's supposed to be. So that |
| might be a case where it's maybe not the right choice, but for other |
| constraining reasons, you choose to use a `unique_ptr` here. And you pay the |
| cost of a heap allocation and free as a result. But when would you use an |
| `optional`? So `optional` is fantastic for returning a value sometimes. I want |
| to do this thing, and I want to give you back a result, but I might fail. Or |
| sometimes there's no value to give you back. Typically, before C++ - what are |
| we on now, was it came in 14? I'm going to say it wrong. That's OK. Before we |
| had `absl::optional`, you would have to do different tricks. So you would pass |
| in a native pointer as a parameter and return a bool as the return value to say |
| did I populate the pointer. And yes, that works. But it's easy to mess it up. |
| It also generates less optimal code. Pointers cause the optimizer to have |
| troubles. And it doesn't express as nicely what your intention is. A return, |
| this thing, sometimes. And so in place of using this pointer plus bool, you can |
| put that into a single type, return an `optional`. Similar for holding |
| something as a field, where you want it to be held inline in your class, but |
| you don't always have it present, you can do that with an `optional` now, where |
| you would have probably used a pointer before. Or a `union` or something, but |
| that gets even more tricky. And then another place you might use it as a |
| function argument. However, that's usually not the right choice for a function |
| argument. Why? Because the `optional` holds the value inside of it. |
| Constructing an `optional` requires constructing the whole object inside of it. |
| And so that's not free. It can be arbitrarily expensive, depending on what your |
| type is. And if your caller to your function doesn't have already an |
| `optional`, they have to go and construct it to pass it to you. And that's a |
| copy or move of that inner type. So generally, if you're going to receive a |
| parameter, maybe sometimes, the right way to spell that is just to pass it as a |
| native pointer, which can be null, when it's not present. |
| |
| 11:29 SHARON: Hopefully that clarifies some things for people who are trying to |
| decide which one best suits their use case. So moving on from that, some people |
| might remember from a couple of years ago that instead of being called |
| `absl::optional`, it used to be called `base::optional`. And do you want to |
| quickly mention why we switched from `base` to `absl`? And you mentioned even |
| switching to `std::optional`. Why this transition? |
| |
| 11:53 DANA: Yeah, absolutely. So as the C++ standards come out, we want to use |
| them, but we can't until our toolchain is ready. What's our toolchain? It's our |
| compiler, our standard library - and unfortunately, we have more than one |
| compiler that we need to worry about. So we have the NaCl compiler. Luckily, we |
| just have Clang for the compiler choice we really have to worry about. But we |
| do have to wait for these things to be ready, and for a code base to be ready |
| to turn on the new standard because sometimes there are some non-backwards |
| compatible changes. But we can forward port stuff out of the standard library |
| into base. And so we've done that. We have a bunch of C++20 backports in base |
| now. We had 17 backports before. We turned on 17, now they should hopefully be |
| gone. And so `base::optional` was an example of a backport, while `optional` |
| was still considered experimental in the standard library. We adopted use of |
| `absl` since then, and `absl` had also, essentially, a backport of the |
| `optional` type inside of it for presumably the same reasons. And so why have |
| two when you can have one? That's a pretty good rule. And so we deprecated the |
| `base` one, removed it, and moved everything to the `absl` one. One thing to |
| note here, possibly interest, is we often add security hardening to things in |
| `base`. And so sometimes there is available in the standard library something. |
| But we choose not to use it and use something in `base` or `absl`, but we use |
| it in `base` instead, because we have extra hardening checks. And so part of |
| the process of removing `base::optional` and moving to `absl::optional` was |
| ensuring those same security hardening checks are present in `absl`. And we're |
| going to have to do the same thing to stop using `absl` and start using the |
| standard one. And that's currently a work in progress. |
| |
| 13:48 SHARON: So let's go through some of the `base` types because that's |
| definitely where the most of these kind of wrapper types live. So let's just |
| start with one that I learned about recently, and that's a `scoped_refptr`. |
| What's that? When should we use it? |
| |
| 13:59 DANA: So `scoped_refptr` is kind of your Chromium equivalent to |
| `shared_ptr` in the standard library. So if you're familiar with that, it's |
| quite similar, but it has some slight differences. So what is `scoped_refptr`? |
| It gives you shared ownership of the underlying object. And it's a smart |
| pointer. It holds a pointer to an object that's allocated in the heap. When all |
| `scoped_refptr` that point to the same object are gone, it'll be deleted. So |
| it's like `unique_ptr`, except it can be copied to add to your ref count, |
| basically. And when all of them are gone, it's destroyed. And it gives access |
| to the underlying pointer in exactly the same ways. Oh, but why is it different |
| than `shared_ptr`? I did say it is. `scoped_refptr` requires the type that is |
| held inside of it to inherit from `RefCounted` or `RefCountedThreadSafe`. |
| `shared_ptr` doesn't require this. Why? So `shared_ptr` sticks an allocation |
| beside your object and then puts your object here. So the ref count is |
| externalized to your object being stored and owned by the shared pointer. |
| Chromium took this position to be doing intrusive ref counting. So because we |
| inherit from a known type, we stick the ref count in that base class, |
| `RefCounted` or `RefCountedThreadSafe`. And so that is enforced by the |
| compiler. You must inherit from one of these two in order to be stored and |
| owned in a `scoped_refptr`. What's the difference? `RefCounted` is the default |
| choice, but it's not thread safe. So the ref counting is cheap. It's the more |
| performant one, but if you have a `scoped_refptr` on two different threads |
| owning the same object, their ref counting will race, can be wrong, you can end |
| up with a double free - which is another way that pointers can go wrong, two |
| things freeing the same thing - or you could end up with potentially not |
| freeing it at all, probably. I guess I've never checked if that's possible. But |
| they can race, and then bad things happen. Whereas, `RefCountedThreadSafe` |
| gives you atomic ref counting. So atomic means that across all threads, they're |
| all going to have the same view of the state. And so it can be used across |
| threads and be owned across threads. And the tricky part there is the last |
| thread that owns that object is where it's going to be destroyed. So if your |
| object's destructor does things that you expect to happen on a specific thread, |
| you have to be super careful that you synchronize which thread that last |
| reference is going away on, or it could explode in a really flaky way. |
| |
| 17:02 SHARON: This sounds useful in other ways. What are some kind of more |
| design things to consider, in terms of when a `scoped_refptr` is useful and |
| does help enforce things that you want to enforce, like relative lifetimes of |
| certain objects? |
| |
| 17:15 DANA: Generally, we recommend that you don't use ref counting if you can |
| help it. And that's because it's hard to understand when it's going to be |
| destroyed, like I kind of alluded to with the thread situation. Even in a |
| single thread situation, how do you know which one is the last reference? And |
| is this object going to outlive that other object? Maybe sometimes. It's not |
| super obvious. It's a little more clear with a `unique_ptr`, at least local to |
| where that `unique_ptr`'s destruction is. But there's usually no |
| `scoped_refptr`. You can say this is the last one. So I know it's gone after |
| this thing is gone. Maybe it is, maybe it's not, often. So it's a bit tricky. |
| However, there are scenarios when you truly want a bunch of things to have |
| access to a piece of data. And you want that data to go away when nobody needs |
| it anymore. And so that is your use case for a `scoped_refptr`. It is nicer |
| when that thing being with shared ownership is not doing a lot of interesting |
| things, especially in its destructor because of the complexity that's involved |
| in shared ownership. But you're welcome to shoot yourself in the foot with this |
| one if you need to. |
| |
| 18:33 SHARON: We're hoping to help people not shoot themselves in the foot. So |
| use `scoped_refptr` carefully, is the lesson there. So you mentioned |
| `shared_ptr`. Is that something we see much of in Chrome, or is that something |
| that we generally try to avoid in terms of things from the standard library? |
| |
| 18:51 DANA: That is something that is banned in Chrome. And that's just |
| basically because we already have `scoped_refptr`, and we don't want two of the |
| same thing. There's been various times where people have brought up why do we |
| need to have both? Can we just use `shared_ptr` now? And nobody's ever done the |
| kind of analysis needed to make that kind of decision. And so we stay with what |
| we're at. |
| |
| 19:18 SHARON: If you want to do that, there's someone that'll tell you what to |
| do. So something that when I was using `scoped_refptr`, I came across that you |
| need a WeakPtrFactory to create such a pointer. So weak pointers and WeakPtr |
| factories are one of those things that you see a lot in Chrome and one of these |
| base things. So tell us a bit about weak pointers and their factories. |
| |
| 19:42 DANA: So WeakPtr and WeakPtrFactory have a bit of an interesting history. |
| Their major purpose is for asynchronous work. Chrome is basically a large |
| asynchronous machine, and what does that mean? It means that we break all of |
| the work of Chrome up into small pieces of work. And every time you've done a |
| piece, you go and say, OK, I'm done. And when the next piece is ready, run this |
| thing. And maybe that next thing is like a user input event, maybe that's a |
| reply from the network, whatever it might be. And there's just a ton of steps |
| in things that happen in Chrome. Like, a navigation has a request, a response, |
| maybe another request - some redirects, whatever. That's an example of tons of |
| smaller asynchronous tasks that all happen independently. So what goes wrong with |
| asynchronous tasks? You don't have a continuous stack frame. What does that |
| mean? So if you're just running some synchronous code, you make a variable, you |
| go off and you do some things, you come back. Your variable is still here, |
| right? You're in this stack frame and you can keep using it. You have |
| asynchronous tasks. You make a variable, you go and do some work, and you are |
| done your task. Boop, your stack's gone. You come back later, you're going to |
| continue. You don't have your variable anymore. So any state that you want to |
| keep across your various tasks has to be stored and what we call bound in with |
| that task. If that's a pointer, that's especially risky. So we talked earlier |
| about Use-After-Frees. Well, you can, I hope, imagine how easy it is to stick a |
| pointer into your state. This pointer is valid, I'm using it. I go away, I come |
| back when? I don't know, sometime in the future. And I'm going to go use this |
| pointer. Is it still around? I don't own it. I didn't use a `unique_ptr`. So |
| who owns it? How do they know that I have a task waiting to use it? Well, |
| unless we have some side channel communicating that, they don't. And how do I |
| know if they've destroyed it if we don't have some side channel communicating |
| that? I don't know. And so I'm just going to use this pointer and bad things |
| happen. Your bank account is gone. |
| |
| 22:06 SHARON: No! My bank account! |
| |
| 22:06 DANA: I know. So what's the side channel? The side channel that we have |
| is WeakPtr. So a WeakPtr and WeakPtrFactory provide this communication |
| mechanism where WeakPtrFactory watches an object, and when the object gets |
| destroyed, the WeakPtrFactory inside of it is destroyed. And that sets this |
| little bit that says, I'm gone. And then when your asynchronous task comes back |
| with its pointer, but it's a WeakPtr inside of it and tries to run, it can be |
| like, am I still here? If the WeakPtrFactory was destroyed, no, I'm not. And |
| then you have a choice of what to do at that point. Typically, we're like, |
| abandon ship. Don't do anything here. This whole task is aborted. But maybe you |
| do something more subtle. That's totally possible. |
| |
| 22:59 SHARON: I think the example I actually meant to say that uses a |
| WeakPtrFactory is a SafeRef, which is another base type. So tell us a bit about |
| SafeRefs. |
| |
| 23:13 DANA: WeakPtr is cool because of the side channel that you can examine. |
| So you can say are you still alive, dear object? And it can tell you, no, it's |
| gone. Or yeah, it's here. And then you can use it. The problem with this is |
| that in places where you as the code author want to believe that this object is |
| actually always there, but you don't want a security bug if you're wrong. And |
| it doesn't mean that you're wrong now, even. Sometime later, someone can change |
| code, unrelated to where this is, where the ownership happens, and break you. |
| And maybe they don't know all the users of a given object and changing its |
| lifetime in some subtle way, maybe not even realizing they are. Suddenly you're |
| eventually seeing security bugs. And so that's why native pointers can be |
| pretty scary. And so SafeRef is something we can use instead of a native |
| pointer to protect you against this type of bug. It's built on top of WeakPtr |
| and WeakPtrFactory. That's its relationship, but its purpose is not the same. |
| so what SafeRef does is it says - SafePtr? |
| |
| 24:31 SHARON: SafeRef. |
| |
| 24:31 DANA: SafeRef. |
| |
| 24:31 SHARON: I think there's also a safe pointer, but there - |
| |
| 24:38 DANA: We were going to add it. I'm not sure if it's there yet. But so two |
| differences between SafeRef and WeakPtr then, ref versus ptr, it can't be null. |
| So it's like a reference wrapper. But the other difference is you can't observe |
| whether the object is actually alive or not. So it has the side channel, but it |
| doesn't show it to you. Why would you want that? If the information is there |
| anyway, why wouldn't you want to expose it? And the reason is because you are |
| documenting that you as the author understand and expect that this pointer is |
| always valid at this time. It turns out it's not valid. What do you do? If it's |
| a WeakPtr, people tend to say, we don't know if it's valid. It's a WeakPtr. |
| Let's check. Am I valid? And if I'm not, return. And what does that result in? |
| It results in adding a branch to your code. You do that over, and over, and |
| over, and over, and static analysis, which is what we as humans have to do - |
| we're not running the program, we're reading the code - can't really tell what |
| will happen because there's so many things that could happen. We could exit |
| here, we could exit there, we could exit here. Who knows. And that makes it |
| increasingly hard to maintain and refactor the code. So SafeRef gives you the |
| option to say this is always going to be valid. You can't check it. So if it's |
| not valid, go fix that bug somewhere else. It should be valid here. |
| |
| 26:16 SHARON: So what kind of - |
| |
| 26:16 DANA: The assumptions are broken. |
| |
| 26:16 SHARON: So what kind of errors happen when that assumption is broken? Is |
| that a crash? Is that a DCHECK kind of thing? |
| |
| 26:22 DANA: For SafeRef and for WeakPtr, if you try to use it without checking |
| it, or write it incorrectly, they will crash. And crashing in this case means a |
| safe crash. It's not going to lead to a security bug. It's literally just |
| terminating the program. |
| |
| 26:41 SHARON: Does that also mean you get a sad tab as a user? Like when the |
| little sad file comes up? |
| |
| 26:47 DANA: Yep. It would. If you're in the render process, you take it down. |
| It's a sad tab. So that's not great. It's better than a security bug. Because |
| your options here are don't write bugs. Ideal. I love that idea, but we know |
| that bugs happen. Use a native pointer, security problem. Use a WeakPtr, that |
| makes sense if you want it to sometimes not be there. But if you want it to |
| always be there - because you have to make a choice now of what you're supposed |
| to do if it's not, and it makes the code very hard to understand. And you're |
| only going to find out it can't be there through a crash anyhow. Or use a |
| SafeRef. And it's going to just give you the option to crash. You're going to |
| figure out what's wrong and make it no longer do that. |
| |
| 27:38 SHARON: I think wanting to guarantee the lifetime of some other things |
| seems like a pretty common thing that you might come across. So I'm sure there |
| are many cases for many people to be adding SafeRefs to make their code a bit |
| safer, and also ensure that if something does go wrong, it's not leading to a |
| memory bug that could be exploited in who knows how long. Because we don't |
| always hear about those. If it crashes, and they can reliably crash, at least |
| you know it's there. You can fix it. If it's not, we're hoping that one of our |
| VRP vulnerability researchers find it and report it, but that doesn't always |
| happen. So if we can know about these things, that's good. So another new type |
| in base that people might have been seeing recently is a `raw_ptr` which is |
| maybe why earlier we were saying let's call them native pointers, not raw |
| pointers. Because the difference between `raw_ptr` and raw pointer, very easy |
| to mix those up. So why don't you tell us a bit about `raw_ptr`s? |
| |
| 28:40 DANA: So `raw_ptr` is really cool. It's a non-owning smart pointer. So |
| that's kind of like WeakPtr or SafeRef. These are also non-owning. And it's actually |
| very similar in inspiration to what WeakPtr is. So it has a side channel where |
| it can see if the thing it's pointing to is alive or gone. So for WeakPtr, it |
| talks to the WeakPtrFactory and says "am I deleted?" And for `raw_ptr`, what it |
| does is it keeps a reference count, kind of like `scoped_refptr`, but it's a |
| weak reference count. It's not owning. And it keeps this reference count in the |
| memory allocator. So Chrome has its own memory allocator for `new` and `delete` |
| called PartitionAlloc. And that lets us do some interesting stuff. And this is |
| one of them. And so what happens is as long as there is `raw_ptr` around, this |
| reference count is non-zero. So even if you go and you delete the object, the |
| allocator knows there is some pointer to it. It's still out there. And so it |
| doesn't free it. It holds it. And it poisons the memory, so that just means |
| it's going to write some bit pattern over it, so it's not really useful |
| anymore. It's basically re-initialized the memory. And so later, if you go and |
| use this `raw_ptr`, you get access to just dead memory. It's there, but it's |
| not useful anymore. You're not going to be able to create security bugs in the |
| same way. Because when we first started talking about a Use-After-Free - you |
| have your goat, you free it, a cow is there, and now your pointer is pointing |
| at the wrong thing - you can't do that because as long as there's this |
| `raw_ptr` to your goat, the goat can be gone, but nothing else is going to come |
| back here. It's still taken by that poisoned memory until all the `raw_ptr`s |
| are gone. So that's their job, to protect us from a Use-After-Free being |
| exploitable. It doesn't necessarily crash when you use it incorrectly, you just |
| get to use this bad memory inside of it. If you try to use it as a pointer, |
| then you're using a bad pointer, you're going to probably crash. But it's a |
| little bit different than a WeakPtr, which is going to deterministically crash |
| as soon as you try to use it when it's gone. It's really just a protection or a |
| mitigation against security exploits through Use-After-Free. And then we |
| recently just added `raw_ref`, which is really the same as `raw_ptr`, except |
| addressing nullability. So smart pointers in C++ have historically all allowed |
| a null state. That's representative of what native pointers did in C and C++. |
| And so this is kind of just bringing this along in this obvious, historical |
| way. But if you look at other languages that have been able to break with |
| history and make their own choices kind of fresh, we see that they make choices |
| like not having null pointers, not having null smart pointers. And that |
| increases the readability and the understanding of your code greatly. So just |
| like for WeakPtr, how we said, we just check if it's there or not. And if it's |
| not, we return, and so on. It's every time you have a WeakPtr, if you were |
| thinking of a timeline, every time you touch a WeakPtr, your timeline splits. |
| And so you get this exponential timeline of possible states that your |
| software's in. That's really intense. Whereas every time you can not do that, |
| say this can't be null, so instead of WeakPtr, you're using SafeRef. This can't |
| be not here or null, actually - WeakPtr can just be straight up null - this is |
| always present. Then you don't have a split in your timeline, and that makes it |
| a lot easier to understand what your software is doing. And so for `raw_ptr`, |
| it followed this historical precedent. It lets you have a null value inside of |
| it. And `raw_ref` is our kind of modern answer to this new take on nullability. |
| And so `raw_ref` is a reference wrapper, meaning it holds a reference inside of |
| it, conceptually, meaning it just can't be null. That is just basically - it's |
| a pointer, but it can't be null. |
| |
| 33:24 SHARON: So these do sound the most straightforward to use. So basically, |
| if you're not sure - for your class members at least - any time you would use a |
| native pointer or an ampersand, basically you should always just put those in |
| either a `raw_ptr` or a `raw_ref`, right? |
| |
| 33:45 DANA: Yeah, that's what our style guide recommends, with one nuance. So |
| because `raw_ptr` and `raw_ref` interact with the memory allocator, they have |
| the ability to be like, turned on or off dynamically at runtime. And there's a |
| performance hit on keeping this reference count around. And so at the moment, |
| they are not turned on in the renderer process because it's a really |
| performance-critical place. And the impact of security bugs there is a little |
| less than in the browser process, where you just immediately get access to the |
| whole system. And so we're working on turning it on there. But if you're |
| writing code that's only in the renderer process, then there's no point to use |
| it. And we don't recommend that you use it. But the default rule is yes. Don't |
| use a native pointer, don't use a native reference. As a field to an object, |
| use a `raw_ptr`, use a `raw_ref`. Prefer `raw_ref` - prefer something with less states, always, |
| because you get less branches in your timeline. And then you can make it |
| `const` if you don't want it to be able to rebound to a new object, if you |
| don't want the pointer to change. Or you can make it mutable if you wanted to |
| be able to. |
| |
| 34:58 SHARON: So you did mention that these types are ref counted, but earlier |
| you said that you should avoid ref counting things. So - |
| |
| 35:04 DANA: Yes. |
| |
| 35:11 SHARON: So what's the balance there? Is it because with a |
| `scoped_refptr`, you're a bit more involved in the ref counting, or is it just, |
| this is we've done it for you, you can use it. This is OK. |
| |
| 35:19 DANA: No, this is a really good question. Thank you for asking that. So |
| there's two kinds of ref counts going on here. I tried to kind of allude to it, |
| but it's great to make it clear. So `scoped_refptr` is a strong ref count, |
| meaning the ref count owns the object. So the destructor runs, the object is |
| gone and deleted when that ref count goes to 0. `raw_ref` and `raw_ptr` are a |
| weak ref count. They could be pointing to something owned in a |
| `scoped_refptr` even. So they can exist at the same time. You can have both |
| kind of ref counts going at the same time. A weak ref count, in this case, is |
| holding the memory alive so that it doesn't get re-used. But it's not keeping |
| the object in that memory alive. And so from a programming state point-of-view, |
| the weak refs don't matter. They're helping protect you from security bugs. |
| When things go wrong, when a bug happens, they're helping to make it less |
| impactful. But they don't change your program in a visible way. Whereas, strong |
| references do. That destrutor's timing is based on when the ref count goes to 0 |
| for a strong reference. So that's the difference between these two. |
| |
| 36:46 SHARON: So when you say don't use ref counting, you mean don't use strong |
| ref counting. |
| |
| 36:46 DANA: I do, yes. |
| |
| 36:51 SHARON: And if you want to learn more about the raw pointer, `raw_ptr`, |
| `raw_ref`, that's all part of the [MiraclePtr] project, and there's a talk about |
| that from BlinkOn. I'll link that below also. So in terms of other base types, |
| there's a new one that's called `base::expected`. I haven't even really seen |
| this around. So can you tell us a bit more about how we use that, and what |
| that's for? |
| |
| 37:09 DANA: `base::expected` is a backport from C++23, I want to say. So the |
| proposal for `base::expected` actually cites a Rust type as inspiration, which |
| is called `std::result` in Rust. And it's a lot like `optional`, so it's used |
| for return values. And it's more or less kind of a replacement for exceptions. |
| So Chrome doesn't compile with exceptions enabled even, so we've never relied |
| on exceptions to report errors. But we have to do complicated things, like with |
| `optional` to return a bool or an enum. And then maybe some value. And so this |
| kind of compresses all that down into a single type, but it's got more state |
| than just an option. So `expected` gives you two choices. It either returns |
| your value, like `optional` can, or it returns an error. And so that's the |
| difference between `optional` and `expected`. You can give a full error type. |
| And so this is really useful when you want to give more context on what went |
| wrong, or why you're not returning the value. So it makes a lot of sense in |
| stuff like file IO. So you're opening a file, and it can fail for various |
| reasons, like I don't have permission, it doesn't exist, whatever. And so in |
| that case, the way you would express that in a modern way would be to return |
| `base::expected` of your file handle or file class. And as an error, some |
| enumerator, perhaps, or even an object that has additional state beyond just I |
| couldn't open the file. But maybe a string about why you couldn't open the file |
| or something like this. And so it gives you a way to return a structured error |
| result. |
| |
| 39:05 SHARON: That's found useful in lots of cases. So all of these types are |
| making up for basically what is lacking in C++, which is memory safety. C++, it |
| does a lot. It's been around for a long time. Most of Chrome is written in it. |
| But there are all these memory issues. And a lot of our security bugs are a |
| result of this. So you are working on bringing Rust to Chromium. Why is that a |
| good next step? Why does that solve these problems we're currently facing? |
| |
| 39:33 DANA: So Rust has some very cool properties to it. Its first property |
| that is really important to this conversation is the way that it handles |
| pointers, which in Rust would be treated pretty much exclusively as references. |
| And what Rust does is it requires you to tell the compiler the relationships |
| between the lifetimes of your references. And the outcome of this additional |
| knowledge to the compiler is memory safety. And so what does that mean? It |
| means that you can't write a Use-After-Free bug in Rust unless you're going |
| into the unsafe part of the language, which is where scariness exists. But you |
| don't need to go there to write a normal program. So we'll ignore it. And so |
| what that means is you can't write the bug. And so that doesn't just mean I |
| also like to believe I can write C++ without a bug. That's not true. But I |
| would love to believe that. But it means that later, when I come back and |
| refactor my code, or someone comes who's never seen this before and fixes some |
| random bug somewhere related to it, they can't introduce a Use-After-Free |
| either. Because if they do, the compiler is like, hey - it's going to outlive |
| it. You can't use it. Sorry. And so there's this whole class of bugs that you |
| never have to debug, you never ship, they never affect users. And so this is a |
| really nice promise, really appealing for a piece of software like Chrome, |
| where our basic purpose is to handle arbitrary and adversarial data. You want |
| to be able to go on some web page, maybe it's hostile, maybe not. You just get |
| a link. You want to be able to click that link and trust that even if it's |
| really hostile and wanting to destroy you, it can't. Chrome is that safety net |
| for you. And so Rust is that kind of safety net for our code, to say no matter |
| how you change it over time, it's got your back. You can't introduce this kind |
| of bug. |
| |
| 42:03 SHARON: So this Rust project sounds really cool. If people want to learn |
| more or get involved - if you're into the whole languages, memory safety kind |
| of thing - where can people go to learn more? |
| |
| 42:09 DANA: So if you're interested in helping out with our Rust experiment, |
| then you can look for us in the Rust channel on Slack. If you're interested in |
| C++ language stuff, you can find us in the CXX channel on Slack, as well. As |
| well as the [[email protected]] mailing list. And there is, of course, the |
| [[email protected]] mailing list if you want to use email to reach us as |
| well. |
| |
| 42:44 SHARON: Thank you very much, Dana. There will be notes from all of this |
| also linked in the description box. And thank you very much for this first |
| episode. |
| |
| 42:52 DANA: Thanks, Sharon This was fun. |
| |
| [Life of a Vulnerability]: https://www.youtube.com/watch?v=HAJAEQrPUN0 |
| [MiraclePtr]: (https://www.youtube.com/watch?v=WhI1NWbGvpE) |
| [[email protected]]: https://groups.google.com/a/chromium.org/g/cxx |
| [[email protected]]: https://groups.google.com/a/chromium.org/g/rust-dev |