
Pybites Podcast
The Pybites Podcast is a podcast about Python Development, Career and Mindset skills.
Hosted by the Co-Founders, Bob Belderbos and Julian Sequeira, this podcast is for anyone interested in Python and looking for tips, tricks and concepts related to Career + Mindset.
For more information on Pybites, visit us at https://pybit.es and connect with us on LinkedIn:
Julian: https://www.linkedin.com/in/juliansequeira/
Bob: https://www.linkedin.com/in/bbelderbos/
Pybites Podcast
#082 - Annotate all the things! Why you should care about Python type hints ...
This week we have Will Frey on the podcast: ML engineer, Python "knowledge dictionary" and type hints fan & geek.
We talk about his background, how he learns / keeps up with Python's fast moving ecosystem and of course we look at Python's type hints in-depth: why care and some of his favorite tricks.
We hope you enjoy this episode.
Links:
- typing docs
- mypy docs
- PEP 484 - Type Hints
- PEP 483 - Theory of Type Hints
- PEP 526 - Syntax for Variable Annotations
- PEP 544 - Protocols: Structural subtyping (static duck typing)
- PEP 561 - Distributing and Packaging Type Information
- typing notes (unmentioned, but useful)
- grep.app
(We told you, he lives and breathes this stuff haha)
Correctness with my PI helps you from a testing perspective because it allows you to relax. And with having to overly specify isinstance and checking things in your Python code, you can proceed pythonically and just try things. But you've said what you expect and what you expect to be given and what you expect to return, and then my PI can hopefully catch those for you and take care of a huge part of all of the things you need to test. It takes care of a huge chunk in Python that was previously very hard to test. I feel like because you didn't have a compiler to yell at you that something wasn't the wrong type or it doesn't know what it is. Hello and welcome to the PY Bytes podcast, where we talk about Python career and mindset. We're your hosts. I'm Julian Sequeira. And I am bobbeldables. If you're looking to improve your python, your career, and learn the mindset for success, this is the podcast for you. Let's get started. Welcome back, everybody, to the Pyewites podcast. I'm Bob Valdevos and I'm here with Will Frey. Hey, will, nice to have you on the podcast. Welcome. I'm honored to be here. Thanks for having me. Yeah. How are you doing today? Doing well. It is heating up, as it has been the past couple of weeks, but otherwise things are good. It's Monday. Yeah, it was on a Monday, right? Start of the week. Strong. Okay. So, yeah, I have you on the podcast because, you know, I've been with pibytes for quite a while in the community and, yeah, we really value your input and, you know, quite a bit. You're quite a type hints fan, and I wanted to pick your brain on that today and some more things. So maybe to start it off, maybe you can introduce yourself to the audience. What's your background and how do you use Python day to day? Sure. So I'm a machine learning engineer, whatever that means. These days can mean a lot of things, but I enjoy that part of it. I've been using Python really since dabbled with it in grad school, used it towards the tail end of my first job, and then really dove into what I'd say starting like, 2017. And I've just really come to love the language and I've loved where it's been going with, like, the later or the more recent 3.910 eleven versions of what they've been doing and pattern matching, things like that. Yeah, day to day with Python. It's tons of stuff. It's machine learning work. It's sometimes data sciencey work, doing a little bit of exploration. Sometimes it's scraping data, setting up a web server to host a machine learning model. Really just a little bit of everything, I'd say, except with the exception of front end. No front end work, but that's it. Yeah. Cool. That's awesome. And you were with us in PDM, and one question I had people often tell me, will does know a lot about Python. He's actually like a walking dictionary. Facts and, like, lesser known features and really the depth of Python. So, yeah, there's still so much. There's still so much that I get to learn, but thank you. Yeah, that's kind of irony, right? Like, the more you know, also, the more you find that you don't know. You learn that you don't know. Right, exactly. But, yeah, indeed. So I was really curious about how, if you develop that body of knowledge and how do you keep up and, yeah, how. How do you. Because she said you started using or you learning more in depth. Python 2017. That's not relatively a long period. So how did you develop that body of knowledge and how do you stay current? How do you learn? I think the catalyst for me going whoa with Python was the first edition of Fluent Python, and I really like the second edition, too. I get a shout out, or like, I'm in the acknowledgement. Cause I saw that I gave an edit, right. But that book really did a great job of basically working through. If you go onto the python.org website, look at the docs, and it walks through, there's a very informative but dry treatment of the python data model. What happens when you create class something, colon, class something, open parentheses, a bunch of parent classes, close parentheses, new, and everything that goes on, you can do so much. But fluent Python really just opened my eyes to what you could do and how, and how you could do it well, as opposed to just doing it right and cleanly in Python. I think it also comes from. What motivated me was it comes from sense of. I don't want to say self loathing, but in science and research, a lot of times the code is sort of, it's cobbled together because that's what it needs. It's only thing it needs to be. But I quickly was wondering, I started wondering, there has to be a better way. There has to be a better way. And I think the combination of that book plus that interest got me into digging into the depths of python. And then from there, when data classes and typing came out with three six. It was provisional in three six and data classes. Three six. Where I went from like, what is this thing? What is this thing? Whenever I'm working with a third party package, standard library, things like that, the typing just. I know it was controversial at the time. Some people thought it was like an anti feature and some people loved it. I loved it. It really helped me start to write, like, what do I need? It just takes a step back, makes you think about the problem, but you also don't have a compiler that you have to appease. You can still go back to say like, don't know what this is going to be yet, but I'm going to make progress as opposed to having to fight that fight right then and there. And I think there's been progress made where I think it's my PI C, which will, if you have accurately annotated code, it will compile it down for you into C code and you get a lot more performance. I think you asked about how I keep up now. I just end up reading a lot of code and seeing. Trying to find examples. I have an idea of, hey, can I do it this way? Or have other people done this? Or how do other people use like X, Y or Z and just try to find examples and graciously borrow from the community sort of, and sort of vet my own ideas against what's already out there from, I don't know, places, authors, developers that I've come to appreciate and trust kind of from afar. Nice. So you follow certain people, or is it when you have a feature you actively searching for similar, I mean, I remember you introduced this to grep app. It's a regular expression, or you can use just plain text with regular expression. And then there's another one I have no affiliation with either, but there's source graph, which is like a startup. I think Reptile app is just a guy or a couple guys who just run it because they're good people. But yeah, I'll go and search for code or try to find similar code using grip app. End up finding a lot of anything in the PSF, the Python Software foundation, which is where black is now, requests and I think a few other things based on the home there Python packaging authority. So pypa, I think I've mentioned this to you before, but a lot of times side projects from Python core developers like Lucas Lange is the. He's a developer, he's a core developer. And I found with like the black source code or like the black code base is great. Yeah, Python core devs doing side projects because then they're not beholden to just using the standard library. They can, you know, expand a little bit, but they still keep it very concise and concise enough without being overly verbose. Yeah, thanks. Those are some great resources. So yeah, as we said before, you're a big fan of type ins. Can you tell us why do you think they're important? Yeah, they do a lot of things. I'd say if you have accurate type hints that are flexible, like not too restrictive for what you're going to accept or expect, and then well defined of what you can give back when you say you call a method or make an instance of a class, things like that, it helps anyone using your code. It is self documenting, but it's not restricting. It doesn't have to be. That's right. There you're adding documentation, you end up with a lot of additional tools that are made available to you, ides, text editors, even good old Vim like you like. There are plugins to get completions based off of the type ins, things like that. But for you, working on it yourself as the developer, it makes you stop and think about what do I really need? I think it was Brandon Rhodes who said in some talk that I can't remember, but I quite liked it. Just because you can in Python doesn't mean you should. And a lot of times people take for granted. I think developers of Python, people writing Python when they're writing a function, something like that, and they accept an argument, I mean really, it's dynamic, you can do anything to it. It might crash at runtime, but whatever. And that quickly you're like, okay, just add this method, just add this method, just make sure it has this attribute and you end up with this frankenstein of this thing, this instance of something that you need this function to accept. But if you type ahead of time, especially using interfaces or really shallow protocols, you make it very concise. And I just think you end up with a lot better code. Yeah, I think you raised a good point, because Python lets you pretty free, so it also lets you make a mess out of it. So getting a bit more strictness back with the type its which by the way, as you said, they're add on, right? So they're not enforced unless you enforce them with a type checker, right? Well, sort of. There are libraries that are fantastic and I'm sure we'll get to this later that make strong use, are completely reliant upon your type annotations at runtime, identic by extension, fast API, typer, things like that. Yeah, so I think they've grown beyond their initial scope and I mean why not use them? You can do that sort of introspection at runtime and you don't have to repeat yourself with things. So yeah. Yes, I think it's good to get a bit of strictness back into what otherwise is a language that lets you very free. Right, exactly. And yeah, I glad you to bring up identic and fastapi and as you know, we had Sebastian on the show and those libraries so cool. Heavily use that stuff. Pretty cool. All right, so what are some of your five favorite typing tricks and things you can do apart from the input output parameter? Sure. Five, man, put me on the spot here. So I think number one, and you can read it for listeners, you can go to the docs for typing like the python typing module and then look up a protocol. And it's a way to define lightweight protocols. Like if you just need a couple methods or you need something, some, some type annotation should, needs a couple methods or it needs to have a couple attributes. I think protocols are fantastic because without having to get into a mess with importing something, or if you're relying on another library in their type int necessarily like they're too heavy, you can define your own protocol. And I have not written any go lang, but I guess like very pop. It's the flexibility of that is that you just have these real thin protocols to define your face. It's structural subtyping, I think is what it's called. And yeah, that's great for documenting what you expect. And I think you can go down the rabbit hole with the variance of variables, which I think is also really interesting. But protocols, just to say what you mean without having to get into the guts of it, are fantastic. So that's one that was a lot, I think making use. So they've, the standard libraries moved as of 3.9. A lot of the interfaces that were defined in the typings module go back to their original home, like collections, ABC or context lib, things like that. But I think making use of all the containers and collections on ABC to annotate your functions is great because it's like you don't need, you're accepting something and you think it needs to be a list. Doesn't really need to be a list, it just needs to be immutable sequence, any immutable sequence. Or maybe it just needs to be a sequence, or maybe it just needs to be an iterable, or maybe it just needs to be a collection because you're checking if things are in it. And you can really make your code a lot more flexible by using what's in the standard library. Maybe isn't immediately obvious. And yeah, you're going to, you're going to reach for a dictionary or a list right away, but you know, maybe that's a little too heavy handed for what you expect. Geez. What else? Do one more if you want. Sure, I'll do one and a half. I like the explicit type alias annotation now for when you're defining aliases for things like oftentimes flexibility. Like I have just today, I have an id for something and it's a string, but I really want to say it's an id and I don't. It's too heavy handed just to subclass string make an id and then call it that. And then there is a new type in typing that you can define. It'll be treated like it's its own distinct class, and it will yell at you if you're just passing like a string. Literally, you'd have to cast it or wrap it in that. But if you just say like, I don't know, object id colon type alias equals string, that's a nice way to explicitly say that it's type alias. Now it's not, it's explicit, not implicit. What's a type alias and what's maybe like an alias for another class where you're actually going to instantiate it. So that's great. And I love the new union syntax and believe it's 310 and later where you can actually use a pipe in order to define unions. And the fun thing is with isinstance and is subclass checks, as long as so you're defining a union, maybe it can be a string or an int. You'd have whatever it is defined like Str pipe int. And if that's, you can actually use that inside of is instance or is subclass check as opposed to having to provide like a tuple of those classes. As long as none of the definitions inside of that union are themselves generic. You couldn't have it be a string and then pipe and then a dictionary of string value string keys or something like that. Then it would yell at you. But I think that's really nice because you're starting to see the type aliases and type definitions become more useful for narrowing types at runtime. Nice. Yeah, they're just making it a lot more ergonomic, which is great. Nice. Yeah. The other example is dict and list and tuple, three nine becoming. You can use the built ins, right, exactly. Anymore. Oh, and I think it's fussy right now, even in 310 with my pie. But in 311 they're introducing the self type because oftentimes if you have like a fluid interface or you have some sort of alternative constructor that's defined as a class method and you annotate it right now, if you subclass that, the annotation is going to say you're going to get an instance of the parent class, not necessarily the subclass. And with the self type you can annotate it right now to make it generic, but it's verbose and kind of a pain. And they've introduced the self type which just allows you to annotate, say this is going to be an, this is going to return an instance of self, like the type of the class that you're in and subclasses will just handle it automatically. And that's quite nice. So I'm really glad they do that. Nice. That's coming soon. And we're going to provide some links in the description if people want to read more because those more advanced just read the typing, read the peps for typing anything related to typing. I think they do a great job. And then go read the docs for the typing module. And I think you'll be off to, you know, if you want to start getting your feet wet with this, that's a great place to start. Cool. Yeah. And a related question that as to when you want to start using it, right. So you're completely new to it. Apart from adding the types in the code. In terms of type checking and tooling, what do you recommend? My PI is the gold standard and I'd say start there. You can, if you're using versus code, I mean, if you're editor, I know, versus code, it's very easy to get Pylance set up, which is based off pyrite, which is another type checker. Facebook has Pyre, Google has another one I'm not super familiar with. So just turn on something. But I recommend if you can get it my PI and make sure you have check for my PI. So I'm going to speak specifically to my PI now. Make sure you have like check untyped deaths on, which is an option you can provide on the command line or in the config file, which is a place you can put it there. So that way code that you haven't fully annotated or is completely un annotated, it will start checking it for you. And then if you want to scare yourself you can turn on strict, but I think you can see what it's going to yell at you about and then just start working from there and start trying to learn. Excuse me, what is my PI telling you? Why is it telling you that? Think about it. I think mypi will give you hints about, okay, you expect the dictionary, but maybe I think it'll suggest how about you use a mapping instead and not so it's not mutable because it's going to be a lot more flexible that way. And yeah, start just trying to work through all of the my PI errors and then slowly start as you feel more comfortable, slowly start ratcheting up the strictness because you can see what strict when you run my PI, all the options it turns on. So start kind of just making it a little stricter at a time. The advice my PI gives and I think this is, you know, this is like start gradually like they say that on in their docs. And also I'd say start using it, don't use it in a huge code base that you've already been working on. I mean maybe if you can get a little corner of it, that's your little playground, do it there, but start working on it with like a new project. And also don't get discouraged when a lot of other packages maybe don't have great type hints or don't have type hints. Uh, you'll feel you'll end up finding that you start running towards the ones that do. And then lastly, and I've alluded to this before, when if you've, you're starting to get comfortable, type ins, writing annotations, start playing a game with yourself where you have, say you're writing a method or a function or you know, something like some def in a class, it's just standalone function, wherever some callable, start seeing what the weakest, or you know what the weak, not in a bad way, but like the, what sort of, what's the minimum amount of information that you need to accept to do what you need to do inside this method. Like you don't need a list, like you maybe just need an iterable, things like that. And see how flexible and what the minimum you need in order to do what you need to do. And don't go, you can go crazy trying to make it outrageously flexible, but start trying to do that and then start trying to figure out what's the strongest result you can get back. Don't just say you're going to get back a list, or, sorry, don't just say you're going to get back an iterator. Maybe you're going to give back a list, maybe you're going to give back a tuple, maybe you're going to give back like a deck from the collections module, something like that. But the one thing to keep in mind there is that if you're going to return a list and you have other people relying on your code, and all of a sudden you change it, it's not a list. You may have unhappy users of your library, so proceed with caution. But I think as long as. Go ahead, build on some flexibility there. Right, right. Especially on the return types. I mean, try to make what you accept as flexible as possible. But do bear in mind that flexibility on the return can, or being very strict about what you're returning could, if you decide you need to change that later, could cause some problems, but hopefully not. I think if you're sticking to returning things that are in the standard library, it should be okay. So yeah, my PI docs are accessible to start using it. Absolutely. I think the mypi docs are great. I think between the my PI docs, the typing peps, and then the python docs, that's a great place to start, and soon, hopefully a learning path as well. We're working on a type of learning path, and I think if you want to start digging into code on the python organization GitHub, like the python GitHub.org, there's typeshed which ships with my PI, and I think all the other type checkers usually ship with it. They have files that are PYI, which are Python interface files. You can write annotations inside of your Python file, inside the actual py file, and that's what I do. So those would be called inline, but then also for older libraries that you don't want to annotate for whatever reason, inside of the function itself, you can distribute these Python interface files. You can also distribute them side by side inside of your code. Just name them the same thing as the code that they are and they'll work out. But you can start reading those to see how they annotate things like the actual standard library and a number of the third party packages that they've added to typeshed. And I think that's another good place to start, to start getting your feet wet with this stuff. And especially the typeshed PYi file has a bunch of those lightweight protocols that I mentioned before. They'll have ones that's just supports keys and getitem. Sometimes that's usually all you need. You don't need even need a mapping, you just need supports, keys and getitem and that's all you need. Awesome. Do you manually run my PI or do you run it as part of pre commit or how do you have that wired up? So I have it turned on in my ide. Given the way things are set up with my work, it's hard to run it like as like a CI job, things like that. I think they're starting to introduce more type checking at work, but I just usually just run it locally right now because there's no checks for anything. Like as far as when you commit things at work, but I'd recommend have it running in the background. Sometimes I silence it because I just need to get work done because I will gladly go down the rabbit hole trying to appease my pie because it's interesting to do. So. Sometimes you can turn it off, but then I'll turn it back on as I'm working and just make sure things are good to go. But yeah, run it, run it. You can run it as part of your tests, you can run it as part of your linting, you can put it as pre commit, you can run it as a GitHub actions job when you commit. Put it in lots of places because I think correctness with my PI helps you from a testing perspective because it allows you to relax. And with having to overly specify isinstance and checking things in your python code, you can proceed pythonically and just try things, but you've said what you expect and what you expect to be given and what you expect to return, and then my PI can hopefully catch those for you and take care of a huge part of all of the things you need to test. It takes care of a huge chunk in Python that was previously very hard to test. I feel like because you didn't have a compiler to yell at you that something wasn't the wrong type or it doesn't know what it is. So yeah, treat it as part of your testing that you can get feedback as you go. That's a great point because actually at least the less code because you're not going, for example, re input arguments. You don't have to really check in the body of the function anymore, like is it this or that and raise value error? Or is it this that can now be enforced by the typing, knowing that my API is going to catch that for you. Right? The one thing you can do with isinstance is for type narrowing which you can read about in Python. Like you can say assert isinstance. Say you're given an object and it's literally annotated object. So it's just you have bare minimum to work with. You can use isinstance like assert isinstance, open, paren, object, whatever your argument is. Like obj, obj, comma and then something. And then if that passes, then by PI will know, or the type checkers will know that. Now you've asserted that it's going to be that you can also do that without doing an assert and is instance with the cast function inside of typing. But the assert adds a little bit of overhead if you keep asserts on, but you can actually turn them when you're running python on the command line python three. And then if you do a dash and then uppercase o, that turns on the first level of optimization, which will take out your asserts. So tangential lesson, don't use asserts, don't rely on catching assertion errors and things like that because you can turn them off. That breaks a ton of stuff. A lot of times people don't, aren't, aren't strict about that. But yeah, if you're using those as instance checks and you're worried about runtime overhead, you can actually just run your python without is instance check or without assertions minus o, right. Or something like that. Yeah, dash capital o. I think you can also, there's an environment variable that you can specify, I believe that does the same thing. Yeah. So yeah, you don't have to sit there and do all this checking in order to, because you've said what you mean and then if it ends up, it makes it flexible because at runtime, if something satisfies it. Okay, cool. That's the flexibility that you want. That's why you're using python, because it's just that flexible. But you've said what you mean. And my pilot catch there is there, hopefully. Cool, cool. So yeah, thanks for sharing all that. Very interesting. And I invite everybody to read up on those resources and start using it, which are nice suggestions. Any final shout out or book or we always wrap up with or win or a book or shout out or all. Pick whatever you want. I think I'm reading through the second edition of Fluent Python, which I think does a great job because the python screwed up a lot, even since the first edition of Fluent Python. And I think it's dense. It's a big book, but it is great if you really want to get into how the language ticks and works and really see how far you can go with it. It's fantastic. That's a great book. He gets into some typing. Yeah. So that would be my recommendation. Awesome. Yeah, it's 900 pages. I'm reading it on the Kindle. I need to get back to it, but I'm 22% in. But yeah, I like that it's from the inside out. He goes straight into the data model and not like I'm going to bore you with five chapters of variable names. It goes straight into the core of Python. Right. And also, but it's. You can read it straight through. You can also, if you already have a decent amount of familiarity with Python, you can just jump right into chapters and you might have to reference, you know, backtrack a little bit to reference something from a previous chapter, but you can just jump right in and use it almost like a reference book or just, you know, read it, read little chapters at a time, you know, out of order for what you're interested in. And I think it's, it's just a great, great, great book. It's a masterpiece. And he also covers typing. He does, he does, yeah. Awesome. Well, there's a lot more we can talk about. Endless. We can do a follow up when I figure typing out a little bit more. And I, I can, I can ask you some more questions, but, yeah, thanks for sharing all that insight today. And yeah, thanks also for all your contributions to PI Bytes and PDM. I know people really enjoy when you chime in and share your Python knowledge. Well, thank you for having me. This was great. I'm honored to be here and I'm glad people appreciate when I go, actually, during the code, the code clinic sessions. So, yeah, this was fun. So thank you. Yeah, thanks, Will. Thanks, Rob. We hope you enjoyed this episode. To hear more from us, go to Pibyte Friends, that is Pibit es friends, and receive a free gift just for being a friend of the show. And to join our thriving slack community of Python programmers, go to Pibytes community, that's Pibit es forward slash community. We hope to see you there and catch you in the next episode.