#131 - Write more maintainable Python code, avoid these 15 code smells Artwork

Pybites Podcast

The Pybites Podcast is a podcast about Python Development, Career and Mindset skills.

Hosted by the Co-Founders, Bob Belderbos and Julian Sequeira, this podcast is for anyone interested in Python and looking for tips, tricks and concepts related to Career + Mindset.

For more information on Pybites, visit us at https://pybit.es and connect with us on LinkedIn:

Julian: https://www.linkedin.com/in/juliansequeira/
Bob: https://www.linkedin.com/in/bbelderbos/

All Episodes

Pybites Podcast

#131 - Write more maintainable Python code, avoid these 15 code smells

September 15, 2023 • Julian Sequeira & Bob Belderbos

This week we talk about code smells. 💡

Code smells are characteristics in the code that might indicate deeper issues or potential problems. While they're not necessarily bugs, they can be a sign of poor code quality or maintainability issues. 😱

We distilled 15 common smells ranging from generic programming to Python specific issues. We hope it will make your more conscious of your code as well as code you'll review. 🐍 💪

If you have any feedback, hit us up on:
- LinkedIn
- X
- Email
(Also for any podcast topic requests ...)

Mentioned Dictionary Dispatch Pattern video

And to write cleaner, more maintainable code, in the context of (complex) real world applications, check out our 1:1 coaching options.

Chapters:
00:00 Intro music
00:20 What are code smells?
01:11 1. Long functions or classes
01:46 2. Duplicated code
02:25 3. Data Clumps
03:13 4. Using the global space
03:52 5. Magic numbers
04:38 6. Primitive obsession
05:06 7. Overusing comments
06:23 8. Too deep nesting
07:36 9. Switch statement or long if-elif-elif-else chains
08:41 10. Too deep inheritance
09:45 11. Dead code
10:21 12. Misusing (nested) listcomps
11:03 13. Single letter variable names
12:03 14. Mutable Default Arguments
13:05 15. Error Silencing
14:04 Wrap up
14:56 Outro music

Thanks for tuning in as always 🙏 and next week we'll be back with a brand new episode ... 🎧

0:00

Hello, and welcome to the Pibytes podcast, where we talk about Python career and mindset. We're your hosts. I'm Julian Sequeira. And I am Bob Eldebos. If you're looking to improve your python, your career, and learn the mindset for success, this is the podcast for you. Let's get started. Welcome back to the Pibytes podcast. This is Bob Eldeboz, and it's only me this week, and I want to talk about an important topic in software development, and that is code smells. What does it mean when your code smells? It kind of sounds weird, right? But then also we get very practical, and I look at 15 code smells. So first of all, what is a code smell? Code smells are characteristics in the code that might indicate deeper issues or potential problems, but they're not necessarily bugs. They can be a sign of poor code quality or maintainability issues. So your code is probably working. It's just not ideal. So that's where you want to refactor and get rid of those code smells, so that in the longer term, you have a more healthy codebase and an easier time maintaining that code base. All right, let's get practical and dive into the code smells. Number one, lung functions or classes, right? We spoke about this many times as PI bytes, but basically large units of code trying to do too many things, those are a problem, right? For example, you have a function that calculates interest, sends an email and logs data. You really should split that into three separate functions, each handling one task. So functions should do one thing, and once you start to have functions or classes that do many, many things, that's a code smell. Number two, duplicated code. So repeating the same logic in multiple places, this might seem pretty obvious, but little duplications tend to sneak in. For example, if you have some date formatting logic in various locations, it can actually be hard to spot. But every time you see duplication, centralize it. Make sure you have one copy, because inevitably, over time you will forget. And when you update the code and there are multiple copies, you're inevitably going to forget to update one or more of the copies. So in the date formatting logic example, just make a little helper, format date and abstract the logic there and have it in one place. Number three, data clumps, repeated groups of variables, data that's often used together. For example, if you're passing in street city, zip separately into functions, therefore also increasing the number of parameters passed into that function. You can make an address class or data class in Python or tuple. Name tuple and pass that in, right? And then in an example of a function instead of three arguments, and it could even be more, you have one argument which is the object, and then you have attribute access to street city zip on the object. And so it reduces your interface. And yeah, you group related data together, which is more maintainable using the global space. Relying heavily on the global space can lead to unpredictable side effects, right, we have spoken about this before. Encapsulation is a good thing, it protects the scope, right? So for example, you have a mutable list globally at the module level, and various functions are mutating that list. Now these functions are causing a side effect in the outer scope, right? So it's much better to pass in the list as an argument. So it becomes scoped to the function and we're not muting, mutating the outer global space to prevent any unintended side effects. Number five, magic numbers. These are basically hard coded values that lack context. For example, if you use 3.14 directly, instead of having that 3.14 hard coded, it's very easy to put that in a variable, which we often use a constant. So PI uppercase equals 3.14 and then use that constant PI. So instead of 3.14, you now read PI and it's immediately clear from the variable what that means, whereas just a random number, somebody that reads your code has to guess what that means. So using more variables makes the code more readable. So replace magic numbers by variables, and often those are in this case constants. Number six, primitive obsession. This is the over relying on primitives instead of more expressive types, rather than accepting any string for an email validated with a dedicated function or method in your class, or use a library that has validation embedded, like pydentic for example. So instead of relying on just primitive strings, for example, be more strict with your types. Number seven, overusing commands it's a code smell to heavily rely on commands to explain what code does. Now, a bit of a contrived example here is if you have a statement like x x one, and then an inline command increment x one. That's of course totally redundant because it merely explains the code. But you can have more examples where a whole block of code is proceeded with a command that might just describe what the python does. So it's a redundant command. Also in this case of the x x one, you can also give it a more meaningful name like counter counter plus one, right? The x variable is now counter, which is way more explicit, which also makes the command redundant. But overall, commands can be a code spell because they might indicate that the code is not expressive enough. One exception for me is when complexity is necessary and you want to explain the rationale why you did it, then a command can really help clarifying that and can save you and your colleagues a lot of time. They refer back to that code, but overall, too many commands. Code smell. You probably should refactor your code to make it more expressive. Code smell number eight too deep nesting deeply nested code means complexity. We've all seen the notorious arrow shape where code is going in the form of an arrow going inwards and outwards, and that's deeply nested and that's often associated with a high cyclomatic complexity. And cyclomatic complexity basically means the amount of branches in your code, right? Every if for else, it all adds like different paths that the code can take. Hence it's more complex. You need to write more test cases. So if you see deeply nested code, and the Zen of Python of course says flat is better than nested, then it's time to refactor and break out more logic into helper functions and reduce that often unnecessary nesting more practically. For example, if you find multiple nested if statements, you can use guard classes or sometimes reverse the conditionals to have the if statements return early and then the after the early return, the code can be dedented. And if you do that a couple of times, then you can easily gain three levels of nesting back. Your code is much flatter. It's more readable and more maintainable. Code smell number nine a switch statement or long if elif Elif else chain last week we spoke about this on our YouTube channel. So if you have a long chain of that, it becomes harder to maintain. Like you have to keep updating those conditionals. What you can do is to use the dictionary dispatch pattern. For example, if you have user roles like admin, user guest, and a whole bunch of roles, and they have different functions associated with them, you can actually build up a dictionary where the keys are those roles strings and the values are those callables or functions. And then you can just look up the role key in the dictionary and get the callable and then call the callable right? Run the function. Then having that all in your dictionary makes it easier to read and easier to maintain. I will link the YouTube video below so you can see see that in action. But basically long if elif elif else change are code smell code smell number ten too deep inheritance so overly deep or complex inheritance structures are a code smell. And personally I've struggled with this in Django, where the class based views start off simple and are nice for crud apps. But once you start to do more complex things, you're wondering what method should I override? Bring in mix ins, and it's becoming quite complex, and I think it's a code smell. So then when I use function based views, I have way more clarity on what I'm actually doing. It's way more transparent. Another example, if you have a bunch of animal classes and you have a bird that inherits from animal, a penguin inherits from bird, and a rockhopper that inherits from penguin, that gets, again, that structure gets pretty complex. So you should consider composition. So maybe a penguin has a species attribute instead of further subclassing. So again, deep inheritance trees code smell. There are ways to simplify that in python and encoding overall codesmell eleven dead code I think this is a pretty obvious one, but let's talk about it. We have unused segments of code that clutter the codebase, so it could be old unused code commanded out sections, just remove them. If it's not used, it has no business still sticking around in your codebase. What I often see, for example, is a whole bunch of code commented, but actually you can remove it because you still have it in version control, right? So you still have it. If you make a meaningful commit message, you should be able to retrieve that code. It's just causing clutter if you leave that around. Code spell number twelve misusing nested list comprehensions we love list comprehensions. I think it's one of my favorite features in Python, but I only use them for a single level. A single for loop. Once you start to do multiple for loops in a list comprehension, you basically put too much logic on one line or one statement, and it runs unreadable. For example, if you have a list comprehension that reads like x plus y for x in range three, for y in range three. So you're doing a nested loop inside a list comprehension. That to me is a code smell. It's too complex, and you're probably better off refactoring that using classic for loops code smell number 13 single letter variable names. Names that lack context make code harder to read. We've all seen I, j, k like I can work in a tight for loop, but overall avoid single letter variables. Variable names, even in a loop, if you do for I and l, well, you know that I is the loop variable, but what is l? Right? So for item in items that reads very nicely. That's English, and it doesn't cost you much more effort to use variable names that are slightly longer. Another thing with single character variable names is if you drop into the debugger, they often start to clash with debugger commands, so it makes your code harder to debug, but there are ways around it. The principal concern here is that your code becomes harder to read. Again, you have to be expressive in your code code smell number 14 and by the way, there are many more code smells. I just digested it down to 15, so two more left. Number 14 is mutable. Default arguments default arguments in python persist across function calls, so that can lead to unexpected behavior. So if you see a function definition like additem, it takes an item and it takes a list, equals and then square brackets indicating an empty list, then that's going to be a problem, because that list will now grow between function calls, right? So if you want to use a default value, use none, and then in the body of the function, check against none and then set an empty list inside the function so it's properly scoped and there won't be side effects when you call the function multiple times. So if you ever see list equals empty list empty dictionary any mutable argument as a default argument, we factor that to the singleton none codesmo number 15 error silencing the Zen of Python says errors should not pass silently, and it makes a great point because it can lead to insidious bugs and long and obscure debugging sessions. So broadly, catching ignoring exceptions without handling them is a code smell. The most infamous one is try except pass. Now any exception that might occur will be silenced even if you try to break the program with a keyboard interrupt. So instead of try accept pass, do try accept and then the name of the exception, and then handle the exception, for example by logging it, right? So if you accept value error, then you can do a logging error with a log message. You can even do a logging exception to log the whole stack trace. But don't let errors pass silently to not cause bigger issues down the line. Alright, that's it. 15 code smells with a good mix of general concepts in programming as well as Python specific. I hope you liked this. If you want to see more of that, feel free to reach out to me bobbytes also follow our YouTube channel, which I will link below for more videos on Python refactoring. We posted a lot of good stuff there. I hope this makes you more aware about your own code and the code you might be reviewing. And again, if you have any feedback, hit me up through the channels and happy to discuss. I'm really passionate about not only Python, but also the concept of clean code and how to keep code maintainable. Thanks for tuning in to our py Bytes podcast, as always, and next week we shift gears a little bit and we dedicate an episode to productivity. Until then, we hope you enjoyed this episode. To hear more from us, go to Pibyte friends, that is Pibit es friends, and receive a free gift just for being a friend of the show and to join our thriving slack community of Python programmers. Go to Pibytes community. That's Pibit es forward slash community. We hope to see you there and catch you in the next episode.