
Pybites Podcast
The Pybites Podcast is a podcast about Python Development, Career and Mindset skills.
Hosted by the Co-Founders, Bob Belderbos and Julian Sequeira, this podcast is for anyone interested in Python and looking for tips, tricks and concepts related to Career + Mindset.
For more information on Pybites, visit us at https://pybit.es and connect with us on LinkedIn:
Julian: https://www.linkedin.com/in/juliansequeira/
Bob: https://www.linkedin.com/in/bbelderbos/
Pybites Podcast
#122 - Using Python (and FastAPI) to support PFAS research
In this podcast episode, Robert Young, the director of an analytical chemistry lab at New Mexico State University, shares his unique journey from being a lawyer to becoming a chemist and a Python programmer.
He explains how his passion for environmental causes led him to study chemical analysis and mass spectrometry, initially focusing on the breakdown of endocrine disrupting chemicals in the environment.
Robert discusses the challenges of analyzing complex data sets with thousands of molecules and how he learned to use Python and FastAPI to make the analysis more efficient.
He also introduces his current project, studying Per- and Polyfluoroalkyl Substances (PFAS), also known as Forever Chemicals, which do not degrade easily and have adverse health effects.
Robert's goal was to develop an app using FastAPI +that SQLModel that allows non-programmers to explore PFAS data and filter molecules based on specific criteria.
A goal he achieved with us in our Pybites Developer Mindset (PDM) program in which he got this app done (MVP status), enhanced his coding skills, and found a supportive community.
He mentions the guidance he received from his coach in architectural design, project planning, and best practices for developer collaboration.
Robert plans to deploy his app soon and hopes to involve more contributors in the future.
Last but not least, Robert's project showcases the power of Python, FastAPI, and PDM in solving real-world scientific challenges (Forever Chemicals!) and making data analysis accessible to a broader audience.
Enjoy our interview with Robert Young!
Chapters:
00:00 Intro snippet and music
00:42 Guest and episode intro
01:20 Introducing Robert Young
04:08 Current research field
08:12 PFAS or "forever chemicals"
09:05 The effects of PFAS
12:00 PDM journey and PFAS project
16:36 FastAPI + SQLModel
19:44 Other wins and takeaways from project + PDM
23:24 Tutorial paralysis / Pybites approach
24:50 Using same approach for new tech / next steps app
28:28 How to reach out to Robert
30:00 Book: Manufacturing Consensus
32:00 How do we good information (social media)
35:00 Thanks for joining us today
35:33 Outro music
Links:
- Reach out to Robert via email
- PFAS (Per- and Polyfluoroalkyl Substances)
- EPA website
- ECHA website
- Australian government website
- SERDP podcast (sponsors of Robert's PFAS research)
- Book mentioned: Manufacturing Consensus
- The PDM program
So for that, I felt like Python will be great. I wanted to use fast API to create an API that would allow people to search a PPAas database and match against their own data, or just explore a PPAAS database. Hello, and welcome to the Pibytes podcast, where we talk about Python career and mindset. We're your hosts. I'm Julian Sequeira. And I am Bob Baldebos. If you're looking to improve your python, your career, and learn the mindset for success, this is the podcast for you. Let's get started. Welcome back to the Pibytes podcast. This is Bob Baldebos. And today we have a very special guest and episode. So with me today is Robert young. He's a chemist. His profession and his passion and his specialty is pfas or forever chemicals, a pretty scary thing, as we will find out throughout this episode. But he joined PDM, and throughout his journey in PDM, he built an app that is helping him and his colleagues and research in the field to solve this problem. So without further ado, enjoy the interview with Robert young. Hello, everybody, and welcome back to the Pieby Bytes podcast. This is Bob Baldebols, and I'm not here with Julian this week. I have a very special guest with me, Robert Young. Robert, welcome to the show. How are you doing? Oh, very well, and thank you very much for having me. Yeah, it's a pleasure to have you on. So today we're going to talk about your background, PDM, a couple of more things. So, yeah, maybe to start it off, do you want to introduce yourself to our audience? Yes. So, my name is Robert Young. I'm actually the director of an analytical chemistry lab at New Mexico State University, where I mostly study, where I mostly do chemical analysis and basically run mass spectrometry experiments, which I can describe more later. So I'm really not your typical python programmer. And even to take it further, this is a sort of second career for me. Before this, I was a lawyer, which is also not your typical python programmer. Wow. Well, maybe I have to ask how you went from lawyer into your current field. That's already interesting. Yes. So it really, I think, was, I loved going into law initially because I've always been very into learning different things and understanding how things work. And law was kind of interesting field for me. I started studying business, which is really what my father did. Then I got to a point where I thought, okay, my two options are masters of business administration or law school. And I chose law school because I thought it was broader. I was interested in environmental law, but then learned that, you know, there's so much of that that has to do with permitting and things like that, or just litigation that I wasn't so interested in. So I ended up becoming a business lawyer with sort of a tax planning emphasis, which is all very exciting for most people. And then I think somewhere along the way, I just. I've always been interested in environmental causes, and I've followed many. And I used to complain about things that I've. That were policy matters or different things that, you know, I didn't like. And my ex wife used to always say, if you really cared, you do something about it. And at some point when we split, I thought, okay, now's my big chance. I'm going to go back to school and learn how to do this. And I didn't want to be a policy person. I wanted to understand the science. So I started studying basically soil and water and got really into the chemical analysis part. Awesome. Wow. Yeah. So maybe also you want to expand a bit on your current field and what you're researching, because. Yeah, I'm not a chemist actually in either. I switched from chemistry to the other, more economics things, but that's another story. But, yeah, maybe expand a bit on what you do these days and maybe also leading into how you came up then with the project you came to us for, because that's all related to these chemicals that never go away. But anyway, maybe getting ahead of ourselves. But, yeah, I will let you kick it off. Yeah. How it really started for me going this way. When I went back to school, the group that I joined were studying what they call endocrine disrupting chemicals, and in some cases, that steroid hormones, if it's natural, even synthetic hormones, like with birth control drugs, and then in some cases, it's some plasticizers and other things. But my project became basically learned how these things break down in the environment. And so we were studying mostly how chemicals break down in sunlight. And the idea would be that they get discharged from wastewater treatment plants, they get into the water, does the sunlight remove them? Does it transform them into something worse? Or does it do nothing at all? And they just kind of hang around. And in the course of studying that, that's where I started learning how to do the chemical analysis part. And I had seen CSI and NCIS and things like that, and on those shows, they take a sample, put it in the instrument, get back immediate results, and it works great. My first experience of performing a chemical analysis was I put the sample in and we saw nothing and we should have seen something. So it's like, okay, this is already harder than it looks like on tv. And maybe from there, I think my project kind of developed. There was a real interest in how the natural organic matter and water influences these reactions. So I wanted to study that, and I learned that natural organic matter is really just a general term for mixtures of tens or hundreds of thousands of different chemicals that are all there, because plants, and especially plant material, breaks down to different degrees and microbes die and break down. And so there's all this sort of leftover molecular soup and complex to study, but we have some really fancy instruments that can look at really complex mixtures and see not everything, but a lot of it. So once I started working on samples like that, I was getting back spreadsheets with, like, 20,000, 30,000 individual molecules that were detected and then asked to compare different samples. And I started doing that just in Excel spreadsheets, and it's a monstrous undertaking. And so I knew somebody who was an r programmer, and I decided I'm going to learn to do that so that I'd be able to analyze the kind of data we get more easily. And that's where the coding thing took off. And at some point, I felt like I heard a lot about Python also was maybe attracted to, you know, I was always a Monty Python fan, so even the way it got its name made me, you know, laugh and kind of attracted to it, but. So our new project is really studying poly and per floral alkyl substances, which is a long name for a complex mixture also, but the abbreviation is PFAT, and the nickname is forever chemicals. And the thing is, the fluoro part means there are a lot of fluorine molecules and carbon and fluorine bonds are very strong, so they don't degrade very well. They. That's also a reason why they're used popularly, but. But when they get into the environment, they stick around, and there's a lot of things that we could go into, if you want, about that, but that's what we're studying. And some of the people involved were python coders. So I thought, okay, I've always been interested. Maybe this is now my chance. And I do really love working with the language. So python has kind of become my thing. Awesome. Yeah. I mean, it sounds pretty bad, but not being an expert, what is actually the effect of these? The fact that they don't dissolve or break down? What's the impact on the environment? One thing is that we can now find them almost everywhere. They find them in the arctic, they find them in the Antarctic, they found them on top of Mount Everest, possibly because of climbers though, and they found them in deep ocean trenches. More importantly, some of them are known to have adverse health effects and they find them in blood of animals that they test, including like polar bears or seals in the Arctic. They find them. I think I saw a study a while back where they studied the blood of I don't remember how many thousand people, but maybe like 6000. I should have looked at that and I could tell you about it later. And some of them, the most common ones, were in 99 or 100% of all samples, men and women, children, all races. And when they did a study on breast milk, kind of around the same time or along the same lines, but a lot lower number of people sampled, again, there were some of the same ones that ran like 98, 99, 100% of all samples that they looked at. So we've been being exposed for a while, and they've been accumulating for a while kind of stealthily, because I think I just read the other day that they started manufacturing them in 1949, but it's really only became an issue now, which kind of goes to an interesting part about chemistry too, that it used to be the way we did everything was we would know what we wanted to look for, we would buy a sample of it, we would see what it looks like on our instrument, and then we would go take samples and look for that signal. So it's pretty easy, if we didn't know what to look for, to miss things. Now the instruments can acquire data so rapidly and effectively that we can take almost the opposite approach, collect as much data as we can first, and then go try to figure out what's there. But that then becomes a monstrous data analysis problem. And for people who are busy studying just how to run the instruments and what kind of chemical reactions I might see in the environment, to all of a sudden become part data scientist is tricky, but that's pretty important in my field these days. Yeah. Wow, that's pretty shocking. Then that's we're all exposed to it and it has, that's where theory effects. So, yeah, maybe you want to tell me a bit about then. So you came to us, you were in PDM, you did a project related to this. So what was the goal of the project? And maybe you want to also. Yeah. Highlight some of the technical features, how you did it and how it's now helping you in your field. Yep. So I think I originally came to you guys I partly, what I do in learning coding stuff and just learning about different packages and things that are out there is listening to podcasts. And I found yours. And I really liked the kind of, first of all, the idea that this program was available because I felt like I was sort of operating on my own, doing a lot of self teaching, which I've always been a big believer in anyway. But I felt like I would love to interact with some other people who are maybe even more experienced than me and who could teach me a little bit about best practices and things like that. And then the, I feel like I've always been fine on the mindset side, you know, and, you know, I can go in and learn things, and I don't expect to know everything in the beginning. I can kind of, you know, go through the process and all that. But I did like all of those, all of the conversations around mindset, too, because I think that's so important to learning. So these are all things that attracted me to want to join the program. And if you work in academia, there's not, you know, you can't always just say, oh, I need, you know, money to do this kind of training or whatever. There's often not a lot of money for that. But in this case, there was a program that we work in called the Embra program that ultimately is something that the National Institute of Health supports, and they have some money available for data oriented things. So I got support from them to participate in the program, and it's been a very good experience. My goal has always been, I've been able to do sort of the data science part for a while, but I've worked in groups where some of us knew a little coding and some didn't. And the same tools are not accessible to people who don't code. So my goal with this has been, let's take some of the things that we've done before, but ultimately build an app where people who are not Python programmers can explore the data and can upload their own data and get, like, if they're studying PFAS, they can upload data about what they've detected on their instrument and get back, you know, what are possible pfas that you want to explore, you know, more carefully. And I wanted it to be something that anybody could use. So for that, I felt like Python will be great. I wanted to use Fastapi to create an API that would allow people to search a PFAS database and match against their own data, or just explore a PFAS database to explore it, it might be like I want to upload this. Well, what would be an example? It might be I want to look for all the Ppath molecules that contain nitrogen because there are some certain structures that might be interesting to people. So I wanted somebody who's not a programmer to be able to go to a website, ultimately pull up a bunch of pfas and then filter by nitrogen and see what they all are. What's interesting about them too, from our perspective is they can be very related to each other. Like they can basically have this like chain of the fluorine, carbon fluorine molecules that can vary in length. So if you have a whole bunch of them that occur together that all just keep varying in different lengths, then they all differ from each other by a very characteristic mass value. And if you know one, you can say pull up all the ones that relate to it. So that was another thing I wanted that to do so sort of things like this. I just wanted it to be accessible. And I thought that fast API, it seems like an exciting new framework. It creates an API that people who are coders can query and that a friend could ultimately query to. And it's a very cool package. So I really like using it. I was using SQL model with it also in a SQL database and SQL models knew enough that there were sometimes a few little things that were tricky to find examples that I could use. But it's also, I loved it. And then you don't have a separate schema for your API and for SQL alchemy or for your database. So I'm a big fan of that. I'm a big fan of fast API. I think the API is in pretty good shape, mostly due to participating in PDM. And now we're learning a little bit of front end things. Awesome. Yeah. And it's now you have deployed it as well, right? So people can use it. Not yet at least the API is deployable, so maybe that'll be soon. Kind of been adding a couple of other things and then I think I'm going to deploy it so we can use it on a project. Another side of this project we've been awarded. We haven't actually received all the funding yet, but that's about to change. All the contracts have been signed, which means that program is going to kick into high gear and nice. And my colleagues are already working on some things, but now they're going to start sending me data. So we'll need this soon. And they got, maybe I can mention too, my colleagues are people at the people at CSIRO in Australia, which is the national science agency there, and at Florida State University, they have a facility called the National High Magnetic Field Laboratory with the most amazing mass spectrometer in the world in my opinion. And they're already getting ready to run samples and produce data that I'm going to get. Thing that's awesome. Massive project and also. Yeah, goes to show, right. Like it's a real world thing. Yeah. And through PDM you build it out and now it becomes this important thing in chemistry, becomes available to the world. And it's powered by fast API. API. SQL model for the database. Yeah, I agree with you. Like, it's, when you use fast API, it's very good to use SQL model as well. Same author, same creator, great documentation. And you don't have that problem of pydentic schemas and SQL alchemy models defining that in two places. You can do that now in one place. So it's a really nice package. Yeah, no, exciting. And thanks for sharing and yeah, we're excited to see how you will release that and how it will get used and all that stuff. So, yeah, major win. Yeah. I even have an example where I was kind of separating the models into multiple files and it seemed to make sense to me to do that. But in some cases each model was using components from the other. So I was having sort of these circular definition problems. And my PDM coach Hugh tipping kind of recommended that, you know, as long as it's not going to be, you know, this giant voluminous, you know, file with models, just put it in one and I can avoid that problem. Who knows how long it would have taken me to try that because I wasn't finding clear guidance on this problem when I was doing like Google searches. Now I guess there's also chat GPT, which is another thing. I'm fine, I'm loving using that for questions like this. But his suggestions probably saved me a lot of time. And I, and if I were to go further, the PDM program, I did want to learn a little bit of best practices. I learned to use, you know, projects and GitHub and, you know, basically create new branches and, you know, do pull requests and all those kind of things. And my goal was always, I want other people to be involved in this. I don't want it to be a solo effort. So I want it to be set up in a way where people could join, maybe even at some point, people who have no interest in chemistry but just want to contribute to a project and they have skills that I don't and learning how to run it in GitHub like this, like a real development team would do, was kind of a goal of mine, and I think he really helped with that. And you initially as well, when you were kind of advising me how the program works and how we would set it up and all that. So I'm very happy with those experiences. Happy to hear. Yeah. So it's not only the code per se, but also the approach and the tooling that developers use in the workflow. Right. Working as a team and planning a little bit more, including in the project. But I think my plan was always architects, whatever I feel like at the moment. And so it's made me think a lot more and make it more concrete by putting to dos and things like that up there and thinking through some of the stuff we did, the mind mapping thing as well to kind of vision the project. And I also think that was very powerful for me. Yeah. I mean, the technical code and stuff is valuable code reviewing, but sometimes what really is zero to one component is the design and architecture discussions. Right. To really step back and really think it through together with a coach and brainstorm instead of just jump in and cowboy code and, you know, that was my way, put some thought in, into it right up front, ideally. Yeah. And I do like that you guys always encourage, just get it on paper. I have a real sort of perfectionist mentality, too, even though I do feel like I, you know, I don't expect too much initially. Like, I don't beat myself up for not being, like, the world's greatest, you know, python expert because I don't have that level of exposure to it. But I do like to be very concise and efficient and, you know, I. I like a lot of order and all of that. So I can be a bit of a perfectionist and I think always remembering to get it to work, and then you can, you know, go back and refactor and improve. And all that is valuable because that is one of the ways you can get a little bit paralyzed and take too long. I know you guys talk a lot about tutorial paralysis and that, too. I definitely have taken a lot of them. I feel like at some point I started just trying to learn some and then code with what I learned and at least not be. Not end up with 100 to do apps and no chemistry oriented things. So I think I was learning to get beyond that a little bit even before I started the program. But the emphasis on that helps a lot with me, too, on just moving ahead. Yeah, yeah. Put the thing upside down. Right. Like code in the context of your project, use the resources as you go instead of consuming whole tutorials. We kind of flipped that on its head. Right. So. Yeah. And it sounds like you're still doing that after PDM. Right now, how you've approached things day to day, you just keep using that technique. How has it changed your approach? Yep. And I ultimately want, we have a paper on PFAS oriented work where we were using a technique that involves basically network graphs of the chemicals. And, and they're connected by the, by these characteristic math differences between them. So, you know, like if, I don't know if you had, what would be an example? Well, our PFAs is like, they differ by CF two. A lot of these formulas that weighs about 50. So we look for patterns of differences in 50 and in the data that we collect, and then we can say, okay, we want to look for multiple patterns, so we want to look for differences of 50. We also are interested in knowing is there a sulfur presence. We want to look at differences between sulfur 32 and sulfur 34, which are different isotopes. And if we see like this big chain that all differs by 50, and then a bunch of them are connected to something that differs by that other amount, then we can say, okay, this is probably a p pass that contains sulfur and you can expand that to look at even more massive. So not only do I want to be able to produce that data, but I want to be able to plot it. And for, that's where I want to be able to get a little fancier on the front end. And I have been spending a little bit of time learning to use felt and d three now, um, as well, and felt because it's new and it's supposed to be relatively easy to learn. And I think it has been. So I've kind of done, I'm doing exactly what you said again, but now with this framework, and then I'm just at some point I'm going to approach somebody who's already really good at this and ask them, can you jump in and at least help on this network graph interface? Because that's going to be complex and I would like to deploy it soon rather than a year from now. Yeah, you got to be an expert in everything, but, yeah, it's good to hear that. New framework, new library, same approach to learning. Yeah. And do you also, I guess it's also easier to now plug in any front end now that you made an API. It's more flexible, right? Yep. And I think sometimes for some of us, we may or may not even use it. I think it's nice to be able to, you know, like I already used. It's still only on my computer, but I already use the fast API interface just in the docs to like, oh, I need to do a quick, quick search on a data set, so I'll upload it and get back all of the, you know, potential PFAs aspects in it. I'm already using it effectively and we do a lot of, in my group, some of us are real coders, so it might be easier just to like work in a Python notebook and free the API directly. But like, the group I came from, my boss was an encoder, and he wanted to see how he's wanted to see the data and images. I want to be able to give that to people, too. Awesome. Yeah, no, cannot wait for this to go live. And maybe we can feature it on the PDM projects page as well if you're up for it or have a quick demo somewhere, like spread the word, right? Absolutely. Proudly produced. Yeah. Nice. Yeah. Thanks for sharing. Yeah. Where can people reach out to you if they want to further talk about this with you or any. What's your preferred way? Well, it's probably the easiest way is by email, which is, and you can put it in the notes, but it's rb youngmsu.edu, so that's basically my initials and my university, New Mexico State University. I guess once we do go live, it'll be easy to contact us through the website as well. But I don't have that anything like that set up currently. And New Mexico State University, I guess we have a. You could search for me that way as well. That works. Yeah. And email works, and I can always go back to the description and update it with the link to the API or whatnot later. Three main, or whenever that is. Yeah, that works. It will be, the API will be called PFAS KImDB, and then the user interface ultimately is going to be ppathchem.net, which partly because it's doing molecular networking stuff too, and partly because it was available, but those will be the two site names when it's fully up and running. Awesome. Yeah. And again, really cool project. Kudos and really cool seeing you built this with us in PDM. So we're running a bit short of time. But do you lastly want to share what you're reading? Well, I am reading right now a nonfiction book that I picked up at the Strand bookstore in New York last week. Called manufacturing consensus, which is basically a book about how people have used social media to, in a very propagandist way, ultimately, manufacturing consensus comes from a, it's a follow up to a book by other people called manufacturing consent that was about propaganda before the age of social media. And that book was based on some theory in that field where somebody talked about this idea of manufacturing consent and what does it mean? And it's just been interesting to me because, you know, I've seen so much like, I got into science in part because I felt like one of the goals of my, of my work is to give people good data that they can use for decision making and we can use to make good policy and all that. But when you see, like, a lot of conversation around vaccines, you know, whether they're safe or not, even though, you know, the seminal study that questioned whether vaccines were causing autism have been debunked many times over at this point. But people still believe that because it got out there. There's all this debate about climate change that despite very strong consensus among scientists. So I have always been interested in, and you saw it in COVID, too, where there was a lot of information going around that was suspect. And so I got interested in that topic and how do we get good information to people and what is going on that it's so easy to spread bad information sometimes. That's ultimately what this book is about. And the use of bots and the use of fake accounts and all that. I think there was one example in the book where somebody, one group controls like 40,000 fake accounts and their businesses, people hire them to generate buzz basically in social media. And people think that, I don't know, it's easy to fool people with all that. So the book is very fascinating because it kind of goes into a little bit of how this is happening and who's doing it, and it interviews some of the people who run these platforms and things like that. It's just an interesting book. Yeah. Yeah. From the title, I'm not sure if I'm interested, but I'm happy that you expanded on it because that sounds pretty fascinating because it comes increasingly difficult these days, especially with AI and the amount of news that's putting out. But to really see what's real and not, you know, so it's definitely something to spend some time on thinking about, right. Where. How accurate our data is and how we can be insidiously manipulated, it's very dangerous. Yeah. So, and it's a, you know, the way they describe propaganda, it's got a negative connotation. But originally, you know, it was really just trying to persuade people, I think was the way they describe it in the book. But it's obviously often done with false information these days to some end. And that part does concern me. Another side of it is sometimes it's used to harass people that have viewpoints that somebody wants to challenge until they'll be quiet. And in another, in other cases, it's kind of used to pollute the good information with the bad and make it so voluminous that nobody wants to, you know, engage. And it's. It is kind of a problem of our time that, you know, it's, I think all of the social media and ability to connect to people and all the information that's out there is awesome. But this is one of the downsides. Yeah. This can be another podcast episode topic for sure, in and of itself. Yeah. Well, Robert, thanks so much for coming on the show today and sharing your background, your PDM experience, and that interesting book finally. Yeah. And it was great having you. So thanks for sharing. It was my pleasure. Thanks very much for having me. And if anybody does want to get involved in any of our PFAS projects, especially once we go live, don't, don't be shy. I would love to have more people contributing and if they want to know about what you did with fast API to get as well. Right? Yep. Okay. Well, thank you so much. Thank you too and have a great day. Cheers. You too. We hope you enjoyed this episode. To hear more from us, go to Pybite friends, that is Pibit es friends, and receive a free gift just for being a friend of the show. And to join our thriving slack community of Python programmers, go to Pibytes community, that's pybit es forward slash community. We hope to see you there and catch you in the next episode.