Pybites Podcast

#205: Building reactive Python notebooks with Marimo

Julian Sequeira & Bob Belderbos Episode 205

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 53:39

Marimo is redefining what a Python notebook can do—bringing structure, version control, and interactivity together. In this episode, we chat with Akshay Agrawal, co-founder and CEO of Marimo, about how their reactive Python notebook fixes hidden state, keeps outputs in sync, and makes reproducible, reviewable code the norm.

Akshay shares Marimo’s origin story, how its reactive DAG turns notebooks into clean, Git-friendly tools, and why teams are ditching Jupyter-to-Streamlit pipelines for simpler, reactive workflows. We also dive into performance, data handling with pandas/Polars via Narwhals, and SQL reactivity with DuckDB.

Join us in this insightful episode as we talk with Akshay about reproducibility, data workflows, and turning prototypes into shareable apps.

For more info on Marimo, reach out to Akshay:

Website: https://www.akshayagrawal.com/

Github: https://github.com/akshayka

LinkedIn: https://www.linkedin.com/in/akshayka/

X: https://x.com/akshaykagrawal

______

If you found this podcast helpful, please consider following us!

Start Here with Pybites: https://pybit.es

Developer Mindset Newsletter: https://pybit.es/newsletter 💡

Pybites Books: https://pybitesbooks.com/

Bob LinkedIn: https://www.linkedin.com/in/bbelderbos/

Julian LinkedIn: https://www.linkedin.com/in/juliansequeira/

Twitter: https://x.com/pybites

Apple Podcasts: https://podcasts.apple.com/us/podcast/pybites-podcast/id1545551340

Spotify: https://open.spotify.com/show/1sJnriPKKVgPIX7UU9PIN1

Welcome And Guest Introduction

Akshay

We we get like companies coming to us saying like, oh, Marimo is so cool, like bigger companies, and they're like, How do I use it effectively in my organization to connect with all my private data sources and my heterogeneous compute engines? And how do I, you know, respect, you know, how do I have multi-user authentication, RBAC, how do I spin up compute on my Kubernetes based cluster and spin it down? That's the kind of stuff that we constitively commercialize, like helping people interact with their own data and it competed to a remote in their organization.

Advert

Hello and welcome to the Pybytes Podcast, where we talk about Python, career, and mindset. We're your host, I'm Julian Sakura.

What Is Marimo And Why It Exists

Bob

And I am Bob Beldebos. If you're looking to improve your Python, your career, and learn the mindset for success, this is the podcast for you. Let's get started. Hey, and welcome back, everybody, to the Pie Bytes Podcast. I'm Bob Eldebos and I have a very special guest today, Akshay Agravel. Welcome to the show. Thanks, Bob. It's a pleasure to be here. Yeah, it's a pleasure to have you on. Um, we're going to talk about Maremo, of course, which you uh built, co-founded, um, and of course your story. And um yeah, so maybe we can dive straight into it. Um yeah, for our audience, can you quickly introduce yourself? Definitely.

Akshay

Yeah, so I'm Akshay. I'm the co-founder and CEO of Maremo. Um uh day-to-day is a little bit of everything. So Maremo is um maybe seven bag. Marimo is a new kind of open source Python notebook. It's a reactive Python notebook. Um the key thing about it is that it feels like a notebook, but it's actually stored as a pure Python file that you can share as an interactive web app, or you can run as a script, or you can reuse as a module. Um, so it's a notebook, but it's also not just a notebook, which we can get into in in sort of many different ways. And day-to-day, oh, it it it changes day-to-day. So I spend my time wherever I'm most needed. So we're we're a team of um about seven folks. And so sometimes I do engineering, sometimes I do product, sometimes I help spread the word about Marimo. Um, so wherever I'm most needed.

Win Of The Week: Restaurant Dashboards

Bob

Yeah, awesome. Yeah, I only recently heard about it, but it's uh it kind of puts the whole Jupyter notebook and the design upside down, right? So I definitely want to dive into it and and get a bit into the core design. And uh you've really, you know, you have you've spoken about it on conferences. There's there's great articles on on the Marima blog as well. But yeah, um, before we do that, do you have a win of the week? We always start with a win uh of the week.

Akshay

I actually do have one that we can get into later, too, because it's kind of interesting. Uh the win of the week is that I helped onboard the director of operations of a local California restaurant chain. Um, it's called Mama Hoohoo. I helped onboard him onto Marimo. So he actually had just started learning Python, probably in the last few weeks. And he was using, you know, Jupyter. I was like, oh, wait, have you heard of Marimo? And he's like, What's that? And so I showed it to him. And um, there was a lot of things that sort of wowed him, such as like uh our built-in data frame table, which is far richer. Of course, the reactivity when you run a cell in Marimo, other things update, and then the ability to build like data applications, which was really useful for him because the whole point he was getting into coding was so that he could make dashboards and internal web tools for his restaurant team to look at to analyze his profit and loss data. And he told me that I saved him from going and buying a Windows machine to run Power BI because he wanted something more interactive. And um, I thought it was it was really inspiring for me to that that the tool that that we made that you know was originally designed in some sense for like quite technical users, is now at the point where um we can get folks who are really not that comfortable with Python at all using it to make their restaurants run more efficiently.

Origins And Pain Points With Jupyter

Bob

Um so that that that was really inspiring. That's such an awesome story. It's a really cool story. Yeah. Nice. So yeah, let's let's uh dive a bit more into the backstory. What what inspired you to to build Mi Remo? I think it was two years ago more or less. Um so yeah, tell us a bit about the the pains and uh yeah, what what made you spend uh two years now, if I have that correctly, on building Maremo.

Reactive DAG And Execution Model

Akshay

Yeah, definitely. So it's actually almost four years. It's been a long time, yeah. Um so okay, maybe a little bit about my own background, because it might help sort of for the inspiration. So I have a background in like uh computer systems, uh, but also like machine learning and data work. So at different times I wear different hats, but um I really like working on software and building tools for other people. I also like working with data. So I used to work at TensorFlow. I Google on the TensorFlow team where I worked with data flow graphs a lot. And then I went and did a PhD at Stanford in like machine learning and convex optimization. And there at Stanford, I wrote a lot of you know, library code, like uh, but but I also spent a lot of time in Jupyter notebooks because I needed to interact with my data while I worked on it, like visualizing embeddings or like testing and optimization algorithms. Like that interactivity was really useful. Um but at the same time, there, you know, I I guess maybe being trained as a software engineer, um, there was like a lot of issues that I encountered while working with Jupyter notebooks that sort of I felt like slowed me down. Um and and these issues are sort of like documented widely by others on the internet. I'm not the first to notice them. What one big one was like um reproducibility. So like a Jupyter notebook, you know, when it's you know, you're looking at it and it kind of looks like it's a program. You have code arranged from top to bottom. But the fact that, you know, when you interact with it, you run a cell one at a time and like mutate this hidden uh mutable workspace. Because of that, you can accumulate like hidden state and uh to the point where like you come back the next day and you run your notebook, you may not get the same results. So there's this famous study from 2019 where um some academics at NYU, I believe, downloaded like 10 million notebooks from GitHub or a million notebooks from GitHub, tried to run them all, and then found like only a quarter of them could run, and some small fraction of them actually ran in the same way, like produced the same outputs in in the notebook file. And those are the statistics, but I also just experienced this like day-to-day in like my research. Like your co-author gives you a notebook, they're like, here's how I generated the plots for the for the for the uh final chapter. And I just I can't run them, I don't get the same results. So that was frustrating. Um, and then other things that were frustrating were like like Jupyter notebooks because of the file format, like they, you know, you can't version of them git, you can't reuse them as scripts, you can't reuse the logic in them. And so me and other people would just, you know, just just copy the notebook instead of like using the notebook as a module, you just like kind of duplicate it, and you have like untitled one.ipymb, untitled two.ipymb, all the way up to like untitled 42 iPy Mb in your folder, and it's just kind of a mess. And then and then the other thing that I that sort of I really wanted to solve for was like working with data like more interactively. So like Jupyter notebooks, traditional notebooks are interactive, as like you can write code, run a cell, see outputs. Uh but like I like they don't have great integrations with like widgets. Like um that they're very hard to use. And like I I wanted to basically do something where I could like make a selection with my mouse, like on a scatter plot, and then immediately see the selected points as a data frame and like keep on going from there. So I wanted like really, really interactive workflows, and and that was something that was missing from Jupiter as well. Um and so so these are sort of all ideas rattling in my brain. And then I got the timing was really great because so I decided, okay, I want to start a company around this as a way to sort of sustainably work on a new kind of notebook because I realized that tons of people in industry start their work in notebooks. So I'm like, okay, this is probably something valuable to build. I want it to be open source because I I truly believe in in the virtues of open source. And next I needed a way to to fund myself to start working on it. And so I got kind of lucky that I um I linked up with some scientists at Stanford uh who I was explaining these ideas to them, and they were like, oh, this is amazing. Like, you know, we use Jupyter a lot, but for all the reasons you mentioned, it's been like really hindering our scientific reproducibility, also our shareability of our analyses. So they're like, you know, we'll we set up a contract where like they funded uh my company as a subcontractor to to work on this kind of work to work on Marimo. Uh and um that that sort of kickstarted the whole thing. And uh and and yeah, from from there, there's a lot of stuff that happened between that and almost four years ago and today, but that's sort of the origin story.

Bob

Wow, that's that's amazing. Yeah. So uh yeah, congrats on on accomplishing that. That's that's a big uh feat. And uh I see it's as a very uh valuable tool. And yeah, especially it it's it definitely resonates like if you have a notebook with Jupyter, right? And and it gets out of order, or you get the typical stack trace, a variable is not set, or or it's even clobbered, right? And then it's kind of sneaky as you say that your graphs might show something entirely different while it still runs. Um so I think you solve that problem with the directed acyclic graph or DAG. Um, so you make sure that one thing runs before the other, so you create kind of this dependency graph, right? Can you talk a bit more about uh yeah the internals and and some of the design decisions?

File Format: Pure Python, Git Friendly

Akshay

Yeah, definitely. So that that's absolutely that's right. So that's that's the main sort of architectural thing that's different from like a Maremo notebook and a Jupyter notebook. So in a Maremo notebook, before running your code, we statically parse all the code in each cell, um, and then build a data flow graph on the cells. So we know um for each cell, we know what variables the cell declares or defines and what variables it reads. And then from that, we build a graph where there's an edge from one cell to another if the second cell reads any of the variables that the first one defines. And so the effect of this is um like in our runtime, if you run a cell that says say x equals zero defines some variable x, remote knows that okay, that cell defines the variable x. And now in order to keep the notebook consistent, I need to run all other cells that read the variable x. Uh and um so our runtime will will pick that up from the graph that we've already parsed and then run all dependent cells and then transitively run all their dependent cells. And so the effect of that is that your code and your outputs, your visual outputs, like stay in sync, um, solving this problem that you mentioned of uh in Marimo, it's not it's difficult to clobber variable state or to accumulate hidden state for this reason. Um and so that that's the core thing that's different. Um just as a sidebar, like sometimes people when I mention this, people think that oh, if my notebook is really expensive, this might not work because it'll automatically run things that I'm not yet ready to run. We do have like a lazy execution mode that'll just mark dependent cells as stale in the user interface, and then you can click one button to bring them up to date. Sidebar. But um, but yeah, so this is the main engine that that Maremo has that's sort of uh which we build everything on top of. So this is, and we can get into this, but this is what this reactive engine is what we call it. Reactivity is what enables you to create UI elements with one line of code that are automatically synchronized from the front end and the back end, no callbacks required. It's what enables Maremo to run any notebook as an interactive web app. And then it's also what enables Maremo to like run notebooks as like Python scripts or reuse them as modules.

Bob

Yeah, interesting. And the mutability, right? Like uh we're in Python, we can just overwrite anything and unless we do type checking. Kind of reminds me of Rust, right? And that if you want to mutate something, you have to explicitly say MUT, right? And uh it just goes to show, right, how mutability yeah, makes it so easy and flexible, but it's also your biggest curse, right? Um but also this shows uh and I had a similar conversation with Charlie on the podcast recently about TY, how important uh for it was for building a static type checker to really get the design right from the start. And it feels like with this, it also was a lot of thinking up front. And I think you documented this well as well in your lessons post. And I think you even linked to a bunch of design uh notebooks or or HTML files, but with a ton of thinking you did up front, right?

Akshay

Yeah, there was a lot of upfront thinking before really any lines of code were written or any substantial lines. Um, and I guess yeah, most of this upfront thinking was in 2022, so a long time ago. Um yeah, I I wrote a bunch of these like markdown files with like um how the runtime should work, have um um conventions for the file format, things like that. Um one thing that honestly made, and I think this is true for like many kinds of you know, new products or research, even that that that's that's done. One thing that made it a lot easier was that there was good prior art to draw a lot of inspiration from. Um so of course there was Jupiter, but then there were also these like really cool notebooks and other languages. And the one that I drew the most inspiration from is called Pluto JL. It's uh it's a notebook for the Julia programming language. Um, I think originally created Creator's name is Fonz. Um, but it has a lot of these ideas in there. So it has reactive execution. It's also stored as a pure, in their case, a pure Julia file. Um, and it also has these interactive elements that you can create um without callbacks. They don't go as far to like let you share your Julia notebook uh your Pluto notebooks as web apps, but you know, that was sort of a small conceptual leap. And then I saw tools like Streamlit and I'm like, oh, okay, like people like to make you know web apps with uh with Python. A reactive notebook is basically a web app. Just let them anyway. So yeah, the prior art, there was a lot of prior art studying and then from that design doc writing and um and and uh back and forth with you know some of the folks at at Stanford as well as just friends of like you know, brainstorming what the design should be.

Bob

Yeah, get get users on it as soon as possible. But it was also your vision of like, well, this can be a notebook, um, but this should also really be an executable script, right? And it should also do like more interactive stuff, like almost bordering like web app style, right? So I think you had that vision from the start that it should be three things in one, right? That's exactly right, yeah.

Akshay

Yeah, not not just the notebook.

Bob

Yeah, yeah. Nice. Um, yeah, anything else on the design, or uh shall we move on to uh integrations, adoption, migrations?

Data Frames, Narwhals And DuckDB

Akshay

I think maybe one thing about the design, yeah, no another thing about the design that is probably worth calling out that there's less probably obvious to see is um like the file format. So like Jupyter notebooks yeah, yeah, yeah. So Jupyter notebooks are stored as um JSON, right? With like you know, the code and and like outputs like serialized in them. Um and this is good for like viewing a notebook on GitHub, say, but it's not good for treating your notebook as software. So Merimo Notebooks are stored as a pure Python file. Um but like unlike unlike say like some of your listeners may have seen like the I think it's called like the Pi percent format, where like it's stored as like a flat script, like a notebook is stored as a flat script. In our case, actually, the way that the file format is structured is that each cell is um like wrapped in a function, and there's like a little decorator that collects these functions. But each cell is a is a function um that maps its variables it reads to the variables it defines. And then there at the bottom there's an if name equals main guard um and then like a app.run, like r run the notebook in topologically sorted order. Um the reason this is relevant is that like so okay, so one, we also designed it so that small change to the notebook yields a small diff, so you can uh guarantee that you can version this with Git um uh well. But the other reason that this matters is that like by because we wrap each cell in a function, you can actually import your notebook as a module without just running the whole notebook. So you can just import it as a regular module. Uh and then one of our engineers, Dylan Mattissetti, did some really cool work where like um if any of your cells um just declare a single function that is, roughly speaking, pure, um, with some exceptions to to increase the set of functions. But roughly speaking, a pure function, that's actually just saved top level in the file. So if you have some function, silly example, but def print hello world and it prints hello world, um, it'll be saved top level in the file, and then you can just in another Python module say from my notebook, uh import print hello world, and then just use it as a function. Um it's a small thing, but it gives you like composability in a way that people have never really had before with notebooks.

Bob

It's pretty cool. Like it's hybrid, right? Like you can either then still run it as a notebook, but you can also treat it as a module. So you have two things in one, right?

Adoption, Use Cases And Switching

Akshay

Yeah, exactly. And then in general, we found that um there's been like a lot of unexpected benefits of choosing um a Python file format. Um we have this blog post on on marima.io slash blog python not JSON. Um I'm not gonna remember all of them, but like another another one you mentioned, um Charlie Marsh was on on the podcast. Um so uh UV, I think, was one of the first package managers to implement support for uh PEP723, a standard around inline script metadata that allows you to serialize the package, yeah, the package requirements of a script at the top of a notebook. And um we quickly built an integration around this. We realized that, oh, like the Marima editor if could actually track your packages for you if you run with the special flag, it's opt-in, and then save the dependencies at the top of the file using just um inline script metadata. And then now we can just you know use like Python standards, and uh, you know, so now you can UV run your notebook and then with it as a script, and then it'll load all the dependencies. And and so like this is not something that we anticipated when we designed the file format because UV didn't even exist then. And I don't know if PEC 723 was around, but that's just one one example of how choosing a standard file format you know comes with a lot of benefits.

Bob

It's kind of cool, right? Like Python evolves, and then UV is being developed super fast, and then it's open source, so we can all see it, and then we start to adopt these things in in in new tools. Um kind of the power of open source as well, right? Um it's very cool. Yeah, it raises all boats, definitely. Yeah, no, and thanks for keeping up with uh UV. Uh like UV Yeah, UV's amazing. I've never looked back.

Akshay

Yeah, yeah, the UV integration. What that was uh my co-founder Miles' idea. Okay, I think like the day or very shortly after it came out, he's like, Oh wow, this is awesome. He like implemented support premium line script metadata through Marimo and yeah, it's a pretty cool feature.

Bob

Yeah. I also really like what you said about the diffs, uh, because it seems like with Jupyter notebooks on GitHub, it's always like the mix of of code and output, right? And that just blows it up and makes it very hard to manage. So when you look at a Marimo file, it's just Python code, right? Very lean and you manage to to separate the two, which actually makes a lot of sense, right? Um to not store the the generated stuff uh in the div. Um right.

Akshay

So yeah, yeah, and you can choose to like some people still want to store the output. So like Remo has a setting. You can store outputs alongside your notebook in a separate, like actually iPyNB or like markdown file or something, or HTML. But by yeah, by default, we just produced that one Python file.

Bob

Yeah. And also a nicely uh functional programming, right? A lot of functions, decorators. So that's also uh nice.

Akshay

Marimo does encourage you in general to write more functional code, uh immutable, immutability, right?

Bob

Yeah, yeah. Nice. So um talking about integrations, uh you also integrate with Polars, DuckDB. So do you want to talk a bit about that? How that uh came about and uh how that uh makes it even more powerful.

AI Features, Copilot And Claude Code

Akshay

Yeah, definitely. So let's see, there's a few different ways we integrate with um like data frames and in DuckDB. So one is um when you like output any data frame in Marimo, like as like the visual output is of a cell, whether it's pandas or uh uh polars or even like a Pyero table, uh when you output any, you get like a really rich data frame viewer that lets you actually page through the entire data set, like no matter how many rows it is, that there's the the data is streamed from the back end. And then you can also like there's like a GUI for like sorting columns um or like filtering um columns by value or doing like full text search. And whenever you do that on the front end, what's happening is on the back end, like the front end issues uh an RPC to the back end that then performs the query on the back end using whatever data frames being used, polars or pandas, and then sends the update back to the front end so so it's performant. Um in order to support like all these different data frame libraries well, um, we use uh a library called Narwhals, um, which is a it's a Python library. It's very cool that like basically recognized that people like us, like the Marimo team and others, want to sort of generically support all the data frames, which but by pandas and polars and all have slightly different APIs. So it's like this compatibility layer that we can go through. And it's been really, really helpful for us. Um so that's one way that like we integrate with like the data frame ecosystem. And then as an aside, whenever you uh have a data frame in in memory in a Marimo notebook, there's like this little side panel, data sources that like we introspect your variables and like, oh, here are all your data frames. And then like you can click into it and see summary statistics and stuff. So it's like we really tried to design our editor for working with data. Um and then for DuckDB, the integration is a little bit deeper. Um, so we have Mirimo is a Python notebook and it is stored as pure Python, but uh we also have SQL cells, um, which are you know in the in the front end, it's like sort of uh UI sugar for like some Python code that gets generated. But you can have a SQL cell, and by default, the SQL engine or the database is DuckDB. Um and that's duck DuckDB is for I guess any listeners who might not be aware, um uh an analytical database. That it's the thing that's special about it is that it runs entirely in process, so it's kind of like uh SQLite for like analytical workloads. And so it's very fast uh and lightweight. Um and so yeah, we default our SQL cells to DuckDB. And what that lets you do is like you can query all kinds of different data sources that you may have on hand, whether it's parquet or CSV or remote storage, but you can also write SQL queries against data frames that you have in memory. And so like you can, you know, in a remote notebook, like the reactivity actually can flow from Python into SQL, and then the SQL cell emits another data frame that can be named and that you can use in another Python cell. And so it actually we have this secretly have two DAGs, like from Python, but then also into SQL and then back out into Python. Um yeah. Interesting.

Bob

Yeah, that that must be a lot of complexity to get that all working and integrated, right?

Akshay

Yeah, yeah. That the definitely there was quite a bit of complexity. The narwhal's made our life a lot easier on the data frame side. Um and then for for the SQL cells and the reactivity through DuckDB, um there's a library we use called SQL Glot for parsing um uh like the SQL grammar or like C SQL itself, and that made it a lot easier as well.

Bob

Yeah, nice. So we have uh another type, right? Uh originally we had Python and Markdown, but now you have also SQL cells. Yeah. Quick break for a note of our sponsor this week, which is Pybytes. Bye bytes. That's us. I'm here, Bob. What are we talking about this week? Well, we uh have a new coaching program, uh PDC or Pybytes Developer Cohort. We thought it was never going to happen because we have been doing one-to-one for five years, but now we can do group coaching as well. We're going to build a real-world app six weeks in an exciting cohort. We're going to learn with one of our Pybytes coaches the whole journey, but also work together and learn together. And yeah, no more tutorial processes. Build, build, build.

Open Source Model And Commercial Focus

Advert

It's wonderful. It's not something you want to miss out on. So please check out the link below, pybitescoaching.com. This is a program that bridges real building with a cohort environment, learning with other people, building with engineers. It's a wonderful thing. Check it out now, pybitescoaching.com, and back to the episode.

Bob

So, how do you balance being open source um with running a venture-backed company and shaping a community-driven roadmap?

Lessons From TensorFlow And CVXPy

Akshay

Yeah, so I would say the way that excuse me, the way that we think about our you know, our future commercialization plans is that like from the beginning we've never we've always said that we're not going to like sell the notebook in the sense that like Remo is Apache 2.0. We want it to be the standard that for for working with data. And it doesn't make sense for us to then go and like I don't know, have like premium features or like sell a slightly better version of it. Because like that would that would just not work at all. And like that would kind of ruin the the whole project. So like we're never gonna sell the notebook. Like the thing that we could sell, or you know, whatever we do sell in the future would be complementary to the notebook or almost orthogonal, like in the sense that say we we get like companies coming to us saying, like, oh, Marimo is so cool, like bigger companies, and they're like, How do I use it effectively in my organization to connect with all my private data sources and my my heterogeneous compute engines? And how do I, you know, respect, you know, how do I have multi-user authentication, RBAC, how do I spin up compute on my on my you know, Kubernetes-based cluster and spin it down, all that kind of stuff. That's the kind of stuff that we could conceivably commercialize, like helping people uh interact with their own data and their compute through Marimo in their organization, say in their VPC. Um and we like this approach because then there's no tension between the open source, what goes in the open source, and like what's commercialized. Like we're incentivized to just make the open source as good as possible, which gets more enterprises interested in using it, but they still need help with a long list of things that are not relevant for the open source. Um and so that's sort of how we keep the two uh uh priorities aligned and and not at intention.

Bob

Cool. And uh yeah, how how is uh adoption bean and how do you promote it? And I guess there's I mean, although it's it's pretty straightforward to use, there's actually a lot happening right when you go into a Maremo. Notebook for the first time, a lot of buttons and stuff. So how do you uh train people in that transition? For example, I can imagine like people um wondering, yeah, I'm so used to Jupyter or even Streamlet, right? And and when should we then migrate and and maybe have um stories or more use cases?

Molab, Roadmap And VS Code

Books, Links And Closing

Akshay

Yeah, so um so adoption. So we've been working on Marima for a while, but our I guess our first big public launch was January 2024 on Hacker News, you know, you do your show HN type of thing. And that went really well. Like at the time, we were the second most upvoted Python show HN of all time. Now we're the fifth most. I was last I checked. Um but so so that went really well. And um so since then we've I think have like five million cumulative downloads across PyPy and Conda. Uh when I last checked, we were like at 480 PyPy downloads a month. Um, we have a lot of GitHub stars. We have like 16,000 GitHub stars. Um and then I guess like the true market adoption. So we don't have like like user-level telemetry in in the open source. So I don't uh I can't tell you like how many people are using the open source. But I but one thing that we've been doing as part of running the company is just like reaching out to people who've interacted with our GitHub and learning how they're using it in industry. And um, it turns out that there's a a lot of companies using Marimo. Um so we just published actually three case studies on our blog. Um one is actually um uh Anthony Goldblum, who's the uh former founder of Kaggle and um also an investor in our company. But he he used Marimo before he invested in our company. He runs his entire like analytics stuh, like all of his dashboards internal tools all run on Maremo Notebooks, and he's replaced Jupyter with Marimo internally, um, and and a couple of other case studies. But uh the question is how do you train people to to to come on to Marimo? When is it the right tool, right? Like if they're already using Jupyter or Streamlit, is there friction? And I think in that case, like I I I can answer that like with a few examples of like people I know who made the switch and how they made the switch. Um for for different people, there's like different pools and like different like minimum bars of parity you need to clear, I think. So for Anthony Goldblum, he previously used like a mix of Jupyter and Streamlet to um you know analyze the internal data produced at his company to run his company more efficiently. Honestly, not so different from the um restaurant chain Mama Hoohoo use case I mentioned in the beginning, uh run your business more efficiently. And so he previously used a mix of Jupyter and Streamlet. Um the reason he ended up going to Marimo is that like uh in the beginning, he so when he was using Jupyter and Streamlit, he would first create his prototype his application in a Jupyter notebook and then translate it to a Streamlet application. Now with Maremo, what he realized is that like any notebook, every any any notebook that he makes, with really minimal effort, really just you change one variable to a UI element, replace yeah, import Marima's mo into your notebook and replace you know your variable uh with like you know, say x equals mo.ui.slider, and then boom, you have an application. Um so he realized, wow, it it it took whereas maybe he spent four hours or two hours porting from Jupyter Streamlet, he spends two minutes with Marimo. The thing that I needed to do to get him to actually try Marimo was at GitHub Copilot support. He's like, I will not try it until you have that. So there was like a sort of a list of parity things that we had to do in early days, and and now I think we're beyond parity. Like we have like LLMs sort of built into the editor in various locations for generating code, or you can even like use quad code in conjunction, like in a sidebar to write your Maremo notebook. We have a whole lot of that stuff going on. Yeah, but but for him, he understood the appeal. Notebook to web app sounds great. I just need GitHub Copilot. Other users actually, like more technical users, they'll hear like um maybe they they lead the data team or maybe even like CTO of a startup, and they'll hear Git friendly Python notebooks. And they're like, okay, we just have to use this because like we're training like important machine learning models in our notebooks, and like it's just not sustainable to do them in a file format that I can't even version with Git. So there's a company called Bunker Hill. Um, they train foundation models for healthcare, and they originally decided to adopt Maremo because we said Git friendly Python notebooks, and then they stayed for the ability to make data apps that like their machine learning engineers now make for their radiologists. Um and still other companies, there's another very large public company, an e-commerce company, that told us that they originally adopted Marimo because it turned out it's pure Python file format, worked really well with Claude Code. And so now they have tons of engineers, but also non-engineers across the organization generating bespoke notebooks and data applications from just English using Cloud Code. So there's a lot of different entry points into Maremo. In terms of like the learning curve, um, the biggest thing you got to get used to is like the DAG, because it is like a different way of working with notebooks. Um we made the UI feel like, you know, it does feel like a notebook, right? You have selves and then you have code and then outputs and markdown. Um to get people used to the learning curve, we have like tutorials built into the CLI. So you pip install Marimo and then type Marimo tutorial intro, and then opens an intro notebook in your browser that you can kind of work through. And there's a couple of other tutorials. Um we also have several like example notebooks on our gallery, Maremo I.O. gallery. And then we have a thriving YouTube channel that um Vincent Wormer Daum of Maremo uh runs. Um so tons of YouTube shorts and tons of YouTube videos that sort of show you how to use Maremo and what you can use it for.

Bob

Yeah, yeah, a lot of awesomeness. Uh one of our coaches is uh always raving about the uh the channel and the the shorts because very you know short, right? So very concise mini tutorials. So uh that that uh that has been going great. But yeah, that that's a lot of stuff to unpack. So um thanks for the autocomplete. So it would do autocomplete in cells, uh copilot style.

Akshay

Yeah, yeah. So there's copilot that do autocomplete style. Um and then we also have like you can generate multiple cells using your LLM of choice. So you can do like a little drop-down, you choose your LLM, bring your keys or your subscription. There's a chat sidebar, there's an experimental agent sidebar. Um, so so there's a lot of uh experiments. There's there's experimental MCP support.

Bob

So there's a lot we recognize that developers are moving in this direction, and so there's a lot that we're yeah, and it turned out that the the format separating Python and and the output or kind of getting getting rid of the output would made it more AI friendly than as well, right?

Akshay

Yeah, that's definitely a thing that I didn't anticipate as well in the beginning because there was there was no glide code or anything. But yeah, so like I mean, I the the you the company I mentioned, what they told us is that yeah, Claude will just it can run the notebook as a script to see if what it's doing is correct. And um, we also have like a linter now from Remo Notebooks. It's like a CLI command Remo check. Um, but the point is that Claude can use that too to then like make sure the DAG is well formed and then continue on. Um so yeah, unexpected tailwind for sure.

Bob

How many developers are you now? That that sounds like a lot of development.

Akshay

Yeah, we we ship a lot. Um, how many developers? So our team is seven, but uh we have, let's see, I don't want to get this wrong. So we have one, two, three, four full-time developers of that seven. Uh I develop sometimes. I I used to develop a lot more, so maybe four and a half. I'm trying to do a little bit more again, just because that's where that's what gives me joy. Um but we ship a lot, and we also have a really large community of contributors. Like um we have something like close to 180 contributors on GitHub. Um and I think like over a hundred of those like made their first contributions this year. Uh so we get we get help.

Bob

Yeah, similar to Fast API, right? Like these big open source projects. Uh it's not only the the core computers, it's also like the community. It's uh it's amazing. Yeah, yeah. Nice. Um yeah, I got three more questions if we have time. So uh we can do relatively quickly. Um, do you want to talk a bit about Molab and more like the cloud features of Marimo and maybe even a bit more about LLM integration? Because that seems pretty interesting to me. And maybe also what's coming next. Um what's planned?

Akshay

Yes, definitely. So um Molab is our version or our our answer to Google Colab, Maremo Molab, Mo for Marimo. Um it's you you can use it, it's a free online service. You go to molab.marimo.io and uh you can create notebooks, you can share them with others. Um every notebook you create is like public but like unlisted by default. So this is really just like a tool for the community. Um and it it just makes it really easy to like if you want to make a tutorial and you want to host it as an application, or or if you want to just share link it in your docs, like uh it just makes that seamless. Um we so it that that is for the community, so it is free. You do need to make an account. Um and and we we plan to continue investing in that. But basically, the right now you it has you can experience Marima in two ways there, as a notebook or as like a web application. There's no like pipeline or script execution there yet. Um in terms of LLM integrations, um, one thing I didn't mention that is actually really unique about our editor that makes it more powerful for for AI's generated code is that uh you can like tag variables or data frames in memory and like pass them to the LLM when you're generating code. So you can say something like generate a scatter scatter plot of at my data frame using the whatever columns, and like then we'll like introspect the data frame, look at the column names, give some sample values to the LLM so like it actually knows like what to code against. Uh and so that that that can actually make you a lot more productive. Um sounds amazing. And then uh in terms of future plans, um we're pretty heavily sprinting on a lot of these like AI features in the editor. Um the one experimental thing that I mentioned is that you can actually have an agent like Claude Code or Gemini like in your sidebar, like programmatically adding in cells running them. Uh this we implemented what's called like the agent-client protocol, which was introduced by Zed, which Zed is a is a modern IDE. Um, and so that's experimental right now, but we're we're building that out further. MCP is another thing we're building out. Um, and then a big thing that I think people will be really excited for is um we are close to shipping like a a native feeling VS Code extension, so it'll feel just like you know, Jupyter notebooks feel like in VS Code. Um, it's been a big project. Trevor Mann's um uh on Marimo team has been leading it, but I think that'll be really big for people who say, I don't want to leave VS Code or I don't want to leave cursor, I don't want it'll work in any VS Code fork as well. Um, so yeah, be on the lookout for that.

Bob

Yeah. Anything for Vim or is that unrealistic?

Akshay

Uh you know, I do all my non-Maremo coding in Vim, but but probably uh lower on the primary list.

Bob

You need a richer uh interface, more GUI-like, right? Yeah, yeah, yeah. Yeah, awesome. Um yeah, no, thanks for sharing, and uh there's so much good stuff. But maybe we can go back a bit to your TensorFlow in Google days, and but only if there's like additional lessons you want to share, um, how you think about building uh these days. I mean, of course you already you went quite a bit into the design, but are there any previous lessons from that time that you just use over and over again when you design software and think about uh the ideal user experience?

Akshay

Yeah, I guess a couple lessons. So one main one. So um when so I was on TensorFlow when we made the transition from TensorFlow 1 to 2, so I joined like the TensorFlow 2 team before it was uh released. TensorFlow 2 like made TensorFlow more imperative like PyTorch is today. Um there were like efforts at TensorFlow to like excuse me, like automatically try and convert imperative code to like data flow graphs, um and like automatically try and like like inspect like code that was mutating lists and like doing complicated iterations, etc. But basically that there was this effort to like magically map imperative code to a graph in a way that turned out to be really difficult and like not possible to do with 100% reliability. And this ended up being like a like a big usability problem because like there was this like it wasn't clear to the user like when this translation from imperative code to graph would work and when it wouldn't, because like they were trying to just cover all you know the whole universe of Python code, which was just clearly not possible. So like that uncanny valley of like not knowing if something will work as a user is like something that I took away, something that I never wanted to have in like one of the products that you know I made. So in Marimo, like the way that reactivity works is like it's really simple, but it's also not exhaustive, but it's it's clear. It's um when you run a cell, any variables that that cell defines, all of the cells that read that variable run, and that's it. So like we don't introspect list mutations or like we're basically we discourage mutations. Um, it would be possible to try and trace mutations and then do weird reactive things at runtime, but then we would never be able to cover all of those edge cases, and then there would be situations where user would run a cell that would be like, I don't know what's gonna run, and I don't know what's not gonna run. And I did not want to have that experience. So, like that, like yeah, that that's one learning. So I have a simple contract that's easy for users to understand. It's okay if it's not exhaustive, it just needs to be users need to understand it so they can work in your system.

Bob

Uh goes back to the design of Python, right? Um I think yesterday I took a screenshot because we had this demo in in the coaching uh remote, and how it just couldn't go beyond the state. I think it was again a variable issue and and the deck not being whatever, right? And but like, yeah, this is exactly like in the face of ambiguity. Uh refuse the temptation to guess. It kind of goes back to that, right? Like, yeah, yeah, exactly. Yeah, yeah. Yeah, interesting. Cool.

Akshay

Uh second lesson or uh I do have a second lesson, but it's not from TensorFlow. It's actually from a PhD where I also worked on a lot of open source software. There I was uh second maintainer on a project called CVXPy, which is um think of it as like a Python, it's a Python library for specifying optimization solvers, uh optimization problems, like linear programs, convex optimization problems, and then the problems get solved. It's kind of like a compiler or DSL for optimization problems. Um the lesson there was kind of similar, actually, where like C VXPi, if you CVXPi will never be the fastest way to solve like a linear program or a convex optimization problem. Um but it's like fast enough and it makes it way easier to solve, like as opposed to using a solver directly. Like it just makes it it really reduces the friction to like even get a solution. Maybe it's like 80% as fast or 50% as fast, even sometimes. And like that's okay. And in fact, like that was our goal because like our main thing was to just make it easy for users to do things, kind of like how Python is the second best language for everything, like just need to be good enough. And like that's actually another lesson that I kind of take at heart at Maremo. Like, like if your goal is to just build a really fully featured like web application, a really complicated data application, like Maremo might not be the best choice, right? Like you might want, I don't know, like reflex that Python library or something that that's exclusively designed for making data apps. Or like um if you're making like a really complicated, complex data pipeline, maybe you reach for like prefect first or something like that. Or or you you could maybe get a little bit farther with with prefect. But like the point is that like through a notebook, like we would really reduce the friction to even getting to a data app in the first place or getting to like a script or a pipeline. Um and I think that's like actually a good thing. Like we don't need to be the best w you know, XYZ thing for like making really complicated data apps. We just need to be good enough and and still delightful to use. And uh we need to really reduce the friction to even getting to a useful artifact. Um so I think that's another lesson that I took away.

Bob

Yeah. Achieved that. So uh yeah, that's that's a great lesson.

Advert

A quick break from the episode to talk about a product that we've had going for years now. This is the Pie Bytes platform, Bob. What's it all about?

Bob

Now with AI, I think uh there's a bit of a sentiment that we're eroding our skills because AI writes so much code for us. But actually, I went back to the platform the other day, solved 10 bytes, and I'm still secure of my skills because it's good to be limited in your resources. You really have to write the code, it really makes you think about the code. It's really helpful.

Advert

Definitely helpful as long as you don't use AI to solve the problems. If you do, you're just cheating. But in reality, this is an amazing tool to help you keep fresh with Python, keep your skills strong, keep you sharp so that when you are on a live stream like Bob over here, you can solve exercises live with however many people watching you code at the exact same time. So please check out Pywitesplatform.com. It is the coding platform that beats all other coding platforms and will keep you sharper than you could ever have imagined. Check it out now, Pybitesplatform.com and back to the episode.

Bob

Uh last question is always the books. Uh, we love reading and uh we think books are important at all levels. So, what are you currently reading? And if you're not reading right now, is there a book you can recommend to our audience?

Akshay

Yeah, so I just finished reading a book. I just finished reading uh the design of everyday things. Um yeah, classic book on I I guess like design thinking. But a lot of lessons for like software as well about like how to think how to design usable systems and like how humans think about uh errors and things like that. Um I think it it it was a really good read. Um and I think one piece that stood out to me was um there was this concept of like how errors should be like local, like in a system. Like you shouldn't do one thing and then like have an error propagate like really far in some other place in your system. And like that's like a lesson that like we've tried to incorporate in Marimo as well and in various locations, but would recommend it.

Bob

Interesting. Yeah, I got it uh somewhere on the stack. I might I might bump it up. Yeah, because it seems like a design book, but it seems also very generically applicable to everything like software, right? So uh probably something we anybody that makes open source libraries, web applications should read, right?

Akshay

So yeah, yeah. Fundamentally about designing for humans, which if you're building software for you're building libraries, you're doing that. So yeah, exactly.

Bob

Cool. Well, thanks for sharing. Uh lastly, where should people go? Uh I mean, obviously to the website, the YouTube channel, and maybe final shout out. Uh that would be uh great where people can find and follow you, maybe as well. Yeah.

Akshay

Yeah. So um our website, marimo.io. Uh if you want to just get started quickly with Marimo without installing it, you can try it at Molab. So molab.maremo.io. And to install locally, pip install Marimo, you can in a virtual environment, or UV add marimo, or yes, uvx Marimo, if you want to use it as a tool, which you know many people do. Um, so lots lots of different ways. Uh like the definitive getting started guide is at our docs, docs.maremo.io. Um people can find me on socials. I'm on LinkedIn, I'm also on um Twitter and Blue Sky. Um and was there another part to the question?

Bob

No, that's awesome. Uh if you had a final shout out, or uh yeah.

Akshay

Well, thanks, thank you, Bob, for having me uh uh on the podcast. And then a shout-out to the Maremo team. I I know you mentioned like, wow, like how many engineers do you have? Marimo team works hard and and they're always shipping. And so yeah, just want to thank the team too.

Bob

Awesome. Well, thanks again for uh coming on. Very interesting insight, and uh hope a lot of people are going to check it out because uh yeah, it's an impressive tool and uh can make your workflow a lot more efficient.

Akshay

So thanks, Bob. It was a lot of fun.

Bob

Bye, thanks.

Advert

Hey everyone, thanks for tuning into the Pybytes podcast. I really hope you enjoyed it. A quick message from me and Bob before you go to get the most out of your experience with Pybytes, including learning more Python, engaging with other developers, learning about our guests, discussing these podcast episodes, and much, much more. Please join our community at pyvytes.circle.so. The link is on the screen if you're watching this on YouTube, and it's in the show notes for everyone else. When you join, make sure you introduce yourself, engage with myself and Bob, and the many other developers in the community. It's one of the greatest things you can do to expand your knowledge and reach and network as a Python developer. We'll see you in the next episode, and we will see you in the community.