The Perfect Python Package

... does not exist. But can you fool me?

Nov 12, 2025

This article does not reflect the preferences or perspectives of my employer; the views here are mine and mine alone.

Of course, the perfect package does not exist. Reviewing job applicants’ Github profiles makes this abundantly clear. But beyond the “meat” of the package that actually does, you know, the stuff that it says it will do, the “smell” of the package is what I gravitate towards. I don’t really care so much about factory patterns or complicated abstractions; I care that you’ve put in the time and effort to understand that writing code isn’t as simple as slinging some slick one-liners to do nested dictionary comprehension and throwing it up for the world to admire. Which came first, the egg or the wheel (and which are we building today)? Do I see some pesky bytecode in those module directories? Is the evidence of your OSX Finder still hanging around in the root of the directory to share your disdain for terminals to the whole world?

I’m being facetious of course, but in truth these things matter. When it comes to making production-quality code I need to know that you can handle a .gitignore (otherwise do I trust you with secrets?) and having done the work to understand the arcane incantations in a CI/CD pipeline means that you can bootstrap real distributions. When I’m asked about these things in interviews I usually give a hand-wavy answer about good practices. What are those good practices? I’m glad you asked.

Principle 1: Your time is valuable

The very first thing that I look for in a package is an admission that anything you do by hand in a terminal is hacky by definition. If it’s not repeatable after you’ve left the package to sit for a year and some enterprising student finds a bug, it’s bad.

In practice, this boils down to two core technical beliefs. Firstly, you need automation. Secondly, you need to proactively think about dependency resolution issues.

Automation breeds confidence. When I see things like a mechanism for handing releases (Either trunk-based or Git flow is fine but you have to make a decision! Tools like release-please are helpful here) I feel more confident that features and bugs are being integrated at regular intervals and available to developers. Even better when the release mechanism ties directly to a distribution mechanism that incorporates testing (more later). Github actions, CircleCI, Jenkins, whatever. The core principle that a short-lived ephemeral VM can test, build, and publish your code means that you’ve put in the effort to ensure that your code can be run anywhere. It also means that you’ve figured out what is needed to run / publish your code (including handling secrets like Pypi keys), you will spot issues as they arise and tests fail, you might even have some idea of coverage. Most of all though, this is a strong signal that someone else can run your code. And that is hugely important because of the second technical belief.

If you’re not careful, dependencies can take up more time than the code you’re writing. I’m serious. Have you ever had to migrate a compiled torch model from Pydantic 1 to 2, sweet summer child? This isn’t an issue you can entirely get around with clever engineering. Things are going to change, and you don’t know how. The only thing in your power is to ensure that i) you are using a packaging mechanism that makes dependencies explicit, and ii) you are intentionally auditing your dependencies to make sure they are reasonable. To address the first, something like uv can help to keep your pyproject.toml up to date with your actual environment (do you ever go into the terminal and run pip install? Stop that) and mamba is a necessary evil when going beyond the nicely defined space of python distributions. A second-order effect of uv or pixi is making it easier for others to run your code in the same environment that you developed it in. The ultimate test here is containerization, and while I love to see a .devcontainers in a repo it’s a high bar that few manage to reach. If you’re one of the few, enjoy the view from your lofty heights of reproducibility.

If your package seems well designed for reproducible packaging and I can actually get it running with minimal finicking, congratulations you have passed the first test.

Principle 2: My time is valuable, too

My first question when opening a piece of code is, “what does this do?”. Which of the many holes in my heart (or tech stack) are you plugging with your invention? If you don’t communicate this within one (that’s 1, with a capital 1) paragraph then you have lost me and likely most others. Assuming that you have retained my goldfish-like attention span, the next hurdle for you to cross is accurately and succinctly communicating i) what your package does, ii) how I can do it too. The first of these is accomplished with documentation, while the latter is accomplished with examples and tests.

Documentation sounds boring. I’m not debating that. The mental shift that I would recommend is going from “my code tells its own story” (the myth of self-documenting code) to “my code solves a problem, the documentation explains what that problem is and how we are solving it”. The “pitch” in the README should have already established why that problem is worth solving. That’s not to say that you shouldn’t use in-line documentation (you should), just that comments and snappy variable names are not a necessary and sufficient condition for success. If it’s a function, it’s not sufficient to say “Create embeddings of single-cell expression data” because i) I don’t know what “single-cell expression data” is, and ii) I don’t know how you’re “creating” the embeddings. “Use a given pre-trained UCE model to embed expression vectors of log-RPKM 10x single-cell RNA-seq data. Defaults to using 128 dimensions for the latent space”. Now, you might have used this model before, but now you know a lot more about what I’m trying to do. Of course, your intentions are meaningless if I can’t replicate them for my own purposes. And for that, you need tests.

Treat tests as you would a confessional. They’re a place to put your deepest, darkest fears (in the vein of, “this really shouldn’t happen…”) in a judgement-free way. They’re also, fundamentally, a way to communicate. You communicate your expectations in a machine-verifiable way, you communicate the core functionality of your code to me with examples, and you communicate that you are actually thinking about these things proactively. When I want to use a package, the first place I go is to the tests (why? Because they’re the only things that I know will work). If my eager mouse can’t find an aptly named folder in the repository that you are currently worrying about, then go write some tests!

However, I do acknowledge that not everything should be a test. There are tasks that require elaboration and explanation to communicate the intention behind a particular pattern in your package. I’ve always used the term “cook book”, in the sense of a set of “recipes” that I can try out to do different things. I like that analogy rather than a “tutorial” because really, I’m not interested in learning everything about your package (except in some rare cases), I’m interested in taking it and doing something with it. A cookbook tells me what I can do, and how I can do it. Ideally, these recipes are fast enough that they can be run while the documentation is being knit and therefore are guaranteed to work.

A repository is an idea, and an idea that is not properly communicated is worthless. Code communicates with the machine, but I’m not a machine (yet). I need words, diagrams, concrete use cases to show me that this package is worth my ever-vanishing time. Breath sweet snippets into my ear and show me why my time is best spent using your spaghetti rather than cooking my own.

Principle 3: You don’t know me, I don’t know you

Look, you probably think you do know me from the tone of this article (and I don’t blame you). But really, I’m just coming off a Dungeon Crawler Carl kick and like this writing style right now. In practice, you have no idea what I want to do, and I almost never know why you made this package in the first place. So don’t make assumptions. They make an ass of you and I (but in this case, I’m reviewing your code, so guess who the ass is?).

Assumptions abound in code - the most egregious is a hard coded path to your home directory with pd.read_csv(“/home/alice/the_only_data_that_works.csv”) but there are all sorts of less obvious sins. Parameter values are almost never documented, especially to private functions, and in a lot of cases they matter. Let’s say you’ve made a topic modelling thingy that uses latent dirichlet allocation - great, but you have at least two really important parameters to consider. Do you expect me to just… figure that out? You don’t know what I’m doing. I could be doing topic modelling on migratory bird pit stops. But, at the same time, I should be able to use your package quickly without having to read your eighteen page conference paper. So, how do you balance these conflicting priorities?

First of all, what I’m looking for primarily is an acknowledgement of where things are easy and where they are hard. Setting parameter values for an LDA algorithm is hard, and I don’t expect you to fix that. But I do expect you to tell me that this is not a simple decision and give me a reasonable place to start. Being the statistics nerd that I am, I get concerned when your algorithm fails to converge and I see hard-coded priors four levels deep in the stack trace. That’s…. bad…. But, what’s worse is not even giving me the option to change them.

So, second of all, your API has two customers. The first is who you think I am (someone with no time and even less patience). I just want to click run and get some nice TQDM progress bars. But the second customer is a PhD student with more time, ambition, and caffeine than common sense. They want to poke around and find the weak spots (usually because they want to fix them and claim they’ve cured cancer). And in truth, it’s better that you let them. Your API needs to have a “dumb person” mode and a “smart person” mode. And how do you do that without overwhelming people? Set sensible defaults (and explain them) but don’t hide important objects in non-user facing functions. An elegant way to do this is to use something like jsonargparse or tyro to generate your CLI and pass in complex configuration options which are well documented and clearly accessible.

If I understand that what I’m doing is probably bad (but, I’m able to try it anyway) and you give me a way to do it better in the future, you’ve acknowledged your assumptions and actually done something about it. And by doing that, you’ve gained my trust.

Congratulations, you’ve passed the sniff test

Now what – have you made the perfect Python package? Yes! No. Maybe? I don’t know, why are you asking me? Go write some tests.

Kyle Stiers

Nov 15

> But really, I’m just coming off a Dungeon Crawler Carl kick and like this writing style right now.

Okay, but which one?

Great article! Definitely relatable and gave me some new things to consider in my template for starting projects.

2 replies by Chris Cole and others

Dionizije Fa

Nov 13

Absolutely. And with ever-growing abundance of software, if I can't run your stuff out of the box in 10minutes, I'm moving on to 2nd and 3rd options. And then later I also want to configure it as my usecase grows.

2 more comments...

GxAI Interactions

Discussion about this post

Ready for more?