Testing is a critical activity in all software projects, but one that is often neglected in data pipelines. The complexities introduced by the inherent statefulness of the problem domain and the interdependencies between systems contribute to make pipeline testing difficult to manage. To make this endeavor more manageable Abe Gong and James Campbell have created Great Expectations. In this episode they discuss how you can use the project to create tests in the exploratory phase of building a pipeline and leverage those to monitor your systems in production. They also discussed how Great Expectations works, the difficulties associated with pipeline testing and managing associated technical debt, and their future plans for the project.
Most applications require data to operate on in order to function, but sometimes that data is hard to come by, so why not just make it up? Mimesis is a library for randomly generating data of different types, such as names, addresses, and credit card numbers, so that you can use it for testing, anonymizing real data, or for placeholders. This week Nikita Sobolev discusses how the project got started, the challenges that it has posed, and how you can use it in your applications.
One of the draws of Python is how dynamic and flexible the language can be. Sometimes, that flexibility can be problematic if the format of variables at various parts of your program is unclear or the descriptions are inaccurate. The growing middle ground is to use type annotations as a way of providing some verification of the format of data as it flows through your application and enforcing gradual typing. To make it simpler to get started with type hinting, Carl Meyer and Matt Page, along with other engineers at Instagram, created MonkeyType to analyze your code as it runs and generate the type annotations. In this episode they explain how that process works, how it has helped them reduce bugs in their code, and how you can start using it today.
The importance of testing your software is widely talked about and well understood. What is not as often discussed is the different types of testing, and how end-to-end tests can benefit your team to ensure proper functioning of your application when it gets released to production. This week Luciano Puccio shares the work that he has done on Golem, a framework for building and executing an automation suite to exercise the entire system from the perspective of the user. He discusses his reasons for creating the project, how he things about testing, and where he plans on taking Golem in the future. Give it a listen and then take it for a test drive.
We write tests to make sure that our code is correct, but how do you make sure the tests are correct? This week Ned Batchelder explains how coverage.py fills that need, how he became the maintainer, and how it works under the hood.
The venerable ‘if’ statement is a cornerstone of program flow and busines logic, but sometimes it can grow unwieldy and lead to unmaintainable software. One alternative that can result in cleaner and easier to understand code is a state machine. This week Glyph explains how Automat was created and how it has been used to upgrade portions of the Twisted project.
We all know that testing is an important part of software and systems development. The problem is that as our systems and applications grow, the amount of testing necessary increases at an exponential rate. Cris Medina joins us this week to talk about some of the problems and approaches associated with testing these complex systems and some of the ways that Python can help.
As Python developers we are fond of the dynamic nature of the language. Sometimes, though, it can get a bit too dynamic and that’s where having some type information would come in handy. Mypy is a project that aims to add that missing level of detail to function and variable definitions so that you don’t have to go hunting 5 levels deep in the stack to understand what shape that data structure is supposed to be. This week we spoke with David Fisher and Greg Price about their work on Mypy and its use within Dropbox and the broader community. They explained how it got started, how it works under the covers, and why you should consider adding it to your projects.
When you have good tools it makes the work you do even more enjoyable. Russel Keith-Magee has been building up a set of tools that are aiming to let you write graphical interfaces in Python and run them across all of your target platforms. Most recently he has been working on a capstone project called Toga that targets the Android and iOS platforms with the same set of code. In this episode we explored his journey through programming and how he has built and designed the Beeware suite. Give it a listen and then try out some or all of his excellent projects!
Making sure that your code is secure is a difficult task. In this episode we spoke to Eric Brown, Travis McPeak, and Tim Kelsey about their work on the Bandit library, which is a static analysis engine to help you find potential vulnerabilities before your application reaches production. We discussed how it works, how to make it fit your use case, and why it was created. Give the show a listen and then go start scanning your projects!