The web has spawned numerous methods for communicating between applications, including protocols such as SOAP, XML-RPC, and REST. One of the newest entrants is GraphQL which promises a simplified approach to client development and reduced network requests. To make implementing these APIs in Python easier, Syrus Akbary created the Graphene project. In this episode he explains the origin story of Graphene, how GraphQL compares to REST, how you can start using it in your applications, and how he is working to make his efforts sustainable.
Real-time communication over the internet is an amazing feat of modern engineering. The protocol that powers a majority of video calling platforms is WebRTC. In this episode Jeremy Lainé explains why he wrote a Python implementation of this protocol in the form of AIORTC. He also discusses how it works, how you can use it in your own projects, and what he has planned for the future.
The need to process unbounded and continually streaming sources of data has become increasingly common. One of the popular platforms for implementing this is Kafka along with its streams API. Unfortunately, this requires all of your processing or microservice logic to be implemented in Java, so what’s a poor Python developer to do? If that developer is Ask Solem of Celery fame then the answer is, help to re-implement the streams API in Python. In this episode Ask describes how Faust got started, how it works under the covers, and how you can start using it today to process your fast moving data in easy to understand Python code. He also discusses ways in which Faust might be able to replace your Celery workers, and all of the pieces that you can replace with your own plugins.
Michael Foord has been working on building and testing software in Python for over a decade. One of his most notable and widely used contributions to the community is the Mock library, which has been incorporated into the standard library. In this episode he explains how he got involved in the community, why testing has been such a strong focus throughout his career, the uses and hazards of mocked objects, and how he is transitioning to freelancing full time.
Twisted is one of the earliest frameworks for developing asynchronous applications in Python and it has yet to fulfill its original purpose. It can be used to build network servers that integrate a multitude of protocols, increase the performance of your I/O bound applications, serve as the full web stack for your WSGI projects, and anything else that needs a battle tested and performant foundation. In this episode long time maintainer Moshe Zadka discusses the history of Twisted, how it has evolved over the years, the transition to Python 3, some of its myriad use cases, and where it is headed in the future. Try it out today and then send some thanks to all of the people who have dedicated their time to building it.
Pandas is a swiss army knife for data processing in Python but it has long been difficult to customize. In the latest release there is now an extension interface for adding custom data types with namespaced APIs. This allows for building and combining domain specific use cases and alternative storage mechanisms. In this episode Tom Augspurger describes how the new ExtensionArray works, how it came to be, and how you can start building your own extensions today.
One of the challenges of machine learning is obtaining large enough volumes of well labelled data. An approach to mitigate the effort required for labelling data sets is active learning, in which outliers are identified and labelled by domain experts. In this episode Tivadar Danka describes how he built modAL to bring active learning to bioinformatics. He is using it for doing human in the loop training of models to detect cell phenotypes with massive unlabelled datasets. He explains how the library works, how he designed it to be modular for a broad set of use cases, and how you can use it for training models of your own.
Most applications require data to operate on in order to function, but sometimes that data is hard to come by, so why not just make it up? Mimesis is a library for randomly generating data of different types, such as names, addresses, and credit card numbers, so that you can use it for testing, anonymizing real data, or for placeholders. This week Nikita Sobolev discusses how the project got started, the challenges that it has posed, and how you can use it in your applications.
Determining the best way to manage the capacity and flow of goods through a system is a complicated issue and can be exceedingly expensive to get wrong. Rather than experimenting with the physical objects to determine the optimal algorithm for managing the logistics of everything from global shipping lanes to your local bank, it is better to do that analysis in a simulation. Ruud van der Ham has been working in this area for the majority of his professional life at the Dutch port of Rotterdam. Using his acquired domain knowledge he wrote Salabim as a library to assist others in writing detailed simulations of their own and make logistical analysis of real world systems accessible to anyone with a Python interpreter.