Being able to understand the context of a piece of text is generally thought to be the domain of human intelligence. However, topic modeling and semantic analysis can be used to allow a computer to determine whether different messages and articles are about the same thing. This week we spoke with Radim Řehůřek about his work on GenSim, which is a Python library for performing unsupervised analysis of unstructured text and applying machine learning models to the problem of natural language understanding.
Are you struggling with trying to manage a series of related, interdependent batch jobs? Then you should check out Airflow. In this episode we spoke with the project’s creator Maxime Beauchemin about what inspired him to create it, how it works, and why you might want to use it. Airflow is a data pipeline management tool that will simplify how you build, deploy, and monitor your complex data processing tasks so that you can focus on getting the insights you need from your data.
Dag Brattli is an engineer with Microsoft and in his spare time he created the ported the Reactive Xtensions framework to Python in the form of the RxPy library. In this episode we had the opportunity to speak with Dag and learn more about what ReactiveX is, why it is useful and how you can use it in your Python programs. It is definitely a very powerful programming patern when manipulating data streams which is becoming increasingly common in modern software architectures.