The notebook format that has been exemplified by the IPython/Jupyter project has gained in popularity among data scientists. While the existing formats have proven their value, they are still susceptible with difficulties in collaboration and maintainability. Scott Ernst created the Cauldron notebook to be testable, production ready, and friendly to version control. This week we explore the capabilities, use cases, and architecture of Cauldron and how you can start using it today!
What’s the weather tomorrow? That’s the question that meteorologists are always trying to get better at answering. This week the developers of MetPy discuss how their project is used in that quest and the challenges that are inherent in atmospheric and weather research. It is a fascinating look at dealing with uncertainty and using messy, multidimensional data to model a massively complex system.
Pandas is one of the most versatile and widely used tools for data manipulation and analysis in the Python ecosystem. This week Jeff Reback explains why that is, how you can use it to make your life easier, and what you can look forward to in the months to come.
As the amount of text available on the internet and in businesses continues to increase, the need for fast and accurate language analysis becomes more prominent. This week Matthew Honnibal, the creator of SpaCy, talks about his experiences researching natural language processing and creating a library to make his findings accessible to industry.
Housing is something that we all have experience with, but many don’t understand the complexities of the market. This week Travis Jungroth talks about how House Canary uses data to make the business of real estate more transparent.
We’re delving into the complex workings of your mind this week on Podcast.__init__ with Jonathan Peirce. He tells us about how he started the PsychoPy project and how it has grown in utility and popularity over the years. We discussed the ways that it has been put to use in myriad psychological experiments, the inner workings of how to design and execute those experiments, and what is in store for its future.
MP3 Audio [32 MB]DownloadShow URL Summary Open source has proven its value in many ways over the years. In many companies that value is purely in terms of consuming available projects and platforms. In this episode Zalando describes their recent move to creating and releasing a number of their internal …
MP3 Audio [65 MB]DownloadShow URL Summary Dave Beazley has been using and teaching Python since the early days of the language. He has also been instrumental in spreading the gospel of asynchronous programming and the many ways that it can improve the performance of your programs. This week I had …
Being able to understand the context of a piece of text is generally thought to be the domain of human intelligence. However, topic modeling and semantic analysis can be used to allow a computer to determine whether different messages and articles are about the same thing. This week we spoke with Radim Řehůřek about his work on GenSim, which is a Python library for performing unsupervised analysis of unstructured text and applying machine learning models to the problem of natural language understanding.
Ian Ozsvald and Emlyn Clay are co-chairs of the London chapter of the PyData organization. In this episode we talked to them about their experience managing the PyData conference and meetup, what the PyData organization does, and their thoughts on using Python for data analytics in their work.