Open Source

Analyzing Satellite Image Data Using PyTroll - Episode 193

Summary

Every day there are satellites collecting sensor readings and imagery of our Earth. To help make sense of that information, developers at the meteorological institutes of Sweden and Denmark worked together to build a collection of Python packages that simplify the work of downloading and processing satellite image data. In this episode one of the core developers of PyTroll explains how the project got started, how that data is being used by the scientific community, and how citizen scientists like you are getting involved.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute.
  • And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Martin Raspaud about PyTroll, a suite of projects for processing earth observing satellite data

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what PyTroll is and how the overall project got started?
  • What is the story behind the name?
  • What are the main use cases for PyTroll? (e.g. types of analysis, research domains, etc.)
  • What are the primary types of data that would be processed and analayzed with PyTroll? (e.g. images, sensor readings, etc.)
  • When retrieving the data, are you communicating directly with the satellites, or are there facilities that fetch the information periodically which you can then interface with?
  • How do you locate and select which satellites you wish to retrieve data from?
  • What are the main components of PyTroll and how do they fit together?
  • For someone processing satellite data with PyTroll, can you describe the workflow?
  • What are some of the main data formats that are used by satellites?
  • What tradeoffs are made between data density/expressiveness and bandwidth optimization?
  • What are some of the common issues with data cleanliness or data integration challenges?
  • Once the data has been retrieved, what are some of the types of analysis that would be performed with PyTroll?
  • Are there other tools that would commonly be used in conjunction with PyTroll?
  • What are some of the unique challenges posed by working with satellite observation data?
  • How has the design and capability of the various PyTroll packages evolved since you first began working on it?
  • What are some of the most interesting or unusual ways that you have seen PyTroll used?
  • What are some of the lessons that you have learned while building PyTroll that you have found to be most useful or unexpected?
  • What do you have planned for the future of PyTroll?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Building GraphQL APIs in Python Using Graphene with Syrus Akbary - Episode 192

Summary

The web has spawned numerous methods for communicating between applications, including protocols such as SOAP, XML-RPC, and REST. One of the newest entrants is GraphQL which promises a simplified approach to client development and reduced network requests. To make implementing these APIs in Python easier, Syrus Akbary created the Graphene project. In this episode he explains the origin story of Graphene, how GraphQL compares to REST, how you can start using it in your applications, and how he is working to make his efforts sustainable.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute.
  • And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Syrus Akbary about Graphene, a python library for building your APIs with GraphQL

Interview

  • Introductions
  • How did you get introduced to Python?
  • What is GraphQL and what is the benefit vs a REST-based API?
    • How does it compare to specifications such as OpenAPI (formerly Swagger) or RAML?
  • Can you explain what Graphene is and your motivation for building it?
    • In addition to the Python implementation there is also a JavaScript library. Is that primarily for use as a client or can it also be used in Node for serving APIs?
  • What is involved in building a GraphQL API?
    • What does Graphene do to simplify this process?
  • How is Graphene implemented and how has that evolved since you first started working on it?
    • Is there a set of tests for verifying the compliance of Graphene or a specific API with the GraphQL specification?
  • What are some of the most complex or confusing aspects of building a GraphQL API?
  • What are some of the unique capabilities that are offered by building an application with GraphQL as the communication interface?
  • While reading through documentation in preparation for our conversation I noticed the Quiver project. Can you explain what that is and how it fits with the other Graphene projects?
    • What is it doing under the hood to optimize serving of the API?
  • For someone who is interested in adding a GraphQL interface to an existing application, what would be involved?
  • The documentation mentions creation of a schema, as well as defining queries. Is it possible for a client to craft queries that don’t match directly with those defined in the server layer?
  • What are some of the most interesting or surprising uses of Graphene and GraphQL that you have seeen?
  • What are some cases where it would be more practical to implement an API using REST instead of GraphQL?
  • What are some references that you would recommend for anyone who wants to learn more about GraphQL and its ecosystem?
  • What are your plans for the future of Graphene?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

AIORTC: An Asynchronous WebRTC Framework with Jeremy Lainé - Episode 191

Summary

Real-time communication over the internet is an amazing feat of modern engineering. The protocol that powers a majority of video calling platforms is WebRTC. In this episode Jeremy Lainé explains why he wrote a Python implementation of this protocol in the form of AIORTC. He also discusses how it works, how you can use it in your own projects, and what he has planned for the future.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute.
  • And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Jeremy Lainé about AIORTC, an asynchronous implementation of the WebRTC and ObjectRTC protocols in Python

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what the WebRTC and ObjectRTC protocols are?
    • What are some of the main use cases for these protocols?
  • What is AIORTC and what was your motivation for creating it?
    • How does it compare to other implementations of the RTC protocols?
    • Why do you think there haven’t been any other Python implementations?
  • What are some of the benefits of having a Python implementation of the RTC protocol?
  • How is AIORTC implemented?
    • What have been some of the most difficult or challenging aspects of implementing a WebRTC compliant library?
    • What are some of the most interesting or useful lessons that you have learned in the process?
  • What is involved in building an application on top of AIORTC?
    • What would be required to integrate AIORTC into an existing application built with something such as Flask or Django?
  • What are some of the most interesting uses of AIORTC that you have seen?
  • What are some of the projects that you would like to build with AIORTC?
  • What are some cases where it would make more sense to use a different library or framework for your WebRTC projects?
  • What are your plans for the future of AIORTC?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Polyglot: Multi-Lingual Natural Language Processing with Rami Al-Rfou - Episode 190

Summary

Using computers to analyze text can produce useful and inspirational insights. However, when working with multiple languages the capabilities of existing models are severely limited. In order to help overcome this limitation Rami Al-Rfou built Polyglot. In this episode he explains his motivation for creating a natural language processing library with support for a vast array of languages, how it works, and how you can start using it for your own projects. He also discusses current research on multi-lingual text analytics, how he plans to improve Polyglot in the future, and how it fits in the Python ecosystem.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute.
  • And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Rami Al-Rfou about Polyglot, a natural language pipeline with support for an impressive amount of languages

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Polyglot is and your reasons for starting the project?
  • What are the types of use cases that Polyglot enables which would be impractical with something such as NLTK or SpaCy?
  • A majority of NLP libraries have a limited set of languages that they support. What is involved in adding support for a given language to a natural language tool?
    • What is involved in adding a new language to Polyglot?
    • Which families of languages are the most challenging to support?
  • What types of operations are supported and how consistently are they supported across languages?
  • How is Polyglot implemented?
  • Is there any capacity for integrating Polyglot with other tools such as SpaCy or Gensim?
  • How much domain knowledge is required to be able to effectively use Polyglot within an application?
  • What are some of the most interesting or unique uses of Polyglot that you have seen?
  • What have been some of the most complex or challenging aspects of building Polyglot?
  • What do you have planned for the future of Polyglot?
  • What are some areas of NLP research that you are excited for?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Gnocchi: A Scalable Time Series Database For Your Metrics with Julien Danjou - Episode 189

Summary

Do you know what your servers are doing? If you have a metrics system in place then the answer should be “yes”. One critical aspect of that platform is the timeseries database that allows you to store, aggregate, analyze, and query the various signals generated by your software and hardware. As the size and complexity of your systems scale, so does the volume of data that you need to manage which can put a strain on your metrics stack. Julien Danjou built Gnocchi during his time on the OpenStack project to provide a time oriented data store that would scale horizontally and still provide fast queries. In this episode he explains how the project got started, how it works, how it compares to the other options on the market, and how you can start using it today to get better visibility into your operations.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute.
  • And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Julien Danjou about Gnocchi, an open source time series database built to handle large volumes of system metrics

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Gnocchi is and how the project got started?
    • What was the motivation for moving Gnocchi out of the Openstack organization and into its own top level project?
  • The space of time series databases and metrics as a service platforms are both fairly crowded. What are the unique features of Gnocchi that would lead someone to deploy it in place of other options?
    • What are some of the tools and platforms that are popular today which hadn’t yet gained visibility when you first began working on Gnocchi?
  • How is Gnocchi architected?
    • How has the design changed since you first started working on it?
    • What was the motivation for implementing it in Python and would you make the same choice today?
  • One of the interesting features of Gnocchi is its support of resource history. Can you describe how that operates and the types of use cases that it enables?
    • Does that factor into the multi-tenant architecture?
  • What are some of the drawbacks of pre-aggregating metrics as they are being written into the storage layer (e.g. loss of fidelity)?
    • Is it possible to maintain the raw measures after they are processed into aggregates?
  • One of the challenging aspects of building a scalable metrics platform is support for high-cardinality data. What sort of labelling and tagging of metrics and measures is available in Gnocchi?
  • For someone who wants to implement Gnocchi for their system metrics, what is involved in deploying, maintaining, and upgrading it?
    • What are the available integration points for extending and customizing Gnocchi?
  • Once metrics have been stored, aggregated, and indexed, what are the options for querying and analyzing the collected data?
  • When is Gnocchi the wrong choice?
  • What do you have planned for the future of Gnocchi?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Using Calibre To Keep Your Digital Library In Order with Kovid Goyal - Episode 187

Summary

Digital books are convenient and useful ways to have easy access to large volumes of information. Unfortunately, keeping track of them all can be difficult as you gain more books from different sources. Keeping your reading device synchronized with the material that you want to read is also challenging. In this episode Kovid Goyal explains how he created the Calibre digital library manager to solve these problems for himself, how it grew to be the most popular application for organizing ebooks, and how it works under the covers. Calibre is an incredibly useful piece of software with a lot of hidden complexity and a great story behind it.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at podcastinit.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Kovid Goyal about Calibre, the powerful and free ebook management tool

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what Calibre is and how the project got started?
  • How are you able to keep up to date with device support in Calibre, given the continual release of new devices and platforms that a user can read ebooks on?
  • What are the main features of Calibre?
    • What are some of the most interesting and most popular plugins that have been creatd for Calibre?
  • Can you describe the software architecture for the project and how it has evolved since you first started working on it?
  • You have been maintaining and improving Calibre for a long time now. What is your motivation to keep working on it?
    • How has the focus of the project and the primary use cases changed over the years that you have been working on it?
  • In addition to its longevity, Calibre has also become a de-facto standard for ebook management. What is your opinion as to why it has gained and kept its popularity?
    • What are some of the competing options and how does Calibre differentiate from them?
  • In addition to the myriad devices and platforms, there is a significant amount of complexity involved in supporting the different ebook formats. What have been the most challenging or complex aspects of managing and converting between the formats?
  • One of the challenges around maintaining a private library of electronic resources is the prevalence of DRM restricted content available through major publishers and retailers. What are your thoughts on the current state of digital book marketplaces?
  • What was your motivation for implementing Calibre in Python?
    • If you were to start the project over today would you make the same choice?
    • Are there any aspects of the project that you would implement differently if you were starting over?
  • What are your plans for the future of Calibre?

Keep In Touch

Picks

  • Tobias
  • Kovid
    • Into Thin Air by John Krakauer
      About how an expedition to climb Everest went wrong. Wonderful account of the difficulties of high altitude mountaineering and the determination it needs.
    • The Steerswoman’s Road by Rosemary Kirstein
      About the spirit of scientific enquiry in a fallen civilization on an alien planet with partial terraforming that is slowly failing.

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Entity Extraction, Document Processing, And Knowledge Graphs For Investigative Journalists with Friedrich Lindenberg - Episode 186

Summary

Investigative reporters have a challenging task of identifying complex networks of people, places, and events gleaned from a mixed collection of sources. Turning those various documents, electronic records, and research into a searchable and actionable collection of facts is an interesting and difficult technical challenge. Friedrich Lindenberg created the Aleph project to address this issue and in this episode he explains how it works, why he built it, and how it is being used. He also discusses his hopes for the future of the project and other ways that the system could be used.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so check out Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode today to get a $20 credit and launch a new server in under a minute.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at podcastinit.com/chat
  • Registration for PyCon US, the largest annual gathering across the community, is open now. Don’t forget to get your ticket and I’ll see you there!
  • Your host as usual is Tobias Macey and today I’m interviewing Friedrich Lindenberg about Aleph, a tool to perform entity extraction across documents and structured data

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what Aleph is and how the project got started?
  • What is investigative journalism?
    • How does Aleph fit into their workflow?
    • What are some other tools that would be used alongside Aleph?
    • What are some ways that Aleph could be useful outside of investigative journalism?
  • How is Aleph architected and how has it evolved since you first started working on it?
  • What are the major components of Aleph?
    • What are the types of documents and data formats that Aleph supports?
  • Can you describe the steps involved in entity extraction?
    • What are the most challenging aspects of identifying and resolving entities in the documents stored in Aleph?
  • Can you describe the flow of data through the system from a document being uploaded through to it being displayed as part of a search query?
  • What is involved in deploying and managing an installation of Aleph?
  • What have been some of the most interesting or unexpected aspects of building Aleph?
  • Are there any particularly noteworthy uses of Aleph that you are aware of?
  • What are your plans for the future of Aleph?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Django, Channels, And The Asynchronous Web with Andrew Godwin - Episode 180

Summary

Once upon a time the web was a simple place with one main protocol and a predictable sequence of request/response interactions with backend applications. This is the era when Django began, but in the intervening years there has been an explosion of complexity with new asynchronous protocols and single page Javascript applications. To help bridge the gap and bring the most popular Python web framework into the modern age Andrew Godwin created Channels. In this episode he explains how the first version of the asynchronous layer for Django applications was created, how it has changed in the jump to version 2, and where it will go in the future. Along the way he also discusses the challenges of async development, his work on designing ASGI as the spiritual successor to WSGI, and how you can start using all of this in your own projects today.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at podcastinit.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Andrew Godwin about Django Channels 2.x and the ASGI specification for modern, asynchronous web protocols

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start with an overview of the problem that Channels is aiming to solve?
  • Asynchronous frameworks have existed in Python for a long time. What are the tradeoffs in those frameworks that would lead someone to prefer the combination of Django and Channels?
  • For someone who is familiar with traditional Django or working on an existing application, what are the steps involved in integrating Channels?
  • Channels is a project that you have been working on for a significant amount of time and which you recently re-architected. What were the shortcomings in the 1.x release that necessitated such a major rewrite?
    • How is the current system architected?
  • What have you found to be the most challenging or confusing aspects of managing asynchronous web protocols both as an author of Channels/ASGI and someone building on top of them?
    • While reading through the documentation there were mentions of the synchronous nature of the Django ORM. What are your thoughts on asynchronous database access and how important that is for future versions of Django and Channels?
  • As part of your implementation of Channels 2.x you introduced a new protocol for asynchronous web applications in Python in the form of ASGI. How does this differ from the WSGI standard and what was your process for developing this specification?
    • What are your hopes for what the Python community will do with ASGI?
  • What are your plans for the future of Channels?
  • What are some of the most interesting or unexpected uses of Channels and/or ASGI?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Keep Your Code Clean Using pre-commit with Anthony Sottile - Episode 178

Summary

Maintaining the health and well-being of your software is a never-ending responsibility. Automating away as much of it as possible makes that challenge more achievable. In this episode Anthony Sottile describes his work on the pre-commit framework to simplify the process of writing and distributing functions to make sure that you only commit code that meets your definition of clean. He explains how it supports tools and repositories written in multiple languages, enforces team standards, and how you can start using it today to ship better software.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at podcastinit.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Anthony Sottile about pre-commit, a framework for managing and maintaining hooks for multiple languages

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what a pre-commit hook is and some of the ways that they are useful for developers?
  • What was you motivation for creating a framework to manage your pre-commit hooks?
    • How does it differ from other projects built to manage these hooks?
  • What are the steps for getting someone started with pre-commit in a new project?
  • Which other event hooks would be most useful to implement for maintaining the health of a repository?
  • What types of operations are most useful for ensuring the health of a project?
  • What types of routines should be avoided as a pre-commit step?
  • Installing the hooks into a user’s local environment is a manual step, so how do you ensure that all of your developers are using the configured hooks?
    • What factors have you found that lead to developers skipping or disabling hooks?
  • How is pre-commit implemented and how has that design evolved from when you first started?
    • What have been the most difficult aspects of supporting multiple languages and package managers?
    • What would you do differently if you started over today?
    • Would you still use Python?
  • For someone who wants to write a plugin for pre-commit, what are the steps involved?
  • What are some of the strangest or most unusual uses of pre-commit hooks that you have seen?
  • What are your plans for the future of pre-commit?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Infection Monkey Vulnerability Scanner with Daniel Goldberg - Episode 177

Summary

How secure are your servers? The best way to be sure that your systems aren’t being compromised is to do it yourself. In this episode Daniel Goldberg explains how you can use his project Infection Monkey to run a scan of your infrastructure to find and fix the vulnerabilities that can be taken advantage of. He also discusses his reasons for building it in Python, how it compares to other security scanners, and how you can get involved to keep making it better.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at podcastinit.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Daniel Goldberg about Infection Monkey, an open source system breach simulation tool for evaluating the security of your network

Interview

  • Introductions
  • How did you get introduced to Python?
  • What is infection monkey and what was the reason for building it?
    • What was the reasoning for building it in Python?
    • If you were to start over today what would you do differently?
  • Penetration testing is typically an endeavor that requires a significant amount of knowledge and experience of security practices. What have been some of the most difficult aspects of building an automated vulnerability testing system?
    • How does a deployed instance keep up to date with recent exploits and attack vectors?
  • How does Infection Monkey compare to other tools such as Nessus and Nexpose?
  • What are some examples of the types of vulnerabilities that can be discovered by Infection Monkey?
  • What kinds of information can Infection Monkey discover during a scan?
    • How does that information get reported to the user?
    • How much security experience is necessary to understand and address the findings in a given report generated from a scan?
  • What techniques do you use to ensure that the simulated compromises can be safely reverted?
  • What are some aspects of network security and system vulnerabilities that Infection Monkey is unable to detect and/or analyze?
  • For someone who is interested in using Infection Monkey what are the steps involved in getting it set up?
    • What is the workflow for running a scan?
    • Is Infection Monkey intended to be run continuously, or only with the interaction of an operator?
  • What are your plans for the future of Infection Monkey?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA