Search results for: pandas

Pandas Extension Arrays with Tom Augspurger - Episode 164

Summary

Pandas is a swiss army knife for data processing in Python but it has long been difficult to customize. In the latest release there is now an extension interface for adding custom data types with namespaced APIs. This allows for building and combining domain specific use cases and alternative storage mechanisms. In this episode Tom Augspurger describes how the new ExtensionArray works, how it came to be, and how you can start building your own extensions today.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute.
  • To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Your host as usual is Tobias Macey and today I’m interviewing Tom Augspurger about the extension interface for Pandas data frames and the use cases that it enables

Interview

  • Introductions
  • How did you get introduced to Python?
  • Most people are familiar with Pandas, but can you describe at a high level the new extension interface?
    • What is the story behind the implementation of this functionality?
    • Prior to this interface what was the option for anyone who wanted to extend Pandas?
  • What are some of the new data types that are available as external packages?
    • What are some of the unique use cases that they enable?
  • How is the new interface implemented within Pandas?
  • What were the most challenging or difficult aspects of building this new functionality?
  • What are some of the more interesting possibilities that you are aware of for new extension types?
  • What are the limitations of the interface for libraries that add new array functionality?
  • What is the next major change or improvement that you would like to add in Pandas?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Pandas with Jeff Reback - Episode 98

Summary

Pandas is one of the most versatile and widely used tools for data manipulation and analysis in the Python ecosystem. This week Jeff Reback explains why that is, how you can use it to make your life easier, and what you can look forward to in the months to come.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable.
  • When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app.
  • When you’re writing Python you need a powerful editor to automate routine tasks, maintain effective development practices, and simplify challenging things like refactoring. Our sponsor JetBrains delivers the perfect solution for you in the form of PyCharm, providing a complete set of tools for productive Python, Web, Data Analysis and Scientific development, available in 2 editions. The free and open-source PyCharm Community Edition is perfect for pure Python coding. PyCharm Professional Edition is a full-fledged tool, designed for professional Python, Web and Data Analysis developers. Today JetBrains is offering a 3-month free PyCharm Professional Edition individual subscription. Don’t miss this chance to use the best-in-class tool with intelligent code completion, automated testing, and integration with modern tools like Docker – go to <www.podcastinit.com/pycharm> and use the promo code podcastinit during checkout.
  • Visit the site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
  • To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
  • Your host as usual is Tobias Macey and today I’m interviewing Jeff Reback about Pandas, the swiss army knife of data analysis in Python.

Interview

  • Introductions
  • How did you get introduced to Python?
  • To start off, what is Pandas and what is its origin story?
    • How did you get involved in the project’s development?
  • For someone who is just getting started with Pandas what are the fundamental ideas and abstractions in the library that are necessary to understand how to use it for working with data?
  • Pandas has quite an extensive API and I noticed that the most recent release includes a nice cheat sheet. How do you balance the power and flexibility of such an expressive API with the usability issues that can be introduced by having so many options of how to manipulate the data?
  • There is a strong focus for use in science and data analytics, but there are a number of other areas where Pandas is useful as well. What are some of the most interesting or unexpected uses that you have seen or heard of?
  • What are some of the biggest challenges that you have encountered while working on Pandas?
  • Do you find the constraint of only supporting two dimensional arrays to be limiting, or has it proven to be beneficial for the success of pandas?
  • What’s coming for pandas? Pandas 2.0!

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Computational Musicology For Python Programmers - Episode 198

Summary

Music is a part of every culture around the world and throughout history. Musicology is the study of that music from a structural and sociological perspective. Traditionally this research has been done in a manual and painstaking manner, but the advent of the computer age has enabled an increase of many orders of magnitude in the scope and scale of analysis that we can perform. The music21 project is a Python library for computer aided musicology that is written and used by MIT professor Michael Scott Cuthbert. In this episode he explains how the project was started, how he is using it personally, professionally, and in his lectures, as well as how you can use it for your own exploration of musical analysis.

Announcements

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Michael Cuthbert about music21, a toolkit for computer aided musicology

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what computational musicology is?
  • What is music21 and what motivated you to create it?
    • What are some of the use cases that music21 supports, and what are some common requests that you purposefully don’t support?
  • How much knowledge of musical notation, structure, and theory is necessary to be able to work with music21?
  • Can you talk through a typical workflow for doing analysis of one or more pieces of existing music?
    • What are some of the common challenges that users encounter when working with it (either on the side of Python or musicology/musical theory)?
    • What about for doing exploration of new musical works?
  • As a professor at MIT, what are some of the ways that music21 has been incorporated into your classroom?
    • What have they enjoyed most about it?
  • How is music21 implemented, and how has its structure evolved since you first started it?
    • What have been the most challenging aspects of building and maintaining the music21 project and community?
  • What are some of the most interesting, unusual, or unexpected ways that you have seen music21 used?
    • What are some analyses that you have performed which yielded unexpected results?
  • What do you have planned for the future of music21?
  • Beyond computational analysis of musical theory, what are some of the other ways that you are using Python in your academic and professional pursuits?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Teaching Digital Archaeology With Jupyter Notebooks - Episode 194

Summary

Computers have found their way into virtually every area of human endeavor, and archaeology is no exception. To aid his students in their exploration of digital archaeology Shawn Graham helped to create an online, digital textbook with accompanying interactive notebooks. In this episode he explains how computational practices are being applied to archaeological research, how the Online Digital Archaeology Textbook was created, and how you can use it to get involved in this fascinating area of research.

Introduction

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With 200 Gbit/s private networking, scalable shared block storage, node balancers, and a 40 Gbit/s public network, all controlled by a brand new API you’ve got everything you need to scale up. Go to pythonpodcast.com/linode to get a $20 credit and launch a new server in under a minute. And don’t forget to thank them for their continued support of this show!
  • And to keep track of how your team is progressing on building new features and squashing bugs, you need a project management system designed by software engineers, for software engineers. Clubhouse lets you craft a workflow that fits your style, including per-team tasks, cross-project epics, a large suite of pre-built integrations, and a simple API for crafting your own. Podcast.__init__ listeners get 2 months free on any plan by going to pythonpodcast.com/clubhouse today and signing up for a trial.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Shawn Graham about his work on the Online Digital Archaeology Textbook

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by explaining what digital archaeology is?
  • To facilitate your teaching you have collaborated on the O-DATE textbook and associated Jupyter notebooks. Can you describe what that resource covers and how the project got started?
  • What have you found to be the most critical lessons for your students to help them be effective archaeologists?
    • What are the most useful aspects of leveraging computational techniques in an archaeological context?
  • Can you describe some of the sources and formats of data that would commonly be encountered by digital archaeologists?
  • The notebooks that accompany the text have a mixture of R and Python code. What are your personal guidelines for when to use each language?
  • How have the skills and tools of software engineering influenced your views and approach to research and education in the realm of archaeology?
  • What are some of the most novel or engaging ways that you have seen computers applied to the field of archaeology?
  • What are your goals and aspirations for the O-DATE project?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Of Checklists, Ethics, and Data with Emily Miller and Peter Bull - Episode 184

Summary

As data science becomes more widespread and has a bigger impact on the lives of people, it is important that those projects and products are built with a conscious consideration of ethics. Keeping ethical principles in mind throughout the lifecycle of a data project helps to reduce the overall effort of preventing negative outcomes from the use of the final product. Emily Miller and Peter Bull of Driven Data have created Deon to improve the communication and conversation around ethics among and between data teams. It is a Python project that generates a checklist of common concerns for data oriented projects at the various stages of the lifecycle where they should be considered. In this episode they discuss their motivation for creating the project, the challenges and benefits of maintaining such a checklist, and how you can start using it today.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 40Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Join the community in the new Zulip chat workspace at podcastinit.com/chat
  • Your host as usual is Tobias Macey and today I’m interviewing Emily Miller and Peter Bull about Deon, an ethics checklist for data projects

Interview

  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Deon is and your motivation for creating it?
  • Why a checklist, specifically? What’s the advantage of this over an oath, for example?
  • What is unique to data science in terms of the ethical concerns, as compared to traditional software engineering?
  • What is the typical workflow for a team that is using Deon in their projects?
  • Deon ships with a default checklist but allows for customization. What are some common addendums that you have seen?
    • Have you received pushback on any of the default items?
  • How does Deon simplify communication around ethics across team boundaries?
  • What are some of the most often overlooked items?
  • What are some of the most difficult ethical concerns to comply with for a typical data science project?
  • How has Deon helped you at Driven Data?
  • What are the customer facing impacts of embedding a discussion of ethics in the product development process?
  • Some of the items on the default checklist coincide with regulatory requirements. Are there any cases where regulation is in conflict with an ethical concern that you would like to see practiced?
  • What are your hopes for the future of the Deon project?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Great Expectations For Your Data Pipelines with Abe Gong and James Campbell - Episode 161

Summary

Testing is a critical activity in all software projects, but one that is often neglected in data pipelines. The complexities introduced by the inherent statefulness of the problem domain and the interdependencies between systems contribute to make pipeline testing difficult to manage. To make this endeavor more manageable Abe Gong and James Campbell have created Great Expectations. In this episode they discuss how you can use the project to create tests in the exploratory phase of building a pipeline and leverage those to monitor your systems in production. They also discussed how Great Expectations works, the difficulties associated with pipeline testing and managing associated technical debt, and their future plans for the project.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute.
  • Finding a bug in production is never a fun experience, especially when your users find it first. Airbrake error monitoring ensures that you will always be the first to know so you can deploy a fix before anyone is impacted. With open source agents for Python 2 and 3 it’s easy to get started, and the automatic aggregations, contextual information, and deployment tracking ensure that you don’t waste time pinpointing what went wrong. Go to podcastinit.com/airbrake today to sign up and get your first 30 days free, and 50% off 3 months of the Startup plan.
  • To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected]
  • Your host as usual is Tobias Macey and today I’m interviewing James Campbell and Abe Gong about Great Expectations, a tool for testing the data in your analytics pipelines

Interview

  • Introduction
  • How did you first get introduced to Python?
  • What is Great Expectations and what was your motivation for starting it?
  • What are some of the complexities associated with testing analytics pipelines?
    • What types of tests can be executed to ensure data integrity and accuracy?
  • What are some examples of the potential impact of pipeline debt?
  • What is Great Expectations and how does it simplify the process of building and executing pipeline tests?
  • What are some examples of the types of tests that can be built with Great Expectations?
  • For someone getting started with Great Expectations what does the workflow look like?
  • What was your reason for using Python for building it?
    • How does the choice of language benefit or hinder the contexts in which Great Expectations can be used?
  • What are some cases where Great Expectations would not be usable or useful?
  • What have been some of the most challenging aspects of building and using Great Expectations?
  • What are your hopes for Great Expectations going forward?

Contact Info

Picks

Links

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Jake Vanderplas: Data Science For Academic Research - Episode 140

Summary

Jake Vanderplas is an astronomer by training and a prolific contributor to the Python data science ecosystem. His current role is using Python to teach principles of data analysis and data visualization to students and researchers at the University of Washington. In this episode he discusses how he got started with Python, the challenges of teaching best practices for software engineering and reproducible analysis, and how easy to use tools for data visualization can help democratize access to, and understanding of, data.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable.
  • When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters.
  • If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Your host as usual is Tobias Macey and today I’m interviewing Jake Vanderplas about data science best practices, and applying them to academic sciences

Interview

  • Introductions
  • How did you get introduced to Python?
  • How has your astronomy background informed and influenced your current work?
  • In your work at the University of Washington, what are some of the most common difficulties that students face when learning data science?
    • How does that list differ for professional scientists who are learning how to apply data science to their work?
  • Where is the tooling still lacking in terms of enabling consistent and repeatable workflows?
  • One of the projects that you are spending time on now is Altair, which is a library for generating visualizations from Pandas dataframes. How does that work factor into your teaching?
  • What are some of the most novel applications of data science that you have been involved with?
  • What are some of the trends in data analysis that you are most excited for?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Surprise! Recommendation Algorithms with Nicolas Hug - Episode 135

Summary

A relevant and timely recommendation can be a pleasant surprise that will delight your users. Unfortunately it can be difficult to build a system that will produce useful suggestions, which is why this week’s guest, Nicolas Hug, built a library to help with developing and testing collaborative recommendation algorithms. He explains how he took the code he wrote for his PhD thesis and cleaned it up to release as an open source library and his plans for future development on it.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • I would like to thank everyone who supports us on Patreon. Your contributions help to make the show sustainable.
  • When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at podastinit.com/linode and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. And now you can deliver your work to your users even faster with the newly upgraded 200 GBit network in all of their datacenters.
  • If you’re tired of cobbling together your deployment pipeline then it’s time to try out GoCD, the open source continuous delivery platform built by the people at ThoughtWorks who wrote the book about it. With GoCD you get complete visibility into the life-cycle of your software from one location. To download it now go to podcatinit.com/gocd. Professional support and enterprise plugins are available for added piece of mind.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Your host as usual is Tobias Macey and today I’m interviewing Nicolas Hug about Surprise, a scikit library for building recommender systems

Interview

  • Introductions
  • How did you get introduced to Python?
  • What is Surprise and what was your motivation for creating it?
  • What are the most challenging aspects of building a recommender system and how does Surprise help simplify that process?
  • What are some of the ways that a user or company can bootstrap a recommender system while they accrue data to use a collaborative algorithm?
  • What are some of the ways that a recommender system can be used, outside of the typical ecommerce example?
  • Once an algorithm has been deployed how can a user test the accuracy of the suggestions?
  • How is Surprise implemented and how has it evolved since you first started working on it?
  • What have been the most difficult aspects of building and maintaining Surprise?
  • competitors?
  • What are the attributes of the system that can be modified to improve the relevance of the recommendations that are provided?
  • For someone who wants to use Surprise in their application, what are the steps involved?
  • What are some of the new features or improvements that you have planned for the future of Surprise?

Keep In Touch

Picks

  • Tobias
    • Silk profiler for Django

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

PyTables with Francesc Alted - Episode 97

Summary

HDF5 is a file format that supports fast and space efficient analysis of large datasets. PyTables is a project that wraps and expands on the capabilities of HDF5 to make it easy to integrate with the larger Python data ecosystem. Francesc Alted explains how the project got started, how it works, and how it can be used for creating sharable and archivable data sets.

Preface

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable.
  • When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app. Linode will has announced new plans, including 1GB for $5 plan, high memory plans starting at 16GB for $60/mo and an upgrade in storage from 24GB to 30GB on our 2GB for $10 plan.
  • Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
  • To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
  • Your host as usual is Tobias Macey and today I’m interviewing Francesc Alted about PyTables

Interview

  • Introductions
  • How did you get introduced to Python?
  • To start with, what is HDF5 and what was the problem that motivated you to wrap Python around it to create PyTables?
  • Which are the most relevant contributors for PyTables? How you interacted?
  • How is the project architected and what are some of the design decisions that you are most proud of?
  • What are some of the typical use cases for PyTables and how does it tie into the broader Python data ecosystem?
  • How common is it to use an HDF5 file as a data interchange format to be shared between researchers or between languages?
  • Given the ability to create custom node types, does that inhibit the ability to interact with the stored data using other libraries?
  • What are some of the capabilities of HDF5 and PyTables that can’t be reasonably replicated in other data storage systems?
  • One of the more intriguing capabilities that I noticed while reading the documentation is the ability to perform undo and redo operations on the data. How might that be leveraged in a real-world use case?
  • What are some of the most interesting or unexpected uses of PyTables that you are aware of?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

Python for GIS with Sean Gillies - Episode 80

Summary

Location is an increasingly relevant aspect of software systems as we have more internet connected devices with GPS capabilities. GIS (Geographic Information Systems) are used for processing and analyzing this data, and fortunately Python has a suite of libraries to facilitate these endeavors. This week Sean Gillies, an author and contributor of many of these tools, shares the story of his career and contributions, and the work that he is doing at MapBox.

Brief Introduction

  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable.
  • When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app.
  • You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan.
  • Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
  • To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
  • Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas.
  • Your host as usual is Tobias Macey
  • Today I’m interviewing Sean Gillies about writing Geographic Information Systems in Python.

Interview with Sean Gillies

  • Introductions
  • How did you get introduced to Python?
  • Can you start by describing what Geographic Information Systems are and what kinds of projects might take advantage of them?
  • How did you first get involved in the area of GIS and location-based computation?
  • What was the state of the Python ecosystem like for writing these kinds of applications?
  • You have created and contributed to a number of the canonical tools for building GIS systems in Python. Can you list at least some of them and describe how they fit together for different applications?
  • What are some of the unique challenges associated with trying to model geographical features in a manner that allows for effective computation?
    • How does the complexity of modeling and computation scale with increasing land area?
  • Mapping and cartography have an incredibly long history with an ever-evolving set of tools. What does our digital age bring to this time-honored discipline that was previously impossible or impractical?
  • To build accurate and effective representations of our physical world there are a number of domains involved, such as geometry and geography. What advice do you have for someone who is interested in getting started in this particular niche?
  • What level of expertise would you advise for someone who simply wants to add some location-aware features to their application?
  • I know that you joined Mapbox a little while ago. Which parts of their stack are written in Python?
  • What are the areas where Python still falls short and which languages or tools do you turn to in those cases?

Keep In Touch

Picks

Links

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA