✚ Why I Use R More than Python

Welcome to The Process, where we look closer at how the charts get made. This is issue #256. I’m Nathan Yau. The tool debate between whether to use R or Python for data visualization, and data analysis in general, is a useless one. Because R is clearly better. (I’m kidding.)

Become a member for access to this — plus tutorials, courses, and guides.

✚ How I Made That: Network Diagrams of All the Household Types

With visualization, there’s a lot of filtering and aggregation so that it’s easier to see general patterns. But lately I’ve been more curious about what we can see from visualizing everything. So I made network diagrams for 4,708 household types in the United States.

Here’s how I made them using Python and R.

Become a member for access to this — plus tutorials, courses, and guides.

Python is coming to Excel

Excel is getting a bump in capabilities with Python integration. From Microsoft:

Excel users now have access to powerful analytics via Python for visualizations, cleaning data, machine learning, predictive analytics, and more. Users can now create end to end solutions that seamlessly combine Excel and Python – all within Excel. Using Excel’s built-in connectors and Power Query, users can easily bring external data into Python in Excel workflows. Python in Excel is compatible with the tools users already know and love, such as formulas, PivotTables, and Excel charts.

Sounds fun for both Excel users and Python developers.

It’s headed to the Beta Channel in Excel for Windows and then Excel for Windows proper. They didn’t announce a timeline for Mac.

Tags: ,

Introduction to statistical learning, with Python examples

An Introduction to Statistical Learning, with Applications in R by Gareth James, Daniela Witten, Trevor Hastie, and Rob Tibshirani was released in 2021. They, along with Jonathan Taylor, just released an alternate version with applications in Python. So if Python is your thing, have at it. Like the R version, it is free to download as a PDF.

Tags: , ,

Switching from Python to R

If you’re looking to switch or just want to expand your skills, this starter guide by Stephanie Lo provides some translations:

Are you curious about delving into the world of R programming? While Python remains the dominant choice amongst the data science community, with approximately 60% of developers using it in 2022, there are instances where R may pop up now and again. That’s because R is optimized for statistics and data. If you, like me, have a foundation in Python but now encounter job listings and internal company tasks that demand R skills, this article aims to break that down. We will explore the fundamental distinctions between Python and R and wrap the project into a data cleaning and visualization tutorial to ensure a smooth transition to R.

I mostly use R, but have always found it helpful to know some Python, especially when there’s some fun library to try.

Tags: , ,

Introduction to Deep Learning

Sebastian Raschka made 170 videos on deep learning, and you can watch all of the lessons now:

I just sat down this morning and organized all deep learning related videos I recorded in 2021. I am sure this will be a useful reference for my future self, but I am also hoping it might be useful for one or the other person out there.

It’s split into 19 lessons over five parts: introduction, mathematical foundations, neural networks, deep learning for computer vision, and generative models. Might be useful, even if you just want to learn more about machine learning is.

Tags: , ,

Spatula, a Python library for maintainable web scraping

This looks promising:

While it is often easy, and tempting, to write a scraper as a dirty one-off script, spatula makes an attempt to provide an easy framework that most scrapers fit within without additional overhead.

This reflects the reality that many scraper projects start small but grow quickly, so reaching for a heavyweight tool from the start often does not seem practical.

The initial overhead imposed by the framework should be as light as possible, providing benefits even for authors that do not wish to use every feature available to them.

Although, without my dirty one-off scripts, what will I put in my tmp data folder?

Tags: ,

Generate a color analysis by uploading an image

Mel Dollison and Liza Daly made a fun interactive that lets you upload an image, and it spits out a vintage-looking color analysis a la Vanderpoel:

This generator is based on the works of Emily Noyes Vanderpoel (1842-1939), who hoped her original color analyses would inspire others to study “whatever originals may be at hand in books, shops, private houses, or museums.” We hope you are similarly inspired by her abstract, modernist style employed in the context of everyday objects and photos.

Originally conceived as a Twitter bot, you can find the Python code behind the project on GitHub.

Tags: , , ,

✚ How to Make Line Charts in Python, with Pandas and Matplotlib

The chart type can be used to show patterns over time and relationships between variables. This is a comprehensive introduction to making them using two common libraries. Read More

Altair for visualization in Python

Vega-Lite is a grammar for interactive graphics primarily used for analysis. Altair is a visualization library in Python that is based on this grammar.

With Altair, you can spend more time understanding your data and its meaning. Altair’s API is simple, friendly and consistent and built on top of the powerful Vega-Lite visualization grammar. This elegant simplicity produces beautiful and effective visualizations with a minimal amount of code.

Jim Vallandingham just put up a useful introduction to the library if you’re looking to get your feet wet.

I do very little visualization-wise with Python since my current toolset typically covers my bases, but this has me curious.

Tags:

Posted by in Python, software

Tags:

Permalink